Title | ||
---|---|---|
Fast decision tree-based method to index large DNA-protein sequence databases using hybrid distributed-shared memory programming model |
Abstract | ||
---|---|---|
In recent times, the size of biological databases has increased significantly, with the continuous growth in the number of users and rate of queries; such that some databases have reached the terabyte size. There is therefore, the increasing need to access databases at the fastest rates possible. In this paper, the decision tree indexing model PDTIM was parallelised, using a hybrid of distributed and shared memory on resident database; with horizontal and vertical growth through Message Passing Interface MPI and POSIX Thread PThread, to accelerate the index building time. The PDTIM was implemented using 1, 2, 4 and 5 processors on 1, 2, 3 and 4 threads respectively. The results show that the hybrid technique improved the speedup, compared to a sequential version. It could be concluded from results that the proposed PDTIM is appropriate for large data sets, in terms of index building time. |
Year | DOI | Venue |
---|---|---|
2014 | 10.1504/IJBRA.2014.060765 | International Journal of Bioinformatics Research and Applications |
Keywords | Field | DocType |
decision tree,bioinformatics,dna,data mining,searching algorithms | Decision tree,Shared memory,Computer science,POSIX Threads,Message Passing Interface,Bioinformatics,Distributed shared memory,Fractal tree index,Database,Incremental decision tree,Speedup | Journal |
Volume | Issue | Citations |
10 | 3 | 2 |
PageRank | References | Authors |
0.36 | 13 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Khalid Mohammad Jaber | 1 | 2 | 0.36 |
Rosni Abdullah | 2 | 156 | 24.82 |
Nur'Aini Abdul Rashid | 3 | 3 | 0.73 |