Title
Fast decision tree-based method to index large DNA-protein sequence databases using hybrid distributed-shared memory programming model
Abstract
In recent times, the size of biological databases has increased significantly, with the continuous growth in the number of users and rate of queries; such that some databases have reached the terabyte size. There is therefore, the increasing need to access databases at the fastest rates possible. In this paper, the decision tree indexing model PDTIM was parallelised, using a hybrid of distributed and shared memory on resident database; with horizontal and vertical growth through Message Passing Interface MPI and POSIX Thread PThread, to accelerate the index building time. The PDTIM was implemented using 1, 2, 4 and 5 processors on 1, 2, 3 and 4 threads respectively. The results show that the hybrid technique improved the speedup, compared to a sequential version. It could be concluded from results that the proposed PDTIM is appropriate for large data sets, in terms of index building time.
Year
DOI
Venue
2014
10.1504/IJBRA.2014.060765
International Journal of Bioinformatics Research and Applications
Keywords
Field
DocType
decision tree,bioinformatics,dna,data mining,searching algorithms
Decision tree,Shared memory,Computer science,POSIX Threads,Message Passing Interface,Bioinformatics,Distributed shared memory,Fractal tree index,Database,Incremental decision tree,Speedup
Journal
Volume
Issue
Citations 
10
3
2
PageRank 
References 
Authors
0.36
13
3
Name
Order
Citations
PageRank
Khalid Mohammad Jaber120.36
Rosni Abdullah215624.82
Nur'Aini Abdul Rashid330.73