Title
Mining Contrast Sequential Patterns based on Subsequence Location Distribution from Biological Sequences
Abstract
With the generation of a large amount of biological data, researches on methods that can automatically analyze these biological data has become a hot spot. Contrast sequential patterns play an important role in identifying the characteristics of different biological sequences. However, previous studies on mining contrast sequential pattern did not consider the effects of gene/amino acid location distribution on patterns in given biological sequences. In this paper, we introduce the subsequence location distribution into the conditions of the contrast sequence pattern mining, extending previous studies which only considered support of patterns. We also design a novel algorithm, SLD-tree, which compresses datasets into the tree to avoid repeated scanning of the dataset, and can effectively mines contrast sequential patterns based on subsequence location distribution. The empirical study using real-world biological sequence demonstrates the effectiveness of our method. Moreover, we carry out classification experiment, the results verify our method have higher classification accuracy.
Year
DOI
Venue
2019
10.1145/3352411.3352443
Proceedings of the 2019 2nd International Conference on Data Science and Information Technology
Keywords
DocType
ISBN
Classification, Contrast sequential pattern, Location distribution
Conference
978-1-4503-7141-4
Citations 
PageRank 
References 
0
0.34
0
Authors
3
Name
Order
Citations
PageRank
Qing Li1837.74
Xiangtao Chen2204.07
Ronghui Wu3215.72