Mining K-mers of Various Lengths in Biological Sequences. - Citegraph

Paper Info

Title
Mining K-mers of Various Lengths in Biological Sequences.

Abstract
Counting the occurrence frequency of each k-mer in a biological sequence is an important step in many bioinformatics applications. However, most k-mer counting algorithms rely on a given k to produce single-length k-mers, which is inefficient for sequence analysis for different k. Moreover, existing k-mer counters focus more on DNA sequences and less on protein ones. In practice, the analysis of k-mers in protein sequences can provide substantial biological insights in structure, function and evolution. To this end, an efficient algorithm, called VLmer (Various Length k-mer mining), is proposed to mine k-mers of various lengths termed vl-mers via inverted-index technique, which is orders of magnitude faster than the conventional forward-index method. Moreover, to the best of our knowledge, VLmer is the first able to mine k-mers of various lengths in both DNA and protein sequences.

Year	DOI	Venue
2017	10.1007/978-3-319-59575-7_17	BIOINFORMATICS RESEARCH AND APPLICATIONS (ISBRA 2017)
Keywords	Field	DocType
Sequential pattern mining,K-mer counting,K-mers of various lengths,Biological sequence analysis	Orders of magnitude (numbers),Computer science,Algorithm,DNA,DNA sequencing,Artificial intelligence,Sequential Pattern Mining,Machine learning,Sequence analysis	Conference
Volume	ISSN	Citations
10330	0302-9743	0
PageRank	References	Authors
0.34	16	7

Authors (7 rows)

Cited by (0 rows)

References (16 rows)

Name	Order	Citations	PageRank
Jingsong Zhang	1	38	3.26
Jianmei Guo	2	390	22.80
Xiaoqing Yu	3	75	11.53
Xiangtian Yu	4	38	4.81
Wei-feng Guo	5	8	2.56
Tao Zeng	6	31	9.08
Luonan Chen	7	1485	145.71

1