Title
TagSNP-set selection for genotyping using integrated data
Abstract
Single-nucleotide polymorphisms (SNPs) are vital in identifying genetic level variations in complex disease. It was found that the information of SNPs on adjacent or identical genes can be represented by a few tagSNPs (called tag SNP-set or tagSNP-set). In this work, we propose a novel method called TagSNP-set Selection by Optimal Iteration with Linkage Disequilibrium (TSOILD) and develop a quantificationally analytical tagSNP-set prediction method called Physical Distance-Linkage Disequilibrium Prediction Method (PDLDPM). To verify the validity of TSOILD method and PDLDPM, a large amount of test data is generated by simulation software HAPGEN2. According to the experimental results, the prediction accuracy of TSOILD is improved by 6.73%, 3.19%, 6.52% and 1.72% over the Random Sampling, Genetic Algorithm (GA) , Greedy Algorithm and TagSNP-Set Selection Method with Maximum Information (TSMI) respectively. In addition, PDLDPM, Linkage Coverage and selection of tag SNPs to maximize prediction accuracy (STAMPA) are used to evaluate the tagSNP-set selected by Random Sampling, GA, Greedy Algorithm and TSMI. Results show that the PDLDPM performs better than the other two methods. These methods provide effective assistance for the study of genetic level variation of complex diseases.
Year
DOI
Venue
2021
10.1016/j.future.2020.09.007
Future Generation Computer Systems
Keywords
DocType
Volume
00-01,99-00
Journal
115
ISSN
Citations 
PageRank 
0167-739X
0
0.34
References 
Authors
1
6
Name
Order
Citations
PageRank
Shudong Wang1377.33
Gaowei Liu200.34
Xin-Zeng Wang342.09
Yuanyuan Zhang412111.56
Sicheng He500.34
Yulin Zhang65212.72