Title
Novel applications of multitask learning and multiple output regression to multiple genetic trait prediction.
Abstract
Given a set of biallelic molecular markers, such as SNPs, with genotype values encoded numerically on a collection of plant, animal or human samples, the goal of genetic trait prediction is to predict the quantitative trait values by simultaneously modeling all marker effects. Genetic trait prediction is usually represented as linear regression models. In many cases, for the same set of samples and markers, multiple traits are observed. Some of these traits might be correlated with each other. Therefore, modeling all the multiple traits together may improve the prediction accuracy. In this work, we view the multitrait prediction problem from a machine learning angle: as either a multitask learning problem or a multiple output regression problem, depending on whether different traits share the same genotype matrix or not. We then adapted multitask learning algorithms and multiple output regression algorithms to solve the multitrait prediction problem. We proposed a few strategies to improve the least square error of the prediction from these algorithms. Our experiments show thatmodelingmultiple traits together could improve the prediction accuracy for correlated traits.
Year
DOI
Venue
2016
10.1093/bioinformatics/btw249
BIOINFORMATICS
Field
DocType
Volume
Data mining,Trait,Computer science,Software,Artificial intelligence,Least square error,Linear regression,Quantitative trait locus,Multi-task learning,Regression,Linear model,Bioinformatics,Machine learning
Journal
32
Issue
ISSN
Citations 
12
1367-4803
6
PageRank 
References 
Authors
0.49
8
3
Name
Order
Citations
PageRank
Dan He113312.54
David N Kuhn2101.01
Laxmi Parida377377.21