Title
Missing value estimation for DNA microarray gene expression data: local least squares imputation.
Abstract
Gene expression data often contain missing expression values. Effective missing value estimation methods are needed since many algorithms for gene expression data analysis require a complete matrix of gene array values. In this paper, imputation methods based on the least squares formulation are proposed to estimate missing values in the gene expression data, which exploit local similarity structures in the data as well as least squares optimization process.The proposed local least squares imputation method (LLSimpute) represents a target gene that has missing values as a linear combination of similar genes. The similar genes are chosen by k-nearest neighbors or k coherent genes that have large absolute values of Pearson correlation coefficients. Non-parametric missing values estimation method of LLSimpute are designed by introducing an automatic k-value estimator. In our experiments, the proposed LLSimpute method shows competitive results when compared with other imputation methods for missing value estimation on various datasets and percentages of missing values in the data.The software is available at http://www.cs.umn.edu/~hskim/tools.htmlhpark@cs.umn.edu
Year
DOI
Venue
2005
10.1093/bioinformatics/btk053
Bioinformatics/computer Applications in The Biosciences
Keywords
DocType
Volume
gene expression data analysis,similar gene,missing value estimation,gene array value,missing expression value,dna microarray gene expression,non-parametric missing values estimation,effective missing value estimation,missing value,gene expression data,imputation method,squares imputation,least square,dna microarray,missing values,k nearest neighbor
Journal
22
Issue
ISSN
Citations 
11
1367-4803
144
PageRank 
References 
Authors
6.89
6
3
Search Limit
100144
Name
Order
Citations
PageRank
Hyunsoo Kim11558.12
Gene H. Golub22558856.07
Haesun Park33546232.42