Title
Multiple hot-deck imputation for network inference from RNA sequencing data.
Abstract
Motivation: Network inference provides a global view of the relations existing between gene expression in a given transcriptomic experiment (often only for a restricted list of chosen genes). However, it is still a challenging problem: even if the cost of sequencing techniques has decreased over the last years, the number of samples in a given experiment is still (very) small compared to the number of genes. Results: We propose a method to increase the reliability of the inference when RNA-seq expression data have been measured together with an auxiliary dataset that can provide external information on gene expression similarity between samples. Our statistical approach, hd-MI, is based on imputation for samples without available RNA-seq data that are considered as missing data but are observed on the secondary dataset. hd-MI can improve the reliability of the inference for missing rates up to 30% and provides more stable networks with a smaller number of false positive edges. On a biological point of view, hd-MI was also found relevant to infer networks from RNA-seq data acquired in adipose tissue during a nutritional intervention in obese individuals. In these networks, novel links between genes were highlighted, as well as an improved comparability between the two steps of the nutritional intervention.
Year
DOI
Venue
2018
10.1093/bioinformatics/btx819
BIOINFORMATICS
Field
DocType
Volume
Data mining,Inference,Computer science,Software,Imputation (statistics),Missing data,Comparability,R package
Journal
34
Issue
ISSN
Citations 
10
1367-4803
0
PageRank 
References 
Authors
0.34
5
8