Abstract | ||
---|---|---|
Motivation: Recent studies have shown that RNA-sequencing (RNA-seq) can be used to measure mRNA of sufficient quality extracted from formalin-fixed paraffin-embedded (FFPE) tissues to provide whole-genome transcriptome analysis. However, little attention has been given to the normalization of FFPE RNA-seq data, a key step that adjusts for unwanted biological and technical effects that can bias the signal of interest. Existing methods, developed based on fresh-frozen or similar-type samples, may cause suboptimal performance. Results: We proposed a new normalization method, labeled MIXnorm, for FFPE RNA-seq data. MIXnorm relies on a two-component mixture model, which models non-expressed genes by zero-inflated Poisson distributions and models expressed genes by truncated normal distributions. To obtain maximum likelihood estimates, we developed a nested EM algorithm, in which closed-form updates are available in each iteration. By eliminating the need for numerical optimization in the M-step, the algorithm is easy to implement and computationally efficient. We evaluated MIXnorm through simulations and cancer studies. MIXnorm makes a significant improvement over commonly used methods for RNA-seq expression data. |
Year | DOI | Venue |
---|---|---|
2020 | 10.1093/bioinformatics/btaa153 | BIOINFORMATICS |
DocType | Volume | Issue |
Journal | 36 | 11 |
ISSN | Citations | PageRank |
1367-4803 | 0 | 0.34 |
References | Authors | |
0 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Yin Shen | 1 | 2 | 1.75 |
Xinlei Wang | 2 | 7 | 2.54 |
Gaoxiang Jia | 3 | 0 | 0.34 |
Yang Xie | 4 | 33 | 5.79 |