Title
RNASeq_similarity_matrix: visually identify sample mix-ups in RNASeq data using a 'genomic' sequence similarity matrix.
Abstract
A Summary: Mistakes in linking a patient's biological samples with their phenotype data can confound RNA-Seq studies. The current method for avoiding such sample mix-ups is to test for inconsistencies between biological data and known phenotype data such as sex. However, in DNA studies a common QC step is to check for unexpected relatedness between samples. Here, we extend this method to RNA-Seq, which allows the detection of duplicated samples without relying on identifying inconsistencies with phenotype data. Results: We present RNASeq_similarity_matrix: an automated tool to generate a sequence similarity matrix from RNA-Seq data, which can be used to visually identify sample mix-ups. This is particularly useful when a study contains multiple samples from the same individual, but can also detect contamination in studies with only one sample per individual.
Year
DOI
Venue
2020
10.1093/bioinformatics/btz821
BIOINFORMATICS
Field
DocType
Volume
Data mining,Computer science,Computational biology,Similarity matrix
Journal
36
Issue
ISSN
Citations 
6
1367-4803
0
PageRank 
References 
Authors
0.34
0
7
Name
Order
Citations
PageRank
Nicolaas C Kist100.34
Robert A Power200.34
Andrew Skelton300.34
Seth D Seegobin400.34
Moira Verbelen500.34
Bushan Bonde600.34
Karim Malki711.07