Title
BIRD: identifying cell doublets via biallelic expression from single cells.
Abstract
Current technologies for single-cell transcriptomics allow thousands of cells to be analyzed in a single experiment. The increased scale of these methods raises the risk of cell doublets contamination. Available tools and algorithms for identifying doublets and estimating their occurrence in single-cell experimental data focus on doublets of different species, cell types or individuals. In this study, we analyze transcriptomic data from single cells having an identical genetic background. We claim that the ratio of monoallelic to biallelic expression provides a discriminating power toward doublets' identification. We present a pipeline called BIallelic Ratio for Doublets (BIRD) that relies on heterologous genetic variations, from single-cell RNA sequencing. For each dataset, doublets were artificially created from the actual data and used to train a predictive model. BIRD was applied on Smart-seq data from 163 primary fibroblast single cells. The model achieved 100% accuracy in annotating the randomly simulated doublets. Bonafide doublets were verified based on a biallelic expression signal amongst X-chromosome of female fibroblasts. Data from 10X Genomics microfluidics of human peripheral blood cells achieved in average 83% (+/- 3.7%) accuracy, and an area under the curve of 0.88 (+/- 0.04) for a collection of similar to 13 300 single cells. BIRD addresses instances of doublets, which were formed from cell mixtures of identical genetic background and cell identity. Maximal performance is achieved for high-coverage data from Smart-seq. Success in identifying doublets is data specific which varies according to the experimental methodology, genomic diversity between haplotypes, sequence coverage and depth.
Year
DOI
Venue
2020
10.1093/bioinformatics/btaa474
BIOINFORMATICS
Keywords
DocType
Volume
X-inactivation,Allelic bias,scRNA-seq,single cell,Allele specific expression,X10 Genomics,Single nucleotide polymorphism
Journal
36
Issue
ISSN
Citations 
SUPnan
1367-4803
0
PageRank 
References 
Authors
0.34
0
2
Name
Order
Citations
PageRank
Kerem Wainer-Katsir100.34
Michal Linial21502149.92