Title
cnAnalysis450k: an R package for comparative analysis of 450k/EPIC Illumina methylation array derived copy number data.
Abstract
Motivation: Detailed copy number (CN) variation data can be obtained from 450k or EPIC Illumina methylation assays. However, the effects of different preprocessing strategies (normalization, transformation and selection of gain/loss cutoff values) on variant calling have not been evaluated systematically. Results: We provide an R package which allows to directly compare any preprocessed CN data. It provides its own CN alteration detection methodology: segments are identified through detection of changes in variance of CN data and are subsequently filtered for significance. Meaningful cutoffs for gain/loss definition can be identified automatically through analysis of the resulting Delta CN distributions of all analyzed samples. Three exemplary datasets (2x450k, 1xEPIC) were selected for comparative analyses of Raw, Illumina, SWAN, Quantile, Noob, Funnorm and Dasen normalizations. Importantly, all CN data distributions were skewed (-0.66 to -1.2) therefore requiring different gain/loss cutoffs. Depending on the normalization method, prominent baseline differences between samples could be observed. We present a workflow, which alleviates both issues: Z-transformation removes baseline differences between samples, and automatic cutoff selection circumvents the problems accompanying the skewed distributions. Additional filtering of candidates by significance yields comparable results for most enumerated normalization methods except for SWAN. In contrast, manual cutoff determination results in highly variable numbers of variant calls, highly dependent on the selected normalization method. Taken together, we present a workflow which allows to robustly identify copy number alterations in methylation array data fairly independent of the applied normalization.
Year
DOI
Venue
2017
10.1093/bioinformatics/btx156
BIOINFORMATICS
Field
DocType
Volume
Normalization (statistics),Computer science,Cutoff,Filter (signal processing),EPIC,Preprocessor,Quantile,Bioinformatics,R package
Journal
33
Issue
ISSN
Citations 
15
1367-4803
1
PageRank 
References 
Authors
0.43
3
3
Name
Order
Citations
PageRank
Maximilian Knoll110.43
Jürgen Debus283.23
Amir Abdollahi3112.56