Title
BIGpre: A Quality Assessment Package for Next-Generation Sequencing Data.
Abstract
The emergence of next-generation sequencing (NGS) technologies has significantly improved sequencing throughput and reduced costs. However, the short read length, duplicate reads and massive volume of data make the data processing much more difficult and complicated than the first-generation sequencing technology. Although there are some software packages developed to assess the data quality, those packages either are not easily available to users or require bioinformatics skills and computer resources. Moreover, almost all the quality assessment software currently available didn’t taken into account the sequencing errors when dealing with the duplicate assessment in NGS data. Here, we present a new user-friendly quality assessment software package called BIGpre, which works for both Illumina and 454 platforms. BIGpre contains all the functions of other quality assessment software, such as the correlation between forward and reverse reads, read GC-content distribution, and base Ns quality. More importantly, BIGpre incorporates associated programs to detect and remove duplicate reads after taking sequencing errors into account and trimming low quality reads from raw data as well. BIGpre is primarily written in Perl and integrates graphical capability from the statistics package R. This package produces both tabular and graphical summaries of data quality for sequencing datasets from Illumina and 454 platforms. Processing hundreds of millions reads within minutes, this package provides immediate diagnostic information for user to manipulate sequencing data for downstream analyses. BIGpre is freely available at http://bigpre.sourceforge.net/.
Year
DOI
Venue
2011
10.1016/S1672-0229(11)60027-2
Genomics, Proteomics & Bioinformatics
Keywords
Field
DocType
next-generation sequencing,quality assessment,duplicate reads,sequencing error
Data mining,Data processing,Data quality,Computer science,Raw data,Software,DNA sequencing,Bioinformatics,Throughput,Trimming,Perl
Journal
Volume
Issue
ISSN
9
6
1672-0229
Citations 
PageRank 
References 
4
0.56
6
Authors
7
Name
Order
Citations
PageRank
Tongwu Zhang140.56
Yingfeng Luo240.56
Kan Liu340.56
Linlin Pan440.56
Bing Zhang540.56
Jun Yu65116.26
Songnian Hu78716.83