Title
ATGC database and ATGC-COGs: an updated resource for micro- and macro-evolutionary studies of prokaryotic genomes and protein family annotation.
Abstract
The Alignable Tight Genomic Clusters (ATGCs) database is a collection of closely related bacterial and archaeal genomes that provides several tools to aid research into evolutionary processes in the microbialworld. Each ATGC is a taxonomy-independent cluster of 2 or more completely sequenced genomes that meet the objective criteria of a high degree of local gene order (synteny) and a small number of synonymous substitutions in the protein-coding genes. As such, each ATGC is suited for analysis of microevolutionary variations within a cohesive group of organisms (e.g. species), whereas the entire collection of ATGCs is useful for macroevolutionary studies. The ATGC database includes many forms of pre-computed data, in particular ATGC-COGs (Clusters of Orthologous Genes), multiple sequence alignments, a set of 'index' orthologs representing the most well-conserved members of each ATGC-COG, the phylogenetic tree of the organisms within each ATGC, etc. Although the ATGC database contains several million proteins from thousands of genomes organized into hundreds of clusters (roughly a 4-fold increase since the last version of the ATGC database), it is now built with completely automated methods and will be regularly updated following new releases of the NCBI RefSeq database. The ATGC database is hosted jointly at the University of Iowa at dmk-brain.ecn.uiowa.edu/ATGC/ and the NCBI at ftp.ncbi.nlm.nih.gov/pub/kristensen/ATGC/atgc_home.html.
Year
DOI
Venue
2017
10.1093/nar/gkw934
NUCLEIC ACIDS RESEARCH
Field
DocType
Volume
Genome,Sequence alignment,Synteny,MICROBIOLOGY PROCEDURES,Protein family,RefSeq,Annotation,Phylogenetic tree,Biology,Genetics,Database
Journal
45
Issue
ISSN
Citations 
D1
0305-1048
1
PageRank 
References 
Authors
0.35
7
3
Name
Order
Citations
PageRank
David M. Kristensen11428.54
Yuri I. Wolf254076.15
Eugene V. Koonin3986239.69