Title
GOChase-II: correcting semantic inconsistencies from Gene Ontology-based annotations for gene products.
Abstract
BACKGROUND: The Gene Ontology (GO) provides a controlled vocabulary for describing genes and gene products. In spite of the undoubted importance of GO, several drawbacks associated with GO and GO-based annotations have been introduced. We identified three types of semantic inconsistencies in GO-based annotations; semantically redundant, biological-domain inconsistent and taxonomy inconsistent annotations. METHODS: To determine the semantic inconsistencies in GO annotation, we used the hierarchical structure of GO graph and tree structure of NCBI taxonomy. Twenty seven biological databases were collected for finding semantic inconsistent annotation. RESULTS: The distributions and possible causes of the semantic inconsistencies were investigated using twenty seven biological databases with GO-based annotations. We found that some evidence codes of annotation were associated with the inconsistencies. The numbers of gene products and species in a database that are related to the complexity of database management are also in correlation with the inconsistencies. Consequently, numerous annotation errors arise and are propagated throughout biological databases and GO-based high-level analyses. GOChase-II is developed to detect and correct both syntactic and semantic errors in GO-based annotations. CONCLUSIONS: We identified some inconsistencies in GO-based annotation and provided software, GOChase-II, for correcting these semantic inconsistencies in addition to the previous corrections for the syntactic errors by GOChase-I.
Year
DOI
Venue
2011
10.1186/1471-2105-12-S1-S40
BMC Bioinformatics
Keywords
Field
DocType
algorithms,computational biology,microarrays,bioinformatics,biological database,semantics,tree structure,database management systems,database management,proteins,controlled vocabulary
Logic error,Gene,Information retrieval,Computer science,Gene ontology,Controlled vocabulary,Biological database,Bioinformatics,Molecular Sequence Annotation,Semantics
Journal
Volume
Issue
ISSN
12
S-1
1471-2105
Citations 
PageRank 
References 
11
0.38
13
Authors
5
Name
Order
Citations
PageRank
Yu Rang Park1367.20
Jihun Kim216015.11
Hye Won Lee3161.91
Young Jo Yoon4131.08
Ju Han Kim524830.80