Title
Olfactory Receptor Database: A Metadata-Driven Automated Population From Sources Of Gene And Protein Sequences
Abstract
The Olfactory Receptor Database (ORDB; http://senselab.med.yale.edu/senselab/ordb) is a central repository of olfactory receptor (OR) and olfactory receptor-like gene and protein sequences. To deal with the very large OR gene family, we have constructed an algorithm that automatically downloads sequences from web sources such as GenBank and SWISS-PROT into the database. The algorithm uses hypertext markup language (HTML) parsing techniques that extract information relevant to ORDB. The information is then correlated with the metadata in the ORDB knowledge base to encode the unstructured text extracted into the structured format compliant with the database architecture, entity attribute value with classes and relationship (EAV/CR), which supports the SenseLab project as a whole. Three population methods: batch, automatic and semi-automatic population are discussed. The data is imported into the database using extensible markup language (XML).
Year
DOI
Venue
2002
10.1093/nar/30.1.354
NUCLEIC ACIDS RESEARCH
Keywords
Field
DocType
markup language,algorithms,gene family,database management systems,automation,systems integration,internet,protein sequence,structure formation,knowledge base,extensible markup language,amino acid sequence,forecasting,entity attribute value
Metadata,Population,XML,Biology,Parsing,Knowledge base,HTML,GenBank,Database,Entity–attribute–value model
Journal
Volume
Issue
ISSN
30
1
0305-1048
Citations 
PageRank 
References 
14
1.55
4
Authors
4
Name
Order
Citations
PageRank
Chiquito J. Crasto1918.30
Luis N. Marenco217233.85
P L Miller344593.86
G M Shepherd450873.75