Title
On-line tools for sequence retrieval and multivariate statistics in molecular biology.
Abstract
We have developed a World-Wide-Web server for browsing sequence collections structured under the ACNUC format and for performing multivariate analyses on sequences. General collections (like GenBank or EMBL), as well as specialized data banks (like Hovergen and NRSub) can be accessed. This system allows complex queries to be constructed, and the result of each query, represented by a list of sequences is stored on the server. It is then possible to reuse this list to compute multivariate analyses on the sequences. Two examples of applications are shown. The first one consists in a study of codon usage with correspondence analysis on all the protein genes of Haemophilus influenzae Rd. This study allows the highly expressed genes and the integral membrane proteins of this organism to be identified. The second one consists in an ordering of 70 aligned protein sequences of growth hormone with principal coordinate analysis. With this method, we are able to re-establish the patterns of relationships between the sequences previously determined with tree building programs.
Year
DOI
Venue
1996
10.1093/bioinformatics/12.1.63
COMPUTER APPLICATIONS IN THE BIOSCIENCES
Keywords
Field
DocType
sequence analysis.,world-wide web,sequence data banks,multivariate analysis,retrieval system,codon usage,molecular biology,protein sequence,sequence analysis,multivariate analyses,membrane protein,multivariate statistics,world wide web,correspondence analysis
Sequence alignment,File server,Sequence database,Computer science,Multivariate statistics,Bioinformatics,Correspondence analysis,GenBank,Web server,Sequence analysis
Journal
Volume
Issue
ISSN
12
1
0266-7061
Citations 
PageRank 
References 
5
1.93
12
Authors
2
Name
Order
Citations
PageRank
Guy Perrière129240.35
J Thioulouse21715.19