Title
A new method to forecast of Escherichia coli promoter gene sequences: Integrating feature selection and Fuzzy-AIRS classifier system
Abstract
We have investigated the real-world task of recognizing biological concepts in DNA sequences in this work. Recognizing promoters in strings that represent nucleotides (one of A, G, T, or C) has been performed using a novel approach based on feature selection (FS) and Artificial Immune Recognition System (AIRS) with Fuzzy resource allocation mechanism (Fuzzy-AIRS), which is first proposed by us. The aim of this study is to improve the prediction accuracy of Escherichia coli promoter gene sequences using a novel system based on FS and Fuzzy-AIRS. The E. coli promoter gene sequences dataset has 57 attributes and 106 samples including 53 promoters and 53 non-promoters. The proposed system consists of two parts. Firstly, we have reduced the dimension of E. coli promoter gene sequences dataset from 57 attributes to 4 attributes by means of FS process. Second, Fuzzy-AIRS classifier algorithm has been run to predict the E. coli promoter gene sequences. The robustness of the proposed method is examined using prediction accuracy, sensitivity and specificity analysis, k-fold cross-validation method and confusion matrix. Whilst only Fuzzy-AIRS classifier has obtained 50% prediction accuracy using 10-fold cross-validation, the proposed system has obtained 90% prediction accuracy in the same conditions. These obtained results have indicated that the proposed system obtain the success rate in recognizing promoters in strings that represent nucleotides.
Year
DOI
Venue
2009
10.1016/j.eswa.2007.09.010
Expert Syst. Appl.
Keywords
Field
DocType
prediction,fuzzy-airs classifier system,e. coli promoter gene,escherichia coli promoter gene sequences,fuzzy-airs classifier,prediction accuracy,artificial immune system,novel system,airs classification system,fuzzy-airs classifier algorithm,escherichia coli promoter gene,fuzzy resource allocation mechanism,feature selection,integrating feature selection,10-fold cross-validation,sequences dataset,proposed system,new method,recognizing promoter,escherichia coli,dna sequence,confusion matrix,cross validation,resource allocation,classification system,nucleotides
Data mining,Promoter,Artificial immune system,Confusion matrix,Feature selection,Computer science,Fuzzy logic,Robustness (computer science),Artificial intelligence,DNA sequencing,Classifier (linguistics),Machine learning
Journal
Volume
Issue
ISSN
36
1
Expert Systems With Applications
Citations 
PageRank 
References 
10
0.82
1
Authors
2
Name
Order
Citations
PageRank
Kemal Polat1134897.38
Salih Güneş2126778.53