Abstract | ||
---|---|---|
Recently the parametric representation using cochlea behavior has been used in different studies related with Automatic Speech Recognition (ASR). This paper shows how using an alternative solution reported in the state of the art solves the Lesser and Berkeley's cochlea model in ASR tasks. An approach that considers a new form to construct the bank filter in the parametric representation used to extract MFCC is proposed. Then this distribution of the bank filter to have a new representation of the speech in frequency domain is used. It is important to indicate that MFCC parameters use Mel scale to create a bank filter. The cochlea behavior based on the theory to create the central frequencies of the bank filter was used,. The Mel scale function was substituted for our purpose. A 98.5% performance was reached, for a task that uses isolated digits pronounced by 5 different speakers in the Spanish language and corpus SUSAS with neutral sound records with some advantages in comparison with MFCC was used. |
Year | DOI | Venue |
---|---|---|
2014 | 10.1007/978-3-319-12568-8_21 | PROGRESS IN PATTERN RECOGNITION IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS, CIARP 2014 |
Keywords | Field | DocType |
Automatic Speech Recognition, Speech recognition, cochlea operation, place theory and bank filter component | Frequency domain,Mel-frequency cepstrum,Place theory,Pattern recognition,Computer science,Mel scale,Speech recognition,Parametric statistics,Artificial intelligence | Conference |
Volume | ISSN | Citations |
8827 | 0302-9743 | 0 |
PageRank | References | Authors |
0.34 | 2 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
José Luis Oropeza Rodríguez | 1 | 5 | 6.49 |
Sergio Suárez-Guerra | 2 | 37 | 8.81 |
Mario Jiménez-Hernández | 3 | 0 | 0.34 |