Combining word and phonetic-code representations for spoken document retrieval - Citegraph

Paper Info

Title
Combining word and phonetic-code representations for spoken document retrieval

Abstract
The traditional approach for spoken document retrieval (SDR) uses an automatic speech recognizer (ASR) in combination with a word-based information retrieval method. This approach has only showed limited accuracy, partially because ASR systems tend to produce transcriptions of spontaneous speech with significant word error rate. In order to overcome such limitation we propose a method which uses word and phonetic-code representations in collaboration. The idea of this combination is to reduce the impact of transcription errors in the processing of some (presumably complex) queries by representing words with similar pronunciations through the same phonetic code. Experimental results on the CLEF-CLSR-2007 corpus are encouraging; the proposed hybrid method improved the mean average precision and the number of retrieved relevant documents from the traditional word-based approach by 3% and 7% respectively.

Year	DOI	Venue
2011	10.1007/978-3-642-19437-5_38	CICLing (2)
Field	DocType	Volume
Automatic speech,Transcription (linguistics),Pattern recognition,tf–idf,Computer science,Word error rate,Speech recognition,Transcription error,Artificial intelligence,Natural language processing,Document retrieval,Visual Word	Conference	6609
ISSN	Citations	PageRank
0302-9743	1	0.34
References	Authors
10	3

Authors (3 rows)

Cited by (1 rows)

References (10 rows)

Name	Order	Citations	PageRank
M. Alejandro Reyes-Barragán	1	7	1.72
Manuel Montes-Y-Gómez	2	638	83.97
Luis Villaseñor-Pineda	3	403	53.74

1