Expansion of training texts to generate a topic-dependent language model for meeting speech recognition - Citegraph

Paper Info

Title
Expansion of training texts to generate a topic-dependent language model for meeting speech recognition

Abstract
This paper proposes expansion methods for training texts (baseline) to generate a topic-dependent language model for more accurate recognition of meeting speech. To prepare a universal language model that can cope with the variety of topics discussed in meetings is very difficult. Our strategy is to generate topic-dependent training texts based on two methods. The first is text collection from web pages using queries that consist of topic-dependent confident terms; these terms were selected from preparatory recognition results based on the TF-IDF (TF; Term Frequency, IDF; Inversed Document Frequency) values of each term. The second technique is text generation using participants' names. Our topic-dependent language model was generated using these new texts and the baseline corpus. The language model generated by the proposed strategy reduced the perplexity by 16.4% and out-of-vocabulary rate by 37.5%, respectively, compared with the language model that used only the baseline corpus. This improvement was confirmed through meeting speech recognition as well.

Year	Venue	Keywords
2012	Signal & Information Processing Association Annual Summit and Conference	speech recognition,vocabulary,TF-IDF,Web page,baseline corpus,meeting speech recognition,term frequency inversed document frequency,text collection,text generation technique,topic-dependent confident term query,topic-dependent language model,topic-dependent training text
Field	DocType	ISSN
Speech corpus,Perplexity,Noisy text analytics,Speech analytics,Computer science,Speech recognition,Speaker recognition,Universal language,Artificial intelligence,Natural language processing,Vocabulary,Language model	Conference	2309-9402
ISBN	Citations	PageRank
978-1-4673-4863-8	0	0.34
References	Authors
0	4

Authors (4 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Egashira, K.	1	0	0.34
Kensuke Kojima	2	12	3.69
Masaru Yamashita	3	28	6.46
Katsuya Yamauchi	4	11	1.28

1