Topic-Sensitive Language Modelling - Citegraph

Paper Info

Title
Topic-Sensitive Language Modelling

Abstract
The paper proposes a new framework to construct topic-sensitive language models for large vocabulary speech recognition. Identifying a domain of discourse, a model appropriate for the current domain can be built. In our experiments, the target domain was represented with a piece of text. By using appropriate features, sub-corpus of a large collection of training text was extracted. Our feature selection process was especially suited to languages where words are formed by many different inflectional affixatation. All words with the same meaning (but different grammatical form) were collected in one cluster and represented as one feature. We used the heuristic word weighting classifier TFIDF (term frequency / inverse document frequency) to further shrink the feature vector. Final language model was built by interpolation of topic specific models and a general model. Experiments have been done by using English and Slovenian corpus.

Year	DOI	Venue
2000	10.1007/3-540-45323-7_43	Temporal Logic in Specification
Keywords	Field	DocType
general model,topic-sensitive language model,target domain,feature selection process,topic-sensitive language modelling,topic specific model,different grammatical form,final language model,feature vector,current domain,appropriate feature,inverse document frequency,feature selection,speech recognition,language model,term frequency	Feature vector,Feature selection,tf–idf,Computer science,Speech recognition,Natural language processing,Artificial intelligence,Domain of discourse,Topic model,Classifier (linguistics),Vocabulary,Language model	Conference
Volume	ISSN	ISBN
1902	0302-9743	3-540-41042-2
Citations	PageRank	References
0	0.34	2
Authors
2

Authors (2 rows)

Cited by (0 rows)

References (2 rows)

Name	Order	Citations	PageRank
Mirjam Sepesy Maučec	1	506	26.34
Zdravko Kacic	2	240	36.22

1