Using Random Forests in the Structured Language Model - Citegraph

Paper Info

Title
Using Random Forests in the Structured Language Model

Abstract
In this paper, we explore the use of Random Forests (RFs) in the struc- tured language model (SLM), which uses rich syntactic information in predicting the next word based on words already seen. The goal in this work is to construct RFs by randomly growing Decision Trees (DTs) us- ing syntactic information and investigate the performance of the SLM modeled by the RFs in automatic speech recognition. RFs, which were originally developed as classifiers, are a combination of decision tree classifiers. Each tree is grown based on random training data sampled independently and with the same distribution for all trees in the forest, and a random selection of possible questions at each node of the decision tree. Our approach extends the original idea of RFs to deal with the data sparseness problem encountered in language modeling. RFs have been studied in the context of n-gram language modeling and have been shown to generalize well to unseen data. We show in this paper that RFs using syntactic information can also achieve better performance in both perplexity (PPL) and word error rate (WER) in a large vocabulary speech recognition system, compared to a baseline that uses Kneser-Ney smoothing.

Year	Venue	Keywords
2004	NIPS	speech recognition,word error rate,random forest,decision tree classifier,automatic speech recognition,language model,decision tree
Field	DocType	Citations
Decision tree,Perplexity,Computer science,Word error rate,Smoothing,Sampling (statistics),Artificial intelligence,Natural language processing,Random forest,Syntax,Machine learning,Language model	Conference	0
PageRank	References	Authors
0.34	12	2

Authors (2 rows)

Cited by (0 rows)

References (12 rows)

Name	Order	Citations	PageRank
Peng Xu	1	136	22.02
Frederick Jelinek	2	139	23.22

1