Abstract | ||
---|---|---|
We investigate the performance of the Structured Language Model when one of its components is modeled by a connectionist model. Using a connectionist model and a distributed representation of the items in the history makes the component able to use much longer contexts than possible with currently used interpolated or backoff models, both because of the inherent capability of the connectionist model to fight the data sparseness problem, and because of the only sub-linear growth in the model size when increasing the context length. Experiments show significant improvement in perplexity and moderate reduction in word error rate over the baseline SLM results on the UPENN treebank and Wall Street Journal (WSJ) corpora respectively. The results also show that the probability distribution obtained by our model is much less correlated to regular N-grams than the baseline SLM model. |
Year | DOI | Venue |
---|---|---|
2003 | 10.1109/ICASSP.2003.1198795 | ICASSP '03). 2003 IEEE International Conference |
Keywords | Field | DocType |
interpolation,natural languages,neural nets,probability,speech recognition,UPENN treebank corpus,Wall Street Journal corpus,backoff models,connectionist model,context length,data sparseness problem,distributed representation,interpolated models,model size,neural network model,perplexity,probability distribution,speech recognition,structured language model,syntactical based language model,word error rate reduction | Perplexity,Cache language model,Computer science,Probability distribution,Natural language processing,Artificial intelligence,Artificial neural network,Language model,Pattern recognition,Word error rate,Speech recognition,Natural language,Treebank | Conference |
Volume | ISSN | ISBN |
1 | 1520-6149 | 0-7803-7663-3 |
Citations | PageRank | References |
15 | 7.16 | 3 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Ahmad Emami | 1 | 138 | 26.52 |
Peng Xu | 2 | 290 | 31.64 |
Frederick Jelinek | 3 | 139 | 23.22 |