Compact Lexicon Selection With Spectral Methods - Citegraph

Paper Info

Title
Compact Lexicon Selection With Spectral Methods

Abstract
In this paper, we introduce the task of selecting compact lexicon from large, noisy gazetteers. This scenario arises often in practice, in particular spoken language understanding (SLU). We propose a simple and effective solution based on matrix decomposition techniques: canonical correlation analysis (CCA) and rank-revealing QR (RRQR) factorization. CCA is first used to derive low-dimensional gazetteer embeddings from domain-specific search logs. Then RRQR is used to find a subset of these embeddings whose span approximates the entire lexicon space. Experiments on slot tagging show that our method yields a small set of lexicon entities with average relative error reduction of > 50% over randomly selected lexicon.

Year	Venue	Field
2015	PROCEEDINGS OF THE 53RD ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL) AND THE 7TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (IJCNLP), VOL 2	Pattern recognition,Computer science,Canonical correlation,Matrix decomposition,Lexicon,Artificial intelligence,Spectral method,Factorization,Natural language processing,Small set,Spoken language,Approximation error
DocType	Volume	Citations
Conference	P15-2	5
PageRank	References	Authors
0.43	11	4

Authors (4 rows)

Cited by (5 rows)

References (11 rows)

Name	Order	Citations	PageRank
Young-Bum Kim	1	112	13.60
Karl Stratos	2	328	21.07
Xiaohu Liu	3	18	2.41
Ruhi Sarikaya	4	698	64.49

1