Title
Posterior vs Parameter Sparsity in Latent Variable Models.
Abstract
We address the problem of learning structured unsupervised models with moment sparsity typical in many natural language induction tasks. For example, in unsu- pervised part-of-speech (POS) induction using hidden Markov models, we intro- duce a bias for words to be labeled by a small number of tags. In order to express this bias of posterior sparsity as opposed to parametric sparsity, we extend the pos- terior regularization framework (7). We evaluate our methods on three languages — English, Bulgarian and Portuguese — showing consistent and significant accu- racy improvement over EM-trained HMMs, and HMMs with sparsity-inducing Dirichlet priors trained by variational EM. We increase accuracy with respect to EM by 2.3%-6.5% in a purely unsupervised setting as well as in a weakly- supervised setting where the closed-class words are provided. Finally, we show improvements using our method when using the induced clusters as features of a discriminative model in a semi-supervised setting.
Year
Venue
DocType
2009
NIPS
Conference
Citations 
PageRank 
References 
1
0.38
8
Authors
4
Name
Order
Citations
PageRank
João Graça129511.19
Kuzman Ganchev273735.21
Ben Taskar33175209.33
Fernando Pereira4177172124.79