Abstract | ||
---|---|---|
Statistical morphological inflectors are typically trained on fully supervised, type-level data. One remaining open research question is the following: How can we effectively exploit raw, token-level data to improve their performance? To this end, we introduce a novel generative latent-variable model for the semi-supervised learning of inflection generation. To enable posterior inference over the latent variables, we derive an efficient variational inference procedure based on the wake-sleep algorithm. We experiment on 23 languages, using the Universal Dependencies corpora in a simulated low-resource setting, and find improvements of over 10% absolute accuracy in some cases. |
Year | Venue | Field |
---|---|---|
2018 | ACL | Autoencoder,Computer science,Inflection,Natural language processing,Artificial intelligence |
DocType | Citations | PageRank |
Conference | 0 | 0.34 |
References | Authors | |
12 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Jason Naradowsky | 1 | 186 | 11.73 |
Ryan Cotterell | 2 | 3 | 6.13 |
Sebastian J. Mielke | 3 | 3 | 4.46 |
Lawrence Wolf-Sonkin | 4 | 0 | 1.35 |