Abstract | ||
---|---|---|
We describe a novel approach for inducing unsupervised part-of-speech taggers for languages that have no labeled training data, but have translated text in a resource-rich language. Our method does not assume any knowledge about the target language (in particular no tagging dictionary is assumed), making it applicable to a wide array of resource-poor languages. We use graph-based label propagation for cross-lingual knowledge transfer and use the projected labels as features in an unsupervised model (Berg-Kirkpatrick et al., 2010). Across eight European languages, our approach results in an average absolute improvement of 10.4% over a state-of-the-art baseline, and 16.7% over vanilla hidden Markov models induced with the Expectation Maximization algorithm. |
Year | Venue | Keywords |
---|---|---|
2011 | ACL | resource-poor language,unsupervised model,novel approach,approach result,european language,bilingual graph-based projection,resource-rich language,expectation maximization algorithm,target language,cross-lingual knowledge transfer,unsupervised part-of-speech taggers,unsupervised part-of-speech |
Field | DocType | Volume |
Training set,Graph,Expectation–maximization algorithm,Label propagation,Computer science,Knowledge transfer,Part-of-speech tagging,Speech recognition,Artificial intelligence,Natural language processing,Hidden Markov model,Machine learning | Conference | P11-1 |
Citations | PageRank | References |
120 | 3.57 | 21 |
Authors | ||
2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Dipanjan Das | 1 | 1619 | 75.14 |
Slav Petrov | 2 | 2405 | 107.56 |