Abstract | ||
---|---|---|
In this paper, we apply the concept of pre-training to hidden-unit conditional random fields (HUCRFs) to enable learning on unlabeled data. We present a simple yet effective pre-training technique that learns to associate words with their clusters, which are obtained in an unsupervised manner. The learned parameters are then used to initialize the supervised learning process. We also propose a word clustering technique based on canonical correlation analysis (CCA) that is sensitive to multiple word senses, to further improve the accuracy within the proposed framework. We report consistent gains over standard conditional random fields (CRFs) and HUCRFs without pre-training in semantic tagging, named entity recognition (NER), and part-of-speech (POS) tagging tasks, which could indicate the task independent nature of the proposed technique. |
Year | Venue | Field |
---|---|---|
2015 | PROCEEDINGS OF THE 53RD ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL) AND THE 7TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (IJCNLP), VOL 2 | Conditional random field,Pattern recognition,Computer science,Canonical correlation,Supervised learning,Artificial intelligence,Natural language processing,Cluster analysis,Named-entity recognition,CRFS |
DocType | Volume | Citations |
Conference | P15-2 | 4 |
PageRank | References | Authors |
0.42 | 23 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Young-Bum Kim | 1 | 112 | 13.60 |
Karl Stratos | 2 | 328 | 21.07 |
Ruhi Sarikaya | 3 | 698 | 64.49 |