Title
Coping with Distribution Change in the Same Domain Using Similarity-Based Instance Weighting
Abstract
Lexicons are considered as the most crucial features in natural language processing (NLP), and thus often used in machine learning algorithms applied to NLP tasks. However, due to the diversity of lexical space, the machine learning algorithms with lexical features suffer from the difference between distributions of training and test data. In order to overcome the distribution change, this paper proposes support vector machines with example-wise weights. The training distribution coincides with the test distribution by weighting training examples according to their similarity to all test data. The experimental results on text chunking show that the distribution change between training and test data is actually recognized and the proposed method which considers this change in its training phase outperforms ordinary support vector machines.
Year
DOI
Venue
2009
10.1007/978-3-642-05224-8_27
ACML
Keywords
Field
DocType
ordinary support vector machine,lexical feature,distribution change,test data,lexical space,training phase,test distribution,nlp task,training distribution,weighting training example,similarity-based instance weighting,natural language processing,machine learning,weight training,support vector machine
Weighting,Pattern recognition,Computer science,Coping (psychology),Support vector machine,Test data,Artificial intelligence,Natural language processing,Chunking (psychology),Relevance vector machine,Machine learning
Conference
Volume
ISSN
Citations 
5828
0302-9743
0
PageRank 
References 
Authors
0.34
17
4
Name
Order
Citations
PageRank
Jeong-Woo Son16710.56
Hyun-Je Song2339.58
Seong-Bae Park331147.31
Seyoung Park47614.48