Title
EasyAug: An Automatic Textual Data Augmentation Platform for Classification Tasks
Abstract
Imbalanced data is a perennial problem that impedes the learning abilities of current machine learning-based classification models. One approach to address it is to leverage data augmentation to expand the training set. For image data, there are a number of suitable augmentation techniques that have proven effective in previous work. For textual data, however, due to the discrete units inherent in natural language, techniques that randomly perturb the signal may be ineffective. Additionally, due to the substantial discrepancy between different textual datasets (e.g., different domains), an augmentation approach that facilitates the classification on one dataset may be detrimental on another dataset. For practitioners, comparing different data augmentation techniques is non-trivial, as the corresponding methods might need to be incorporated into different system architectures, and the implementation of some approaches, such as generative models, is laborious. To address these challenges, we develop EasyAug, a data augmentation platform that provides several augmentation approaches. Users can conveniently compare the classification results and can easily choose the most suitable one for their own dataset. In addition, the system is extensible and can incorporate further augmentation approaches, such that with minimal effort a new method can comprehensively be compared with the baselines.
Year
DOI
Venue
2020
10.1145/3366424.3383552
WWW '20: The Web Conference 2020 Taipei Taiwan April, 2020
DocType
ISBN
Citations 
Conference
978-1-4503-7024-0
1
PageRank 
References 
Authors
0.36
0
8
Name
Order
Citations
PageRank
Siyuan Qiu110.70
Binxia Xu210.36
Jie Zhang31995156.26
Yafang Wang413413.56
Xiaoyu Shen510.70
Gerard de Melo672353.54
Chong Long7946.82
Xiaolong Li836236.92