Do Not Have Enough Data? Deep Learning To The Rescue! - Citegraph

Paper Info

Title
Do Not Have Enough Data? Deep Learning To The Rescue!

Abstract
Based on recent advances in natural language modeling and those in text generation capabilities, we propose a novel data augmentation method for text classification tasks. We use a powerful pre-trained neural network model to artificially synthesize new labeled data for supervised learning. We mainly focus on cases with scarce labeled data. Our method, referred to as language-model-based data augmentation (LAM-BADA), involves tine-tuning a state-of-the-art language generator to a specific task through an initial training phase on the existing (usually small) labeled data. Using the fine-tuned model and given a class label, new sentences for the class are generated. Our process then filters these new sentences by using a classifier trained on the original data. In a series of experiments, we show that LAMBADA improves classifiers' performance on a variety of datasets. Moreover, LAMBADA significantly improves upon the state-of-the-art techniques for data augmentation, specifically those applicable to text classification tasks with little data.

Year	Venue	DocType
2020	national conference on artificial intelligence	Conference
Volume	ISSN	Citations
34	2159-5399	0
PageRank	References	Authors
0.34	0	8

Authors (8 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Anaby-Tavor Ateret	1	0	0.34
Boaz Carmeli	2	41	6.70
Goldbraich Esther	3	0	0.34
Amir Kantor	4	24	3.17
Kour George	5	0	0.34
Shlomov Segev	6	0	0.34
Naama Tepper	7	2	1.43
Zwerdling Naama	8	0	0.34

1