Abstract | ||
---|---|---|
In this work we describe a multi-input Convolutional Neural Network for text classification which allows for combining text preprocessed at word level, byte pair encoding level and character level. We conduct experiments on different datasets and we compare the results obtained with other classifiers. We apply the developed model to two different practical use cases: (1) classifying ingredients into their corresponding classes by means of a corpus provided by Northfork; and (2) classifying texts according to the English level of their corresponding writers by means of a corpus provided by ProvenWord. Additionally, we perform experiments on a standard classification task using Yahoo! Answers and GermEval2017 task A datasets. We show that the developed architecture obtains satisfactory results with these corpora, and we compare results obtained for each dataset with different state-of-the-art approaches, obtaining very promising results. |
Year | DOI | Venue |
---|---|---|
2019 | 10.1007/978-3-030-20521-8_49 | ADVANCES IN COMPUTATIONAL INTELLIGENCE, IWANN 2019, PT I |
Keywords | Field | DocType |
Text classification,Document classification,CNN,Multi-input network,Gastrofy,ProvenWord,Use case,Northfork,GermEval2017,Agglutinative language,Swedish,German | Document classification,Use case,Pattern recognition,Convolutional neural network,Computer science,Agglutinative language,Byte pair encoding,Artificial intelligence | Conference |
Volume | ISSN | Citations |
11506 | 0302-9743 | 0 |
PageRank | References | Authors |
0.34 | 0 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Zuzanna Parcheta | 1 | 1 | 2.09 |
Germán Sanchis-Trilles | 2 | 101 | 16.95 |
francisco casacuberta | 3 | 1439 | 161.33 |
Robin Redahl | 4 | 0 | 0.34 |