Title | ||
---|---|---|
Character-Based Text Classification using Top Down Semantic Model for Sentence Representation. |
Abstract | ||
---|---|---|
Despite the success of deep learning on many fronts especially image and speech, its application in text classification often is still not as good as a simple linear SVM on n-gram TF-IDF representation especially for smaller datasets. Deep learning tends to emphasize on sentence level semantics when learning a representation with models like recurrent neural network or recursive neural network, however from the success of TF-IDF representation, it seems a bag-of-words type of representation has its strength. Taking advantage of both representions, we present a model known as TDSM (Top Down Semantic Model) for extracting a sentence representation that considers both the word-level semantics by linearly combining the words with attention weights and the sentence-level semantics with BiLSTM and use it on text classification. We apply the model on characters and our results show that our model is better than all the other character-based and word-based convolutional neural network models by cite{zhang15} across seven different datasets with only 1% of their parameters. We also demonstrate that this model beats traditional linear models on TF-IDF vectors on small and polished datasets like news article in which typically deep learning models surrender. |
Year | Venue | Field |
---|---|---|
2017 | arXiv: Computation and Language | Pattern recognition,Convolutional neural network,Linear model,Computer science,Recurrent neural network,Natural language processing,Artificial intelligence,Deep learning,Sentence,Semantics,Semantic role labeling,Semantic data model |
DocType | Volume | Citations |
Journal | abs/1705.10586 | 1 |
PageRank | References | Authors |
0.35 | 14 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Zhenzhou Wu | 1 | 5 | 1.41 |
Xin Zheng | 2 | 264 | 18.79 |
Daniel Dahlmeier | 3 | 460 | 29.67 |