Distributed Document Representation for Document Classification. - Citegraph

Paper Info

Title
Distributed Document Representation for Document Classification.

Abstract
The distributed vector representations learned from the deep learning framework have shown its great power in capturing the semantic meaning of words, phrases and sentences, from which multiple NLP applications have benefited. As words combine to form the meaning of sentences, so do sentences combine to form the meaning of documents, the idea of representing each document with a dense distributed representation holds promise. In this paper, we propose a supervised framework (Compound RNN) for document classification based on document-level distributed representations learned from deep learning architecture. Our framework first obtains the distributed representation at sentence-level by operating on the parse tree structure from recursive neural network, and then obtains the document presentation-level by convoluting the sentence vectors from a recurrent neural network. Our framework (Compound RNN) outperforms existing document representations such as bag-of-words, LDA in multiple text classification/regression tasks.

Year	DOI	Venue
2015	10.1007/978-3-319-18038-0_17	ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PART I
Field	DocType	Volume
Data mining,Parse tree,Computer science,Machine translation,Recurrent neural network,Artificial intelligence,Natural language processing,Deep learning,Document classification,Architecture,Support vector machine,Sentence,Machine learning	Conference	9077
ISSN	Citations	PageRank
0302-9743	3	0.37
References	Authors
39	2

Authors (2 rows)

Cited by (3 rows)

References (39 rows)

Name	Order	Citations	PageRank
Rumeng Li	1	3	0.71
Hiroyuki Shindo	2	75	13.80

1