Title
Hierarchical Bidirectional Long Short-Term Memory Networks for Chinese Messaging Spam Filtering
Abstract
Messaging spam filtering is an important research area in the field of natural language processing (NLP). In this paper, we propose a hierarchical bidirectional long short-term memory network based approach for Chinese messaging spam filtering. Considering that a message consists of sentences and a sentence consists of words, we design a hierarchical architecture to generate the representation of a message that aggregates the information of each word in each sentence. Besides, we notice that the errors produced by Chinese segment may affect the performance of our model. So we use the unsegmented characters as input rather than the segmented words like most of the Chinese NLP models. The experimental results demonstrate that our method outperforms most of the state-of-the-art methods on the dataset that is tagged manually by a online medical company. Meanwhile, we also show that the unsegmented character has better performance than segmented word in this task.
Year
DOI
Venue
2017
10.1109/BIGCOM.2017.25
2017 3rd International Conference on Big Data Computing and Communications (BIGCOM)
Keywords
Field
DocType
short-term memory network,natural language processing,sentence,hierarchical architecture,Chinese segment,segmented word,Chinese NLP models,research area,Chinese messaging spam filtering,hierarchical bidirectional long short-term memory network
Architecture,World Wide Web,Computer science,Filter (signal processing),Long short term memory,Notice,Natural language processing,Artificial intelligence,Sentence,Machine learning
Conference
ISBN
Citations 
PageRank 
978-1-5386-3350-2
0
0.34
References 
Authors
18
6
Name
Order
Citations
PageRank
Wenliang Shao100.34
Chunhong Zhang2146.37
Tingting Sun365.56
Hang Li43821.94
Ji Yang5358.74
Xiaofeng Qiu601.69