Title
The Comparison of Chinese Spam Filter Based on Generative Model and Discriminative Model
Abstract
Previous studies have shown that discriminative model is better than generative model for spam filtering, which is tested on the English dataset. But the study on Chinese Spam Filter is rare. So we compared the performance of Bogo: a classical generative model, Logistic Regression (LR) and Relaxed Online SVM (ROSVM): two typical discriminative models on the Chinese dataset. Bogo system adopts a generative model, which is based on Bayesian algorithm. We choose the public Chinese datasets: TREC06c, SEWM 2008, SEWM 2010, SEWM 2011, as the test dataset with immediate feedback. The discriminative model gives the better results than the generative model based on spam filter. ROSVM gives the best performance on Chinese spam filter.
Year
DOI
Venue
2011
10.1109/IALP.2011.64
IALP
Keywords
Field
DocType
bogo system,logistic regression,belief networks,chinese spam filter,typical discriminative model,bogo,english dataset,public chinese datasets,regression analysis,rosvm,bayesian algorithm,relaxed online svm,unsolicited e-mail,public chinese dataset,generative model,spam filter,test dataset,information filters,natural language processing,classical generative model,discriminative model,support vector machines,chinese dataset,lr
Bayesian algorithm,Pattern recognition,Computer science,Regression analysis,Support vector machine,Filter (signal processing),Artificial intelligence,Logistic regression,Discriminative model,Machine learning,Generative model
Conference
ISBN
Citations 
PageRank 
978-1-4577-1733-8
1
0.38
References 
Authors
4
4
Name
Order
Citations
PageRank
Yong Han113.42
Yingying Wang22111.64
Huafu Ding310.38
Haoliang Qi413026.56