Abstract | ||
---|---|---|
We study the usability of linguistic features in the Web spam classification task. The features were computed on two Web spam corpora: Webspam-Uk2006 and Webspam-Uk2007, we make them publicly available for other researchers. Preliminary analysis seems to indicate that certain linguistic features may be useful for the spam-detection task when combined with features studied elsewhere. |
Year | DOI | Venue |
---|---|---|
2008 | 10.1145/1451983.1451990 | AIRWeb |
Keywords | Field | DocType |
linguistic feature,spam-detection task,web spam classification task,web spam detection,preliminary analysis,web spam corpus,certain linguistic feature,preliminary study,web spam | World Wide Web,Information retrieval,Computer science,Usability,Spambot,Forum spam,Linguistics,Spamdexing | Conference |
Citations | PageRank | References |
32 | 1.36 | 11 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Jakub Piskorski | 1 | 435 | 50.04 |
Marcin Sydow | 2 | 264 | 22.71 |
Dawid Weiss | 3 | 585 | 32.08 |