Web Spam Detection with Anti-Trust Rank - Citegraph

Paper Info

Title
Web Spam Detection with Anti-Trust Rank

Abstract
Spam pages on the web use various techniques to artificially achieve high rankings in search engine results. Human ex- perts can do a good job of identifying spam pages and pages whose information is of dubious quality, but it is practically infeasible to use human effort for a large number of pages. Similar to the approach in (1), we propose a method of se- lecting a seed set of pages to be evaluated by a human. We then use the link structure of the web and the manually labeled seed set, to detect other spam pages. Our experi- ments on the WebGraph dataset (3) show that our approach is very effective at detecting spam pages from a small seed set and achieves higher precision of spam page detection than the Trust Rank algorithm, apart from detecting pages with higher pageranks, on an average.

Year	Venue	Keywords
2006	AIRWeb	web spam,search engine
Field	DocType	Citations
Data mining,Search engine,Information retrieval,Webgraph,TrustRank,Computer science,Spamdexing	Conference	65
PageRank	References	Authors
2.32	2	2

Authors (2 rows)

Cited by (65 rows)

References (2 rows)

Name	Order	Citations	PageRank
Vijay Krishnan	1	193	11.34
Rashmi Raj	2	68	2.73

1