Title
Web Spam Detection with Anti-Trust Rank
Abstract
Spam pages on the web use various techniques to artificially achieve high rankings in search engine results. Human ex- perts can do a good job of identifying spam pages and pages whose information is of dubious quality, but it is practically infeasible to use human effort for a large number of pages. Similar to the approach in (1), we propose a method of se- lecting a seed set of pages to be evaluated by a human. We then use the link structure of the web and the manually labeled seed set, to detect other spam pages. Our experi- ments on the WebGraph dataset (3) show that our approach is very effective at detecting spam pages from a small seed set and achieves higher precision of spam page detection than the Trust Rank algorithm, apart from detecting pages with higher pageranks, on an average.
Year
Venue
Keywords
2006
AIRWeb
web spam,search engine
Field
DocType
Citations 
Data mining,Search engine,Information retrieval,Webgraph,TrustRank,Computer science,Spamdexing
Conference
65
PageRank 
References 
Authors
2.32
2
2
Name
Order
Citations
PageRank
Vijay Krishnan119311.34
Rashmi Raj2682.73