Abstract | ||
---|---|---|
Phishing attacks are a significant threat to users of the Internet, causing tremendous economic loss every year. In combating phish, industry relies heavily on manual verification to achieve a low false positive rate, which, however, tends to be slow in responding to the huge volume of unique phishing URLs created by toolkits. Our goal here is to combine the best aspects of human verified blacklists and heuristic-based methods, i.e., the low false positive rate of the former and the broad and fast coverage of the latter. To this end, we present the design and evaluation of a hierarchical blacklist-enhanced phish detection framework. The key insight behind our detection algorithm is to leverage existing human-verified blacklists and apply the shingling technique, a popular near-duplicate detection algorithm used by search engines, to detect phish in a probabilistic fashion with very high accuracy. To achieve an extremely low false positive rate, we use a filtering module in our layered system, harnessing the power of search engines via information retrieval techniques to correct false positives. Comprehensive experiments over a diverse spectrum of data sources show that our method achieves 0% false positive rate (FP) with a true positive rate (TP) of 67.15% using search-oriented filtering, and 0.03% FP and 73.53% TP without the filtering module. With incremental model building capability via a sliding window mechanism, our approach is able to adapt quickly to new phishing variants, and is thus more responsive to the evolving attacks. |
Year | DOI | Venue |
---|---|---|
2010 | 10.1007/978-3-642-15497-3_17 | ESORICS |
Keywords | Field | DocType |
low false positive rate,false positive,hierarchical adaptive probabilistic approach,popular near-duplicate detection algorithm,combating phish,true positive rate,false positive rate,detection algorithm,search engine,zero hour phish detection,hierarchical blacklist-enhanced phish detection,new phishing variant,sliding window,spectrum,information retrieval,model building | False positive rate,Data mining,Sliding window protocol,Shingling,Phishing,Computer science,Computer security,Incremental build model,Filter (signal processing),Probabilistic logic,False positive paradox | Conference |
Volume | ISSN | ISBN |
6345 | 0302-9743 | 3-642-15496-4 |
Citations | PageRank | References |
7 | 0.68 | 15 |
Authors | ||
4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Guang Xiang | 1 | 382 | 18.31 |
Bryan A. Pendleton | 2 | 428 | 34.15 |
Jason Hong | 3 | 6706 | 518.75 |
Rosé Carolyn | 4 | 2126 | 222.80 |