Title
How to Learn Klingon without a Dictionary: Detection and Measurement of Black Keywords Used by the Underground Economy
Abstract
Online underground economy is an important channel that connects the merchants of illegal products and their buyers, which is also constantly monitored by legal authorities. As one common way for evasion, the merchants and buyers together create a vocabulary of jargons (called "black keywords" in this paper) to disguise the transaction (e.g., "smack" is one street name for "heroin" [1]). Black keywords are often "unfriendly" to the outsiders, which are created by either distorting the original meaning of common words or tweaking other black keywords. Understanding black keywords is of great importance to track and disrupt the underground economy, but it is also prohibitively difficult: the investigators have to infiltrate the inner circle of criminals to learn their meanings, a task both risky and time-consuming. In this paper, we make the first attempt towards capturing and understanding the ever-changing black keywords. We investigated the underground business promoted through blackhat SEO (search engine optimization) and demonstrate that the black keywords targeted by the SEOers can be discovered through a fully automated approach. Our insights are two-fold: first, the pages indexed under black keywords are more likely to contain malicious or fraudulent content (e.g., SEO pages) and alarmed by off-the-shelf detectors, second, people tend to query multiple similar black keywords to find the merchandise. Therefore, we could infer whether a search keyword is "black" by inspecting the associated search results and then use the related search queries to extend our findings. To this end, we built a system called KDES (Keywords Detection and Expansion System), and applied it to the search results of Baidu, China's top search engine. So far, we have already identified 478,879 black keywords which were clustered under 1,522 core words based on text similarity. We further extracted the information like emails, mobile phone numbers and instant messenger IDs from the pages and domains relevant to the underground business. Such information helps us gain better understanding about the underground economy of China in particular. In addition, our work could help search engine vendors purify the search results and disrupt the channel of the underground market. Our co-authors from Baidu compared our results with their blacklist, found many of them (e.g., long-tail and obfuscated keywords) were not in it, and then added them to Baidu's internal blacklist.
Year
DOI
Venue
2017
10.1109/SP.2017.11
2017 IEEE Symposium on Security and Privacy (SP)
Keywords
Field
DocType
Klingon,black keywords detection,black keywords measurement,online underground economy,illegal products,legal authorities,jargons,underground business,blackhat SEO,search engine optimization,SEO pages,search keyword,search queries,KDES,keywords detection and expansion system,Baidu,text similarity,China,underground market
Original meaning,World Wide Web,Computer science,Computer security,Search engine optimization,Blacklist,Tweaking,Economy,Mobile phone,Obfuscation,Database transaction,Vocabulary
Conference
ISSN
ISBN
Citations 
1081-6011
978-1-5090-5534-0
2
PageRank 
References 
Authors
0.64
33
9
Name
Order
Citations
PageRank
Yang Hao1162.94
Xiulin Ma220.97
Kun Du3337.22
Zhou Li444130.45
Haixin Duan523736.86
XiaoDong Su631.32
Guang Liu731.32
Zhifeng Geng831.32
Jianping Wu9743121.01