Title
Breaking reCAPTCHA: A Holistic Approach via Shape Recognition.
Abstract
CAPTCHAs are small puzzles which should be easily solvable by human beings but hard to solve for computers. They build a security cornerstone of the modern Internet service landscape, deployed in essentially any kind of login service, allowing to distinguish authorized human beings from automated attacks. One of the most popular and successful systems today is reCAPTCHA. As many other systems, reCAPTCHA is based on distorted images of words, where the distortion system evolves over time and determines different generations of the system. In this work, we analyze three recent generations of reCAPTCHA and present an algorithm that is capable of solving at least 5% of the challenges generated by these versions. We achieve this by applying a specialized variant of shape contexts proposed by Belongie et al. to match entire words at once. In order to handle the ellipse shaped distortions employed in one of the generations, we propose a machine learning algorithm that virtually eliminates the distortion. Finally, an improved shape matching strategy allows us to use word dictionaries of a reasonable size (with approximately 20,000 entries).
Year
DOI
Venue
2011
10.1007/978-3-642-21424-0_5
IFIP Advances in Information and Communication Technology
Field
DocType
Volume
Character recognition,Computer science,Login,Theoretical computer science,Internet service,CAPTCHA,Ellipse,Cornerstone,Distortion,Shape context
Conference
354
ISSN
Citations 
PageRank 
1868-4238
11
0.53
References 
Authors
12
4
Name
Order
Citations
PageRank
Paul Baecher112819.00
Niklas Büscher2474.58
Marc Fischlin3170992.71
Benjamin Milde4425.20