Title
A Large-Scale Analysis of the Semantic Password Model and Linguistic Patterns in Passwords
Abstract
AbstractIn this article, we present a thorough evaluation of semantic password grammars. We report multifactorial experiments that test the impact of sample size, probability smoothing, and linguistic information on password cracking. The semantic grammars are compared with state-of-the-art probabilistic context-free grammar (PCFG) and neural network models, and tested in cross-validation and A vs. B scenarios. We present results that reveal the contributions of part-of-speech (syntactic) and semantic patterns, and suggest that the former are more consequential to the security of passwords. Our results show that in many cases PCFGs are still competitive models compared to their latest neural network counterparts. In addition, we show that there is little performance gain in training PCFGs with more than 1 million passwords. We present qualitative analyses of four password leaks (Mate1, 000webhost, Comcast, and RockYou) based on trained semantic grammars, and derive graphical models that capture high-level dependencies between token classes. Finally, we confirm the similarity inferences from our qualitative analysis by examining the effectiveness of grammars trained and tested on all pairs of leaks.
Year
DOI
Venue
2021
10.1145/3448608
ACM Transactions on Privacy and Security
Keywords
DocType
Volume
Password guessing, PCFG, probabilistic context-free grammars, semantics
Journal
24
Issue
ISSN
Citations 
3
2471-2566
0
PageRank 
References 
Authors
0.34
0
3
Name
Order
Citations
PageRank
Rafael Veras1815.18
Christopher Collins2103749.74
Julie Thorpe346632.17