Title
Adversarial Authorship Attribution in Open-Source Projects.
Abstract
Open-source software is open to anyone by design, whether it is a community of developers, hackers or malicious users. Authors of open-source software typically hide their identity through nicknames and avatars. However, they have no protection against authorship attribution techniques that are able to create software author profiles just by analyzing software characteristics. In this paper we present an author imitation attack that allows to deceive current authorship attribution systems and mimic a coding style of a target developer. Withing this context we explore the potential of the existing attribution techniques to be deceived. Our results show that we are able to imitate the coding style of the developers based on the data collected from the popular source code repository, GitHub. To subvert author imitation attack, we propose a novel author obfuscation approach that allows us to hide the coding style of the author. Unlike existing obfuscation tools, this new obfuscation technique uses transformations that preserve code readability. We assess the effectiveness of our attacks on several datasets produced by actual developers from GitHub, and participants of the GoogleCodeJam competition. Throughout our experiments we show that the author hiding can be achieved by making sensible transformations which significantly reduce the likelihood of identifying the author's style to 0% by current authorship attribution systems.
Year
DOI
Venue
2019
10.1145/3292006.3300032
CODASPY
Keywords
Field
DocType
Authorship attribution, obfuscation, imitation, open-source software, adversarial, attacks
Source code,Computer science,Computer security,Coding (social sciences),Readability,Hacker,Attribution,Software,Imitation,Obfuscation
Conference
ISBN
Citations 
PageRank 
978-1-4503-6099-9
0
0.34
References 
Authors
0
4
Name
Order
Citations
PageRank
Alina Matyukhina100.68
Natalia Stakhanova233627.48
Mila Dalla Preda320819.18
Celine Perley400.34