Title
A large scale study of programming languages and code quality in github
Abstract
What is the effect of programming languages on software quality? This question has been a topic of much debate for a very long time. In this study, we gather a very large data set from GitHub (729 projects, 80 Million SLOC, 29,000 authors, 1.5 million commits, in 17 languages) in an attempt to shed some empirical light on this question. This reasonably large sample size allows us to use a mixed-methods approach, combining multiple regression modeling with visualization and text analytics, to study the effect of language features such as static v.s. dynamic typing, strong v.s. weak typing on software quality. By triangulating findings from different methods, and controlling for confounding effects such as team size, project size, and project history, we report that language design does have a significant, but modest effect on software quality. Most notably, it does appear that strong typing is modestly better than weak typing, and among functional languages, static typing is also somewhat better than dynamic typing. We also find that functional languages are somewhat better than procedural languages. It is worth noting that these modest effects arising from language design are overwhelmingly dominated by the process factors such as project size, team size, and commit size. However, we hasten to caution the reader that even these modest effects might quite possibly be due to other, intangible process factors, e.g., the preference of certain personality types for functional, static and strongly typed languages.
Year
DOI
Venue
2014
10.1145/2635868.2635922
SIGSOFT FSE
Keywords
Field
DocType
code quality,programming language,software domain,language constructs and features,regression analysis,type system,empirical research,bug fix
Procedural programming,Programming language,Functional programming,Computer science,Visualization,Commit,Strong and weak typing,Software quality,Sample size determination,Empirical research
Conference
Citations 
PageRank 
References 
94
2.63
18
Authors
4
Name
Order
Citations
PageRank
Baishakhi Ray173734.84
Daryl Posnett257819.11
Vladimir Filkov3150375.32
Premkumar Devanbu44956357.68