Title
Learning from 6,000 projects: lightweight cross-project anomaly detection
Abstract
Real production code contains lots of knowledge - on the domain, on the architecture, and on the environment. How can we leverage this knowledge in new projects? Using a novel lightweight source code parser, we have mined more than 6,000 open source Linux projects (totaling 200,000,000 lines of code) to obtain 16,000,000 temporal properties reflecting normal interface usage. New projects can be checked against these rules to detect anomalies - that is, code that deviates from the wisdom of the crowds. In a sample of 20 projects, ~25% of the top-ranked anomalies uncovered actual code smells or defects.
Year
DOI
Venue
2010
10.1145/1831708.1831723
ISSTA
Keywords
DocType
Citations 
normal interface usage,lightweight cross-project anomaly detection,real production code,top-ranked anomaly,temporal property,novel lightweight source code,open source linux project,actual code,new project,lines of code,product code,anomaly detection,formal concept analysis,source code
Conference
42
PageRank 
References 
Authors
2.46
14
3
Name
Order
Citations
PageRank
Natalie Gruska1433.15
Andrzej Wasylkowski225010.73
Andreas Zeller35697303.71