Title
Mining the Ecosystem to Improve Type Inference for Dynamically Typed Languages
Abstract
Dynamically typed languages lack information about the types of variables in the source code. Developers care about this information as it supports program comprehension. Basic type inference techniques are helpful, but may yield many false positives or negatives. We propose to mine information from the software ecosystem on how frequently given types are inferred unambiguously to improve the quality of type inference for a single system. This paper presents an approach to augment existing type inference techniques by supplementing the information available in the source code of a project with data from other projects written in the same language. For all available projects, we track how often messages are sent to instance variables throughout the source code. Then, predictions for the type of a variable are made based on the messages sent to it. The evaluation of a proof-of-concept prototype shows that this approach works well for types that are sufficiently popular, like those found in the standard libraries, and tends to create false positives for unpopular or domain specific types. The false positives are, in most cases, fairly easily identifiable. Also, the evaluation data shows a substantial increase in the number of correctly inferred types when compared to the non-augmented type inference.
Year
DOI
Venue
2014
10.1145/2661136.2661141
Onward!
Keywords
Field
DocType
ecosystem mining,miscellaneous,object-oriented programming,type inference
Data mining,Information retrieval,Computer science,Source code,Type inference,Instance variable,Program comprehension,Software ecosystem,False positive paradox
Conference
Citations 
PageRank 
References 
6
0.44
20
Authors
3
Name
Order
Citations
PageRank
Boris Spasojevic161.12
Mircea Lungu254539.17
Oscar Nierstrasz32404346.86