Title
Symbolic automata for representing big code
Abstract
Analysis of massive codebases (\"big code\") presents an opportunity for drawing insights about programming practice and enabling code reuse. One of the main challenges in analyzing big code is finding a representation that captures sufficient semantic information, can be constructed efficiently, and is amenable to meaningful comparison operations. We present a formal framework for representing code in large codebases. In our framework, the semantic descriptor for each code snippet is a partial temporal specification that captures the sequences of method invocations on an API. The main idea is to represent partial temporal specifications as symbolic automata--automata where transitions may be labeled by variables, and a variable can be substituted by a letter, a word, or a regular language. Using symbolic automata, we construct an abstract domain for static analysis of big code, capturing both the partialness of a specification and the precision of a specification. We show interesting relationships between lattice operations of this domain and common operators for manipulating partial temporal specifications, such as building a more informative specification by consolidating two partial specifications, and comparing partial temporal specifications.
Year
DOI
Venue
2016
10.1007/s00236-015-0234-1
Acta Inf.
Field
DocType
Volume
Symbolic simulation,Program Dependence Graph,Programming language,Computer science,Static analysis,Theoretical computer science,Operator (computer programming),Code reuse,Equivalence class,Regular language,Snippet
Journal
53
Issue
ISSN
Citations 
4
1432-0525
1
PageRank 
References 
Authors
0.35
26
4
Name
Order
Citations
PageRank
Hila Peleg1475.04
Sharon Shoham234226.67
Eran Yahav3170679.49
Hongseok Yang42313115.85