Abstract | ||
---|---|---|
Analysis of massive codebases (\"big code\") presents an opportunity for drawing insights about programming practice and enabling code reuse. One of the main challenges in analyzing big code is finding a representation that captures sufficient semantic information, can be constructed efficiently, and is amenable to meaningful comparison operations. We present a formal framework for representing code in large codebases. In our framework, the semantic descriptor for each code snippet is a partial temporal specification that captures the sequences of method invocations on an API. The main idea is to represent partial temporal specifications as symbolic automata--automata where transitions may be labeled by variables, and a variable can be substituted by a letter, a word, or a regular language. Using symbolic automata, we construct an abstract domain for static analysis of big code, capturing both the partialness of a specification and the precision of a specification. We show interesting relationships between lattice operations of this domain and common operators for manipulating partial temporal specifications, such as building a more informative specification by consolidating two partial specifications, and comparing partial temporal specifications. |
Year | DOI | Venue |
---|---|---|
2016 | 10.1007/s00236-015-0234-1 | Acta Inf. |
Field | DocType | Volume |
Symbolic simulation,Program Dependence Graph,Programming language,Computer science,Static analysis,Theoretical computer science,Operator (computer programming),Code reuse,Equivalence class,Regular language,Snippet | Journal | 53 |
Issue | ISSN | Citations |
4 | 1432-0525 | 1 |
PageRank | References | Authors |
0.35 | 26 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Hila Peleg | 1 | 47 | 5.04 |
Sharon Shoham | 2 | 342 | 26.67 |
Eran Yahav | 3 | 1706 | 79.49 |
Hongseok Yang | 4 | 2313 | 115.85 |