Title
Enabling clone detection for ethereum via smart contract birthmarks
Abstract
The Ethereum ecosystem has introduced a pervasive blockchain platform with programmable transactions. Everyone is allowed to develop and deploy smart contracts. Such flexibility can lead to a large collection of similar contracts, i.e., clones, especially when Ethereum applications are highly domain-specific and may share similar functionalities within the same domain, e.g., token contracts often provide interfaces for money transfer and balance inquiry. While smart contract clones have a wide range of impact across different applications, e.g., security, they are relatively little studied. Although clone detection has been a long-standing research topic, blockchain smart contracts introduce new challenges, e.g., syntactic diversity due to trade-off between storage and execution, understanding high-level business logic etc.. In this paper, we highlighted the very first attempt to clone detection of Ethereum smart contracts. To overcome the new challenges, we introduce the concept of smart contract birthmark, i.e., a semantic-preserving and computable representation for smart contract bytecode. The birthmark captures high-level semantics by effectively sketching symbolic execution traces (e.g., data access dependencies, path conditions) and maintain syntactic regularities (e.g., type and number of instructions) as well. Then, the clone detection problem is reduced to a computation of statistical similarity between two contract birthmarks. We have implemented a clone detector called EClone and evaluated it on Ethereum. The empirical results demonstrated the potential of EClone in accurately identifying clones. We have also extended EClone for vulnerability search and managed to detect CVE-2018-10376 instances.
Year
DOI
Venue
2019
10.1109/ICPC.2019.00024
Proceedings of the 27th International Conference on Program Comprehension
Keywords
Field
DocType
clone detection, ethereum, smart contract birthmark, symbolic execution
Data mining,Computer science,Business logic,Symbolic execution,Data access,Security token,Bytecode,Syntax,Semantics,Distributed computing,Smart contract
Conference
ISSN
ISBN
Citations 
2643-7147
978-1-7281-1520-7
4
PageRank 
References 
Authors
0.42
12
5
Name
Order
Citations
PageRank
Han Liu1698.01
Zhiqiang Yang2134.35
Yu Jiang334656.49
Wenqi Zhao4212.96
Jia-guang Sun51807134.30