Title
Clustering test steps in natural language toward automating test automation
Abstract
For large industrial applications, system test cases are still often described in natural language (NL), and their number can reach thousands. Test automation is to automatically execute the test cases. Achieving test automation typically requires substantial manual effort for creating executable test scripts from these NL test cases. In particular, given that each NL test case consists of a sequence of NL test steps, testers first implement a test API method for each test step and then write a test script for invoking these test API methods sequentially for test automation. Across different test cases, multiple test steps can share semantic similarities, supposedly mapped to the same API method. However, due to numerous test steps in various NL forms under manual inspection, testers may not realize those semantically similar test steps and thus waste effort to implement duplicate test API methods for them. To address this issue, in this paper, we propose a new approach based on natural language processing to cluster similar NL test steps together such that the test steps in each cluster can be mapped to the same test API method. Our approach includes domain-specific word embedding training along with measurement based on Relaxed Word Mover’sDistance to analyze the similarity of test steps. Our approach also includes a technique to combine hierarchical agglomerative clustering and K-means clustering post-refinement to derive high-quality and manually-adjustable clustering results. The evaluation results of our approach on a large industrial mobile app, WeChat, show that our approach can cluster the test steps with high accuracy, substantially reducing the number of clusters and thus reducing the downstream manual effort. In particular, compared with the baseline approach, our approach achieves 79.8% improvement on cluster quality, reducing 65.9% number of clusters, i.e., the number of test API methods to be implemented.
Year
DOI
Venue
2020
10.1145/3368089.3417067
ESEC/FSE '20: 28th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering Virtual Event USA November, 2020
DocType
ISBN
Citations 
Conference
978-1-4503-7043-1
1
PageRank 
References 
Authors
0.37
16
10
Name
Order
Citations
PageRank
Linyi Li124.09
Zhenwen Li210.37
Weijie Zhang310.37
Jun Zhou44815.65
Pengcheng Wang510.37
Jing Wu610.37
Guanghua He710.37
Xia Zeng8321.84
Yuetang Deng9594.81
Tao Xie105978304.97