Reverse engineering variability from requirement documents based on probabilistic relevance and word embedding. - Citegraph

Paper Info

Title
Reverse engineering variability from requirement documents based on probabilistic relevance and word embedding.

Abstract
Feature and variability extraction from different artifacts is an indispensable activity to support systematic integration of single software systems and Software Product Line (SPL). Beyond manually extracting variability, a variety of approaches, such as feature location in source code and feature extraction in requirements, has been proposed to provide an automatic identification of features and their variation points. Compared with source code, requirements contain more complete variability information and provide traceability links to other artifacts from early development phases. In this paper, we propose a method to automatically extract features and relationships based on a probabilistic relevance and word embedding. In particular, our technique consists of three steps: First, we apply word2vec to obtain a prediction model, which we use to determine the word level similarity of requirements. Second, based on word level similarity and the significance of a word in a domain, we compute the requirements level similarity using probabilistic relevance. Third, we adopt hierarchical clustering to group features and we define four criteria to detect variation points between identified features. We perform a case study to evaluate the usability and robustness of our method and to compare it with the results of other related approaches. Initial results reveal that our approach identifies the majority of features correctly and also extracts variability information with reasonable accuracy.

Year	Venue	Field
2018	SPLC	Data mining,Source code,Computer science,Reverse engineering,Software system,Feature extraction,Control engineering,Software product line,Probabilistic logic,Word2vec,Word embedding
DocType	Citations	PageRank
Conference	1	0.35
References	Authors
19	3

Authors (3 rows)

Cited by (1 rows)

References (19 rows)

Name	Order	Citations	PageRank
Yang Li	1	659	125.00
Sandro Schulze	2	259	23.43
Gunter Saake	3	3255	639.75

1