Title
Reverse Engineering Variability From Natural Language Documents: A Systematic Literature Review
Abstract
Identifying features and their relations (i.e., variation points) is crucial in the process of migrating single software systems to software product lines (SPL). Various approaches have been proposed to perform feature extraction automatically from different artifacts, for instance, feature location in legacy code. Usually such approaches a) omit variability information and b) rely on artifacts that reside in advanced phases of the development process, thus, being only of limited usefulness in the context of SPLs. In contrast, feature and variability extraction from natural language (NL) documents is more favorable, because a mapping to several other artifacts is usually established from the very beginning. In this paper, we provide a multi-dimensional overview of approaches for feature and variability extraction from NL documents by means of a systematic literature review (SLR). We selected 25 primary studies and carefully evaluated them regarding different aspects such as techniques used, tool support, or accuracy of the results. In a nutshell, our key insights are that i) standard NLP techniques are commonly used, ii) post-processing often includes clustering & machine learning algorithms, iii) only in rare cases, the approaches support variability extraction, iv) tool support, apart from text pre-processing is often en not available, and v) many approaches lack a comprehensive evaluation. Based on these observations, we derive future challenges, arguing that more effort need to be invested for making such approaches applicable in practice.
Year
DOI
Venue
2017
10.1145/3106195.3106207
21ST INTERNATIONAL SYSTEMS & SOFTWARE PRODUCT LINE CONFERENCE (SPLC 2017), VOL 1
Keywords
Field
DocType
Feature Identification, Variability Extraction, Reverse Engineering, Software Product Lines, Natural Language Documents, Systematic Literature Review
Systematic review,Computer science,Reverse engineering,Feature extraction,Software system,Software,Natural language,Natural language processing,Legacy code,Artificial intelligence,Cluster analysis
Conference
Citations 
PageRank 
References 
6
0.46
23
Authors
3
Name
Order
Citations
PageRank
Yang Li1659125.00
Sandro Schulze225923.43
Gunter Saake33255639.75