Leveraging Machine Learning for Software Redocumentation - Citegraph

Paper Info

Title
Leveraging Machine Learning for Software Redocumentation

Abstract
Source code comments contain key information about the underlying software system. Many redocumentation approaches, however, cannot exploit this valuable source of information. This is mainly due to the fact that not all comments have the same goals and target audience and can therefore only be used selectively for redocumentation. Performing a required classification manually, e.g. in the form of heuristic rules, is usually time-consuming and error-prone and strongly dependent on programming languages and guidelines of concrete software systems. By leveraging machine learning, it should be possible to classify comments and thus transfer valuable information from the source code into documentation with less effort but the same quality. We applied different machine learning techniques to a COBOL legacy system and compared the results with industry-strength heuristic classification. As a result, we found that machine learning outperforms the heuristics in number of errors and less effort.

Year	DOI	Venue
2020	10.1109/SANER48275.2020.9054838	2020 IEEE 27th International Conference on Software Analysis, Evolution and Reengineering (SANER)
Keywords	DocType	ISSN
software redocumentation,legacy system,comment classification pipeline,heuristic rules,machine learning,NLP,CNNs	Conference	1534-5351
ISBN	Citations	PageRank
978-1-7281-5144-1	0	0.34
References	Authors
8	5

Authors (5 rows)

Cited by (0 rows)

References (8 rows)

Name	Order	Citations	PageRank
Verena Geist	1	29	7.92
Michael Moser	2	18	5.88
Josef Pichler	3	28	6.79
Stefanie Beyer	4	35	3.20
Martin Pinzger	5	2147	120.49

1