Abstract | ||
---|---|---|
Duplicate Bug Detection is the problem of identifying whether a newly reported bug is a duplicate of an existing bug in the system and retrieving the original or similar bugs from the past. This is required to avoid costly rediscovery and redundant work. In typical software projects, the number of duplicate bugs reported may run into the order of thousands, making it expensive in terms of cost and time for manual intervention. This makes the problem of duplicate or similar bug detection an important one in Software Engineering domain. However, an automated solution for the same is not quite accurate yet in practice, in spite of many reported approaches using various machine learning techniques. In this work, we propose a retrieval and classification model using Siamese Convolutional Neural Networks (CNN) and Long Short Term Memory (LSTM) for accurate detection and retrieval of duplicate and similar bugs. We report an accuracy close to 90% and recall rate close to 80%, which makes possible the practical use of such a system. We describe our model in detail along with related discussions from the Deep Learning domain. By presenting the detailed experimental results, we illustrate the effectiveness of the model in practical systems, including for repositories for which supervised training data is not available. |
Year | DOI | Venue |
---|---|---|
2017 | 10.1109/ICSME.2017.69 | 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME) |
Keywords | Field | DocType |
Information Retrieval,Duplicate Bug Detection,Deep Learning,Natural Language Processing,Word Embeddings,Siamese Networks,Convolutional Neural Networks,Long Short Term Memory | Data mining,Recall rate,Convolutional neural network,Computer science,Software bug,Long short term memory,Software,Supervised training,Artificial intelligence,Deep learning,Artificial neural network,Machine learning | Conference |
ISSN | ISBN | Citations |
1063-6773 | 978-1-5386-0993-4 | 7 |
PageRank | References | Authors |
0.43 | 23 | 5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Jayati Deshmukh | 1 | 7 | 1.10 |
K. M. Annervaz | 2 | 17 | 4.58 |
Sanjay Podder | 3 | 38 | 11.58 |
Shubhashis Sengupta | 4 | 158 | 21.17 |
Neville Dubash | 5 | 34 | 2.36 |