Title
Towards Accurate Duplicate Bug Retrieval Using Deep Learning Techniques
Abstract
Duplicate Bug Detection is the problem of identifying whether a newly reported bug is a duplicate of an existing bug in the system and retrieving the original or similar bugs from the past. This is required to avoid costly rediscovery and redundant work. In typical software projects, the number of duplicate bugs reported may run into the order of thousands, making it expensive in terms of cost and time for manual intervention. This makes the problem of duplicate or similar bug detection an important one in Software Engineering domain. However, an automated solution for the same is not quite accurate yet in practice, in spite of many reported approaches using various machine learning techniques. In this work, we propose a retrieval and classification model using Siamese Convolutional Neural Networks (CNN) and Long Short Term Memory (LSTM) for accurate detection and retrieval of duplicate and similar bugs. We report an accuracy close to 90% and recall rate close to 80%, which makes possible the practical use of such a system. We describe our model in detail along with related discussions from the Deep Learning domain. By presenting the detailed experimental results, we illustrate the effectiveness of the model in practical systems, including for repositories for which supervised training data is not available.
Year
DOI
Venue
2017
10.1109/ICSME.2017.69
2017 IEEE International Conference on Software Maintenance and Evolution (ICSME)
Keywords
Field
DocType
Information Retrieval,Duplicate Bug Detection,Deep Learning,Natural Language Processing,Word Embeddings,Siamese Networks,Convolutional Neural Networks,Long Short Term Memory
Data mining,Recall rate,Convolutional neural network,Computer science,Software bug,Long short term memory,Software,Supervised training,Artificial intelligence,Deep learning,Artificial neural network,Machine learning
Conference
ISSN
ISBN
Citations 
1063-6773
978-1-5386-0993-4
7
PageRank 
References 
Authors
0.43
23
5
Name
Order
Citations
PageRank
Jayati Deshmukh171.10
K. M. Annervaz2174.58
Sanjay Podder33811.58
Shubhashis Sengupta415821.17
Neville Dubash5342.36