skip to main content
research-article

An Analysis of Software Bug Reports Using Machine Learning Techniques

Published: 29 June 2019 Publication History

Abstract

Bug tracking systems manage bug reports for assuring the quality of software products. A bug report (alsoreferred as trouble, problem, ticket or defect) contains several features for problem management and resolution purposes. Severity and priority are two essential features of a bug report that define the effect level and fixing order of the bug. Determining these features is challenging and depends heavily on human being, e.g., software developers or system operators, especially for assessing a large number of error and warning events occurring on software products or network services. This study first proposes a comparison of machine learning techniques for assessing severity and priority for software bug reports and then chooses an approach of using optimal decision trees, or random forest, for further investigation. This approach aims at constructing multiple decision trees based on the subsets of the existing bug dataset and features, and then selecting the best decision trees to assess the severity and priority of new bugs. The approach can be applied for detecting and forecasting faults in large, complex communication networks and distributed systems today. We have presented the applicability of random forest for bug report analysis and performed several experiments on software bug datasets obtained from open source bug tracking systems. Random forest yields an average accuracy score of 0.75 that can be sufficient for assisting system operators in determining these features. We have provided some analysis of the experimental results.

References

[1]
Armbrust M, Fox A, Griffith R, Joseph AD, Katz R, Konwinski A, Lee G, Patterson D, Rabkin A, Stoica I, and Zaharia M A view of cloud computing ACM Commun 2010 53 4 50-58
[2]
Tran HM, Lange C, Chulkov G, Schönwälder J, and Kohlhase M Applying semantic techniques to search and analyze bug tracking data J Netw Syst Manag 2009 17 3 285-308
[3]
Breiman L Random Forests Mach Learn 2001 45 1 5-32
[4]
Wang T, Zhang W, Wei J, Zhong H. Fault detection for cloud computing systems with correlation analysis. In: Proceedings of IFIP/IEEE international symposium on integrated network management IM’15; 2015. p. 652–8.
[5]
Ferreira VC, Carrano RC, Silva JO, Albuquerque CVN, Muchaluat-Saade DC, Passos DG. Fault detection and diagnosis for solar-powered wireless mesh networks using machine learning. In: Proceedings of IFIP/IEEE symposium on integrated network and service management (IM’17); 2017. p. 456–62.
[6]
Duenas JC, Navarro JM, Parada HA, Andion J, and Cuadrado F Applying event stream processing to network online failure prediction Commun Mag 2018 56 1 166-170
[7]
Tan JS, Ho CK, Lim AH, and Ramly MR Predicting network faults using Random Forest and C5.0 Int J Eng Technol 2018 7 2.14 93-96
[8]
Tran HM and Le ST Software bug ontology supporting semantic bug search on peer-to-peer networks New Gen Comput 2014 32 2 145-162
[9]
Tran HM and Schönwälder J Discaria—distributed case-based reasoning system for fault management IEEE Trans Netw Serv Manag 2015 12 4 540-553
[10]
Hausheer D, Morariu C. Distributed Test-Lab: EMANICSLab. In: The 2nd international summer school on network and service management (ISSNSM ’08). Switzerland: University of Zurich; 2008.
[11]
Sinnamon RM, Andrews JD. Fault tree analysis and binary decision diagrams. In: Proceedings in reliability and maintainability annual symposium; 1996. p. 215–22.
[12]
Reay KA and Andrews JD A fault tree analysis strategy using binary decision diagrams Reliab Eng Syst Saf 2002 78 1 45-56
[13]
Guo L, Ma Y, Cukic B, Singh H. Robust prediction of fault-proneness by Random Forests. In: Proceedings of 15th international symposium on software reliability engineering (ISSRE’04). Washington, DC: IEEE; 2004. p. 417–28.
[14]
Francis P, Leon D, Minch M, Podgurski A. Tree-based methods for classifying software failures. In: Proceedings of 15th international symposium on software reliability engineering (ISSRE’04). Washington, DC: IEEE; 2004. p. 451–62.
[15]
Zheng AX, Lloyd J, Brewer E. Failure diagnosis using decision trees. In: Proceedings of 1st international conference on autonomic computing (ICAC’04). Washington, DC: IEEE Computer Society; 2004. p. 36–43.
[16]
Quinlan JR C4.5: programs for machine learning 1993 San Francisco Morgan Kaufmann Publishers
[17]
Tran HM, Nguyen SV, Le ST, and Vu QT Applying data analytic techniques for fault detection Trans Large Scale Data Knowl Cent Syst (TLDKS) 2017 31 30-46
[18]
Tran HM, Nguyen SV, Ha SVU, Le TQ. An analysis of software bug reports using Random Forest. In: Proceedings of 5th international conference on future data and security engineering (FDSE’18). Springer; 2018. p. 1–13.
[19]
Bishop CM Neural networks for pattern recognition 1995 New York Oxford University Press Inc
[20]
Aha DW, Kibler D, and Albert MK Instance-based learning algorithms Mach Learn 1991 6 1 37-66
[21]
Cortes C and Vapnik V Support-vector networks Mach Learn 1995 20 3 273-297
[22]
Rish I. An empirical study of the Naive Bayes classifier. In: IJCAI 2001 workshop on empirical methods in artificial intelligence, vol 3(22). 2001. p. 41–6.
[23]
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, and Duchesnay E Scikit-learn: machine learning in python J Mach Learn Res 2011 12 2825-2830
[24]
Oliphant T A guide to NumPy 2006 New York Trelgol Publishing
[25]
Silva FB Learning SciPy for numerical and scientific computing 2013 Birmingham Packt Publishing
[26]
Mozilla bug tracking system. https://bugzilla.mozilla.org/. Accessed Aug 2017.
[27]
Launchpad bugs. https://bugs.launchpad.net/. Accessed Aug 2017.
[28]
Mantis bug tracker. https://www.mantisbt.org/. Accessed Aug 2017.
[29]
Debian bug tracking system. https://www.debian.org/Bugs/. Accessed Aug 2017.
[30]
OpenStack Cloud Software. http://www.openstack.org/ (2010). Accessed Aug 2017.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image SN Computer Science
SN Computer Science  Volume 1, Issue 1
Jan 2020
823 pages

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 29 June 2019
Accepted: 10 June 2019
Received: 15 April 2019

Author Tags

  1. Network fault detection
  2. Fault management
  3. Machine learning
  4. Data analytics
  5. Software bug report

Qualifiers

  • Research-article

Funding Sources

  • Vietnam National University in Ho Chi Minh City

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 18 Aug 2024

Other Metrics

Citations

Cited By

View all

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media

-