摘要:The number of bug reports in complex software increases dramatically. Since bugs are still triaged manually, bug triage or assignment is a labor-intensive and time-consuming task. Without knowledge about the structure of the software, testers often specify the component of a new bug incorrectly. Meanwhile, it is difficult for triagers to determine the component of the bug only by its description. For instance, we dig out the components of 28,829 bugs from the Eclipse bug project, which have been specified incorrectly and modified at least once, and indicated that these bugs have to be reassigned and the process of bug fixing has to be delayed. The average time of fixing incorrectly specified bugs is longer than that of correctly specified ones. In order to solve the problem automatically, we use historical fixed bug reports as training corpus and build classifiers based on support vector machines and Naïve Bayes to predict the component of a new bug. The best predicting precision reaches up to 81.21% on our validation corpus of Eclipse project.
关键词:bug reports;bug triage;text classification;predictive model