文章基本信息

标题：Using Source Code and Process Metrics for Defect Prediction - A Case Study of Three Algorithms and Dimensionality Reduction
本地全文：下载
作者：Wenjing Han ; Chung-Horng Lung ; Samuel Ajila 等
期刊名称：Journal of Software
印刷版ISSN：1796-217X
出版年度：2016
卷号：11
期号：9
页码：883-902
DOI：10.17706/jsw.11.9.883-902
出版社：Academy Publisher
摘要：Software defect prediction is very important in helping the software development team allocate test resource efficiently and better understand the root cause of defects. Furthermore, it can help find the reason why a project is failure-prone. This paper applies binary classification in predicting if a software component has a bug by using three widely used algorithms in machine learning: Random Forest (RF), Neural Networks (NN), and Support Vector Machine (SVM). The paper investigates the applications of these algorithms to the challenging issue of predicting defects in software components. Thus, this paper combines source code metrics and process metrics as indicators for the Eclipse environment using the aforementioned three algorithms for a sample of weekly Eclipse features. In addition, this paper deals with the complex issue of data dimension and our results confirm the predictive capabilities of using data dimension reduction techniques such as Variable Importance (VI) and PCA. In our case the results of using only two features (NBD_max and Pre-defects) are comparable to the results of using 61 features. Furthermore, we evaluates the performance of the three algorithms vis-à-vis the data and both Neural Network and Random Forest turned out to have the best fit.
其他关键词：Software defect prediction, data analysis, eclipse, machine learning techniques.