期刊名称:Advances in Mathematical Finance and Applications
印刷版ISSN:2538-5569
电子版ISSN:2645-4610
出版年度:2022
卷号:7
期号:3
页码:811-835
DOI:10.22034/amfa.2021.1905055.1461
语种:English
出版社:Islamic Azad University of Arak
摘要:From machine learning perspective, the problem of predicting financial distress is challenging because the distribution of the classes is extremely imbalanced. The goal of this study was comparing the performance of financial distress prediction models for the imbalanced data sets with different proportions. In this study, the data of the previous year before financial distress was used for 760 company year for the time period of 2007-2017. Besides using traditional classifications such as logistic regression, linear discriminant analysis, artificial neural network, and the classification models of least square support vector machine with four kernel functions, random forest and the Knn algorithm, the measures of the area under the curve and Friedman and Nemenyi tests were also utilized to determine the average rank and the difference significance of the Auc of the models. For selecting the models´ optimal parameters, the combined method of grid search optimization and cross validation was used. The results of this experimental study showed that for the balanced and imbalanced datasets with lower proportions, the best performance was for the random forest. For more imbalanced datasets, the best performance belonged to the least square support vector machine with sigmoid, radial, and linear kernel functions; performance of Knn algorithm had no significant difference from the other models and the performance of the artificial neural network was average or appropriate. Also, the performances of the linear logistic regression and linear discriminant analysis were weaker than other nonlinear models.