摘要:Banks have a vital role in the financial system and its survival is crucial for the stability of the economy. This research paper attempts to create an efficient and appropriate predictive model using a machine learning approach for an early warning system of bank failure. This paper uses data collected for failed and survived public and private sector banks for the period of 2000–2017 located in India. Bank-specific variables as well as macroeconomic and market structure variables have been used to identify the stress level for banks. Since the number of failed banks in India is very less in comparison to surviving banks, the problem of imbalanced data arises and most of the machine learning algorithms do not work very well with such data. This paper uses a novel approach Synthetic Minority Oversampling Technique (SMOTE) to convert imbalanced data in a balanced form. Lasso regression is used to reduce the redundant features from the failure predictive model. To avoid the bias and over-fitting in the models, random forest and AdaBoost techniques are applied and compared with the logistic regression to get the best predictive model. The result of the study holds its application to various stakeholders like shareholders, lenders and borrowers etc. to measure the financial stress of banks. This study offers an analytical approach ranging from the selection of the most significant bank failure specific indicators using lasso regression, converting data from imbalanced to balanced form using SMOTE and the choice of the appropriate machine learning techniques to predict the failure of the bank.
关键词:failure prediction; imbalanced data; SMOTE; lasso regression; random forest; AdaBoost