期刊名称:International Journal of Advanced Computer Science and Applications(IJACSA)
印刷版ISSN:2158-107X
电子版ISSN:2156-5570
出版年度:2022
卷号:13
期号:3
DOI:10.14569/IJACSA.2022.0130318
语种:English
出版社:Science and Information Society (SAI)
摘要:Health insurance plays an integral part of society's economic well-being; the existence of fraud creates innumerable challenges in providing affordable health care support for the people. In order to reduce the losses incurred due to fraud, there is a need for a powerful model to predict fraud on the data accurately. The purpose of the paper is to implement a more sophisticated technique for fraud detection using machine learning: HEMClust (Heterogeneous Ensemble Model with Clustering). The first phase of the model aims in improving the quality of claims data by providing effective preprocessing. The second stage addresses the overlapping instances in provider specialties by grouping them using k-prototype clustering. The final stage includes building the model using a heterogeneous stacking ensemble that performs classification on multiple levels, with four base learners in level 0 and a meta learner in level 1. The results were assessed using evaluation metrics and statistical tests such as Friedman and Nememyi to compare the performance of base classifiers against the proposed HEMClust. The empirical results show that the HEMClust produced 94% and 96% overall precision-recall rates on the dataset, which was an increase of 45% to 50% in the fraud detection rate for each class in the data.