首页    期刊浏览 2024年12月01日 星期日
登录注册

文章基本信息

  • 标题:Empirical Oversampling Threshold Strategy for Machine Learning Performance Optimisation in Insurance Fraud Detection
  • 其他标题:Empirical Oversampling Threshold Strategy for Machine Learning Performance Optimisation in Insurance Fraud Detection
  • 本地全文:下载
  • 作者:Bouzgarne Itri ; Youssfi Mohamed ; Bouattane Omar
  • 期刊名称:International Journal of Advanced Computer Science and Applications(IJACSA)
  • 印刷版ISSN:2158-107X
  • 电子版ISSN:2156-5570
  • 出版年度:2020
  • 卷号:11
  • 期号:10
  • DOI:10.14569/IJACSA.2020.0111054
  • 出版社:Science and Information Society (SAI)
  • 摘要:Insurance fraud is one of the most practiced frauds in the sectors of the economy. Faced with increasingly imaginative underwriters to create fraud scenarios and the emergence of organized crime groups, the fraud detection process based on artificial intelligence remains one of the most effective approaches. Real world datasets are usually unbalanced and are mainly composed of "no-fraudulent" class with a very small percentage of "fraudulent" examples to train our model, thus prediction models see their performance severely degraded when the target class appears so poorly represented. Therefore, the present work aims to propose an approach that improves the relevance of the results of the best-known machine learning algorithms and deals with imbalanced classes in classification problems for prediction against insurance fraud. We use one of the most efficient approaches to re-balance training data: SMOTE. We adopted the supervised method applied to automobile claims dataset "carclaims.txt". We compare the results of the different measurements and question the results and relevance of the measurements in the field of study of unbalanced and labeled datasets. This work shows that the SMOTE Method with the KNN Algorithm can achieve better classifier performance in a True Positive Rate than the previous research. The goal of this work is to lead a study of algorithm selections and performance evaluation among different ML classification algorithms, as well as to propose a new approach TH-SMOTE for performance improvement using the SMOTE method by defining the optimum oversampling threshold according to the G-mean measure.
  • 关键词:Machine learning; oversampling; SMOTE; insurance fraud
国家哲学社会科学文献中心版权所有