首页    期刊浏览 2024年07月06日 星期六
登录注册

文章基本信息

  • 标题:Probability Based Cluster Expansion Oversampling Technique for Imbalanced Data
  • 本地全文:下载
  • 作者:Shaukat Ali Shahee ; Usha Ananthakumar
  • 期刊名称:Computer Science & Information Technology
  • 电子版ISSN:2231-5403
  • 出版年度:2018
  • 卷号:8
  • 期号:6
  • 页码:77-90
  • DOI:10.5121/csit.2018.80607
  • 出版社:Academy & Industry Research Collaboration Center (AIRCC)
  • 摘要:In many applications of data mining, class imbalance is noticed when examples in one class areoverrepresented. Traditional classifiers result in poor accuracy of the minority class due to theclass imbalance. Further, the presence of within class imbalance where classes are composed ofmultiple sub-concepts with different number of examples also affect the performance ofclassifier. In this paper, we propose an oversampling technique that handles between class andwithin class imbalance simultaneously and also takes into consideration the generalizationability in data space. The proposed method is based on two steps- performing Model BasedClustering with respect to classes to identify the sub-concepts; and then computing theseparating hyperplane based on equal posterior probability between the classes. The proposedmethod is tested on 10 publicly available data sets and the result shows that the proposedmethod is statistically superior to other existing oversampling methods.
  • 关键词:Supervised learning; Class Imbalance; Oversampling; Posterior Distribution
国家哲学社会科学文献中心版权所有