首页    期刊浏览 2024年11月06日 星期三
登录注册

文章基本信息

  • 标题:An Empirical And Comparatively Research On Under-Sampling & Over- Sampling Defect-Prone Data-Sets Model In Light Of Machine Learning
  • 本地全文:下载
  • 作者:Salahuddin Shaikh ; Liu Changan ; Maaz Rasheed Malik
  • 期刊名称:International Journal of Advanced Networking and Applications
  • 电子版ISSN:0975-0290
  • 出版年度:2021
  • 卷号:12
  • 期号:5
  • 页码:4719-4724
  • DOI:10.35444/IJANA.2021.12508
  • 语种:English
  • 出版社:Eswar Publications
  • 摘要:The few researchers have put their ideas about class-imbalance during analysis of datasets, two types of class imbalances are present in datasets. First type in which some classes have many models than others and that is called between class imbalance. Second type in which few subsets of one class have less models than other subsets of similar class and that is within class-imbalance. Over-sampling and Under-sampling innovation assume noteworthy jobs in tackling the class-imbalance issue. There are numerous dissimilarities of over-sampling and under-sampling methods which utilized for class imbalanced dataset model. We have used two sampling techniques in our research paper for our imbalanced datasets models. One is over-sampling using SMOTE technique and another one is under-sampling using spread-sub-sample. During experiments, all results are measured in evaluation performance measure. Mostly they all are class imbalanced measurements, in which precision, recall, f-measure, area under curve and 12 different classifiers we have used in our experiments to get the comparatively results of both sampling techniques. The over-all analysis showed that the efficiency of correctly classified in over-sampling techniques is enhanced in few classifiers as compared to under-sampling techniques. The TP-rate and positive accuracy of both techniques, the stacking is worst classifier in these experiments and multi classification and LMT couldn’t increase the TP-rate in under-sampling techniques. The over-all comparative analysis of both techniques as compared with without using sample techniques have increased but over-sampling technique is more valuable to use for solving the class imbalance issue.
  • 关键词:Software prediction;Under-sampling;Over-sampling;Sampling;Class imbalance;Defect-Prone
国家哲学社会科学文献中心版权所有