首页    期刊浏览 2024年12月01日 星期日
登录注册

文章基本信息

  • 标题:A novel ensemble approach for heterogeneous data with active learning
  • 本地全文:下载
  • 作者:Mohamed Salama ; Hatem Abdelkader ; Amira Abdelwahab
  • 期刊名称:International Journal of Engineering Business Management
  • 印刷版ISSN:1847-9790
  • 电子版ISSN:1847-9790
  • 出版年度:2022
  • 卷号:14
  • DOI:10.1177/18479790221082605
  • 语种:English
  • 出版社:InTech
  • 摘要:At present, millions of internet users are contributing a huge amount of data. This data is extremely heterogeneous, and so, it is hard to analyze and derive information from this data that is considered an indispensable source for decision-makers. Due to this massive growth, the classification of data and analysis has become an important research subject. Extracting information from this data has become a necessity. As a result, it was necessary to process these enormous volumes of data to uncover hidden information and therefore improve data analysis and, in turn, classification accuracy. In this paper, firstly, we focus on developing an ensemble machine-learning model based on active learning which identifies the most effective feature extraction strategy for heterogeneous data analysis, and compare it with traditional machine-learning algorithms. Secondly, we evaluate the proposed model during the experiments; five heterogeneous datasets from various domains were used, such as a Health Care Reform dataset, Sander Frandsen dataset, Financial Phrase Bank dataset, SMS Spam Collection dataset, and Textbook sales dataset. According to the results, the novel approach for data analysis performed better than conventional methods. Finally, the study’s findings confirmed the validity of the suggested technique, meeting the study’s goal of using ensemble methods with active learning to raise the model’s overall accuracy for effectively classifying and analyzing heterogeneous data, reducing the time and money spent training the model, and delivering superior analysis performance as well as insights into other elements of extracting information from heterogeneous data.
国家哲学社会科学文献中心版权所有