首页    期刊浏览 2024年09月19日 星期四
登录注册

文章基本信息

  • 标题:Uncertain Data Analysis with Regularized XGBoost
  • 本地全文:下载
  • 作者:G.V. Suresh ; E. Sreenivasa Reddy
  • 期刊名称:Webology
  • 印刷版ISSN:1735-188X
  • 出版年度:2022
  • 卷号:19
  • 期号:1
  • 页码:3722-3740
  • DOI:10.14704/WEB/V19I1/WEB19245
  • 语种:English
  • 出版社:University of Tehran
  • 摘要:Uncertainty is a ubiquitous element in available knowledge about the real world. Data sampling error, obsolete sources, network latency, and transmission error are all factors that contribute to the uncertainty. These kinds of uncertainty have to be handled cautiously, or else the classification results could be unreliable or even erroneous. There are numerous methodologies developed to comprehend and control uncertainty in data. There are many faces for uncertainty i.e., inconsistency, imprecision, ambiguity, incompleteness, vagueness, unpredictability, noise, and unreliability. Missing information is inevitable in real-world data sets. While some conventional multiple imputation approaches are well studied and have shown empirical validity, they entail limitations in processing large datasets with complex data structures. In addition, these standard approaches tend to be computationally inefficient for medium and large datasets. In this paper, we propose a scalable multiple imputation frameworks based on XGBoost, bootstrapping and regularized method. XGBoost, one of the fastest implementations of gradient boosted trees, is able to automatically retain interactions and non-linear relations in a dataset while achieving high computational efficiency with the aid of bootstrapping and regularized methods. In the context of high-dimensional data, this methodology provides fewer biased estimates and reflects acceptable imputation variability than previous regression approaches. We validate our adaptive imputation approaches with standard methods on numerical and real data sets and shown promising results.
  • 关键词:Uncertainty;Missing Data;Multiple Imputation;Bootstrapping;Regularized Method;and XGBOOST
国家哲学社会科学文献中心版权所有