首页    期刊浏览 2024年07月16日 星期二
登录注册

文章基本信息

  • 标题:Imputing missing values using cumulative linear regression
  • 本地全文:下载
  • 作者:Samih M. Mostafa
  • 期刊名称:CAAI Transactions on Intelligence Technology
  • 电子版ISSN:2468-2322
  • 出版年度:2019
  • 卷号:4
  • 期号:3
  • 页码:182-200
  • DOI:10.1049/trit.2019.0032
  • 出版社:IET Digital Library
  • 摘要:The concept of missing data is important to apply statistical methods on the dataset. Statisticians and researchers may end up to an inaccurate illation about the data if the missing data are not handled properly. Of late, Python and R provide diverse packages for handling missing data. In this study, an imputation algorithm, cumulative linear regression, is proposed. The proposed algorithm depends on the linear regression technique. It differs from the existing methods, in that it cumulates the imputed variables; those variables will be incorporated in the linear regression equation to filling in the missing values in the next incomplete variable. The author performed a comparative study of the proposed method and those packages. The performance was measured in terms of imputation time, root-mean-square error, mean absolute error, and coefficient of determination ( R 2 ) . On analysing on five datasets with different missing values generated from different mechanisms, it was observed that the performances vary depending on the size, missing percentage, and the missingness mechanism. The results showed that the performance of the proposed method is slightly better.
  • 关键词:imputation time; linear regression technique; missing data handling; statistical methods; root-mean-square error; imputed variables; missing values; mean absolute error; coefficient of determination; cumulative linear regression; linear regression equation; imputation algorithm
国家哲学社会科学文献中心版权所有