首页    期刊浏览 2025年02月27日 星期四
登录注册

文章基本信息

  • 标题:Scalable Analytics Model Calibration with Online Aggregation
  • 本地全文:下载
  • 作者:Florin Rusu ; Chengjie Qin ; Martin Torres
  • 期刊名称:Bulletin of the Technical Committee on Data Engineering
  • 出版年度:2015
  • 卷号:38
  • 期号:3
  • 出版社:IEEE Computer Society
  • 摘要:Model calibration is a major challenge faced by the plethora of statistical analytics packages that are in-creasingly used in Big Data applications. Identifying the optimal model parameters is a time-consumingprocess that has to be executed from scratch for every dataset/model combination even by experienceddata scientists. We argue that the lack of support to quickly identify sub-optimal configurations is theprincipal cause. In this paper, we apply parallel online aggregation to identify sub-optimal configura-tions early in the processing by incrementally sampling the training dataset and estimating the objectivefunction corresponding to each configuration. We design concurrent online aggregation estimators anddefine halting conditions to accurately and timely stop the execution. The end-result is online approxi-mate gradient descent—a novel optimization method for scalable model calibration. We show how onlineapproximate gradient descent can be represented as generic database aggregation and implement theresulting solution in GLADE—a state-of-the-art Big Data analytics system.
国家哲学社会科学文献中心版权所有