期刊名称:Bulletin of the Technical Committee on Data Engineering
出版年度:2014
卷号:37
期号:3
出版社:IEEE Computer Society
摘要:SystemML enables declarative, large-scale machine learning (ML) via a high-level language with R-likesyntax. Data scientists use this language to express their ML algorithms with full flexibility but withoutthe need to hand-tune distributed runtime execution plans and system configurations. These ML pro-grams are dynamically compiled and optimized based on data and cluster characteristics using rule-and cost-based optimization techniques. The compiler automatically generates hybrid runtime execu-tion plans ranging from in-memory, single node execution to distributed MapReduce (MR) computationand data access. This paper describes the SystemML optimizer, its compilation chain, and selectedoptimization phases for generating efficient execution plans.