首页    期刊浏览 2025年07月28日 星期一
登录注册

文章基本信息

  • 标题:iotools: High-Performance I/O Tools for R
  • 本地全文:下载
  • 作者:Taylor Arnold ; Michael J. Kane ; Simon Urbanek
  • 期刊名称:R News
  • 印刷版ISSN:1609-3631
  • 出版年度:2017
  • 卷号:9
  • 期号:1
  • 页码:6-13
  • 语种:English
  • 出版社:The R Foundation for Statistical Computing
  • 摘要:The iotools package provides a set of tools for input and output intensive data processing in R. The functions chunk.apply and read.chunk are supplied to allow for iteratively loading contiguous blocks of data into memory as raw vectors. These raw vectors can then be efficiently converted into matrices and data frames with the iotools functions mstrsplit and dstrsplit. These functions minimize copying of data and avoid the use of intermediate strings in order to drastically improve performance. Finally, we also provide read.csv.raw to allow users to read an entire dataset into memory with the same efficient parsing code. In this paper, we present these functions through a set of examples with an emphasis on the flexibility provided by chunk-wise operations. We provide benchmarks comparing the speed of read.csv.raw to data loading functions provided in base R and other contributed packages.
国家哲学社会科学文献中心版权所有