首页    期刊浏览 2024年09月19日 星期四
登录注册

文章基本信息

  • 标题:Stochastic Streams: Sample Complexity vs. Space Complexity
  • 本地全文:下载
  • 作者:Michael Crouch ; Andrew McGregor ; Gregory Valiant
  • 期刊名称:LIPIcs : Leibniz International Proceedings in Informatics
  • 电子版ISSN:1868-8969
  • 出版年度:2016
  • 卷号:57
  • 页码:32:1-32:15
  • DOI:10.4230/LIPIcs.ESA.2016.32
  • 出版社:Schloss Dagstuhl -- Leibniz-Zentrum fuer Informatik
  • 摘要:We address the trade-off between the computational resources needed to process a large data set and the number of samples available from the data set. Specifically, we consider the following abstraction: we receive a potentially infinite stream of IID samples from some unknown distribution D, and are tasked with computing some function f(D). If the stream is observed for time t, how much memory, s, is required to estimate f(D)? We refer to t as the sample complexity and s as the space complexity. The main focus of this paper is investigating the trade-offs between the space and sample complexity. We study these trade-offs for several canonical problems studied in the data stream model: estimating the collision probability, i.e., the second moment of a distribution, deciding if a graph is connected, and approximating the dimension of an unknown subspace. Our results are based on techniques for simulating different classical sampling procedures in this model, emulating random walks given a sequence of IID samples, as well as leveraging a characterization between communication bounded protocols and statistical query algorithms.
  • 关键词:data streams; sample complexity; frequency moments; graph connectivity; subspace approximation
国家哲学社会科学文献中心版权所有