首页    期刊浏览 2025年02月22日 星期六
登录注册

文章基本信息

  • 标题:The Blessing of Dimensionality: Separation Theorems in the Thermodynamic Limit * * The work is partially supported by Innovate UK, Technology Strategy Board, Knowledge Transfer Partnership grant KTP009890
  • 本地全文:下载
  • 作者:Alexander N. Gorban ; Ivan Yu. Tyukin ; Ilya Romanenko
  • 期刊名称:IFAC PapersOnLine
  • 印刷版ISSN:2405-8963
  • 出版年度:2016
  • 卷号:49
  • 期号:24
  • 页码:64-69
  • DOI:10.1016/j.ifacol.2016.10.755
  • 语种:English
  • 出版社:Elsevier
  • 摘要:We consider and analyze properties of large sets of randomly selected (i.i.d.) points in high dimensional spaces. In particular, we consider the problem of whether a single data point that is randomly chosen from a finite set of points can be separated from the rest of the data set by a linear hyperplane. We formulate and prove stochastic separation theorems, including: 1) with probability close to one a random point may be separated from a finite random set by a linear functional; 2) with probability close to one for every point in a finite random set there is a linear functional separating this point from the rest of the data. The total number of points in the random sets are allowed to be exponentially large with respect to dimension. Various laws governing distributions of points are considered, and explicit formulae for the probability of separation are provided. These theorems reveal an interesting implication for machine learning and data mining applications that deal with large data sets (big data) and high-dimensional data (many attributes): simple linear decision rules and learning machines are surprisingly efficient tools for separating and filtering out arbitrarily assigned points in large dimensions.
  • 关键词:Measure concentrationseparation theoremsbig datamachine learning
国家哲学社会科学文献中心版权所有