首页    期刊浏览 2024年10月07日 星期一
登录注册

文章基本信息

  • 标题:Combined Mapping of Multiple clUsteriNg ALgorithms (COMMUNAL): A Robust Method for Selection of Cluster Number, K
  • 本地全文:下载
  • 作者:Timothy E. Sweeney ; Albert C. Chen ; Olivier Gevaert
  • 期刊名称:Scientific Reports
  • 电子版ISSN:2045-2322
  • 出版年度:2015
  • 卷号:5
  • 期号:1
  • DOI:10.1038/srep16971
  • 语种:English
  • 出版社:Springer Nature
  • 摘要:In order to discover new subsets (clusters) of a data set, researchers often use algorithms that perform unsupervised clustering, namely, the algorithmic separation of a dataset into some number of distinct clusters. Deciding whether a particular separation (or number of clusters, K) is correct is a sort of ‘dark art’, with multiple techniques available for assessing the validity of unsupervised clustering algorithms. Here, we present a new technique for unsupervised clustering that uses multiple clustering algorithms, multiple validity metrics, and progressively bigger subsets of the data to produce an intuitive 3D map of cluster stability that can help determine the optimal number of clusters in a data set, a technique we call COmbined Mapping of Multiple clUsteriNg ALgorithms (COMMUNAL). COMMUNAL locally optimizes algorithms and validity measures for the data being used. We show its application to simulated data with a known K, and then apply this technique to several well-known cancer gene expression datasets, showing that COMMUNAL provides new insights into clustering behavior and stability in all tested cases. COMMUNAL is shown to be a useful tool for determining K in complex biological datasets, and is freely available as a package for R.
国家哲学社会科学文献中心版权所有