期刊名称:International Journal of Applied Mathematics and Computer Science
电子版ISSN:2083-8492
出版年度:2019
卷号:29
期号:1
页码:1-11
DOI:10.2478/amcs-2019-0006
出版社:De Gruyter Open
摘要:Finding clusters in high dimensional data is a challenging research problem. Subspace clustering algorithms aim to find
clusters in all possible subspaces of the dataset, where a subspace is a subset of dimensions of the data. But the exponential
increase in the number of subspaces with the dimensionality of data renders most of the algorithms inefficient as well
as ineffective. Moreover, these algorithms have ingrained data dependency in the clustering process, which means that
parallelization becomes difficult and inefficient. SUBSCALE is a recent subspace clustering algorithm which is scalable
with the dimensions and contains independent processing steps which can be exploited through parallelism. In this paper,
we aim to leverage the computational power of widely available multi-core processors to improve the runtime performance
of the SUBSCALE algorithm. The experimental evaluation shows linear speedup. Moreover, we develop an approach using
graphics processing units (GPUs) for fine-grained data parallelism to accelerate the computation further. First tests of the
GPU implementation show very promising results.