摘要:Spectral clustering is a popular and successful approach for partitioning the nodes of a graph into clusters for which the ratio of outside connections compared to the volume (sum of degrees) is small. In order to partition into k clusters, one first computes an approximation of the bottom k eigenvectors of the (normalized) Laplacian of G, uses it to embed the vertices of G into k-dimensional Euclidean space R^k, and then partitions the resulting points via a k-means clustering algorithm. It is an important task for theory to explain the success of spectral clustering. Peng et al. (COLT, 2015) made an important step in this direction. They showed that spectral clustering provably works if the gap between the (k+1)-th and the k-th eigenvalue of the normalized Laplacian is sufficiently large. They proved a structural and an algorithmic result. The algorithmic result needs a considerably stronger gap assumption and does not analyze the standard spectral clustering paradigm; it replaces spectral embedding by heat kernel embedding and k-means clustering by locality sensitive hashing. We extend their work in two directions. Structurally, we improve the quality guarantee for spectral clustering by a factor of k and simultaneously weaken the gap assumption. Algorithmically, we show that the standard paradigm for spectral clustering works. Moreover, it even works with the same gap assumption as required for the structural result.
关键词:spectral embedding; k-means clustering; power method; gap assumption