摘要:In this paper, we tackle the performance evaluation of two clustering algorithms: EFC and AIC-based. Both algorithms face the cluster validation problem, in which they need to estimate the number of components. While EFC algorithm is a direct method, the AIC-based is a verificative one. For a fair quantitative evaluation, comparisons are conducted on numerical data and image histograms data are used. We also propose to use artificial data satisfying the overlapping rate between adjacent components. The artificial data is modeled as a mixture of univariate normal densities as they are able to approximate a wide class of continuous densities.
关键词:performance evaluation; probability density function; clustering algorithm; unsupervised learning; univariate normal mixtures; gray-level histogram