期刊名称:International Journal of Software Engineering & Applications (IJSEA)
印刷版ISSN:0976-2221
电子版ISSN:0975-9018
出版年度:2012
卷号:3
期号:3
页码:1
出版社:Academy & Industry Research Collaboration Center (AIRCC)
摘要:Extensive amount of data stored in medical documents require developing methods that help users to findwhat they are looking for effectively by organizing large amounts of information into a small number ofmeaningful clusters. The produced clusters contain groups of objects which are more similar to each otherthan to the members of any other group. Thus, the aim of high-quality document clustering algorithms is todetermine a set of clusters in which the inter-cluster similarity is minimized and intra-cluster similarity ismaximized. The most important feature in many clustering algorithms is treating the clustering problem asan optimization process, that is, maximizing or minimizing a particular clustering criterion functiondefined over the whole clustering solution. The only real difference between agglomerative algorithms ishow they choose which clusters to merge. The main purpose of this paper is to compare differentagglomerative algorithms based on the evaluation of the clusters quality produced by different hierarchicalagglomerative clustering algorithms using different criterion functions for the problem of clusteringmedical documents. Our experimental results showed that the agglomerative algorithm that uses I1 as itscriterion function for choosing which clusters to merge produced better clusters quality than the othercriterion functions in term of entropy and purity as external measures.