首页    期刊浏览 2024年12月01日 星期日
登录注册

文章基本信息

  • 标题:OCCURRENCE BASED CATEGORICAL DATA CLUSTERING USING COSINE AND BINARY MATCHING SIMILARITY MEASURE
  • 本地全文:下载
  • 作者:S. ANITHA ELAVARASI ; J. AKILANDESWARI
  • 期刊名称:Journal of Theoretical and Applied Information Technology
  • 印刷版ISSN:1992-8645
  • 电子版ISSN:1817-3195
  • 出版年度:2014
  • 卷号:68
  • 期号:1
  • 出版社:Journal of Theoretical and Applied
  • 摘要:Clustering is the process of grouping a set of physical objects into classes of similar object. Objects in real world consist of both numerical and categorical data. Categorical data are not analyzed as numerical data because of the absence of inherit ordering. This paper describes about occurrence based categorical data clustering (OBCDC) technique based on cosine similarity measure and simple binary matching similarity measure. The OBCDC system consists of four modules, such as data pre-processing, similarity matrix generation, cluster formation and validation. Similarity matrix generation uses three functions, namely FrequencyComputation, OccurranceBasedCosine and OccurranceBasedSBMS. The time complexity of various algorithms are discussed and its performance on real world data are measured using accuracy and error rate
  • 关键词:Clustering; Unsupervised Learning; Categorical Data; Cosine Similarity; Simple Binary Matching Similarity
国家哲学社会科学文献中心版权所有