首页    期刊浏览 2024年10月06日 星期日
登录注册

文章基本信息

  • 标题:A Hybrid Method for XML Clustering by Structure and Content
  • 本地全文:下载
  • 作者:Piao, Yong ; Wang, Xiukun
  • 期刊名称:Journal of Software
  • 印刷版ISSN:1796-217X
  • 出版年度:2011
  • 卷号:6
  • 期号:12
  • 页码:2361-2368
  • DOI:10.4304/jsw.6.12.2361-2368
  • 语种:English
  • 出版社:Academy Publisher
  • 摘要:An effective XML cluster method called neighbor center clustering algorithm (NCC) is presented in this paper, whose similarity is obtained through both structural and content information contained in XML files. Structural similarity is firstly measured by frequency-path model and its similarity calculation algorithm with position and frequency weight by longest common subsequence is introduced. In order to improve the performance and precision, the frequency-path model is further extended by considering the structure and content information simultaneously. Experiments show that the NCC embed with hybrid similarity calculation method can obtain high purity and F-measure value and is effective and applicable for clustering XML with both homogenous and heterogeneous structures.
  • 关键词:neighbor center clustering;position and frequency weight;longest common subsequence;hybrid similarity calculation
国家哲学社会科学文献中心版权所有