首页    期刊浏览 2024年11月28日 星期四
登录注册

文章基本信息

  • 标题:Matching XML Documents in Hierarchical Data
  • 本地全文:下载
  • 作者:P.K.R. Madhuri ; Ch.Sita Kameswari
  • 期刊名称:International Journal of Computer Science & Technology
  • 印刷版ISSN:2229-4333
  • 电子版ISSN:0976-8491
  • 出版年度:2014
  • 卷号:5
  • 期号:4
  • 页码:303-305
  • 语种:English
  • 出版社:Ayushmaan Technologies
  • 摘要:Duplicate detection consists in detecting multiple representations of a same real-world object, and that for every object represented in a data source. Duplicate detection is relevant in data cleaning and data integration applications and has been studied extensively for relational data describing a single type of object in a single table. This paper focuses on iterative duplicate detection in XML data. We consider detecting duplicates in multiple types of objects related to each other and devise methods adapted to semi-structured XML data. Relationships between different types of objects either form a hierarchical structure or a graph structure. Iterative duplicate detection require a similarity measure to compare pairs of object representations, called candidates, based on descriptive information of a candidate. The distinction between candidates and their description is not straightforward in XML, but we show that we can semi-automatically determine these descriptions using heuristics and conditions.
  • 关键词:XML Data;Hierarchical Data;Duplicate Detection
国家哲学社会科学文献中心版权所有