首页    期刊浏览 2024年12月01日 星期日
登录注册

文章基本信息

  • 标题:Progressive Detection of Duplicate Data
  • 本地全文:下载
  • 作者:Pranali Turankar ; Vaishali Zungare ; Pooja Thakare
  • 期刊名称:International Journal of Innovative Research in Computer and Communication Engineering
  • 印刷版ISSN:2320-9798
  • 电子版ISSN:2320-9801
  • 出版年度:2017
  • 卷号:5
  • 期号:3
  • 页码:3964
  • DOI:10.15680/IJIRCCE.2017.0503047
  • 出版社:S&S Publications
  • 摘要:Data duplicate detection is the process of identifying multiple representations of same or real worldentities. Nowadays, data duplicate detection methods are needed to process larger datasets in shorter time: maintainingthe quality of the datasets and also the entities duplicated becomes increasingly difficult. This application focus on theduplicates in hierarchical data’s like XML file. The data can be detected using the detection methods. Here the datasetsare loaded in the applications and the processing, extraction, cleaning, separation and detection are carried out toremove the duplicated data. Comprehensive experiments show that our progressive algorithms can double theefficiency over time of traditional duplicate detection and significantly improve upon related work.
  • 关键词:Duplicate detection; entity resolution; progressiveness; and data cleaning
国家哲学社会科学文献中心版权所有