首页    期刊浏览 2024年12月01日 星期日
登录注册

文章基本信息

  • 标题:A Detailed Survey on Various Record Deduplication Methods
  • 本地全文:下载
  • 作者:Lalitha. L ; Maheswari.B ; Karthik.S
  • 期刊名称:International Journal of Advanced Research in Computer Engineering & Technology (IJARCET)
  • 印刷版ISSN:2278-1323
  • 出版年度:2012
  • 卷号:1
  • 期号:8
  • 页码:160-163
  • 出版社:Shri Pannalal Research Institute of Technolgy
  • 摘要:Deduplication is the key operation in data integration from multiple data sources. To achieve higher quality information and more simplified data representation, data preprocessing is required. Data cleaning is one among the data preprocessing steps. Data cleaning includes the process of parsing, data transformation, duplicate elimination and statistical methods. If two records represent the same real world entity then it is called duplicated records. The problem of detecting and eliminating duplicate records is called record deduplication. This paper presents an analysis of record deduplication techniques and algorithms that detect and remove the duplicate records.
  • 关键词:Deduplication; Data cleaning; Data ; preprocessing; Record Linkage; ; ; Record matching
国家哲学社会科学文献中心版权所有