摘要:Record linkage is a critical problem in duplicate data elimination. It is used to detect and eliminate duplicate data. The elimination of duplicate data will increase the quality of data. Record Linkage problem will take high computational cost because of the large number of record comparisons. The comparison of records is inefficient in large data warehouses. Blocking methods are used to group the records to minimize the number of record comparisons. This paper explains the existing blocking methods and its comparison and discusses the selection of token-based blocking key for record comparisons.