期刊名称:Bulletin of the Technical Committee on Data Engineering
出版年度:2010
卷号:33
期号:01
出版社:IEEE Computer Society
摘要:The importance of supporting keyword searches on relations has been widely recognized. Different
from the existing keyword search techniques on relations, this paper focuses on nearly duplicate records
in relational databases due to abbreviation and typos. As a result, processing keyword searches with
duplicate records involves many unique challenges. In this paper we discuss the motivation and present
a system, RSEARCH, to show challenges in supporting keyword search using nearly duplicate records
and key techniques including identifying nearly duplicate records and generating results efficiently.