期刊名称:International Journal of Computer Science & Technology
印刷版ISSN:2229-4333
电子版ISSN:0976-8491
出版年度:2012
卷号:3
期号:2
页码:1123-1128
语种:English
出版社:Ayushmaan Technologies
摘要:Conditional Functional Dependencies (CFDs) have been recently introduced in the context of data cleaning. They can be seen as an unification of Functional Dependencies (FD) and Association Rules (AR) since they allow to mix attributes and attribute/values in dependencies. Conditional Functional Dependencies (CFDs), for data cleaning purposes. CFDs are dependencies which hold on instances of the relations. Constraint used in CFDs is the equality and allows fixing particular constant values for attributes. Conditional Functional Dependencies (CFDs) have been proposed as a new type of semantic rules extended from traditional functional dependencies. They have shown great potential for detecting and repairing inconsistent data. The theoretical search space for the minimal set of CFDs is the set of minimal generators and their closures in data. This search space has been used in the currently most efficient constant CFD discovery algorithm. In this paper, we propose pruning criteria to further prune the theoretic search space, and design a fast algorithm for constant CFD discovery. We evaluate the proposed algorithm on a number of medium to large real world data sets. The proposed algorithm is faster than the currently most efficient constant CFD discovery algorithm, and has linear time performance in the size of a data set.