期刊名称:International Journal of Computer Science & Technology
印刷版ISSN:2229-4333
电子版ISSN:0976-8491
出版年度:2013
卷号:4
期号:3
页码:188-191
语种:English
出版社:Ayushmaan Technologies
摘要:Conditional Functional Dependencies (CFDs) are an extension of Functional Dependencies (FDs) by supporting patterns of semantically related constants, and can be used as rules for cleaning relational data. However, finding CFDs is an expensive process that involves intensive manual effort. To effectively identify data cleaning rules, we take 4 techniques for cleaning the data from sample relations. CFDMiner, is based on techniques for mining closed item sets, and is used to detect constant CFDs, namely, CFDs with constant patterns only. It provides a heuristic efficient algorithm for discovering patterns from a fixed FD. It leverages closed-item set mining to reduce search space. CTANE works well when the arity of a sample relation is small and the support threshold is high, but it scales poorly when the arity of a relation increases. FastCFD is more efficient when the arity of a relation is large. Greedy Method formally based on the desirable properties of support and confidence. It studying the computational complexity of automatic generation of optimal tables and providing an efficient approximation algorithm. These techniques are already implemented in the previous papers. We take algorithms of these 4 techniques and find out time and space complexity of each algorithm to know which technique will be helpful in which case and display the results in the form of line and bar charts.
关键词:Integrity;Conditional Functional Dependency;Functional Dependency;Free Item Set;Closed Item Set