首页    期刊浏览 2024年09月20日 星期五
登录注册

文章基本信息

  • 标题:Performance of Data Cleaning Techniques by Using C-Tane Algorithm
  • 本地全文:下载
  • 作者:B.Revanth ; P.Naga Raju ; M.Durga Satish
  • 期刊名称:International Journal of Computer Science & Technology
  • 印刷版ISSN:2229-4333
  • 电子版ISSN:0976-8491
  • 出版年度:2013
  • 卷号:4
  • 期号:3
  • 页码:188-191
  • 语种:English
  • 出版社:Ayushmaan Technologies
  • 摘要:Conditional Functional Dependencies (CFDs) are an extension of Functional Dependencies (FDs) by supporting patterns of semantically related constants, and can be used as rules for cleaning relational data. However, finding CFDs is an expensive process that involves intensive manual effort. To effectively identify data cleaning rules, we take 4 techniques for cleaning the data from sample relations. CFDMiner, is based on techniques for mining closed item sets, and is used to detect constant CFDs, namely, CFDs with constant patterns only. It provides a heuristic efficient algorithm for discovering patterns from a fixed FD. It leverages closed-item set mining to reduce search space. CTANE works well when the arity of a sample relation is small and the support threshold is high, but it scales poorly when the arity of a relation increases. FastCFD is more efficient when the arity of a relation is large. Greedy Method formally based on the desirable properties of support and confidence. It studying the computational complexity of automatic generation of optimal tables and providing an efficient approximation algorithm. These techniques are already implemented in the previous papers. We take algorithms of these 4 techniques and find out time and space complexity of each algorithm to know which technique will be helpful in which case and display the results in the form of line and bar charts.
  • 关键词:Integrity;Conditional Functional Dependency;Functional Dependency;Free Item Set;Closed Item Set
国家哲学社会科学文献中心版权所有