首页    期刊浏览 2025年02月20日 星期四
登录注册

文章基本信息

  • 标题:A Novel Technique of Replicate Data Detection for Best Precision in Cloud-Based Computing
  • 本地全文:下载
  • 作者:Pritaj Yadav ; Prof. Alka Gulati ; Prof. Vineet Richharia
  • 期刊名称:International Journal of Computer Technology and Applications
  • 电子版ISSN:2229-6093
  • 出版年度:2012
  • 卷号:3
  • 期号:6
  • 页码:2066-2072
  • 出版社:Technopark Publications
  • 摘要:We propose a highly efficient and scalable duplicate-search technique based on hash algorithm, Cloud-based computing is an emerging practice that offers significantly more infrastructure and financial flexibility than traditional computing models which requires very low computational cost and memory cost. Larger enterprises may have implemented very strong security approaches that may or may not be equaled by cloud providers, but don't just assume that security is a problem. Look for the type of security functionality you would look for in an in-house solution. A documents may get mirrored to avoid delays or to provide fault tolerance. Our algorithm RDDA for detecting replicate documents are critical in applications where data is obtained from multiple sources. The removal of replicate documents is necessary, not only to reduce run time, but also to improve search accuracy. Today, search engine crawlers are retrieving billions of unique URL’s, of which hundreds of millions are replicates of some form. This function rapidly compares large numbers of files for identical content by computing the SHA-256 hash of each file and detecting replicates. The probability of two non-identical files having the same hash, even in a hypothetical directory containing millions of files, is exceedingly remote. By efficiently presenting only unique documents, user satisfaction is likely to increase.
  • 关键词:unique documents; detecting replicate; replication; search engine; SHA.
国家哲学社会科学文献中心版权所有