首页    期刊浏览 2024年09月21日 星期六
登录注册

文章基本信息

  • 标题:Effective Sampling Selection Strategy with Reduced Effort Implied, In Tuning Large Scale Deduplication
  • 本地全文:下载
  • 作者:Ashwini R. ; Sridevi S.
  • 期刊名称:International Journal of Computer Science and Network
  • 印刷版ISSN:2277-5420
  • 出版年度:2016
  • 卷号:5
  • 期号:3
  • 页码:523-525
  • 出版社:IJCSN publisher
  • 摘要:The deduplication process is always given by a set of manually labeled pairs. But in a very large datasets, producing manually labeled pairs is a tedious process to complete. So in this article, a two-stage sampling selection procedure that reduces the set of pairs to tune the deduplication process is proposed. T3S executes in two stages. In the first stage a balanced subset of data are produced for labeling. In the next stage, the redundant and the duplicated data are removed and only the deduplicated data are produced as the output
  • 关键词:Deduplication; FS-Dedup; T3S
国家哲学社会科学文献中心版权所有