首页    期刊浏览 2024年07月03日 星期三
登录注册

文章基本信息

  • 标题:Efficiently Finding Near Duplicate Figures in Archives of Historical Documents
  • 本地全文:下载
  • 作者:Rakthanmanon, Thanawin ; Zhu, Qiang ; Keogh, Eamonn J.
  • 期刊名称:Journal of Multimedia
  • 印刷版ISSN:1796-2048
  • 出版年度:2012
  • 卷号:7
  • 期号:2
  • 页码:109-123
  • DOI:10.4304/jmm.7.2.109-123
  • 语种:English
  • 出版社:Academy Publisher
  • 摘要:The increasing interest in archiving all of humankind’s cultural artifacts has resulted in the digitization of millions of books, and soon a significant fraction of the world’s books will be online. Most of the data in historical manuscripts is text, but there is also a significant fraction devoted to images. This fact has driven much of the recent increase in interest in query-by-content systems for images. While querying/indexing systems can undoubtedly be useful, we believe that the historical manuscript domain is finally ripe for true unsupervised discovery of patterns and regularities. To this end, we introduce an efficient and scalable system that can detect approximately repeated occurrences of shape patterns both within and between historical texts. We show that this ability to find repeated shapes allows automatic annotation of manuscripts, and allows users to trace the evolution of ideas. We demonstrate our ideas on datasets of scientific and cultural manuscripts dating back to the fourteenth century.
  • 关键词:component; cultural artifacts; duplication detection; repeated patterns
国家哲学社会科学文献中心版权所有