首页    期刊浏览 2024年10月05日 星期六
登录注册

文章基本信息

  • 标题:Opening Digitized Newspapers Corpora: Europeana's Full-Text Data Interoperability Case
  • 本地全文:下载
  • 作者:Nuno Freire ; Antoine Isaac ; Twan Goosen
  • 期刊名称:OASIcs : OpenAccess Series in Informatics
  • 电子版ISSN:2190-6807
  • 出版年度:2019
  • 卷号:70
  • 页码:22:1-22:14
  • DOI:10.4230/OASIcs.LDK.2019.22
  • 出版社:Schloss Dagstuhl -- Leibniz-Zentrum fuer Informatik
  • 摘要:Cultural heritage institutions hold collections of printed newspapers that are valuable resources for the study of history, linguistics and other Digital Humanities scientific domains. Effective retrieval of newspapers content based on metadata only is a task nearly impossible, making the retrieval based on (digitized) full-text particularly relevant. Europeana, Europe's Digital Library, is in the position to provide access to large newspapers collections with full-text resources. Full-text corpora are also relevant for Europeana's objective of promoting the usage of cultural heritage resources for use within research infrastructures. We have derived requirements for aggregating and publishing Europeana's newspapers full-text corpus in an interoperable way, based on investigations into the specific characteristics of cultural data, the needs of two research infrastructures (CLARIN and EUDAT) and the practices being promoted in the International Image Interoperability Framework (IIIF) community. We have then defined a "full-text profile" for the Europeana Data Model, which is being applied to Europeana's newspaper corpus.
  • 关键词:Metadata; Full-text; Interoperability; Data aggregation; Cultural Heritage; Research Infrastructures
国家哲学社会科学文献中心版权所有