首页    期刊浏览 2025年07月27日 星期日
登录注册

文章基本信息

  • 标题:Grasping the Anti-Modern Discourse on Europe in the Swiss Digitised Press, or can Text Mining Generate a Research Corpus from an Article Collection?
  • 本地全文:下载
  • 作者:Estelle Bunout
  • 期刊名称:Journal of Open Humanities Data
  • 电子版ISSN:2059-481X
  • 出版年度:2021
  • 卷号:7
  • DOI:10.5334/johd.37
  • 语种:English
  • 出版社:Ubiquity Press
  • 摘要:In this paper, we discuss how different types of automatic annotation of digitised newspaper articles can be integrated into the iterative questioning of the source material and the creation of research corpora out of a collection of unstructured texts (kept in a structured collection). We annotate a sizeable collection of Swiss press articles (183,270), extracted via the impresso interface1 using topic modelling (MALLET)2 as well as a naïve Bayes classifier (script by Milan van Lange). The methodological discussion we propose is to explore how text mining can help identify historical discourses that are difficult to query with keywords because of their inherent ambiguity and how to grasp them in a large corpus. We argue that the automated annotations can provide a body of corroborating evidence of the searched discourse, to be used as an intermediary and heuristic analysis step.
国家哲学社会科学文献中心版权所有