首页    期刊浏览 2024年10月06日 星期日
登录注册

文章基本信息

  • 标题:Guidelines for normalising Early Modern English corpora: Decisions and justifications
  • 本地全文:下载
  • 作者:Dawn Archer ; Merja Kytö ; Alistair Baron
  • 期刊名称:ICAME Journal
  • 印刷版ISSN:0801-5775
  • 电子版ISSN:1502-5462
  • 出版年度:2015
  • 卷号:39
  • 期号:1
  • 页码:5-24
  • DOI:10.1515/icame-2015-0001
  • 语种:English
  • 出版社:School of Computing
  • 摘要:Corpora of Early Modern English have been collected and released for research for a number of years. With large scale digitisation activities gathering pace in the last decade, much more historical textual data is now available for research on numerous topics including historical linguistics and conceptual history. We summarise previous research which has shown that it is necessary to map historical spelling variants to modern equivalents in order to successfully apply natural language processing and corpus linguistics methods. Manual and semiautomatic methods have been devised to support this normalisation and standardisation process. We argue that it is important to develop a linguistically meaningful rationale to achieve good results from this process. In order to do so, we propose a number of guidelines for normalising corpora and show how these guidelines have been applied in the Corpus of English Dialogues.
国家哲学社会科学文献中心版权所有