文章基本信息

标题：Supporting the corpus-based study of Shakespeare’s language: Enhancing a corpus of the First Folio
本地全文：下载
作者：Jonathan Culpeper ; Andrew Hardie ; Jane Demmen 等
期刊名称：ICAME Journal
印刷版ISSN：0801-5775
电子版ISSN：1502-5462
出版年度：2021
卷号：45
期号：1
页码：37-86
DOI：10.2478/icame-2021-0002
语种：English
出版社：School of Computing
摘要：This article explores challenges in the corpus linguistic analysis of Shakespeare’s language, and Early Modern English more generally, with particular focus on elaborating possible solutions and the benefits they bring. An account of work that took place within theEncyclopedia of Shakespeare’s LanguageProject (2016–2019) is given, which discusses the development of the project’s data resources, specifically, theEnhanced Shakespearean Corpus.Topics covered include the composition of the corpus and its subcomponents; the structure of the XML markup; the design of the extensive character metadata; and the word-level corpus annotation, including spelling regularisation, part-of-speech tagging, lemmatisation and semantic tagging. The challenges that arise from each of these undertakings are not exclusive to a corpus-based treatment of Shakespeare’s plays but it is in the context of Shakespeare’s language that they are so severe as to seem almost insurmountable. The solutions developed for theEnhanced Shakespearean Corpus– often combining automated manipulation with manual interventions, and always principled – offer a way through.