首页    期刊浏览 2025年06月20日 星期五
登录注册

文章基本信息

  • 标题:The Corpus for Idiolectal Research (CIDRE)
  • 本地全文:下载
  • 作者:Olga Seminck ; Philippe Gambette ; Dominique Legallois
  • 期刊名称:Journal of Open Humanities Data
  • 电子版ISSN:2059-481X
  • 出版年度:2021
  • 卷号:7
  • DOI:10.5334/johd.42
  • 语种:English
  • 出版社:Ubiquity Press
  • 摘要:The Corpus for Idiolectal Research (CIDRE) is a collection of fiction works from 11 prolific 19th-century French authors (4 women, 7 men; 22–62 works/author; total of 37 million words). Every work is dated with the year it was written. Using programming scripts, the works have been gathered from open source platforms, for example La Bibliothèque électronique du Québec, and stripped of paratext (text not being part of the novel, e.g. prefaces). We distribute the text files, the dating, other metadata and the programming scripts under an open source license. CIDRE is the first resource of French for the study of style and idiolect in a diachronic manner (i.e. stylochronometry) on a larger scale.
国家哲学社会科学文献中心版权所有