首页    期刊浏览 2024年09月19日 星期四
登录注册

文章基本信息

  • 标题:esTenTen, a Vast Web Corpus of Peninsular and American Spanish
  • 本地全文:下载
  • 作者:Adam Kilgarriff ; Adam Kilgarriff ; Irene Renau
  • 期刊名称:Procedia - Social and Behavioral Sciences
  • 印刷版ISSN:1877-0428
  • 出版年度:2013
  • 卷号:95
  • 页码:12-19
  • DOI:10.1016/j.sbspro.2013.10.617
  • 语种:English
  • 出版社:Elsevier
  • 摘要:AbstractEveryone working on general language would like their corpus to be bigger, wider-coverage, cleaner, duplicate-free, and with richer metadata. As a response to that wish, Lexical Computing Ltd. has a programme to develop very large ‘TenTen’ web corpora. In this paper we introduce the Spanish corpus, esTenTen, of 8 billion words and 19 different national varieties of Spanish. We investigate the distance between the national varieties as represented in the corpus, and examine in detail the keywords of Peninsular Spanish vs. American Spanish, finding a wide range of linguistic, cultural and political contrasts.
  • 关键词:corpus linguistics;Sketch Engine;Spanish dialects;TenTen corpora
国家哲学社会科学文献中心版权所有