首页    期刊浏览 2024年07月05日 星期五
登录注册

文章基本信息

  • 标题:Language Identification: a Neural Network Approach
  • 本地全文:下载
  • 作者:Alberto Sim{\~o}es ; Jos{\'e} Jo{\~a}o Almeida ; Simon D. Byers
  • 期刊名称:OASIcs : OpenAccess Series in Informatics
  • 电子版ISSN:2190-6807
  • 出版年度:2014
  • 卷号:38
  • 页码:251-265
  • DOI:10.4230/OASIcs.SLATE.2014.251
  • 出版社:Schloss Dagstuhl -- Leibniz-Zentrum fuer Informatik
  • 摘要:One of the first tasks when building a Natural Language application is the detection of the used language in order to adapt the system to that language. This task has been addressed several times. Nevertheless most of these attempts were performed a long time ago when the amount of computer data and the computational power were limited. In this article we analyze and explain the use of a neural network for language identification, where features can be extracted automatically, and therefore, easy to adapt to new languages. In our experiments we got some surprises, namely with the two Chinese variants, whose forced us for some language-dependent tweaking of the neural network. At the end, the network had a precision of 95%, only failing for the Portuguese language.
  • 关键词:language identification; neural networks; language models; trigrams
国家哲学社会科学文献中心版权所有