首页    期刊浏览 2025年04月13日 星期日
登录注册

文章基本信息

  • 标题:ハブの抑制によるコンパラブルコーパスからの対訳抽出精度の改善
  • 本地全文:下载
  • 作者:重藤 優太郎 ; 鈴木 郁美 ; 原 一夫
  • 期刊名称:人工知能学会論文誌
  • 印刷版ISSN:1346-0714
  • 电子版ISSN:1346-8030
  • 出版年度:2016
  • 卷号:31
  • 期号:2
  • 页码:E-F43_1-12
  • DOI:10.1527/tjsai.E-F43
  • 出版社:The Japanese Society for Artificial Intelligence
  • 摘要:Most of the existing approaches to bilingual lexicon extraction (BLE) first map words in source and target languages into a single vector space, and then measure the similarity of words across the two languages in this space. We point out that existing BLE methods suffer from the so-called hubness phenomenon; i.e., a small number of translation candidates (hub candidates) are chosen by the systems as likely translations of many source words, which consequently degrade the accuracy of extracted translations. We show that this phenomenon can be alleviated by centering the data or by using the mutual proximity measure, which are two known techniques that effectively reduce hubness in standard nearest-neighbor search settings. Our empirical evaluation shows that naive nearest-neighbor search combined with these methods outperforms a recently proposed BLE method based on label propagation.
  • 关键词:bilingual lexicon extraction;hubness phenomenon;nearest neighbor search
国家哲学社会科学文献中心版权所有