首页    期刊浏览 2024年11月24日 星期日
登录注册

文章基本信息

  • 标题:Study on Supervised Learning of Vietnamese Word Sense Disambiguation Classifiers
  • 本地全文:下载
  • 作者:Minh Hai Nguyen ; Kiyoaki Shirai
  • 期刊名称:Information and Media Technologies
  • 电子版ISSN:1881-0896
  • 出版年度:2012
  • 卷号:7
  • 期号:3
  • 页码:1083-1108
  • DOI:10.11185/imt.7.1083
  • 出版社:Information and Media Technologies Editorial Board
  • 摘要:It is said that Vietnamese is a language with highly ambiguous words. However, there has been no published Word Sense Disambiguation (WSD hereafter) research on this language. This current research is the first attempt to study Vietnamese WSD. Especially, we would like to explore the effective features for training WSD classifiers and verify the applicability of the ‘pseudoword’ technique to both investigating effectiveness of features and training WSD classifiers. Three tasks have been conducted, using two corpora which were built manually based on Vietnamese Treebank and automatically by applying pseudowords technique. Experiment results showed that Bag-Of-Word feature performs well for all three categories of words (verbs, nouns, and adjectives). However, its combination with POS, Collocation or Syntactic features can not significantly improve the performance of WSD classifiers. Moreover, the experiment results confirmed that pseudoword is a suitable technique to explore the effectiveness of features in disambiguation of Vietnamese verbs and adjectives. Furthermore, we empirically evaluated the applicability of the pseudoword technique as an unsupervised learning method for real Vietnamese WSD.
  • 关键词:Word Sense Disambiguation;Vietnamese;Supervised Machine Learning;Feature for WSD;Pseudoword
国家哲学社会科学文献中心版权所有