首页    期刊浏览 2024年11月25日 星期一
登录注册

文章基本信息

  • 标题:Effectiveness of Combined Features for Machine Learning Based Question Classification
  • 本地全文:下载
  • 作者:Marcin Skowron ; Kenji Araki
  • 期刊名称:Information and Media Technologies
  • 电子版ISSN:1881-0896
  • 出版年度:2006
  • 卷号:1
  • 期号:1
  • 页码:461-481
  • DOI:10.11185/imt.1.461
  • 出版社:Information and Media Technologies Editorial Board
  • 摘要:Question classification is of crucial importance for question answering. In question classification, the accuracy of ML algorithms was found to significantly outperform other approaches. The two key issues in classification with a ML-based approach are classifier design and feature selection. Support Vector Machines is known to work well for sparse, high dimensional problems. However, the frequently used Bag-of-Words approach does not take full advantage of information contained in a question. To exploit this information we introduce three new feature types: Subordinate Word Category, Question Focus and Syntactic-Semantic Structure. As the results demonstrate, the inclusion of the new features provides higher accuracy of question classification compared to the standard Bag-of-Words approach and other ML based methods such as SVM with the Tree Kernel, SVM with Error Correcting Codes and SNoW. A classification accuracy of 85.6 % obtained using the three introduced feature types is, as of yet the highest reported in the literature, bringing error reduction of 27 % compared to the Bag-of-Words approach.
  • 关键词:Question Classification;Feature Selection;SVM;Machine Learning;Question Answering
国家哲学社会科学文献中心版权所有