首页    期刊浏览 2024年11月27日 星期三
登录注册

文章基本信息

  • 标题:Optimization of Word Sense Disambiguation using Clustering in WEKA
  • 本地全文:下载
  • 作者:Neetu Sharma ; Dr. S. Niranjan
  • 期刊名称:International Journal of Computer Technology and Applications
  • 电子版ISSN:2229-6093
  • 出版年度:2012
  • 卷号:3
  • 期号:4
  • 页码:1598-1604
  • 出版社:Technopark Publications
  • 摘要:In the Natural Language Processing (NLP) community, Word Sense Disambiguation (WSD) has been described as the task which selects the appropriate meaning (sense) to a given word in a text or discourse where this meaning is distinguishable from other senses potentially attributable to that word. These senses could be seen as the target labels of a classification problem. Clustering and classification are two important techniques of data mining. Classification is a supervised learning problem of assigning an object to one of several pre-defined categories based upon the attributes of the object. While, clustering is an unsupervised learning problem that group objects based upon distance or similarity. Each group is known as a cluster. In this paper we make use of data file poach.arff containing 7 attributes and 37 instances to perform an integration of clustering and classification techniques of data mining. We compared results of simple classification technique (using Random Forest classifier) with the results of integration of clustering and classification technique, based upon various parameters using WEKA (Waikato Environment for Knowledge Analysis), a Data Mining tool. The results of the experiment show that integration of clustering and classification gives promising results with utmost accuracy rate and robustness
  • 关键词:machine learning software; data mining; data preprocessing; data visualization; WEKA; WORDNET; K-Means; Random Forest
国家哲学社会科学文献中心版权所有