首页    期刊浏览 2024年11月30日 星期六
登录注册

文章基本信息

  • 标题:A Data Mining of Supervised learning Approach based on K-means Clustering
  • 本地全文:下载
  • 作者:Bilal Sowan ; Hazem Qattous
  • 期刊名称:International Journal of Computer Science and Network Security
  • 印刷版ISSN:1738-7906
  • 出版年度:2017
  • 卷号:17
  • 期号:1
  • 页码:18-24
  • 出版社:International Journal of Computer Science and Network Security
  • 摘要:A diversity of application fields include a massive number of datasets. Each dataset consists of a number of variables (features). One of these variables that is considered as a dependent variable (target variable) and is used for prediction in data mining of the supervised learning task. Data mining is necessary for building an automatic analysis in order to extract knowledge from datasets. Knowledge extraction is useful for recommendation system and decision making which can be accomplished by data mining tasks. Different data types and characteristics of dependent variable play an important role in selecting such a specific data mining task. One of the most challenging issues in the data mining research is selecting the most appropriate and perfect technique for a particular dataset. This paper proposes a supervised learning approach by utilizing k-means clustering in order to convert a regression task into a classification task. The proposed approach is a flexible data mining approach that employs variety techniques. The flexibility means that a dependent variable of a numeric data type in a dataset is not only considered for a regression task. Instead, the approach is also able to apply the same dataset in the classification task by categorizing dependent variable into class labels. The experimental results validate the application of the proposed approach using two datasets. The first dataset is CPU dataset from UCI repository datasets, while the second one is a road traffic dataset from a real-world domain. The results show the effectiveness of the proposed approach that integrates different techniques namely MLP, REPTree, and CART, which are widely used for both classification and regressions tasks. The results also demonstrate that by clustering the dependent variable from numeric values into class labels can produce high accuracy for the used datasets.
  • 关键词:Data Mining; Regression Classification; K-means clustering; Clustering; Supervised learning.
国家哲学社会科学文献中心版权所有