文章基本信息

标题：Research of Feature Selection for Text Clustering Based on Cloud Model
本地全文：下载
作者：Zhao, Junmin ; Zhang, Kai ; Wan, Jian 等
期刊名称：Journal of Software
印刷版ISSN：1796-217X
出版年度：2013
卷号：8
期号：12
页码：3246-3252
DOI：10.4304/jsw.8.12.3246-3252
语种：English
出版社：Academy Publisher
摘要：Text clustering belongs to the unsupervised machine learning, the discriminability of class attributes cannot be measured in clustering. And the traditional text feature selection methods cannot effectively solve the high-dimensional problem. To overcome the weakness in existing feature selection, this paper proposes a new method which introduces the cloud model theory into feature selection, constructs the clouds filter for clustering documents. The distribution of document words is constructed in a microcosmic level. By employing the cloud model digital characteristics we can better compute the separability between feature words. Experimental results with K-means algorithm show that our method can remarkably improve the accuracy of text clustering.
关键词：feature selection;cloud model;TF-IDF; K-means algorithm