文章基本信息

标题：An Incremental Feature Clustering Algorithm for Text Classification
本地全文：下载
作者：Johny Thomas ; Abishek Nair ; Arpit Gupta 等
期刊名称：International Journal of Computer Science and Information Technologies
电子版ISSN：0975-9646
出版年度：2015
卷号：6
期号：2
页码：1848-1851
出版社：TechScience Publications
摘要：Text classification is a challenging task due to the large dimensionality of the feature vector. To alleviate this problem, feature reduction techniques are applied for reducing the amount of time and complexity for text classification. In this paper, we propose a novel fuzzy self constructing algorithm for feature clustering. Feature clustering is a feature reduction method which drastically reduces the dimensionality of feature vectors for text classification. Here, words are grouped into clusters based on degree of similarity. Each cluster is characterized by a membership function with statistical mean and deviation. When all the words are fed in, words similar to a cluster are grouped into the same cluster otherwise new clusters are created. The derived feature vectors describe properly the real distribution of the training data. The user need not specify the number of extracted features in advance.
关键词：Feature Clustering; Clustering Algorithm; Text;Classification.