期刊名称:International Journal of Computer Science and Network Security
印刷版ISSN:1738-7906
出版年度:2013
卷号:13
期号:5
页码:100-106
出版社:International Journal of Computer Science and Network Security
摘要:Due to the flourish of World Wide Web and the rapid development of the Internet technology, the increasing volume of digital textual data become more and more unmanageable, therefore the importance of text classification has gained significant attention. Text classi?cation pose some speci?c challenges such as high dimensionality with each document (data point) having only a very small subset of them and representing multiple labels at the same time. Feature clustering is a powerful method to reduce the dimensionality of feature vectors for text classification. Many researchers worked on Feature Clustering for efficient text classification. Recently a Fuzzy based feature clustering was proposed in which Gaussian distribution is used for fuzzy membership function for clustering. But the problem of skewness may occur with this distribution. To overcome that we propose an efficient Fuzzy similarity based membership function for efficient clustering and with this proposed algorithm satisfactory results obtained.
关键词:Dimensionality reduction; Skewness; feature extraction; fuzzy clustering; split normal distribution.