期刊名称:Journal of Theoretical and Applied Information Technology
印刷版ISSN:1992-8645
电子版ISSN:1817-3195
出版年度:2014
卷号:63
期号:1
出版社:Journal of Theoretical and Applied
摘要:Pattern mining is an important research issues in data mining with few kinds of applications. Many text mining methods have been proposed for mining useful pattern in text documents. It is mainly focuses to approximately identify the different entities such as terms, phrases and pattern. We use the feature evaluation to reduce the dimensionality of high dimensional text vector. Then the system assigns the frequency to each word, all the weight of the document is used for pattern clustering. Pattern clustering is one of the favorable methods for feature extraction in text classification. In this paper we propose a fuzzy estimated and similarity - based self generating algorithm for text classification. It overcomes the low frequency problem, and also calculates the similarity between the different pattern and word in effective manner. Experimental on RCV1 data collection and TREC topics implement that the proposed result achieves better performance
关键词:Text Mining; Feature Clustering; Text Classification; Feature Extraction; Pattern.