期刊名称:Journal of Theoretical and Applied Information Technology
印刷版ISSN:1992-8645
电子版ISSN:1817-3195
出版年度:2017
卷号:95
期号:8
出版社:Journal of Theoretical and Applied
摘要:Today abundant information is available due to the advent of Internet, which is usually stored with sole purpose of current needs alone. Such data thus rest in unclassified in dump repository. Instead if it would be stored in a classified repository then navigation could be done easily, or classified at the later stage reaching it could become easier and thus could helpful in decision making. In the process of classification, commonly supervised and unsupervised paradigm is adopted. Semi-supervised is a new term which is in-between supervised and unsupervised learning where in-addition to the unlabeled data, the algorithm is provided with some supervision information but not necessarily for all example data. A blend of supervised and unsupervised classification is explored in the formation of fuzzy clusters based on the importance of the terms in each class. Enhancements in traditional KNN algorithm is explored taking into consideration the different weights for the features based on the concept of variance in each class. Finally the results obtained in supervised paradigm and semi-supervised paradigm is compared.