期刊名称:International Journal of Advanced Computer Science and Applications(IJACSA)
印刷版ISSN:2158-107X
电子版ISSN:2156-5570
出版年度:2017
卷号:8
期号:10
DOI:10.14569/IJACSA.2017.081009
出版社:Science and Information Society (SAI)
摘要:Traditional supervised machine learning techniques require training on large volumes of data to acquire efficiency and accuracy. As opposed to traditional systems Active Learning systems minimizes the size of training data significantly because the selection of the data is done based on a strong mathematical model. This helps in achieving the same accuracy levels of the results as baseline techniques but with a considerably small training dataset. In this paper, the active learning approach has been implemented with a modification into the traditional system of active learning with version space algorithm. The version space concept is replaced with the divisive analysis (DIANA) algorithm and the core idea is to pre-cluster the instances before distributing them into training and testing data. The results obtained by our system have justified our reasoning that pre-clustering instead of the traditional version space algorithm can bring a good impact on the accuracy of the overall system’s classification. Two types of data have been tested, the binary class and multi-class. The proposed system worked well on the multi-class but in case of binary, the version space algorithm results were more accurate.