期刊名称:International Journal of Grid and Distributed Computing
印刷版ISSN:2005-4262
出版年度:2014
卷号:7
期号:1
页码:67-76
出版社:SERSC
摘要:The theoretical upper bound of generalization error for ECOC SVMs is derived based on Fat-Shattering dimensionality and covering number. The factors affecting the generalization performance of ECOC SVMs are analyzed. From the analysis, it is believed that in real classification tasks, the performance of ECOC depends on the performance of the classifiers corresponding to its coding columns, which is irrelevant to the mathematical characteristics of the ECOC itself. The essence of ECOC SVMs is how to construct an optimal voting machine consisting of a number of SVMs, how to choose Sub-SVMs which have better generalization ability, and how to determine the number of Sub-SVMs taking part in voting, that is the most important issue. Data sets including "Segment" are selected for test. All the ECOC code columns are constructed using an exhaustive technique. A Sub-SVM is trained for each code column, and the generalization ability of each Sub-SVM is evaluated by classification intervals and error rates estimated by cross validation. Then, all the ECOC code columns are sorted by the generalization performance of Sub-SVMs. Three categories of ECOC SVMs, including superior, inferior and ordinary categories, are constructed from the sorted ECOC code columns, by using forward, backward and original sequences. Experimental results show that the performance of ECOC SVMs which consist of Sub-SVMs with better generalization ability is better and vice versa, which validates our view and points out the direction for improving ECOC SVMs