出版社:The Japanese Society for Artificial Intelligence
摘要:AdaBoost has been successfully applied to a number of classification tasks, seemingly defying problems of overfitting. AdaBoost performs gradient descent in an error function with respect to the margin. This method concentrates on the patterns which are hardest to learn. However, this property of AdaBoost can be disadvantageous for noisy problems. Indeed, theoretical analysis has shown that the margin distribution plays a crucial role in understanding this phenomenon. Loosely speaking, some outliers should be tolerated if this has the benefit of substantially increasing the margin on the remaining points. In this paper, we propose new noise robust boosting methods using the concepts of ν-Support Vector Classification and Arc-GV. These methods allow for the probability of a pre-specified fraction ν of points to lie in the margin area or even on the wrong side of the decision boundary. This algorithms can give a nicely interpretable way of controlling the trade-off between minimizing the training error and capacity.