期刊名称:International Journal of Advanced Computer Science and Applications(IJACSA)
印刷版ISSN:2158-107X
电子版ISSN:2156-5570
出版年度:2020
卷号:11
期号:4
DOI:10.14569/IJACSA.2020.0110412
出版社:Science and Information Society (SAI)
摘要:This study proposes a self-paced learning scheme that integrates self-training and deep learning to select and learn labeled and unlabeled data samples for classifying anterior-posterior chest images as either being pneumonia-infected or normal. With this new approach, a model is first trained with labeled data. The model is evaluated on unlabeled data to generate pseudo labels for the unlabeled data. Using a novel selection scheme, the pseudo-labeled samples are then selected to update the model in next training iteration of the semi-supervised training process. The selected pseudo-labeled images to be added to the next training iteration are images with the most confident probabilities from every unlabeled class. Such a selection scheme prevents mistake reinforcement, which is a prevalent occurrence in self-training. With deep models having the tendency to latch onto well-represented class samples while ignoring less transferable and represented classes, especially in the case of unbalanced data, the proposed method utilizes a novel algorithm for the generation and selection of reliable top-K pseudo-labeled samples to be used in updating the model during the next training phase. Such an approach does not only force the model to learn the hard samples in the training data, it also helps enlarge the training set by generating enough samples that satisfy the hunger of deep models. Extensive experimental evaluation of the proposed method yields higher accuracy results compared to methods mentioned in the literature on the same dataset, an indication of the effectiveness of the proposed method.