文章基本信息

标题：Reverse Difference Network for Highlighting Small Objects in Aerial Images
本地全文：下载
作者：Ni, Huan ; Chanussot, Jocelyn ; Niu, Xiaonan 等
期刊名称：ISPRS International Journal of Geo-Information
电子版ISSN：2220-9964
出版年度：2022
卷号：11
期号：9
页码：1-25
DOI：10.3390/ijgi11090494
语种：English
出版社：MDPI AG
摘要：The large-scale variation issue in high-resolution aerial images significantly lowers the accuracy of segmenting small objects. For a deep-learning-based semantic segmentation model, the main reason is that the deeper layers generate high-level semantics over considerably large receptive fields, thus improving the accuracy for large objects but ignoring small objects. Although the low-level features extracted by shallow layers contain small-object information, large-object information has predominant effects. When the model, using low-level features, is trained, the large objects push the small objects aside. This observation motivates us to propose a novel reverse difference mechanism (RDM). The RDM eliminates the predominant effects of large objects and highlights small objects from low-level features. Based on the RDM, a novel semantic segmentation method called the reverse difference network (RDNet) is designed. In the RDNet, a detailed stream is proposed to produce small-object semantics by enhancing the output of RDM. A contextual stream for generating high-level semantics is designed by fully accumulating contextual information to ensure the accuracy of the segmentation of large objects. Both high-level and small-object semantics are concatenated when the RDNet performs predictions. Thus, both small- and large-object information is depicted well. Two semantic segmentation benchmarks containing vital small objects are used to fully evaluate the performance of the RDNet. Compared with existing methods that exhibit good performance in segmenting small objects, the RDNet has lower computational complexity and achieves 3.9–18.9% higher accuracy in segmenting small objects.
关键词：difference; semantic segmentation; convolutional networks; attention; deep learning