噪声鲁棒的轻量级深度遥感场景图像分类检索
摘 要
目的 基于深度神经网络的遥感图像处理方法在训练过程中往往需要大量准确标注的数据,一旦标注数据中存在标签噪声,将导致深度神经网络性能显著降低。为了解决噪声造成的性能下降问题,提出了一种噪声鲁棒的轻量级深度遥感场景图像分类检索方法,能够同时完成分类和哈希检索任务,有效提高深度神经网络在有标签噪声遥感数据上的分类和哈希检索性能。方法 选取轻量级神经网络作为骨干网,而后设计能够同时完成分类和哈希检索任务的双分支结构,最后通过设置损失基准的正则化方法,有效减轻模型对噪声的过拟合,得到噪声鲁棒的分类检索模型。结果 本文在两个公开遥感场景数据集上进行分类测试,并与8种方法进行比较。本文方法在AID(aerial image datasets)数据集上,所有噪声比例下的分类精度比次优方法平均高出7.8%,在NWPU-RESISC45(benchmark created by Northwestern Polytechnical University for remote sensing image scene classification covering 45 scene classes)数据集上,分类精度比次优方法平均高出8.1%。在效率方面,本文方法的推理速度比CLEOT(classification loss with entropic optimal transport)方法提升了2.8倍,而计算量和参数量均不超过CLEOT方法的5%。在遥感图像哈希检索任务中,在AID数据集上,本文方法的平均精度均值(mean average precision,mAP)在3种不同哈希比特下比MiLaN(metric-learning based deep hashing network)方法平均提高了5.9%。结论 本文方法可以同时完成遥感图像分类和哈希检索任务,在保持模型轻量高效的情况下,有效提升了深度神经网络在有标签噪声遥感数据上的鲁棒性。
关键词
A robust lightweight deep learning method for remote sensing scene image classification and retrieval under label noise
Wang Yapeng, Li Yang, Wang Jiabao, Zhao Xun, Miao Zhuang(Command and Control Engineering College, Army Engineering University of PLA, Nanjing 210007, China) Abstract
Objective With the development of deep learning technology, deep neural networks have been widely used in various tasks of remote sensing, such as image retrieval, scene classification and change detection. Although these deep learning methods constantly refresh the accuracy of remote sensing applications on specific datasets, they require massive data with millions of reliable annotations, which are impractical or expensive for real-world applications. In contrast, when the accuracy of labels is too low, the performance of these deep learning methods will decline sharply. In order to reduce the labeling cost and improve the labeling speed, researchers have proposed a variety of greedy annotation methods to improve labeling efficiency via clustering and crowd sourcing information. The performance of deep learning methods will decline dramatically once the label noise is introduced into the dataset. It is necessary to construct a noise robust deep learning method for remote sensing image processing to improve generalization performance. A noise robust and lightweight deep learning method for remote sensing scene classification and retrieval have been proposed to resolve performance degradation, which can effectively improve the classification and hash retrieval performance on remote sensing dataset under label noise. Furthermore, the proposed method can complete classification and hash retrieval tasks at the same time. Method First, a lightweight deep neural network named mobile GPU-aware network C (MoGA-C) as the backbone has been used to keep the lightweight of deep learning model, which has been proposed by Xiaomi AI Lab. MoGA-C has been obtained based on mobile GPU-aware (MoGA) neural network structure search algorithm. Various skills of lightweight network design have been integrated to ensure the lightweight of the network in the process of MoGA-C network design. Next, a double-branch structure behind deep neural network has been performed to the tasks of classification and retrieval simultaneously, which can not only avoid the degradation of classification performance caused by the insertion of hash layer, but also effectively increase the classification accuracy under label noise by integrating the results of double-branch. At last, the whole network has been fine-tuned during training process in order to improve the learning ability of deep neural network, which effectively improved the classification performance under low ratio label noise. A loss benchmark in the process of network fine-tuning has been set to reduce over-fitting to label noise in the middle and later stage of training, which limited the lower boundary of training loss and reduced the over-fitting under high ratio noise effectively. Result The proposed method has been evaluated via comparing it with other eight state of the art methods on two public remote sensing classification datasets. The research method has performed well under different noise ratios, which is 7.8% higher than sub-optimal method on aerial image datasets (AID) dataset and 8.1% higher on benchmark created by Northwestern Polytechnical University for remote sensing image scene classification covering 45 scene classes (NWPU-RESISC45) dataset in average. The inference speed has reached 2.8 times faster than the classification loss with entropic optimal transport (CLEOT) method. The floating point operations (FLOPs) and parameters are less than 5% of that in CLEOT method. The method has 5.9% average improvement under three different hash bits compared with the metric-learning based deep hashing network (MiLaN) method on AID dataset in the task of remote sensing image retrieval. Conclusion A lightweight and noise robust method for remote sensing scene classification and retrieval has been demonstrated to resolve the problem of performance degradation of remote sensing image processing methods under label noise. The proposed method can perform the tasks of classification and hash retrieval at the same time and improve the classification and retrieval performance under label noise effectively. First of all, a lightweight network has been opted as the backbone to ensure the lightweight of the model. Secondly, a parallel double-branch structure has been designed in order to complete the classification and hash retrieval tasks at the same time, the classification performance of the model has been improved further via combining the double-branch prediction results. Finally, the training loss has subjected to a positive value to reduce the over-fitting of label noise effectively via setting a loss benchmark. To compare with other methods, the classification and hash retrieval experiments have been conducted on two public datasets. The experimental results have presented that the proposed method not only has high efficiency, but also has good robustness to different ratios of label noise.
Keywords
|