结合判别相关分析与特征融合的遥感图像检索
摘 要
目的 高分辨率遥感图像检索中,单一特征难以准确描述遥感图像的复杂信息。为了充分利用不同卷积神经网络(convolutional neural networks,CNN)的学习参数来提高遥感图像的特征表达,提出一种基于判别相关分析的方法融合不同CNN的高层特征。方法 将高层特征作为特殊的卷积层特征处理,为了更好地保留图像的原始空间信息,在图像的原始输入尺寸下提取不同高层特征,再对高层特征进行最大池化来获得显著特征;计算高层特征的类间散布矩阵,结合判别相关分析来增强同类特征的联系,并突出不同类特征之间的差异,从而提高特征的判别力;选择串联与相加两种方法来对不同特征进行融合,用所得融合特征来检索高分辨率遥感图像。结果 在UC-Merced、RSSCN7和WHU-RS19数据集上的实验表明,与单一高层特征相比,绝大多数融合特征的检索准确率和检索时间都得到有效改进。其中,在3个数据集上的平均精确率均值(mean average precision,mAP)分别提高了10.4% 14.1%、5.7% 9.9%和5.9% 17.6%。以检索能力接近的特征进行融合时,性能提升更明显。在UC-Merced数据集上,融合特征的平均归一化修改检索等级(average normalized modified retrieval rank,ANMRR)和mAP达到13.21%和84.06%,与几种较新的遥感图像检索方法相比有一定优势。结论 本文提出的基于判别相关分析的特征融合方法有效结合了不同CNN高层特征的显著信息,在降低特征冗余性的同时,提升了特征的表达能力,从而提高了遥感图像的检索性能。
关键词
Remote sensing image retrieval combining discriminant correlation analysis and feature fusion
Abstract
Objective With the rapid development of remote sensing technology, numerous high-resolution remote sensing images have become available. As a result, the effective retrieval of remote sensing images has become a challenging research topic. Feature extraction is key to determining the retrieval performance of high-resolution remote sensing image retrieval tasks. Traditional feature extraction methods are mainly based on handcrafted features, whereas such shallow features are easily affected by artificial intervention. Convolutional neural networks (CNNs) can learn feature representations automatically, and thus are suitable to deal with high-resolution remote sensing images with complex content. However, the parameters of CNNs are difficult to train fully due to the small scale of currently available public remote sensing datasets. In this case, the transfer learning of CNNs has attracted much attention. CNNs pretrained on large-scale datasets have good generalization ability, and parameters can be transferred to small-scale data effectively. Therefore, extracting CNN features on the basis of transfer learning has become an effective method in the field of remote sensing image retrieval. Given the abundant and complex visual content of high-resolution remote sensing images, it is difficult to accurately express the content of remote sensing images using a single feature. Thus, feature fusion is a useful method to improve the feature representation of remote sensing images. To maximize the learning parameters of different CNNs to represent the content of remote sensing images, a method based on discriminant correlation analysis (DCA) is proposed to fuse the high-level features of different CNNs. Method First, CNN parameters from VGGM(visual geometry group medium), VGG(visual geometry group)16, GoogLeNet, and ResNet50 are transferred for high-resolution remote sensing images, and the high-level features are adopted as special convolutional features. To preserve the original spatial information of the image, the high-level features are extracted under the original input image size, and the output form of three-dimensional tensor is retained. Then, max pooling is adopted on the high-level features to extract salient features. Second, DCA is adopted to enhance the feature representation. The DCA is the first to incorporate the class structure into the feature level fusion and has low computational complexity. To maximize the correlation of corresponding features across the two feature sets and in the same time decorrelates features that belong to different classes within each feature set, the between-class scatter matrices of the two sets of high-level features are calculated, and matrix diagonalization and singular value decomposition are adopted to transform the features. The transformed matrix contains the important eigenvectors of the between-class scatter matrix, and the dimension of the transformed matrix is reduced accordingly. Thus, the transformed feature vectors have strong discriminative power and low dimension. Lastly, two methods of concatenation and summation are selected to perform the fusion of transformed feature vectors, and the fused features are normalized via Gaussian normalization. The similarities between the query and dataset features are calculated using the Euclidean distance method, and the retrieval results are returned in accordance with the sort of similarities. Result Experiment results on the UC-Merced, RSSCN7, and WHU-RS19 datasets show that the retrieval accuracy and retrieval time of most fusion features are effectively improved in comparison with a single high-level feature; the mean average precision (mAP) of the fusion feature is improved by 10.4%14.1%, 5.7%9.9%, and 5.9%17.6%, respectively. The retrieval results of the fused features using the concatenation method are better than that using the summation method. Multifeature fusion experiments show that the best result on the UC-Merced dataset is obtained from the fusion of four features, whereas the best results on the RSSCN7 and WHU-RS19 datasets are obtained from the fusion of three features. This finding indicates that a larger number of fused features does not translate into better performance; selecting the appropriate features is crucial for feature fusion. Especially, when the different features have good representation and similar retrieval capabilities, the fusion of these features can achieve good retrieval performance. Compared with other state-of-the-art approaches, the average normalized modified retrieval rank(ANMRR) and mAP of the proposed fused feature on the UC-Merced dataset reach 0.132 1 and 84.06%, respectively. Experimental results demonstrate that our method outperforms state-of-the-art approaches. Conclusion The feature fusion method based on discriminant correlation analysis combines the salient information of different high-level features. This method reduces feature redundancy while improving feature discrimination. Features with equivalent retrieval capabilities can be fused by the proposed method well, thus effectively improving the retrieval performance of high-resolution remote sensing images.
Keywords
remote sensing image retrieval convolutional neural network (CNN) high-level feature fusion discriminant correlation analysis(DCA) max pooling
|