Current Issue Cover
全卷积神经网络下的多光谱遥感影像分割

姚建华1, 吴加敏1, 杨勇1, 施祖贤2,3(1.宁夏回族自治区遥感测绘勘查院, 银川 750021;2.北京科技大学计算机与通信工程学院, 北京 100083;3.材料领域知识工程北京市重点实验室, 北京 100083)

摘 要
目的 传统的遥感影像分割方法需要大量人工参与特征选取以及参数选择,同时浅层的机器学习算法无法取得高精度的分割结果。因此,利用卷积神经网络能够自动学习特征的特性,借鉴处理自然图像语义分割的优秀网络结构,针对遥感数据集的特点提出新的基于全卷积神经网络的遥感影像分割方法。方法 针对遥感影像中目标排列紧凑、尺寸变化大的特点,提出基于金字塔池化和DUC(dense upsampling convolution)结构的全卷积神经网络。该网络结构使用改进的DenseNet作为基础网络提取影像特征,使用空间金字塔池化结构获取上下文信息,使用DUC结构进行上采样以恢复细节信息。在数据处理阶段,结合遥感知识将波段融合生成多源数据,生成植被指数和归一化水指数,增加特征。针对遥感影像尺寸较大、采用普通预测方法会出现拼接痕迹的问题,提出基于集成学习的滑动步长预测方法,对每个像素预测14次,每次预测像素都位于不同图像块的不同位置,对多次预测得到的结果进行投票。在预测结束后,使用全连接条件随机场(CRFs)对预测结果进行后处理,细化地物边界,优化分割结果。结果 结合遥感知识将波段融合生成多源数据可使分割精度提高3.19%;采用基于集成学习的滑动步长预测方法可使分割精度较不使用该方法时提高1.44%;使用全连接CRFs对预测结果进行后处理可使分割精度提高1.03%。结论 针对宁夏特殊地形的遥感影像语义分割问题,提出基于全卷积神经网络的新的网络结构,在此基础上采用集成学习的滑动步长预测方法,使用全连接条件随机场进行影像后处理可优化分割结果,提高遥感影像语义分割精度。
关键词
Segmentation in multi-spectral remote sensing images using the fully convolutional neural network

Yao Jianhua1, Wu Jiamin1, Yang Yong1, Shi Zuxian2,3(1.Ningxia Insitute of Remote Sensing, Survey and Mapping, Yinchuan 750021, China;2.School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing 100083, China;3.Beijing Key Laboratory of Knowledge Engineering for Materials Science, Beijing 100083, China)

Abstract
Objective The traditional remote sensing image segmentation method requires the selection of manyartificial participation featuresandparameters. The shallow machine learning algorithm cannot achieve high-precision segmentation accuracy. The convolutional neural network can automatically learn the characteristics of features and draws on its excellent network structure for performing natural image semantic segmentation. A novel method based on the fully convolutional neural network for remote sensing image segmentation is proposed based on the characteristics of the remote sensing dataset. It studies the fusion between multi-spectral image data bands, increases the learnable features, and improves segmentation accuracy. On the basis of the characteristics of the remote sensing image size, the prediction results of integrated learning and the conditional random field processing model are investigated to mitigate the phenomenon of misclassification, restore the boundary of features, and further improve segmentation accuracy. This study realizes the extraction of features on multi-spectral remote sensing images, which can be applied to subsequent change detection tasks, thus promoting the analysis of changes in surface cover types by automation. Method Aiming at the characteristics of compact targets and the large size range of remote sensing images, a fully convolutional neural network based on pyramid pooling and the dense upsamplingconvolution (DUC) structure is proposed. The proposed network can automatically interpret remote sensing images. The network structure uses improved DenseNet as the underlying network to extract image features, the spatial pyramid pooling structure to obtain context information, and the DUC structure to upsampleand recover detailed information. In the data processing stage, in combination with remote sensing knowledge, the bands are combined to generate multi-source data, and vegetation and normalized water indexes are generated to increase the characteristics. A sliding step prediction method based on integrated learning is proposed to address the problem of remote sensing images being large and the appearance of splicing trace by an ordinary prediction method. Each pixel is predicted 1 to 4 times, and each predicted pixel is located in different image blocks. Different locations vote on the results of multiple predictions. After prediction, the prediction results are post-processed using fully connected conditional random fields (CRFs) to refine the boundary of the features and optimize the segmentation results. Result To verify the validity of the proposed network model and post-processing method, the U-Net model, the fully convolutional neural network FCN-8s model, and the Hdc-DUC model are compared through experiments using a self-built dataset. The accuracy of using the multi-source data from the training model is higher than that obtained by using the original data. The multi-source data training model improves the mIoU evaluation standard by 3.19%, which confirms the validity of the multi-source data generated by band fusion combined with geo-remote sensing knowledge. In terms of effectiveness, when the sliding step prediction method based on integrated learning is used, the segmentation accuracy is improved by 1.44%, and the effect of the characteristics of the remote sensing image on the prediction phase of the model is verified. Although fully connected CRFs may smoothen small-sized features, the use of CRFs to post-process the prediction results effectively improves the segmentation accuracy by 1.03%. The main reason is the image resolution of the self-built dataset. The rate is low, the dataset is relatively fuzzy, the features are highly complicated, and the labeling is inaccurate. The distribution of data is difficult to learn through the fully convolutional neural network, and the accuracy of the prediction results is low. Therefore, fully connected CRFs can improve the segmentation results to a large extent. Experimental results verify the effectiveness of the proposed network model and post-processing method. Conclusion This study mainly investigates the semantic segmentation of remote sensing images. The research belongs to computer vision and pattern recognition. The purpose is to let a computer identify the category of each pixel in the remote sensing image, namely, remote sensing image interpretation. Remote sensing image interpretation is a basic problem in remote sensing.It is an important means to obtain remote sensing image information, and the ground object information obtained from it can provide an important reference for various tasks, such as change detection and disaster relief. Improving the segmentation accuracy of remote sensing images has always been a popular topic. This study proposes a new network structure based on fully convolutional neural network for the characteristics of remote sensing images. On this basis, a sliding step prediction method based on integrated learning is proposed and used. Fully connected conditions are adopted for the post-processing of images to optimize the segmentation results and achieve a high-precision semantic segmentation of remote sensing images.
Keywords

订阅号|日报