面向GF-2遥感影像的U-Net城市绿地分类

徐知宇; 周艺; 王世新; 王丽涛; 王振庆

发布时间： 2021-03-19
摘要点击次数： 3543
全文下载次数： 1954
DOI: 10.11834/jig.200052
2021 | Volume 26 | Number 3

面向GF-2遥感影像的U-Net城市绿地分类

徐知宇^1,2, 周艺¹, 王世新¹, 王丽涛¹, 王振庆^1,2(1.中国科学院空天信息创新研究院, 北京 100094;2.中国科学院大学, 北京 100049)

摘要

目的高分2号卫星（GF-2）是首颗民用高空间分辨率光学卫星，具有亚米级高空间分辨率与宽覆盖结合的显著特点，为城市绿地信息提取等多领域提供了重要的数据支撑。本文利用GF-2卫星多光谱遥感影像，将一种改进的U-Net卷积神经网络首次应用于城市绿地分类，提出一种面向高分遥感影像的城市绿地自动分类提取技术。方法先针对小样本训练集容易产生的过拟合问题对U-Net网络进行改进，添加批标准化（batch normalization，BN）和dropout层获得U-Net+模型；再采用随机裁剪和随机数据增强的方式扩充数据集，使得在充分利用影像信息的同时保证样本随机性，增强模型稳定性。结果将U-Net+模型与最大似然法（maximum likelihood estimation，MLE）、神经网络（neural networks，NNs）和支持向量机（support vector machine，SVM）3种传统分类方法以及U-Net、SegNet和DeepLabv3+这3种深度学习语义分割模型进行分类结果精度对比。改进后的U-Net+模型能有效防止过拟合，模型总体分类精度比改进前提高了1.06%。基于改进的U-Net+模型的城市绿地总体分类精度为92.73%，平均F₁分数为91.85%。各分类方法按照总体分类精度从大到小依次为U-Net+（92.73%）、U-Net （91.67%）、SegNet （88.98%）、DeepLabv3+（87.41%）、SVM （81.32%）、NNs （79.92%）和MLE （77.21%）。深度学习城市绿地分类方法能充分挖掘数据的光谱、纹理及潜在特征信息，有效降低分类过程中产生的"椒盐噪声"，具有较好的样本容错能力，比传统遥感分类方法更适用于城市绿地信息提取。结论改进后的U-Net+卷积神经网络模型能够有效提升高分遥感影像城市绿地自动分类提取精度，为城市绿地分类提供了一种新的智能解译方法。

关键词

城市绿地卷积神经网络 U-Net 高分遥感语义分割

U-Net for urban green space classification in Gaofen-2 remote sensing images

Xu Zhiyu^1,2, Zhou Yi¹, Wang Shixin¹, Wang Litao¹, Wang Zhenqing^1,2(1.Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, China;2.University of Chinese Academy of Sciences, Beijing 100049, China)

Abstract

Objective High-precision monitoring of the spatial distribution of urban green space has important social, economic, and ecological benefits for optimizing the spatial structure of such space, maintaining urban ecological balance, and developing green city construction. As the first civilian optical satellite with high spatial resolution, Gaofen2 (GF-2) exhibits the remarkable characteristics of sub-meter high spatial resolution and wide coverage. GF2 provides important data support to multiple fields, such as urban environmental monitoring and urban green space information extraction. However, traditional classification methods still encounter many problems. For example, training a method to be an effective classifier for massive data is difficult, and the accuracy of classification results is generally low. The use of massive high-resolution remote sensing images to achieve large-scale rapid and accurate urban green space distribution extraction is an urgent task for urban planning managers. With the rapid development of deep learning technology, full convolutional networks (FCN) provide novel creative possibilities for semantic segmentation and realize pixel-level classification of images in the field of deep learning for the first time. Inspired by the U-Net network structure, we applied an improved U-Net to urban green space classification for the first time and proposed an automatic classification technique for urban green space by using high-resolution remote sensing images. Method First, we improved the U-Net model to obtain the U-Net+ model. The main structure of U-Net+ is composed of an encoder and a decoder that can achieve end-to-end training. The encoding channel realizes the multi-scale feature recognition of an image through four-time maximum pooling, and the decoding channel restores the position and detailed information of an image through upsampling. The network uses skip connection to realize the fusion of feature information with the same scale at different levels, overcoming accuracy loss caused by upsampling. In addition, we improved the model by adding batch normalization (BN) after each layer of network convolution operation, effectively regulating the input of the network layer and improving model training speed and network generalization capability. To solve the overfitting problem, which is easily produced by the limited sample training set, we added the dropout layer with a 50% probability of dropping neurons after the convolution operation of the fourth and fifth layers of the network. Second, deep learning requires a large amount of label data related to the classification objectives for training. However, existing open-source datasets cannot meet the requirements of the urban green space classification task. Manually establishing an urban green space tag dataset is necessary. We selected three typical urban green space sample areas in Beijing (urban parks, residential areas, and golf courses) as study areas. By combining GF-2 images and Google Earth remote sensing images in summer and winter, we drew all types of urban green space in the study areas through visual interpretation by using ArcGIS. The visual interpretation results are corrected with actual field investigation. Third, random cropping and data augmentation techniques are adopted to expand the dataset, ensuring the randomness of the samples and enhancing the stability of the model while fully utilizing image information. We adopt the Adam optimizer with an initial learning rate of 0.000 1. Result 1) The overall classification accuracy of the U-Net+ model is improved by 1.06% compared with that of the original U-Net. After 40 training epochs, the accuracy of the U-Net+ model reaches a high level, and the loss function realizes rapid convergence. The U-Net+ model effectively prevents overfitting and improves generalization capability. 2) To verify the effectiveness of our method, the classification accuracy of the U-Net+ results is compared with those of three traditional classification methods, namely, maximum likelihood estimation (MLE), neural networks (NNs), and support vector machine (SVM), and three semantic segmentation models, i.e., U-Net, SegNet, and DeepLabv3+. Among the seven classification methods, the U-Net+ model achieves the highest overall classification accuracy for urban green space. The seven classification methods are arranged in order of classification accuracy from large to small:U-Net+ (92.73%) > U-Net (91.67%) > SegNet (88.98%) > DeepLabv3+ (87.41%) > SVM (81.32%) > NNs (79.92%) > MLE (77.21%). 3) In the three types of urban green space, evergreen trees have the highest classification accuracy (F₁=93.65%), followed by grassland (F₁=92.55%) and deciduous trees (F₁=86.55%). 4) Deep learning exhibits strong fault-tolerant capability for training samples. By training and learning a large number of label data, it can effectively reduce the impact of errors and improve recognition capability, making it more suitable for urban green space information extraction than traditional remote sensing classification methods. Conclusion Deep learning urban green space classification methods can fully mine the spectral, textural, and potential feature information of data. Meanwhile, the U-Net+ model proposed in this study can also effectively reduce the salt-and-pepper noise the classification process and realize high-precision pixel-level classification of urban green space. The improved U-Net+ can effectively improve the accuracy of automatic classification of urban green space in high-resolution remote sensing images and provide a new intelligent interpretation method for urban green space classification in the future.

Keywords

urban green space convolutional neural network (CNN) U-Net high-resolution remote sensing semantic segmentation

在线采编平台

论文出版

年度会议

下载中心

年度信息