边缘增强深层网络的图像超分辨率重建
摘 要
目的 针对基于学习的图像超分辨率重建算法中存在边缘信息丢失、易产生视觉伪影等问题,提出一种基于边缘增强的深层网络模型用于图像的超分辨率重建。方法 本文算法首先利用预处理网络提取输入低分辨率图像的低级特征,然后将其分别输入到两路网络,其中一路网络通过卷积层级联的卷积网络得到高级特征,另一路网络通过卷积网络和与卷积网络成镜像结构的反卷积网络的级联实现图像边缘的重建。最后,利用支路连接将两路网络的结果进行融合,并将其结果通过一个卷积层从而得到最终重建的具有边缘增强效果的高分辨率图像。结果 以峰值信噪比(PSNR)和结构相似度(SSIM)作为评价指标来评价算法性能,在Set5、Set14和B100等常用测试集上放大3倍情况下进行实验,并且PSNR/SSIM指标分别取得了33.24 dB/0.9156、30.60 dB/0.852 1和28.45 dB/0.787 3的结果,相比其他方法有很大提升。结论 定量与定性的实验结果表明,基于边缘增强的深层网络的图像超分辨重建算法所重建的高分辨率图像不仅在重建图像边缘信息方面有较好的改善,同时也在客观评价和主观视觉上都有很大提高。
关键词
Image super-resolution reconstruction via deep network based on edge-enhancement
Xie Zhenzhu, Wu Congzhong, Zhan Shu(School of Computer and Information, Hefei University of Technology, Hefei 230009, China) Abstract
Objective Image super-resolution reconstruction is a branch of image restoration, which concerns with the problem of generating a plausible and visually pleasing high-resolution output image from a low-resolution input image. This approach has many practical applications, ranging from video surveillance imaging to medical imaging and satellite remote-sensing image processing. Although some methods have achieved reasonable results in recent years, they have mainly focused on visual artifacts, while the loss of edge information has been rarely mentioned. To address these weaknesses, a novel image super-resolution reconstruction method via deep network based on edge enhancement is proposed in this study. Method Given that deep learning has demonstrated excellent performance in computer vision problems, some scholars have utilized convolutional neural networks to design deep architecture for image super resolution. Dong et al successfully introduced deep learning into a super-resolution-based method; they demonstrated that convolutional neural networks could be used to learn mapping from a low-resolution image to a high-resolution image in an end-to-end way and achieved state-of-the-art results. Besides, Inspired by semantic segmentation based on deconvolution network, we introduce a deconvolution network to reconstruct edge information. The proposed model considers an interpolated low-resolution image (to the desired size) as input. The preprocessed network is utilized to extract low-level features of the input image, which are imported into the mixture network. The mixture network consists of two roads. One road is used to obtain high-level features by cascading the convolutional layer many times, and the other road realizes the reconstruction of the image edge by cascading between the convolutional network and its mirror network-deconvolution network. The convolutional and deconvolution layers in stacked style can retain the feature map size by adding pad wise-pixel. We can obtain the final reconstruction result through a convolutional layer by fusing the two road results via bypass connection. We select the rectified linear unit as activation function in our model to accelerate the training process and avoid the vanishing gradient. We employ 91 images as the training set and observe their performance changes in Set5, Set14, and B100 with scaling factors of 2, 3, and 4 respectively. The training set is further augmented by rotating the original image by 90°, 180°, and 270° and flipping them upside down to prevent overfitting in deep network. Notably, we initially convert the color images of RGB space into YCbCr space, considering that human vision is more sensitive to details in intensity than in color. We then apply the proposed algorithm to the luminance Y channel, and the Cb, Cr channels are upscaled by bicubic interpolation. Result All experiments are implemented on the Caffe package. The proposed algorithm considers peak-signal-to-noise ratio and structural similarity index as evaluation metrics. The experimental results on Set5 for the scale factor of 3 are 33.24 dB/0.915 6, 30.60 dB/0.852 1, and 27.99 dB/0.784 8. Compared with bicubic, ScSR, A+, SelfEx, SRCNN, and CSCN, the proposed algorithm shows improved performances by 2.85 dB/4.74, 1.9 dB/2.87, 0.66 dB/0.68, 0.66 dB/0.63, 0.49 dB/0.66, and 0.14 dB/0.12 respectively. The running time of GPU version on Set5 for scale factor of 3 only takes 0.62 s, which is obviously superior to those of the other methods. Conclusion Convolutional neural networks have been increasingly popular in image super-resolution reconstruction. This study employs a deep network that contains convolution, deconvolution, and unpooling, which is used for reconstructing image edge information. The experimental results demonstrate that the proposed method based on edge enhancement model achieves better quantitative and qualitative reconstruction performances than those of the other methods.
Keywords
super-resolution reconstructions convolutional neural networks deconvolution unpooling edge enhancement
|