Current Issue Cover
区域级通道注意力融合高频损失的图像超分辨率重建

周波1, 李成华2, 陈伟1(1.江南大学人工智能与计算机学院, 无锡 214000;2.中国科学院自动化研究所, 北京 100019)

摘 要
目的 通道注意力机制在图像超分辨率中已经得到了广泛应用,但是当前多数算法只能在通道层面选择感兴趣的特征图而忽略了空间层面的信息,使得特征图中局部空间层面上的信息不能合理利用。针对此问题,提出了区域级通道注意力下的图像超分辨率算法。方法 设计了非局部残差密集网络作为网络的主体结构,包括非局部模块和残差密集注意力模块。非局部模块提取非局部相似信息并传到后续网络中,残差密集注意力模块在残差密集块结构的基础上添加了区域级通道注意力机制,可以给不同空间区域上的通道分配不同的注意力,使空间上的信息也能得到充分利用。同时针对当前普遍使用的L1和L2损失函数容易造成生成结果平滑的问题,提出了高频关注损失,该损失函数提高了图像高频细节位置上损失的权重,从而在后期微调过程中使网络更好地关注到图像的高频细节部分。结果 在4个标准测试集Set5、Set14、BSD100(Berkeley segmentation dataset)和Urban100上进行4倍放大实验,相比较于插值方法和SRCNN(image super-resolution using deep convolutional networks)算法,本文方法的PSNR(peak signal to noise ratio)均值分别提升约3.15 dB和1.58 dB。结论 区域级通道注意力下的图像超分辨率算法通过使用区域级通道注意力机制自适应调整网络对不同空间区域上通道的关注程度,同时结合高频关注损失加强对图像高频细节部分的关注程度,使生成的高分辨率图像具有更好的视觉效果。
关键词
Region-level channel attention for single image super-resolution combining high frequency loss

Zhou Bo1, Li Chenghua2, Chen Wei1(1.School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi 214000, China;2.Institute of Automation, Chinese Academy of Sciences, Beijing 100019, China)

Abstract
Objective As an important branch of image processing, image super-resolution has attracted extensive attention of many scholars. The attention mechanism has originally been applied to machine translation in deep learning. As an extension of the attention mechanism, the channel attention mechanism has been widely used in image super-resolution. A single image super-resolution using region-level channel attention has been proposed. A region-level channel attention mechanism has been presented in the network, which can assign different attention to different channels in different regions. Meanwhile, the high-frequency aware loss has been demonstrated with aiming at the characteristics that L1 and L2 losses commonly used at present tend to produce very smooth results. This loss function has strengthened the weight of losses at high-frequency positions to the generation of high-frequency details. Method The network structure has consisted of three parts:low-level feature extraction, high-level feature extraction and image reconstruction. In the low-level feature extraction part, the algorithm used 1 layer of 3×3 convolution. The high-level feature extraction part has contained a non-local module and several residual dense block attention modules. The non-local module has extracted the non-local similarity information via the non-local operation. Sub-pixel convolutional layer has been used before calculating non-local similar information. The calculation has been conducted at low resolution. Dense connection has been used in the residual dense block attention modules to facilitate the network adaptive accumulation of features of different layers. Meanwhile, residual learning has been used to further optimize the gradient propagation problem. Region-level channel attention mechanism has been introduced to pay attention to information in different regions adaptively. The initial non-local similarity information has been added to the last layer by skip connection. In the image reconstruction part, sub-pixel convolution has been used to up-sampling operation on features and a 3×3 convolutional layer has been used to obtain the final reconstruction result. In terms of loss function, the high-frequency aware loss has been operated for enhancing the network's ability of reconstructing high frequency details. Before training, the locations of high-frequency details in the image have been extracted. During training, more weight has been added to the losses at these locations to better learn the reconstruction process of high-frequency details. The whole training process has been divided into two stages. In the first stage, L1 loss has used to train the network. In the second stage, the high-frequency aware loss and L1 loss has used to fine-tune the model of the first stage together. Result Region-level channel attention and the high-frequency aware loss have been verified via ablation study. The model using the region-level channel attention is significantly better on peak signal to noise ratio (PSNR). The high-frequency aware loss and L1 loss together to fine-tune the model is better on PSNR than the model only use L1 loss to fine-tune. The good effect of the region-level channel attention and the high-frequency aware loss have been verified both at the same time. Set5, Set14, Berkeley segmentation dataset (BSD100) and Urban100 have been selected for testing in comparison with other algorithms. The comparison algorithms have included Bicubic, image super-resolution using deep convolutional networks (SRCNN), accurate image super-resolution using very deep convolutional networks (VDSR), image super-resolution using very deep residual channel attention networks (RCAN), feedback network for image super-resolution (SRFBN) and single image super-resolution via a holistic attention network (HAN) respectively. On the subjective effect of the present, the results with a factor of 4, three of the results have been selected for display. The results generated by the algorithm have presented more rich in texture without any blurring or distortion. In the presentation of objective indicators, PSNR and structural similarity (SSIM) have been used as indicators to make a comprehensive comparison under three different factors of 2, 3 and 4, respectively. PSNR of the model with amplification factor of 4 under four standard test sets is 32.51 dB, 28.82 dB, 27.72 dB and 26.66 dB, respectively. Conclusion A super-resolution algorithm using region-level channel attention mechanism has been commonly used channel attention in region-level. Based on the high-frequency aware loss, the network can reconstruct high frequency details by increasing the attention degree of the network to the high frequency detail location. The experimental results have shown that the proposed algorithm has its priority in objective indicators and subjective effects via using region-level channel attention mechanism and high-frequency aware loss.
Keywords

订阅号|日报