面向纹理平滑的方向性滤波尺度预测模型
摘 要
目的 传统图像处理的纹理滤波方法难以区分梯度较强的纹理与物体的结构,而深度学习方法使用的训练集生成方式不够合理,且模型表示方式比较粗糙,为此本文设计了一种面向纹理平滑的方向性滤波尺度预测模型,并生成了含有标签的新的纹理滤波数据集。方法 在现有结构图像中逐连通区域填充多种纹理图,生成有利于模型训练的纹理滤波数据集。设计了方向性滤波尺度预测模型,该模型包含尺度感知子网络和图像平滑子网络。前者预测得到的滤波尺度图不但体现了该像素与周围像素是否为同一纹理,而且还隐含了该像素是否为结构像素的信息。后者以滤波尺度图和原图的堆叠作为输入,凭借少量的卷积层快速得出纹理滤波的结果。结果 在本文的纹理滤波数据集上与7个算法进行比较,峰值信噪比(peak signal to noise ratio,PSNR)与结构相似度(structural similarity,SSIM)分别高于第2名2.79 dB、0.0133,均方误差(mean squared error,MSE)低于第2名6.863 8,运算速度快于第2名0.002 s。在其他数据集上的实验对比也显示出本文算法更好地保持结构与平滑纹理。通过比较不同数据集上训练的同一网络模型,证实了本文的纹理滤波数据集有助于增强模型对于强梯度纹理与物体结构的区分能力。结论 本文制作的纹理滤波数据集使模型更好地区分强梯度纹理与物体结构并增强模型的泛化能力。本文设计的方向性滤波尺度预测模型在性能上超越了已有的大多数纹理平滑方法,尤其在强梯度纹理的抑制和弱梯度结构的保持两个方面表现优异。
关键词
Texture-smoothing-oriented directional filtering scales-predicting model
Lin Junyan, Liu Chunxiao, Zhang Jinkai, Li Hongyi(School of Computer Science and Information Engineering, Zhejiang Gongshang University, Hangzhou 310018, China) Abstract
Objective Texture filtering is a low-level task in image processing and computer vision, which aims to filter the image through the essential image structure preservation and other texture smoothing details. Current texture filtering algorithms are mainly divided into two categories like local-based and global-based orientation. Traditional methods are challenged to distinguish image structure and strong gradient textures in common. Due to the lack of reliable training set, recent deep learning algorithms often use the results of existing traditional methods as ground truth, so they are unable to refill the gaps of the existing traditional algorithms. For example, texture and structure aware filtering network (TSAFN) performs data synthesis by filling the whole image with the same texture, but textures should be object-dependent. Such a synthesis way will lead to a large gap between synthetic images and real-world images. In order to solve these problems, our novel dataset is generated for texture filtering training, and the image smoothing algorithm is proposed based on directional filtering scales-predicting model. Method First, a texture filtering dataset for deep learning is generated by filling texture images per object structure based on the existing structure images. At the same time, we processed the image structure via smoothing and compression. Hence, the dataset we generated can not only enhance the ability of the algorithm to distinguish strong gradient texture and structure, but also reduce the domain gap between synthetic images and real images. Then, the image smoothing algorithm based on directional filtering scales-predicting model is designed, which includes a scale-aware sub-network and an image smoothing sub-network. The scale-aware sub-network is used to predict directional texture filtering scales map. It not only reflects whether a pixel and its surrounding pixels are in the same texture, but also implies information about whether the pixel is a structural pixel or not. The image smoothing sub-network takes the stack of scales map predicted by the scale-aware sub-network and original image as input, and gets the filtered image through a small amount of convolution layer. It can complete the smoothing and correct the imperfection of the result of scale-aware sub-network quickly. In edge-aware sub-network, we applied the classic U-Net because of its excellent ability to easy use low-level features straightforward, and we change its input and output dimensions. The input of the scale-aware sub-network is the stack of RGB image and gradient map, the output of the scale-aware sub-network is a six-dimensional scales map. The image smoothing sub-network consists of seven convolutional layers, the first six layers are followed by ReLU and batch normalization, while the last layer is followed by sigmoid for preventing the pixel value out of bounds, the input of the image smoothing sub-network is the stack of an image and six-dimensional scales map, the output of the image smoothing sub-network is the filtered image. The number of images related to our training set, test set and verification set are 10 000, 1 500, 1 000, respectively, they were selected from our dataset randomly, and they did not overlap. Our network is implemented in Pytorch toolbox. The input images and ground truth images are clipped to 224×224 pixels for training, the momentum parameter is set to 0.9, the learning rate is set to 1E-2, and the weight decay is 0.000 2. We use an adaptive method, that is, if the loss does not decrease by more than 0.003 for 5 epochs, then the learning rate will be halved. The stochastic gradient descent(SGD) learning procedure is accelerated using a NVIDIA RTX 2080 GPU device. Result We compared our algorithm to the five traditional algorithms and two deep learning algorithms on our dataset and other real-world image datasets. The quantitative evaluation metrics used in our dataset contain the peak signal to noise ratio (PSNR), the structural similarity (SSIM), the mean square error (MSE) and the running time. In comparison with the results of different filtering algorithms from our dataset, our PSNR is 2.79 higher (higher is better) than the second-best, our SSIM is 0.013 3 higher (higher is better) than the second-best, our MSE is 6.863 8 lower (less is better) than the second-best, the running time of our method is 0.002 s faster than the second-best. All deep learning algorithms have been re-trained from our dataset, it is sorted out that our algorithm keeps the leading effect in the discrimination of structure and strong gradient texture based on the comparative results of the texture filtering results of real-world images. The results are trained by different datasets are compared in terms of same model, and it is proved that our dataset can make the model have better generalization ability and stronger ability of distinguishing the strong gradient texture and structure. Conclusion Our dataset contains a variety of textures and structures, which can develop the model to distinguish strong gradient texture and object structure better. Our data synthesis method can make the model have better generalization potential ability. Additionally, the designed image smoothing algorithm surpasses the existing methods in performance and speed based on directional filtering scales-predicting model.
Keywords
|