Current Issue Cover
循环生成对抗网络的线稿图像自动提取

王素琴, 张加其, 石敏, 赵银君(华北电力大学控制与计算机工程学院, 北京 102206)

摘 要
目的 动漫制作中线稿绘制与上色耗时费力,为此很多研究致力于动漫制作过程自动化。目前基于数据驱动的自动化研究工作快速发展,但并没有一个公开的线稿数据集可供使用。针对真实线稿图像数据获取困难,以及现有线稿提取方法效果失真等问题,提出基于循环生成对抗网络的线稿图像自动提取模型。方法 模型基于循环生成对抗网络结构,以解决非对称数据训练问题。然后将不同比例的输入图像及其边界图输入到掩码指导卷积单元,以自适应选择网络中间特征。同时为了进一步提升网络提取线稿的效果,提出边界一致性约束损失函数,确保生成结果与输入图像在梯度变化上的一致性。结果 在公开的动漫彩色图像数据集Danbooru2018上,应用本文模型提取的线稿图像相比于现有线稿提取方法,噪声少、线条清晰且接近真实漫画家绘制的线稿图像。实验中邀请30名年龄在2025岁的用户,对本文以及其他4种方法提取的线稿图像进行打分。最终在30组测试样例中,本文方法提取的线稿图像被认为最佳的样例占总样例84%。结论 通过在循环生成对抗网络中引入掩码指导单元,更加合理地提取彩色图像的线稿图像,并通过对已有方法提取效果进行用户打分证明,在动漫线稿图像提取中本文方法优于对比方法。此外,该模型不需要大量真实线稿图像训练数据,实验中仅采集1 000幅左右真实线稿图像。模型不仅为后续动漫绘制与上色研究提供数据支持,同时也为图像边缘提取方法提供了新的解决方案。
关键词
Image extraction of cartoon line art based on cycle-consistent adversarial networks

Wang Suqin, Zhang Jiaqi, Shi Min, Zhao Yinjun(School of Control & Computer Engineering, North China Electric Power University, Beijing 102206, China)

Abstract
Objective With the continuous development of digital media, people's demand for animation works continues to increase. Excellent two-dimensional animation works usually require a lot of time and effort. In the animation production process, the key frame line draft images are usually drawn by the original artist, then the intermediate frame line draft images are drawn by multiple ordinary animators, and finally all the line draft images are colored by the coloring staff. In order to improve the production efficiency of two-dimensional animation art, researchers have committed to improving the automation of the production process. At present, data-driven deep learning technology is developing rapidly, which provides a new solution for improving the production efficiency of animation works. Although many data-driven automated methods have been proposed, it is very difficult to obtain training datasets, and there is no public dataset that corresponds to color images and linear images. For this reason, the research work of automatically extracting line draft images from color animation images will provide data support for animation production-related research. Method Early image edge extraction methods depend on the setting of parameter values, and fixed parameters cannot be applied to all images. However, the data-driven image edge extraction method is limited by the collection and size of the dataset. Therefore, researchers usually use data enhancement techniques or use images similar to line art, such as boundary images (edge information extracted from color images). This study proposes an automatic extraction model of linear art images based on the cycle-consistent adversarial networks to solve the problem of the difficulty of obtaining real line art images and the distortion of the existing line art image extraction methods. First of all, this study uses a cycle-consistent adversarial network structure to solve the dataset problem without real color images and corresponding line art images. It only uses a few collected real line art images and a large number of color images to learn the model parameters. Then, the mask-guided convolution unit and the mask-guided residual unit are proposed to better autonomously select the intermediate output features of the network. Specifically, the input images of different scales and their corresponding boundary images are input to mask-guided convolution unit to learn the mask parameters of the intermediate feature layer, where the boundary map determines the line area of the line art image and the input image provides prior information. In order to ensure that information is not lost in the process of information encoding, no operations such as pooling that can cause information loss are used in the network design process, but the image resolution is reduced by controlling the size of the convolution kernel and the convolution step length. Finally, this study proposes a boundary constraint loss function. Considering that this study does not have the supervision information corresponding to the input image, the loss function is designed to calculate the difference between the gradient information of the input image and the output image. At the same time, regular constraints are added to ensure that the generated result is consistent with the gradient of the input image. The proposed method mainly restricts the gradient of the input image and the generated image to be consistent. Result Finally, on the public animation color image dataset Danbooru2018, the line art image extraction results of this method are compared with the results extracted by the Canny edge detection operator, cycle-consistent adversarial networks (CycleGAN), holistically-nested edge detection (HED), and SketchKeras methods. The Canny edge detection operator only extracts the position information of the image gradient. The resulting lines extracted by CycleGAN are blurred and accompanied by missing information, and the lines in some areas cannot be extracted correctly. The line art image extracted by HED has obvious outer contours but seriously lacks internal details. The line art image extracted by SketchKeras is closer to the edge information image and contains the rich gradient change information, which causes the lines to be unclear and noisy. The extracted results of the proposed model are not only clear and have less noise, but also are more in line with the effect drawn by human animators. In order to show the actual performance effect of the proposed method, 30 users between the ages of 20-25 years are invited to score the cartoon line art images extracted by five different methods. A total of 30 sets of test samples are provided. Each user selects the best line art image in each group according to whether the extracted line art image lines are clear, whether there is noise, and whether it is close to the real cartoonist's line art image. The statistical results show that the linear art image extracted by the proposed method is superior to that of other methods in terms of image quality and authenticity. Moreover, the proposed method can not only extract the line art image corresponding to the color animation image, but also extract the line art from the real color image. In the experiment, the model was used to extract line art images from real-world color images, and results similar to animation line art images were obtained. At the same time, the proposed model is better at extracting black border lines, which may be because the borders of the color animation images given in the training set are black lines. Conclusion This study proposes a model for extracting line art images from color animation images. It trains network parameters through asymmetric data and does not require a large amount of real cartoon line art images. The proposed mask-guided convolution unit and mask-guided residual unit constrain the output features of the intermediate network through the input image and the corresponding boundary image to obtain clearer line results. The proposed boundary consistency loss function introduces a Gaussian regular term to make the boundary of the region with severe gradient change more obvious, and the region with weak gradient change is smoother, reducing the noise in the generated line art image. Finally, the proposed method extracts corresponding line art images from the public animation color dataset Danbooru2018, provides data support for subsequent line art drawing and line art coloring research work, and can also extract results similar to the sketch drawn by an animator from the real color image.
Keywords

订阅号|日报