Current Issue Cover
  • 发布时间: 2024-09-11
  • 摘要点击次数:  9
  • 全文下载次数: 7
  • DOI:
  •  | Volume  | Number
三维风格化人脸生成与结构化建模

胡佳平, 周漾(深圳大学计算机与软件学院)

摘 要
现有的三维人脸风格化方法难以生成较大相机姿态的人脸视图,或止步于生成多角度的人脸视图,而非结构化的三维网格模型。本文提出一种基于样例的三维人脸风格化与结构化建模方法。该方法能够不仅能够合成新的全相机视角下的风格化人脸视图,还具有生成结构化的人脸三维网格模型的能力,即包括人脸三维网格以及对应的纹理贴图。具体来说,我们提出了一个两阶段的结构化三维风格人脸生成框架,主要包括三维感知人脸生成器域迁移、基于多视图约束的人脸纹理优化两个步骤。首先,我们利用二维人脸风格化数据增强策略微调三维感知生成器,然后通过一个视图对齐策略对齐基于隐式神经场的渲染视图以及基于三维网格的渲染视图,再利用多视图约束的梯度回传优化人脸模型的纹理贴图,最后通过融合多张纹理贴图得到最终的纹理贴图。结果表明,该方法能够有效构建高质量的结构化三维风格人脸模型,生成高质量的全角度风格人脸视图与纹理贴图。此外,显式构建的结构化人脸模型能够更为便捷地被用于三维人脸相关下游任务。
关键词
Example-based 3D Stylized and Structured Face Modeling

hujiaping, Zhou Yang(College of Computer Science and Software Engineering, SZU)

Abstract
Facial image stylization and 3D face modeling are important tasks in the fields of computer graphics and vision, with significant applications in virtual reality and social media, including popular technologies such as virtual live streaming, virtual imaging, and digital avatars. This paper addresses the task of 3D facial stylization and generation, aiming to produce novel and stylized facial views from a given real face image and a style reference. The novel views can be rendered at corresponding angles by inputting camera poses in 3D space. Meanwhile, these views need to maintain good 3D multi-view consistency while expressing the exaggerated geometry and colors characteristic of the given artistic style reference. Facial stylization and facial modeling are prominent tasks in the fields of computer graphics and vision, with significant applications in virtual reality and social media. These applications include popular technologies such as virtual live streaming, virtual imaging, and digital humans. This paper addresses the task of 3D facial stylization generation, aiming to produce facial views from corresponding angles by inputting camera poses in 3D space. These views need to maintain good 3D multi-view consistency while expressing the exaggerated geometry and colors characteristic of artistic styles. Existing methods for 3D facial generation can be broadly categorized into two types: those based on 3D deformable models and those based on implicit neural representations. Methods based on 3D deformable models often struggle to express non-facial components such as hairstyles and glasses, which severely limits the quality of the generated results. On the other hand, methods based on implicit neural representations, while capable of achieving good generation results, tend to produce severely distorted facial views under large camera poses, such as side profiles. Additionally, the results of implicit methods typically include only facial geometry and multi-view facial views, making it difficult to integrate them with mature rendering pipelines. This limitation hinders their application in practical scenarios. Consequently, both existing 3D facial generation methods face challenges in producing high-quality 3D stylized facial models with good structured modeling, i.e., 3D facial meshes and topologically complete texture maps. To address the shortcomings of existing methods, this paper proposes a novel approach for 3D stylized facial generation and structured modeling. The goal of this paper is to train a 3D aware stylized facial generator within the style domain of a specified artistic facial sample. This generator should be capable of producing high-quality 3D facial views from any angle in the specified style, including large-pose side profiles and back views. Furthermore, based on multi-view facial data, the generator should produce structured 3D facial models, including facial geometric mesh models and corresponding texture maps. To achieve this, the paper proposes a two-stage method for 3D stylized facial generation and structured modeling. The method comprises two main steps: 3D aware facial generator domain transfer and multi-view constrained facial texture optimization. In the first stage, the paper utilizes 2D facial stylization prior methods to perform data augmentation on artistic style samples, generating a small-scale artistic style facial dataset. Subsequently, the camera poses and facial masks of the facial images in this dataset are extracted sequentially. The annotated stylized facial dataset is then used to fine-tune a 3D aware generator in the natural style domain. The fine-tuned 3D aware generator can generate high-quality multi-view facial views and 3D mesh models. The focus of the second stage is to optimize facial textures using multi-view images from a set of directions. The paper first performs smoothing and UV unwrapping on the facial mesh. To align the volumetric rendered facial views with the differentiable rendered facial views for pixel-level loss optimization of facial textures, the paper proposes a simple and effective facial view alignment strategy based on mask affine transformation. Finally, multi-view facial supervision is used to optimize facial textures, and the final facial texture map is obtained through texture fusion. To demonstrate the superiority of the proposed method, the paper compares the two-stage 3D facial generation and structured modeling method with existing advanced baseline methods. This comparison illustrates the quality of 3D aware stylized facial generation and the effectiveness of structured facial mesh generation. Additionally, to demonstrate the effectiveness of each stage component of the proposed method, the paper includes ablation studies for the key components of the method. The use studies illustrate the correctness of the proposed method. Qualitative and quantitative experiments show that the proposed method can effectively construct high-quality structured 3D stylized facial models and generate high-quality stylized facial views. Moreover, the explicitly modeled structured facial models can be more conveniently applied to downstream tasks related to 3D faces. The results indicate that the proposed method not only achieves superior performance in generating stylized facial views but also ensures the structural integrity and applicability of the generated 3D facial models in practical scenarios. In conclusion, this paper presents a comprehensive approach to 3D stylized facial generation and structured modeling, addressing the limitations of existing methods. By leveraging a two-stage process that includes domain transfer and multi-view constrained texture optimization, the proposed method achieves high-quality results in both facial view generation and structured modeling. The effectiveness of the method is demonstrated through extensive experiments, highlighting its potential for practical applications in virtual reality, social media, and other related fields.
Keywords

订阅号|日报