结合形变模型与图像修复的人脸姿态矫正
摘 要
目的 人脸姿态偏转是影响人脸识别准确率的一个重要因素,本文利用3维人脸重建中常用的3维形变模型以及深度卷积神经网络,提出一种用于多姿态人脸识别的人脸姿态矫正算法,在一定程度上提高了大姿态下人脸识别的准确率。方法 对传统的3维形变模型拟合方法进行改进,利用人脸形状参数和表情参数对3维形变模型进行建模,针对面部不同区域的关键点赋予不同的权值,加权拟合3维形变模型,使得具有不同姿态和面部表情的人脸图像拟合效果更好。然后,对3维人脸模型进行姿态矫正并利用深度学习对人脸图像进行修复,修复不规则的人脸空洞区域,并使用最新的局部卷积技术同时在新的数据集上重新训练卷积神经网络,使得网络参数达到最优。结果 在LFW(labeled faces in the wild)人脸数据库和StirlingESRC(Economic Social Research Council)3维人脸数据库上,将本文算法与其他方法进行比较,实验结果表明,本文算法的人脸识别精度有一定程度的提高。在LFW数据库上,通过对具有任意姿态的人脸图像进行姿态矫正和修复后,本文方法达到了96.57%的人脸识别精确度。在StirlingESRC数据库上,本文方法在人脸姿态为±22°的情况下,人脸识别准确率分别提高5.195%和2.265%;在人脸姿态为±45°情况下,人脸识别准确率分别提高5.875%和11.095%;平均人脸识别率分别提高5.53%和7.13%。对比实验结果表明,本文提出的人脸姿态矫正算法有效提高了人脸识别的准确率。结论 本文提出的人脸姿态矫正算法,综合了3维形变模型和深度学习模型的优点,在各个人脸姿态角度下,均能使人脸识别准确率在一定程度上有所提高。
关键词
Face pose correction based on morphable model and image inpainting
Wu Congzhong1, Zheng Rongsheng1, Zang Huaijuan1, Liu Mingwei1, Xu Jiajia2, Zhan Shu1(1.School of Computer and Information, Hefei University of Technology, Hefei 230009, China;2.iFLYTEK Co. Ltd., Hefei 230009, China) Abstract
Objective Face recognition has been a widely studied topic in the field of computer vision for a long time. In the past few decades, great progress in face recognition has been achieved due to the capacity and wide application of convolutional neural networks. However, pose variations still remain a great challenge and warrant further studies. To the best of our knowledge, the existing methods that address this problem can be generally categorized into two classes:feature-based methods and deep learning-based methods. Feature-based methods attempt to obtain pose-invariant representations directly from non-frontal faces or design handcrafted local feature descriptors, which are robust to face poses. However, it is often too difficult to obtain robust representation of the face pose using these handcrafted local feature descriptors. Thus, these methods cannot produce satisfactory results, especially when the face pose is too large. In recent years, convolutional neural networks have been introduced in face recognition problems due to their outstanding performance in image classification tasks. Different from traditional methods, convolutional neural networks do not require the manual extraction of local feature descriptors. They try to directly rotate the face image of arbitrary pose and illuminate into the target pose, which maintains the face identity feature well. In addition, due to the powerful ability of image generation, generative adversarial network is also used in frontal face image synthesis and has achieved great progress. Compared with traditional methods, deep learning-based methods can obtain a higher face recognition rate. However, the disadvantage of deep learning-based methods is that the face images synthesized from the large face pose have low credibility, which lead to poor face recognition accuracy. To deal with the limitations of these two kinds of methods, we present a face pose correction algorithm based on 3D morphable model (3DMM) and image inpainting. Method In this study, we propose a face frontalization method by combining deep learning model and a 3DMM, which can generate a photorealistic frontal view of the face image. In detail, we first detect facial landmarks by using a well-known facial landmark detector, which is robust to large pose variations. We detect a total of 68 facial landmarks to fit the face image more accurately. Then, we perform accurate 3DMM fitting for face image with facial landmark weighting. Next, we estimate the depth information of the face image and rotate the 3D face model into frontal view using 3D transformation. Finally, we employ image inpainting for the irregular facial invisible region caused by self-occlusion by utilizing deep learning model. We fine-tune the pre-trained model to train our image inpainting model. In the training process, all of the convolutional layers are replaced with partial convolutional layers. Our training set consists of 13 223 face images that are selected from the labeled faces in the wild (LFW) dataset. Our image inpainting network is implemented in Keras. The batch size is set to 4, the learning rate is set to 10-4, and the weight decay is 0.000 5. The network training procedure is accelerated using NVIDIA GTX 1080 Ti GPU devices, which takes approximately 10 days in total. Result We compare our method with state-of-the-art methods, including the traditional method and deep learning method, on two public face datasets, namely, LFW dataset and StirlingESRC 3D face dataset. The quantitative evaluation metric is face recognition rate under different face poses, and we provide several synthesized frontal face images by our method. The synthesized frontal face images show that our method can produce more photorealistic results than other methods in the LFW dataset. We achieve 96.57% face recognition accuracy on the LFW face dataset. In addition, the quantitative experiment results show that our method outperforms all other methods in StirlingESRC 3D face dataset. The experimental results show that the face recognition accuracy of our method is improved under different face poses. Compared with the other two methods in the StirlingESRC 3D face dataset, the face recognition accuracy increased by 5.195% and 2.265% under the face pose of 22° and by 5.875% and 11.095% under the face pose of 45°, respectively. Moreover, the average face recognition rate increased by 5.53% and 7.13%, respectively. The experimental results show that the proposed multi-pose face recognition algorithm improves the accuracy of face recognition. Conclusion In this study, we propose a face pose correction algorithm for multi-pose face recognition by combining 3DMM with deep learning model. The qualitative and quantitative experiment results show that our method can synthesize a more photorealistic frontal face image than other methods and can improve the accuracy performance of multi-pose face recognition.
Keywords
multi-pose face recognition 3D morphable model (3DMM) convolutional neural network(CNN) image inpainting deep learning
|