面部表情分析进展和挑战
摘 要
面部表情分析是计算机通过分析人脸信息尝试理解人类情感的一种技术,目前已成为计算机视觉领域的热点话题。其挑战在于数据标注困难、多人标签一致性差、自然环境下人脸姿态大以及遮挡等。为了推动面部表情分析发展,本文概述了面部表情分析的相关任务、进展、挑战和未来趋势。首先,简述了面部表情分析的几个常见任务、基本算法框架和数据库;其次,对人脸表情识别方法进行了综述,包括传统的特征设计方法以及深度学习方法;接着,对人脸表情识别存在的问题与挑战进行总结思考;最后,讨论了未来发展趋势。通过全面综述和讨论,总结以下观点:1)针对可靠人脸表情数据库规模小的问题,从人脸识别模型进行迁移学习以及利用无标签数据进行半监督学习是两个重要策略;2)受模糊表情、低质量图像以及标注者的主观性影响,非受控自然场景的人脸表情数据的标签库存在一定的不确定性,抑制这些因素可以使得深度网络学习真正的表情特征;3)针对人脸遮挡和大姿态问题,利用局部块进行融合的策略是一个有效的策略,另一个值得考虑的策略是先在大规模人脸识别数据库中学习一个对遮挡和姿态鲁棒的模型,再进行人脸表情识别迁移学习;4)由于基于深度学习的表情识别方法受很多超参数影响,导致当前人脸表情识别方法的可比性不强,不同的表情识别方法有必要在不同的简单基线方法上进行评测。目前,虽然非受控自然环境下的表情分析得到较快发展,但是上述问题和挑战仍然有待解决。人脸表情分析是一个比较实用的任务,未来发展除了要讨论方法的精度也要关注方法的耗时以及存储消耗,也可以考虑用非受控环境下高精度的人脸运动单元检测结果进行表情类别推断。
关键词
Advances and challenges in facial expression analysis
Peng Xiaojiang, Qiao Yu(Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China) Abstract
Facial expression analysis aims to understand human emotions by analyzing visual face information and has been a popular topic in the computer vision community. Its main challenges include annotating difficulties, inconsistent labels, large face poses, and occlusions in the wild. To promote the advance of facial expression analysis, this paper comprehensively reviews recent advances, challenges, and future trends. First, several common tasks for facial expression analysis, basic algorithm pipeline, and datasets are explored. In general, facial expression analysis mainly includes three tasks, namely, facial expression recognition (i.e., basic emotion recognition), facial action unit detection, and valence-arousal regression. The well-known pipeline of facial expression analysis mainly consists of three steps, namely, face extraction (which includes face detection, face alignment, and face cropping), facial feature extraction, and classification (or regression for the valence-arousal task). Datasets for facial expression analysis come from laboratory and real-world environments, and recent datasets mainly focus on in-the-wild conditions. Second, this paper provides an algorithm survey on facial expression recognition, including hand-crafted feature-based methods, deep learning-based methods, and action unit(AU)-based methods. For hand-crafted features, appearance- and geometry-based features can be used. Specifically, traditional appearance-based features mainly include local binary patterns and Gabor features. Geometry-based features are mainly computed from facial key points. In deep learning-based methods, early studies apply deep belief networks while almost all recent methods use deep convolutional neural networks. Apart from direct facial expression recognition, some methods leverage the corresponding map between AUs and emotion categories, and infer categories from detected AUs. Third, this paper summarizes and discusses recent challenges in facial expression recognition (FER), including the small scale of reliable FER datasets, uncertainties in large-scale FER datasets, occlusion and large pose problems in FER datasets, and comparability of FER algorithms. Lastly, this paper discusses the future trends of facial expression analysis. According to our comprehensive review and discussion, 1) for the small-scale challenge posed by reliable FER data, two important strategies are transfer learning from face recognition models and semi-supervised methods based on large-scale unlabeled facial data. 2) Owing to ambiguous expression, low-quality face images, and subjectivity of annotators, uncertain annotations inevitably exist in large-scale in-the-wild FER datasets. For better learning facial expression features, it is beneficial to suppress these uncertain annotations. 3) For the challenges posed by occlusion and large pose, combining different local regions is effective, and another valuable strategy is to first learn an occlusion- and pose-robust face recognition model and then transfer it to facial expression recognition. 4) Current FER methods are difficult to compare due to the impact of many hyper parameters in deep learning methods. Thus, comparing various baselines for different FER methods is necessary. For example, a facial expression recognition method should be compared in the configuration of learning from scratch and from pretrained models. Recently, although facial expression analysis has great progress, the abovementioned challenges remain unsolved. Facial expression analysis is a practical task, and algorithms should also pay attention to the time and memory consumption except for accuracy in the future. In the deep learning era, facial action unit detection in the wild has achieved great progress, and using the results of action unit detection in inferring facial expression categories in the wild may be feasible in the future.
Keywords
facial expression analysis facial expression recognition(FER) convolutional neural network(CNN) deep learning transfer learning
|