Current Issue Cover
结合混合池化的双流人脸活体检测网络

汪亚航, 宋晓宁, 吴小俊(江南大学物联网工程学院, 无锡 214122)

摘 要
目的 人脸识别技术在很多领域起着重要作用,但大量的欺诈攻击对人脸识别产生了威胁,比如打印攻击和重放攻击。传统的活体检测方法是以手工方式提取特征且缺乏对时间维度的考虑,导致检测效果不佳。针对以上问题,提出一种结合混合池化的双流活体检测网络。方法 对数据集提取光流图像并进行面部检测,得到双流网络的两个输入;在双流网络末端加入空间金字塔和全局平均混合池化,利用全连接层对池化后的特征进行分类并进行分数层面的融合;对空间流网络和时间流网络进行融合得到一个最优结果,同时考虑了不同颜色空间对检测性能的影响。结果 在CASIA-FASD (CASIA face anti-spoofing database)和replay-attack两个数据集上做了多组对比实验,在CASIA-FASD数据集上,等错误率(equal error rate,EER)为1.701%;在replay-attack数据集上,等错误率和半错误率(half total error rate,HTER)分别为0.091%和0.082%。结论 结合混合池化的双流活体检测网络充分考虑时间维度,提出的空间金字塔和全局平均混合池化策略能有效地利用特征。针对包含多种攻击类型、图像质量差异较大的数据集,本文提出的网络模型均能取得较低的错误率。
关键词
Two-stream face spoofing detection network combined with hybrid pooling

Wang Yahang, Song Xiaoning, Wu Xiaojun(School of Internet of Things Engineering, Jiangnan University, Wuxi 214122, China)

Abstract
Objective The convenient and efficient technique of biometric feature identification triggers the wide and profound research on face recognition, which draws extensive concern and has been widely used in various authorization scenarios, including mobile phone unlocking, access control, and face payment. The rise of Internet of Things application terminals including mobile phone cameras make much private facial information of user easily and instantaneously available, which lead to serious threat for the traditional face identification systems. Therefore, more efficient and accurate identifying of face spoofing attacks, including replay and print attacks, reducing threats, and ensuring system security have become an urgent issue to be addressed. Face anti-spoofing mainly ensures that the real entity is displayed in front of the camera, rather than a video or a photo. Many techniques, including convolutional neural network (CNN) methods, have appeared to address this issue and aroused a great concern constantly in recent years. Typically, the hand-crafted features including LBP (local binary pattern), HOG (histogram of oriented gradient) jointly using a kind of classifier such as SVM (support vector machine), is to be the most commonly used technique for face anti-spoofing. However, the diversity of spoofing means still causes great difficulties in manually extracting effective features. Most CNN methods cannot fully exploit the valuable information in the case of temporal dimensionality and thus result in poor detection. To address this issue, this study presents a two-stream spoofing detection network based on a scheme of hybrid pooling. Method Those methods that only use spatial information to solve the face anti-spoofing problem are not generalized. The face anti-spoofing framework proposed in this paper includes spatial stream, temporal stream and fusion modules. The entire design process of experiment is divided into two aspects, one is to use spatial net to achieve spatial information, and another is to apply the temporal net to acquire temporal information. The algorithm implementation is detailed as following. First, optical flow pictures are extracted from the dataset, on which face detection is performed, the optical flow pictures are utilized as the input of the temporal stream to learn temporal information, and the original face pictures are used as the input of the spatial stream to learn spatial information. Second, a shallow network is adopted for the spatial stream network, in which spatial pyramid pooling is combined with global average pooling at the end of the network to obtain hybrid pooling features. We then perform classification using a fully connected layer. For the temporal stream network, we adopt a deep network using residual blocks given that effective temporal features are difficult to extract. Spatial pyramid pooling and global average pooling are also synthesized at the end of the network for the classification. Results obtained from the spatial and temporal stream networks are used for the final score fusion for improved classification. The choice of different color spaces may affect the final result; thus, we compare the effectiveness of different color spaces by experiment and determine the optimal color space of the presented model. Result The proposed method is demonstrated on two public benchmark datasets, namely, CASIA-FASD (CASIA face anti-spoofing database) and replay-attack. We obtain the results of 2.141% EER (equal error rate) in the spatial stream, 9.005% EER in the temporal stream, and 1.701% EER in fusion on CASIA-FASD; and 0.071% EER and 0.109% HTER(half total error rate) in the spatial stream, 17.045% EER and 21.781% HTER in the temporal stream, and 0.091% EER and 0.082% HTER in fusion on replay-attack. Conclusion The two-stream spoofing detection network combined with the hybrid pooling scheme achieves promising results on different datasets. The proposed method utilizes the effective information in temporal dimensionality. The hybrid of spatial pyramid and global average pooling means can exploit the effective information in multiple scales. We can also relieve the burden of the high dimensionality of datasets. Experimental results obtained from popular datasets demonstrate the merits of the proposed method, especially its robustness to the diversity of spoofing means and the large differences in the quality of image datasets. Our promising results will encourage further work on synthesizing an informative stream and lead to successful solutions for other application domains in the future.
Keywords

订阅号|日报