使用人眼几何特征的视线追踪方法
摘 要
目的 视线追踪是人机交互的辅助系统,针对传统的虹膜定位方法误判率高且耗时较长的问题,本文提出了一种基于人眼几何特征的视线追踪方法,以提高在2维环境下视线追踪的准确率。方法 首先通过人脸定位算法定位人脸位置,使用人脸特征点检测的特征点定位眼角点位置,通过眼角点计算出人眼的位置。直接使用虹膜中心定位算法的耗时较长,为了使虹膜中心定位的速度加快,先利用虹膜图片建立虹膜模板,然后利用虹膜模板检测出虹膜区域的位置,通过虹膜中心精定位算法定位虹膜中心的位置,最后提取出眼角点、虹膜中心点等信息,对点中包含的角度信息、距离信息进行提取,组合成眼动向量特征。使用神经网络模型进行分类,建立注视点映射关系,实现视线的追踪。通过图像的预处理对图像进行增强,之后提取到了相对的虹膜中心。提取到需要的特征点,建立相对稳定的几何特征代表眼动特征。结果 在普通的实验光照环境中,头部姿态固定的情况下,识别率最高达到98.9%,平均识别率达到95.74%。而当头部姿态在限制区域内发生变化时,仍能保持较高的识别率,平均识别率达到了90%以上。通过实验分析发现,在头部变化的限制区域内,本文方法具有良好的鲁棒性。结论 本文提出使用模板匹配与虹膜精定位相结合的方法来快速定位虹膜中心,利用神经网络来对视线落点进行映射,计算视线落点区域,实验证明本文方法具有较高的精度。
关键词
Gaze tracking method for human eye geometric features
Su Haiming, Hou Zhenjie, Liang Jiuzhen, Xu Yan, Li Xing(Department of Information Science & Engineering, Changzhou University, Changzhou 213164, China) Abstract
Objective Eye gaze is an input mode that has a potential to serve as an efficient computer interface. Eye movement has consistently been a research hotspot. Knowledge on gaze tracking can provide valuable information on a person's point of attention. The methods used at present are mainly model and regression based. The model-based method extracts facial features and calculates the 3D gaze direction through the geometric relationship between facial features. However, to obtain good accuracy, this method requires individual calibration, which is difficult and reduces user experience. Meanwhile, the regression-based method utilizes powerful computer learning technology to perform mapping from eye appearance characteristics to the gaze direction. Compared with the model-based method, the regression-based method avoids the modeling of the complicated eyeball structure and only needs to collect a large amount of data. Regression-based approaches can be further divided into feature-and appearance-based methods. The feature-based regression method learns the mapping function from an eye feature to the gaze direction, whereas appearance-based regression learns the mapping function of gaze direction from eye appearance. Learning algorithms use traditional support vector regression, random forest, and the latest in-depth learning technology. However, this method requires one or more data sets, thus making the model complicated. Meanwhile, regression-based methods commonly use additional data to compensate for head movements. In addition, substantial data are needed to learn a good mapping function. To improve line-of-sight tracking accuracy in a 2D environment, a new method based on the geometric features of the human eyes is proposed to solve the problem of high error rate and large time consumption of traditional iris location methods. Method First, the position of the face is located by a face location algorithm. The location of the eye angle point is determined by the feature point of the feature point detection, and the eye area is calculated by the angle point. A traditional iris location method may take a long time to locate the iris center. To increase the speed of iris center location, an iris template is established by an iris image and used to detect the location of the iris region. Subsequently, the iris center position is roughly located. Second, the iris center position is located by an iris center precise location algorithm. Through facial feature point localization and iris center localization, the corners of the eye and the iris center are obtained and used as basic information to describe eye movement vectors. The extracted eye motion vector comprises only the information on eye corners and iris center points; thus, the angle is introduced based on the position relation of the points, and the distance from the departure information is adopted as the final eye motion vector. In this study, the neural network model is used to judge the point of sight, and the eye movement vector is utilized as the input feature of the neural network model. Then, the mapping relation of the gaze point is established to realize line-of-sight tracking. Result A camera is used to record videos as the neural network training dataset. In the feature extraction stage, the original data are preprocessed to enhance image quality, thus making the iris center extraction accurate. Training results are obtained via feature extraction, training, and testing. Results show that in an ordinary experimental light environment, the recognition rate reaches 98.9% when the head pose is fixed, and the average recognition rate reaches 95.74%. When the head posture changes, the recognition rate of the algorithm changes to some extent, but the recognition rate can remain stable if stable eye movement features are extracted. When the restricted area of head posture changes, the recognition rate is still high, and the average recognition rate exceeds 90%. Experimental results show that the proposed method has good robustness to restricted area of head variation. Conclusion In this study, a neural network is used to map eye images and the gaze point. Hence, the system does not need to use multiple cameras and infrared light sources, nor does it need camera calibration. A single camera system with no light source is utilized to locate the iris center through a combination of template matching and iris precision positioning. Compared with other methods, this system has a simpler structure, which is realized by using only a single webcam without an auxiliary light source and camera calibration. The neural network is adopted to map the line-of-sight landing point and calculate the line-of-sight landing area. Relatively stable features are extracted in an ordinary light source environment. Experiments show that the method performs well when the camera detects a complete head image in a certain range of head posture changes.
Keywords
|