外观表征分析下动态更新相关滤波跟踪
摘 要
目的 基于相关滤波和孪生神经网络的两类判别式目标跟踪方法研究已取得了较大进展,但后者计算量过大,完全依赖GPU(graphics processing unit)加速运算。传统相关滤波方法由于滤波模型采用固定更新间隔,难以兼顾快速变化目标和一般目标。针对这一问题,提出一种基于目标外观状态分析的动态模型更新算法,优化计算负载并提高跟踪精度,兼顾缓变目标的鲁棒跟踪和快速变化目标的精确跟踪。方法 通过帧间信息计算并提取目标区域图像的光流直方图特征,利用支持向量机进行分类从而判断目标是否处于外观变化状态,随后根据目标类别和目标区域图像的光流主分量幅值动态设置合适的相关滤波器更新间隔。通过在首帧进行前背景分离运算,进一步增强对目标外观表征的学习,提高跟踪精度。结果 在OTB100(object tracking benchmark with 100 sequences)基准数据集上与其他6种快速跟踪算法进行对比实验,本文算法的精准度和成功率分别为86.4%和64.9%,分别比第2名ECO-HC(efficient convolution operators using hand-crafted features)算法高出1.4%和0.9%。在平面内旋转、遮挡、部分超出视野和光照变化这些极具挑战性的复杂环境下,精准度分别比第2名高出3.0%、4.4%、5.2%和6.0%,成功率高出1.9%、3.1%、4.9%和4.0%。本文算法在CPU(central processing unit)上的运行速度为32.15帧/s,满足跟踪问题实时性的要求。结论 本文的自适应模型更新算法在优化计算负载的同时取得了更好的跟踪精度,适合于工程部署与应用。
关键词
Dynamic update correlation filter tracking based on appearance representation analysis
Qiang Zhuang, Shi Fanhuai(Department of Control Science and Engineering, College of Electronic and Information Engineering, Tongji University, Shanghai 201804, China) Abstract
Objective Visual object tracking, which has a profound theoretical basis and application value, is one of the basic problems in computer vision research. Visual object tracking technology has wide applications but faces increasingly complex environments. Factors, such as scale changes, occlusion, and illumination variation, bring uncertain interferences to visual tracking. Research on robust, accurate, and fast visual object tracking algorithms should be conducted further. In recent years, two categories of discriminant model methods based on a discriminative correlation filter and the Siamese neural network have achieved high accuracy and robustness in the tracking problem. However, tracking methods based on the Siamese network are limited by the huge computation amount of a convolutional neural network (CNN) and can only be performed on high-performance GPUs(graphics processing units). The computing requirement seriously affects the application of this type of methods in the practical engineering environment. Tracking methods based on a discriminative correlation filter have simple frameworks, and thus, can use manually setting features to learn and update an object’s representation and achieve real-time tracking on a single CPU(central processing unit). This types of real-time tracking algorithm has been applied well to mobile platforms, such as unmanned aerial vehicles. Under the traditional correlation filtering framework, updating the correlation filter frame by frame will lead to an excessively large computational load and affect real-time performance. The sparse model updating strategy proposed in recent years simply sets a fixed updating interval, reducing the convergence speed of the tracking model and easily losing track when the object changes rapidly. The tracking ability of the two types of correlation filtering tracking algorithms cannot meet the increasing application requirements in complex environments. For the correlation filter updating strategy, this study proposes a dynamic updating algorithm based on appearance representation analysis to optimize computation and improve tracking accuracy. Method First, optical flow features are used to estimate the appearance state of an object. We calculate the dense optical flow of the predicted target region’s image. When the object is simply shifting, the optical flow’s amplitude of each pixel is small, and the direction lacks a uniform rule because the image of the target area changes minimally. However, when the object is deforming or being occluded, the deformed part will generate a considerably larger optical flow, which differs from common objects. In this study, optical flow histogram information is extracted by dividing an image into m×n grids. The average optical flow amplitudes and angles of each pixel are counted in each grid to form the histogram feature vector. A support vector machine is then used to classify feature vectors to estimate the object’s current appearance state. After appearance state analysis, the optical flow amplitude in the object region of the current frame is counted, and a statistical histogram of optical flow amplitude with an interval of 0.5 is constructed. The updating interval of the filter model is set in accordance with the magnitude of the main optical flow amplitude and the target category to realize the adaptive updating of the correlation filter. Moreover, the foreground-background separation operation based on discrete cosine transform in the first frame is used to obtain accurate labeling information, reduce similar background interference, and further optimize the learning of object representation. Result This algorithm is tested on the OTB100(object tracking benchmark with 100 sequences) dataset and compared with ECO-HC(efficient convolution operators using hand-crafted features), SRDCF(spatially regularized discriminative correlation filter), Staple(sum of template and pixel-wise learners), KCF(kernelized correlation filter), DSST(discriminative scale space tracker) and CSK(circulant structure of tracking-by-detection with kernels), which are fast tracking algorithms. On five typical challenging video image sequences, the algorithm proposed in this study achieved higher tracking overlap through the update interval adaptively setting model. It solved the overfitting problem of traditional frame-by-frame updating algorithms, such as Staple, and the problem in ECO-HC which is easy to lose fast-changing objects owing to the sparse updating strategy. The comprehensive quantitative analysis results on the entire OTB100 dataset showed that the tracking accuracy and success rate of the algorithm proposed in this work are 86.4% and 64.9%, respectively. The tracking accuracy and robustness of our algorithm are the best compared with other fast-tracking algorithms that can run on a CPU. Moreover, under highly challenging and complex environments, including in-plane rotation, occlusion, out of view, and illumination variation, our algorithm’s precision was 3.0%, 4.4%, 5.2%, and 6.0% higher than that of the algorithm at second place, and the success rate was 1.9%, 3.1%, 4.9%, and 4.0% higher. In the running speed test on CPU i7-6850k, the frames per second of the algorithm developed in this work is 32.15, and the computational load is less than that of the frame-by-frame updating algorithm, thereby meeting the real-time requirements for tracking problems. Conclusion This study proposed a dynamic updating correlation filter tracking algorithm based on appearance representation analysis. A series of comparison results shows that the improved algorithm in this work can consider the robust tracking of slow-changing objects and the accurate tracking of fast-changing objects to achieve excellent real-time performance suitable for project deployment and application.
Keywords
|