Current Issue Cover
结合连续卷积算子的自适应加权目标跟踪算法

罗会兰, 石武(江西理工大学信息工程学院, 赣州 341000)

摘 要
目的 在视觉跟踪领域中,特征的高效表达是鲁棒跟踪的关键,观察到在相关滤波跟踪中,不同卷积层表达了目标的不同方面特征,提出了一种结合连续卷积算子的自适应加权目标跟踪算法。方法 针对目标定位不准确的问题,提出连续卷积算子方法,将离散的位置估计转换成连续位置估计,使得位置定位更加准确;利用不同卷积层的特征表达,提高跟踪效果。首先利用深度卷积网络结构提取多层卷积特征,通过计算相关卷积响应大小,决定在下一帧特征融合时各层特征所占的权重,凸显优势特征,然后使用从不同层训练得到的相关滤波器与提取得到的特征进行相关运算,得到最终的响应图,响应图中最大值所在的位置便是目标所在的位置和尺度。结果 与目前较流行的3种目标跟踪算法在目标跟踪基准数据库(OTB-2013)中的50组视频序列进行测试,本文算法平均跟踪成功率达到85.4%。结论 本文算法在光照变化、尺度变化、背景杂波、目标旋转、遮挡和复杂环境下的跟踪具有较高的鲁棒性。
关键词
Adaptive weighted object tracking algorithm with continuous convolution operator

Luo Huilan, Shi Wu(School of Information Engineering, Jiangxi University of Science and Technology, Ganzhou 341000, China)

Abstract
Objective In the visual tracking field, efficient representation of features is the key to robust tracking. Different convolution layers represent various aspects of the target in correlation filter tracking. An adaptive weighted object tracking algorithm with continuous convolution operator is proposed. Method A continuous convolution operator method is proposed to convert discrete position estimates into continuous ones for solving the inaccurate target location problem, thereby rendering position location highly accurate. The feature representations of different convolution layers are leveraged to improve the tracking effect. Different convolutional layer features in deep convolutional neural networks have different expression capabilities. Specifically, shallow features demonstrate substantial positional information, whereas deep ones present considerable semantic features. Therefore, when feature expression and tracking can be conducted by combining them, better tracking effects can be obtained than using only deep or shallow features. First, the multi-layer convolution features are extracted by using the deep convolution network structure. The weight of each layer feature in the fusion features in the next frame is determined by calculating the correlation convolution response to highlight the dominant features and render the target highly distinguishable from the background or distractor. Then, the correlation filter trained from different layers is used to perform correlation operation with the extracted features for obtaining the final response map. The position of the maximum value in the response map is used to calculate the position and scale of the target. The weights of different convolutional feature layers are adaptively updated through the correlation filtering tracking effect of different convolutional layers. The feature expression capability of different convolutional layers in the convolutional neural network is fully exerted. The expression scheme is adaptively adjusted in accordance with the different environmental conditions of each frame to improve the tracking performance. Result The average success rate of the proposed algorithm is 85.4% compared with three state-of-the-art tracking algorithms in 50 video sequences of object tracking benchmark (OTB-2013) dataset. Conclusion Experimental results show that the proposed tracking algorithm has good performance and can successfully and efficiently track many complicated situations, such as illumination variation, scale variation, background clutters, object rotation, and occlusion.
Keywords

订阅号|日报