Current Issue Cover
结合相位一致性的度量学习跟踪

霍其润1,2, 陆耀1, 刘羽2, 巢进波1(1.北京理工大学计算机学院, 北京 100081;2.首都师范大学信息工程学院, 北京 100048)

摘 要
目的 目标跟踪在实际应用中通常会遇到一些复杂的情况,如光照变化、目标变形等问题,为提高跟踪的准确性和稳定性,提出了一种基于相位一致性特征的度量学习跟踪方法。方法 首先对目标区域提取相位一致性特征,其次结合集成学习和支持向量机的优点,利用度量学习的思想进行区域的相似性判别,以此来确定目标所在位置。跟踪的同时在线更新目标模型和度量矩阵从而实现自适应性。结果 算法的有效性在有外观、光照变化及遮挡等具有挑战性的视频序列上得到了验证,并与当前几种主流方法进行了跟踪成功率和跟踪误差的定量比较,实验结果显示本文算法在4组视频上的跟踪误差平均为15个像素,跟踪成功率最低的也达到了80%,优于其他算法,具有更好的跟踪准确性和稳定性。结论 本文设计并实现了一种基于度量学习的跟踪新方法,利用较少的训练样本即可学习到有判别力的度量矩阵。该跟踪方法对目标特征的维数没有限制,在高维特征空间的判别中更有优势,具有较好的通用性,在有外观、光照变化及遮挡等复杂情况下,均能获取较为准确和稳定的跟踪效果。
关键词
Metric learning for tracking utilizing phase congruency

Huo Qirun1,2, Lu Yao1, Liu Yu2, Chao Jinbo1(1.School of Computer Science, Beijing Institute of Technology, Beijing 100081, China;2.college of Information Engineering, Capital Normal University, Beijing 100048, China)

Abstract
Objective Object tracking is an important research area in computer vision and has been widely adopted both in military and civilian applications. Improving tracking accuracy and stability in realistic scenarios that involve appearance change, occlusion, and illumination change is still difficult for practical application. A tracking method based on the phase congruency transformation and metric learning was presented to solve the aforementioned problem.Methods This study formulates object tracking as a matching task to find a candidate, which is most similar to the target model, over the subsequent image frames. This process is largely controlled by two factors:the selected features that characterize objects and the distance metric used to determine the closest match in the selected feature space. First, the features were extracted by phase congruency transformation. Combining the advantages of ensemble learning and support vector machine (SVM), we then introduce a type of ensemble metric learning to obtain a distance metric matrix utilizing a small number of training data extracted from the fore sequence of images. Most approaches directly solve the optimal metric matrix and induce a large increase in the calculation as the feature dimension increases. In contrast, our method indirectly obtains the projection matrix by learning multiple projection vectors; thus, it is simple and efficient even with high-dimension features. Candidates are obtained by Markov chain Monte Carlo sampling and calculate the distance from the target model utilizing the learned metric matrix in the tracking process. The candidate with the smallest distance value is regarded as the target. Moreover, the object model and metric matrix are constantly updated with new training data extracted during tracking for adaptability.Results The effectiveness of the algorithm has been verified on several challenging video sequences that contain a dynamic background, appearance changes, and occlusions. The AEMTrack algorithm proposed in this study is clearly smaller on both the mean and standard deviation of the location error than those from three mainstream methods. Together with the quantitative assessment of tracking a successful rate, experimental results show that the accuracy of our proposed method even exceeds several mainstream methods in existing tracking studies and has appropriate stability.Conclusion This study designs and realizes a new tracking method based on metric learning. A metric matrix is learned and tends to maximize the distance between samples of different classes using a small amount of training data sampled from an image sequence during tracking. The metric learning process is decomposed into multiple independent linear SVM, which can be executed in parallel implementation. This method can also result in dimension reduction; thus, it is efficient even in high-dimensional space. New targets and background samples are also applied to update the model in the tracking process; hence, the algorithm is adaptive. This tracking method has suitable generality because no limitation on the feature dimension of the target exists. Experimental results show that the proposed method can obtain an accurate and stable tracking effect in the complex scene, including appearance and illumination changes.
Keywords

订阅号|日报