深度学习的目标跟踪算法综述
李玺1, 查宇飞2, 张天柱3, 崔振4, 左旺孟5, 侯志强6, 卢湖川7, 王菡子8(1.浙江大学计算机科学与技术学院, 杭州 310007;2.西北工业大学计算机学院, 西安 710043;3.中国科学院自动化研究所, 北京 100190;4.南京理工大学计算机科学与工程学院, 南京 210094;5.哈尔滨工业大学计算机科学与技术学院, 哈尔滨 150006;6.西安邮电大学计算机学院, 西安 710121;7.大连理工大学信息与通信工程学院, 大连 116024;8.厦门大学信息科学与技术学院, 厦门 361001) 摘 要
目标跟踪是利用一个视频或图像序列的上下文信息,对目标的外观和运动信息进行建模,从而对目标运动状态进行预测并标定目标位置的一种技术,是计算机视觉的一个重要基础问题,具有重要的理论研究意义和应用价值,在智能视频监控系统、智能人机交互、智能交通和视觉导航系统等方面具有广泛应用。大数据时代的到来及深度学习方法的出现,为目标跟踪的研究提供了新的契机。本文首先阐述了目标跟踪的基本研究框架,从观测模型的角度对现有目标跟踪的历史进行回顾,指出深度学习为获得更为鲁棒的观测模型提供了可能;进而从深度判别模型、深度生成式模型等方面介绍了适用于目标跟踪的深度学习方法;从网络结构、功能划分和网络训练等几个角度对目前的深度目标跟踪方法进行分类并深入地阐述和分析了当前的深度目标跟踪方法;然后,补充介绍了其他一些深度目标跟踪方法,包括基于分类与回归融合的深度目标跟踪方法、基于强化学习的深度目标跟踪方法、基于集成学习的深度目标跟踪方法和基于元学习的深度目标跟踪方法等;之后,介绍了目前主要的适用于深度目标跟踪的数据库及其评测方法;接下来从移动端跟踪系统,基于检测与跟踪的系统等方面深入分析与总结了目标跟踪中的最新具体应用情况,最后对深度学习方法在目标跟踪中存在的训练数据不足、实时跟踪和长程跟踪等问题进行分析,并对未来的发展方向进行了展望。
关键词
Survey of visual object tracking algorithms based on deep learning
Li Xi1, Zha Yufei2, Zhang Tianzhu3, Cui Zhen4, Zuo Wangmeng5, Hou Zhiqiang6, Lu Huchuan7, Wang Hanzi8(1.College of Computer Science and Technology, Zhejiang University, Hangzhou 310007, China;2.School of Computer Science, Northwestorn Polytechnical University, Xi'an 710043, China;3.Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China;4.School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094, China;5.School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150006, China;6.College of Computer Science and Technology, Xi'an University of Posts and Telecommunications, Xi'an 710121, China;7.School of Information and Communication Engineering, Dalian University of Technology, Dalian 116024, China;8.School of Information Science and Engineering, Xiamen University, Xiamen 361001, China) Abstract
Object tracking is a fundamental problem in computer vision, which uses context information in a video or image sequence to predict and locate a target(s). It is widely used in smart video monitoring systems, intelligent human interaction, intelligent transportation, visual navigation systems, and many other areas. With the advent of the big data era and the emergence of deep learning methods, tracking performance has substantially improved. In this paper, we introduce the basic research framework of object tracking and review the history of object tracking from the perspective of the observation model. We indicate that deep learning allows for a more robust observation model to be obtained. We review the deep learning methods that are suitable for object tracking from the aspects of deep discriminative model and deep generative model. We also classify and analyze the existing deep object tracking methods from the perspectives of network structure, network function, and network training. In addition, we introduce several other deep object tracking methods, including deep object tracking based on the fusion of classification and regression, on reinforcement learning, on ensemble learning, and on meta-learning. We show the current commonly used databases for object tracking based on deep learning and their evaluation methods. We likewise analyze and summarize the latest specific application scenarios in object tracking from the perspectives of mobile tracking system, detection, and tracking-based system. Finally, we analyze the problems of object tracking, including insufficient training data, real-time tracking, and long-term tracking and specify further research directions for deep object tracking.
Keywords
visual object tracking deep neural network correlation filter deep Siamese network reinforcement learning generative adversarial network
|