多方向显著性权值学习的行人再识别
摘 要
目的 针对当前行人再识别匹配块的显著性外观特征不一致的问题,提出一种对视角和背景变化具有较强鲁棒性的基于多向显著性相似度融合学习的行人再识别算法。方法 首先用流形排序估计目标的内在显著性,并融合类间显著性得到图像块的显著性;然后根据匹配块的4种显著性分布情况,通过多向显著性加权融合建立二者的视觉相似度,同时采用基于结构支持向量机排序的度量学习方法获得各方向显著性权重值,形成图像对之间全面的相似度度量。结果 在两个公共数据库进行再识别实验,本文算法较同类方法能获取更为全面的相似度度量,具有较高的行人再识别率,且不受背景变化的影响。对VIPeR数据库测试集大小为316对行人图像的再识别结果进行了定量统计,本文算法的第1识别率(排名第1的搜索结果即为待查询人的比率)为30%,第15识别率(排名前15的搜索结果中包含待查询人的比率)为72%,具有实际应用价值。结论 多方向显著性加权融合能对图像对的显著性分布进行较为全面的描述,进而得到较为全面的相似度度量。本文算法能够实现大场景非重叠多摄像机下的行人再识别,具有较高的识别力和识别精度,且对背景变化具有较强的鲁棒性。
关键词
Person re-identification based on multi-directional saliency metric learning
Chen Ying, Huo Zhonghua(Key Laboratory of Advanced Process Control for Light Industry(Ministry of Education), Jiangnan University, Wuxi 214122, China) Abstract
Objective Person re-identification is important in video surveillance systems because it reduces human efforts in searching for a target from a large number of video sequences. However, this task is difficult because of variations in lighting conditions, clutter in the background, changes in individual viewpoints, and differences in individual poses. To tackle this problem, most studies concentrated either on designing a feature representation, metric learning method, or discriminative learning method. Visual saliency based on discriminative learning methods has recently been exploited because salient regions can help humans efficiently distinguish targets. Given the problem of inconsistent salience properties between matched patches in person re-identification, this study proposes a multi-directional salience similarity evaluation approach for person re-identification based on metric learning. The proposed method is robust to viewpoints and background variations. Method First, the salience of image patches is obtained by fusing inter-salience and intra-salience, which are estimated by manifold ranking. The visual similarity between matched patches is then established by the multi-directional weighted fusion of salience according to the distribution of the four saliency types of matched patches. The weight of saliency in each direction is obtained by using metric learning in the base of structural SVM ranking. Finally, a comprehensive similarity measure of image pairs is formed. Result The proposed method is demonstrated on two public benchmark datasets (e.g., VIPeR and ETHZ), and experimental results show that the proposed method achieves excellent re-identification rates with comprehensive similarity measures compared with other similar algorithms. Moreover, the proposed method is invariant to the effects of background variations. The re-identification results on the VIPeR dataset with half of the dataset sampled as training samples are quantitatively analyzed, and the performance of the proposed method outperforms existing learning based methods by 30% at rank 1(represents the correct matched pair) and 72% at rank 15(represents the expectation of the matches at rank 15). The proposed method can still achieve state-of-the-art performance even if the size of the training pair is small. For generalization verification, experiments are conducted on the ETHZ dataset for testing. Result shows that the proposed method outperforms existing feature-design-based methods and supervised-learning-based methods on all three sequences. Thus, the proposed method shows practical significance. Conclusion The multi-directional weighted fusion of salience can yield a comprehensive description of the saliency distribution of image pairs and obtain a comprehensive similarity measure. The proposed method can realize person reidentification in large-scale, non-overlapping, multi-camera views. Furthermore, the proposed method improves the discriminative and accuracy performance of re-identification and has strong robustness to background changes.
Keywords
|