多形状局部区域神经网络结构的行人再识别
摘 要
目的 目前,行人再识别领域将行人图像的全局和局部特征相结合的方法已经成为基本的解决方法。现有的基于局部特征的方法更多的是侧重于定位具有特定的语义区域,这样增加了学习难度,并且对于差异较大的图像场景不具有鲁棒性。为了解决上述问题,通过对网络结构进行改进提出一种多形状局部区域网络(MSPN)结构,它具有多分支并将横向和纵向条状的特征作为局部特征,能够端到端进行训练。方法 网络的多个分支设计可以同时获得多粒度和多形状的局部特征,其中一个分支表示全局特征的学习,两个分支表示横条状不同粒度的局部特征学习,最后一个分支表示竖条状局部特征学习。网络不再学习定位具有特定语义的区域,而是将图像提取的特征切分成横向和竖向的若干条作为局部特征。不同分支条的形状和数量不一致,最后获得不同粒度或不同形状的局部特征信息。因为切分方向的不同,多粒度多形状的局部特征缓解了行人在不同图像中无法对齐的问题。结果 在包括Market-1501、DukeMTMC-ReID和CUHK03在内的主流评估数据集上的综合实验表明,多形状局部区域神经网络和现有的主要方法相比具有更好的表现。其中在数据集Market-1501上达到84.57%的平均准确率(mAP)和94.51%的rank-1准确率。结论 多形状局部区域网络能够学习得到判别能力更强的深度学习模型,从而有效地提升行人再识别的准确率。
关键词
Multishape part network architecture for person re-identification
Chen Liangyu, Li Weijiang(Department of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China) Abstract
Objective Person re-identification (ReID) aims to associate the same pedestrian across multiple cameras. It has attracted rapidly increasing attention in the computer vision community because of its importance for many potential applications, such as video surveillance analysis and content-based image/video retrieval. Person ReID is a challenging task. First, when a single person is captured by different cameras, the illumination conditions, background clutter, occlusion, observable human body parts, and perceived posture of the person can be dramatically different. Second, even within a single camera, the aforementioned conditions can vary through time as the person moves and engages in different actions (e.g., suddenly taking something out of a bag while walking). Third, a gallery itself usually consists of diverse images of a single person from multiple cameras, which, given the above factors, generate high intraclass variation that impedes the generalization of learned representations. Fourth, compared with images in problems such as object recognition or detection, images in person ReID benchmarks are usually of lower resolution, making it difficult to extract distinctive attributes to distinguish one identity from another. The success of deep convolutional networks has introduced powerful representations with high discrimination and robustness for pedestrian images and enhanced the performance of ReID. The combination of global and local features has been an essential solution to improve discriminative performances in person ReID tasks. Previous methods based on local features focused on locating regions with specific predefined semantics, which increased the learning difficulty and did not have robustness for different scenarios. In this study, a multishape part network (MSPN) that has horizontal and vertical strip features as local features is designed. This network can train from end to end. Method We carefully design the MSPN, which is a multibranch deep network architecture consisting of one branch for global feature representations and three branches for local feature representations. MSPN no longer learns to locate regions with specific semantics. Instead, the features extracted from images are divided into horizontal and vertical ones. The shape and partition of different branches are different. Local feature information with different granularities is finally obtained. Our network can be compatible with the horizontal and vertical dislocation of different image features of the same pedestrian because of the different directions of partition. Result Comprehensive experiments implemented on mainstream evaluation data sets, including Market-1501, DukeMTMC-ReID, and CUHK03, indicate that our method robustly achieves state-of-the-art performances. Conclusion A pedestrian recognition method based on MSPN, which can obtain a high discriminative representation of different pedestrians, is proposed in this study. The performance of person ReID is improved effectively.
Keywords
public security surveillance person re-identification convolutional neural network(CNN) deep learning part local feature
|