区域块分割与融合的行人再识别
摘 要
目的 由于摄像机视角和成像质量的差异,造成行人姿态变化、图像分辨率变化和光照变化等问题的出现,从而导致同一行人在不同监控视频中的外观区别很大,给行人再识别带来很大挑战。为提高行人再识别的识别率,针对行人姿态变化问题,提出一种区域块分割和融合的行人再识别算法。方法 首先根据人体结构分布,将行人图像划分为3个局部区域。然后根据各区域在识别过程中的作用不同,将GOG(Gaussian of Gaussian)特征、LOMO(local maximal occurrence)特征和KCCA(Kernel canonical correlation analysis)特征的不同组合作为各区域特征。接着通过距离测度算法学习对应区域之间的相似度,并通过干扰块剔除算法消除图像中出现的无效干扰块,融合有效区域块的相似度。最后将行人图像对的全局相似度和各局部区域相似度进行融合,实现行人再识别。结果 在4个基准数据集VIPeR、GRID、PRID450S和CUHK01上进行了大量实验,其中Rank1(排名第1的搜索结果即为待查询人的比例)分别为62.85%、30.56%、71.82%和79.03%,Rank5分别为86.17%、51.20%、91.16%和93.60%,识别率均有显著提高,具有实际应用价值。结论 提出的区域块分割和融合方法,能够去除图像中的无用信息和干扰信息,同时保留行人的有效信息并高效利用。该方法在一定程度上能够解决行人姿态变化带来的外观差异问题,大幅度地提升识别率。
关键词
Person re-identification with region block segmentation and fusion
Jiang Jianguo1,2, Yang Ning1, Qi Meibin1,2, Chen Cuiqun1(1.School of Computer Science and Information Engineering, Hefei University of Technology, Hefei 230009, China;2.Anhui Province Key Laboratory of Industry Safety and Emergency Technology, Hefei 230009, China) Abstract
Objective The person re-identification task is of great value in multi-target tracking and the target retrieval of multi-cameras. Thus, it has received increasing attention in the field of computer vision and widespread interest among researchers at home and abroad in recent years. The differences in camera viewing angles and imaging quality lead to variations in pedestrian posture, image resolution, and illumination. These variations make the appearance of the same pedestrian in various surveillance videos considerably different. This difference, in turn, causes severe interference in person re-identification. To improve the recognition rate of person re-identification and solve the posture changing problem, this study proposes a person re-identification algorithm with region block segmentation and fusion on the basis of human body structure information. Method First, according to the distribution of the human body structure, a pedestrian image is divided into three local regions:the head part (the H region), the shoulder-knee part (the SK region), and the leg part (the L region). These local regions are enlarged to the original image size using a bilinear interpolation method, which can enhance the expression of the regions and fully use the region information. Second, according to the different roles of each local region in the recognition process, the Gaussian of Gaussian (GOG) feature is extracted from the H and the L regions. The GOG feature, the local maximal occurrence (LOMO) feature, and the kernel canonical correlation analysis (KCCA) feature are extracted from the SK region because the SK region contains the most abundant information of pedestrian images. Extracting numerous features in the SK region can increase the diversity of the region information and strengthen the role of the region in the re-identification process. Third, the interference block removal (IBR) algorithm is used to eliminate the invalid blocks in the image and fuse the similarities of the effective blocks. Given the differences in posture and viewpoint, some objects might appear in one image and be absent in another image of the same person captured by another camera. Such objects may cause large changes in the color and texture information of the pedestrian's corresponding body regions. These changes result in disturbances to the recognition process. The regions in which such objects are located are called interference blocks in this study. By observing the location of the interference blocks, we find that the interference blocks are distributed from the shoulder to the knee of pedestrians. Therefore, the IBR algorithm uses the image of the SK region. According to the human body structure distribution, the IBR algorithm horizontally divides the SK region into the chest part (h1 block), the lumbar part (h2 block), and the leg part (h3 block); and vertically divides the region into the left-arm part (v1 block), the torso part (v2 block), and the right-arm part (v3 block). Then, the GOG feature, LOMO feature, and KCCA feature are extracted from each block. The three features of each block are fed to the similarity measure function to obtain the three similarities between the corresponding blocks. The three similarities of the same block are merged to form the final similarity of the block. When the final similarities of the six block (h1, h2, h3, v1, v2, v3) pairs are calculated, the similarities of the three horizontal block (h1, h2, h3) pairs are compared to find the block with the smallest similarity, which is the interference block in the horizontal direction. The interference block in the vertical direction is found in the same manner. When the two interference blocks are removed, the influence of the interference block on the overall pedestrian similarity can be eliminated. After the interference blocks are removed, the similarities of the remaining four blocks are fused as the similarity of the SK region. Finally, the global similarity of the pedestrian image pair and the similarities of the three local regions (H, L, and SK) are combined to realize person re-identification. Result Many experiments are conducted on four benchmark datasets, namely, VIPeR, GRID, PRID450S, and CUHK01. The results of rank 1 (represents the proportion of queried people) for the four datasets are 62.85%, 30.56%, 71.82%, and 79.03%. The results of rank 5 are 86.17%, 51.20%, 91.16%, and 93.60%. The experimental results show the considerable improvement of recognition rates for the small and large datasets. Thus, the proposed algorithm offers practical application value. Conclusion Experimental results show that the proposed method can effectively express the image information of pedestrians. Furthermore, the proposed region block segmentation and fusion algorithm can remove useless and interference information in images as much as possible under the guidance of human body structure information. It can also preserve the effective information of pedestrians and use it effectively. This method can solve the differences in pedestrian appearance caused by changes in pedestrian posture to a certain extent and greatly improve recognition rates.
Keywords
person re-identification human structure information region block segmentation interference block removal region block fusion
|