深度哈希图像检索方法综述
刘颖1,2, 程美2, 王富平1,2, 李大湘1,2, 刘伟1,2, 范九伦1,2(1.电子信息现场勘验应用技术公安部重点实验室, 西安 710121;2.西安邮电大学图像与信息处理研究所, 西安 710121) 摘 要
随着网络上图像和视频数据的快速增长,传统图像检索方法已难以高效处理海量数据。在面向大规模图像检索时,特征哈希与深度学习结合的深度哈希技术已成为发展趋势,为全面认识和理解深度哈希图像检索方法,本文对其进行梳理和综述。根据是否使用标签信息将深度哈希方法分为无监督、半监督和监督深度哈希方法,根据无监督和半监督深度哈希方法的主要研究点进一步分为基于卷积神经网络(convolutional neural networks,CNN)和基于生成对抗网络(generative adversarial networks,GAN)的无监督/半监督深度哈希方法,根据数据标签信息差异将监督深度哈希方法进一步分为基于三元组和基于成对监督信息的深度哈希方法,根据各种方法使用损失函数的不同对每类方法中一些经典方法的原理及特性进行介绍,对各种方法的优缺点进行分析。通过分析和比较各种深度哈希方法在CIFAR-10和NUS-WIDE数据集上的检索性能,以及深度哈希算法在西安邮电大学图像与信息处理研究所(Center for Image and Information Processing,CⅡP)自建的两个特色数据库上的测试结果,对基于深度哈希的检索技术进行总结,分析了深度哈希的检索技术未来的发展前景。监督深度哈希的图像检索方法虽然取得了较高的检索精度。但由于监督深度哈希方法高度依赖数据标签,无监督深度哈希技术更加受到关注。基于深度哈希技术进行图像检索是实现大规模图像数据高效检索的有效方法,但存在亟待攻克的技术难点。针对实际应用需求,关于无监督深度哈希算法的研究仍需要更多关注。
关键词
Deep Hashing image retrieval methods
Liu Ying1,2, Cheng Mei2, Wang Fuping1,2, Li Daxiang1,2, Liu Wei1,2, Fan Jiulun1,2(1.Key Laboratory of Electronic Information Application Technology for Scene Investigation, Ministry of Public Security, Xi'an 710121, China;2.Center for Image and Information Processing, Xi'an University of Posts and Telecommunications, Xi'an 710121, China) Abstract
The efficient processing of massive amounts of data obtained as a result of the rapid growth of image and video data transmission is becoming increasingly difficult for traditional image retrieval methods. The feature-Hashing technology, which can achieve efficient feature compression and fast feature matching and image retrieval, is introduced to address this issue. The deep learning technology also has unique advantages in feature extraction and compact description. The deep Hashing technology, which combines feature Hashing with deep learning, has become an interesting research topic in the area of large-scale image retrieval in solving the problem of large-scale image retrieval. Image retrieval methods based on deep Hashing have attracted increasing attention. Extensive research on image retrieval technologies using deep Hashing has been conducted in recent years and is reported in this paper. First, the deep Hashing method is divided into unsupervised, semisupervised, and supervised deep Hashing methods according to whether label information is used. Second, unsupervised and semisupervised deep Hashing methods are further divided into two types, namely, unsupervised/semisupervised deep Hashing based on deep network models and GANs (generative adversarial networks). In the unsupervised deep Hashing based on the deep network models, the DeepBit algorithm and the SADH (similarity-adaptive deep Hashing) algorithm are mainly introduced. In the GAN-based unsupervised deep Hashing method, we illustrate the principles of HashGAN, BGAN (binary generative adversarial networks) and PGH (progressive generative Hashing) algorithms. In the semi-supervised deep Hashing method, the SSDH (semi-supervised discriminant Hashing) algorithm based on the depth models and the SSGAH (semi-supervised generative adversarial Hashing) algorithm based on the generated adversarial network are mainly interpreted. Third, the supervised deep Hashing algorithms are divided into deep Hashing methods based on triple labels and data pairs depending on the different types of label information used. Designing loss functions and controlling quantization errors occupies important parts of deep Hashing image retrieval, hence the algorithms are classified in more detail according to different loss functions in several supervised deep Hashing methods. In the deep Hashing methods based on paired supervision information, the algorithm are further classified as deep Hashing methods using square loss function, using cross-entropy loss function, or designing a new loss function. 1) In the Hash method using the square loss function, CNNH (convolutional neural network Hashing) is introduced in detail. 2) In the Hash method using the cross entropy loss function, we mainly describe DPSH (deep supervised Hashing with pairwise labels), DSDH (deep supervised discrete Hashing), HashNet and HashGAN four algorithm models. 3) DSH (deep supervised Hashing) and DVSQ (deep visual-semantic quantization) algorithms design new loss functions in their research. Among the deep Hashing methods based on triple labels, 1) deep Hashing methods using triple loss function are mainly illustrated: NINH (network in network Hashing), DRSCH (deep regularized similarity comparison Hashing), DTQ (deep triplet quantization). And the triple loss function is actually improved from the hinge loss function; 2) the deep Hashing methods using the triple entropy loss function: DTSH (deep supervised Hashing with triplet). Because triple labels require a lot of image preprocessing, there is little research about it. After introducing principles and characteristics of selected classical algorithms, and the advantages and disadvantages of each deep supervised algorithm are analyzed. Fourth, we compare the retrieval performances of each algorithm on two commonly used large-scale datasets, namely, CIFAR-10 and NUS-WIDE. We also investigate the performance of the DPSH algorithm on two specialized datasets, namely, CⅡP(Center for Image and Information Process-ing)-CSID(crime scene investigation image database) and CⅡP-TPID(tread pattern image dataset), and summarize existing deep Hashing-based retrieval technologies. Finally, we discuss the future development of deep Hashing-based retrieval algorithms. Hashing has improved the image retrieval speed on very-large-scale datasets, but the overall retrieval performance remains low. Hashing with deep learning has been extensively used in recent years to extract features of high-level semantic information. The CNNH algorithm is the first of such attempts. The excellent performance of CNNH has opened a new chapter for Hashing-based image retrieval methods. Deep Hashing methods based on paired supervisory information or triplet supervised information have caused improvements in algorithm structure, Hashing function, loss function, and control quantization error. However, the improvement of triple-based deep Hashing methods is limited by their requirement of numerous image-preprocessing works. Deep Hashing methods based on pairwise label information provide some insights into the way to enhance triple-based deep Hashing methods. For example, NINH improves the network structure from CNNH, and the DTSH algorithm is based on the algorithm structure of DPSH. Deep Hashing-based image retrieval methods have their own advantages in retrieval performance. The existing methods have achieved superior retrieval precision, but a space for improvement in controlling quantization error and learning image representation remains. Labeling images one by one will require high labor and time costs because the supervised deep Hashing method highly depends on data labels but the data scale in reality is expanding. Scholars have paid increasing attention to unsupervised deep Hashing technologies and achieved significant performance improvements by combining such technologies with GANs or deep mod-els. Experimental results on two special databases show that the DPSH algorithm performs efficiently on CⅡP-CSID and competitively on CⅡP-TPID. The deep Hashing technology is an effective method for large-scale image retrieval, but major problems remain unsolved. On the basis of the needs of practical applications, the research on unsupervised deep Hashing algorithms requires further attention. Network models and feature learning should also be improved in different ways depending on dataset characteristics and case used. The potential applications of the deep Hashing technology are wide, including biometrics and multimodal retrieval. The experimental results of the DPSH algorithm on two special databases reveal the need to customize network models and feature-learning algorithms in accordance with their cases used. Such a need renders the deep Hashing technology for critical image retrieval research areas and presents a great potential for various specialized industries.
Keywords
|