哈夫曼编码乘积量化的图像哈希检索方法
摘 要
目的 基于哈希编码的检索方法是图像检索领域中的经典方法。其原理是将原始空间中相似的图片经哈希函数投影、量化后,在汉明空间中得到相近的哈希码。此类方法一般包括两个过程:投影和量化。投影过程大多采用主成分分析法对原始数据进行降维,但不同方法的量化过程差异较大。对于信息量不均衡的数据,传统的图像哈希检索方法采用等长固定编码位数量化的方式,导致出现低编码效率和低量化精度等问题。为此,本文提出基于哈夫曼编码的乘积量化方法。方法 首先,利用乘积量化法对降维后的数据进行量化,以便较好地保持数据在原始空间中的分布情况。然后,采用子空间方差作为衡量信息量的标准,并以此作为编码位数分配的依据。最后,借助于哈夫曼树,给方差大的子空间分配更多的编码位数。结果 在常用公开数据集MNIST、NUS-WIDE和22K LabelMe上进行实验验证,与原始的乘积量化方法相比,所提出方法能平均降低49%的量化误差,并提高19%的平均准确率。在数据集MNIST上,与同类方法的变换编码方法(TC)进行对比,比较了从32 bit到256 bit编码时的训练时间,本文方法的训练时间能够平均缩短22.5 s。结论 本文提出了一种基于多位编码乘积量化的哈希方法,该方法提高了哈希编码的效率和量化精度,在平均准确率、召回率等性能上优于其他同类算法,可以有效地应用到图像检索相关领域。
关键词
Hashing method for image retrieval based on product quantization with Huffman coding
Luan Tingting1,2, Zhu Jihua2, Xu Siyu2, Wang Jiaxing2, Shi Xuan2, Li Yaochen2(1.The First Affiliated Hospital, College of Medicine, Zhejiang University, Hangzhou 310003, China;2.School of Software Engineering, Xi'an Jiaotong University, Xi'an 710049, China) Abstract
Objective Hashing method is one of the most popular approaches for content-based image retrieval. The main idea of this approach is to learn the same size of binary codes for each image and then use the Hamming distance to measure the similarity of images. Effective Hashing methods should include at least three properties. First, the learned codes should be short so that large amounts of images can be stored in a small memory. Second, the learned codes should transform images that are perceptually or semantically similar into binary strings with a small Hamming distance. Third, the method should be efficient to learn the parameters of the binary code and encode a new test image. Most Hashing approaches include two important steps to achieve binary coding, namely, projection and quantization. For projection, most Hashing approaches perform principal component analysis (PCA) to reduce the dimensionality of raw data. For quantization, different Hashing approaches may design different strategies. In the quantization stage, most traditional Hashing methods usually allocate the same number of bit to each data subspace for image retrieval. However, information quantities are different in each data subspace. Accordingly, a uniform quantization may result in inefficient codes and high quantization distortion problems, especially when the data have unbalanced information quantities. To address this problem, this study proposes an effective coding method based on product quantization, called Huffman coding. Method Similar to most Hashing approaches, the proposed method utilizes PCA to reduce the dimensionality of raw data in the projection stage. A vector quantization scheme is then carefully designed at the quantization stage. The proposed approach first utilizes product quantization to quantize data after dimensionality reduction to preserve data distribution in the original space. For each subspace, the variance can be directly calculated as the measure of its information quantity. For effectiveness, the subspace with high information quantity should be allocated with a large number of bit for binary coding and vice versa. To achieve this goal, the reciprocal value of the variance proportion can be used to build a Huffman tree, which can then be applied to generate Huffman codes. Accordingly, different bit and values of binary code can be assigned to each subspace. In other words, numerous bit will be allocated to encode subspaces with large variance and few for subspaces with small variance. The variance is easy to calculated, and therefore, the proposed approach is simple and efficient for binary coding. Experimental results illustrate that the Huffman coding method is effective for image retrieval. Result During the experiment, the proposed approach is tested on three public datasets, namely, MNIST, NUS-WIDE, and 22K LabelMe. For each image, a 512D GIST descriptor can be extracted as the input of the Hashing approach. To verify its good performance, the proposed approach is compared with four related approaches:original product quantization method, PCA-based product quantization method, iterative quantization method, and transform coding (TC) method. The experimental results are reported in the form of quantization distortion, mean average precision, recall, and training time. Results show that the average quantization distortion of the proposed approach can be decreased by approximately 49%, and the mean average precision of the retrieval results is increased by approximately 19% compared with the existing method based on product quantization. The training time of the proposed approach is also compared with that of TC from 32 bit to 256 bit on MNIST. The proposed approach can reduce 22.5 s of the training time on average. Conclusion This study proposes Huffman coding for image retrieval in the product quantization stage. According to information quantities, the Huffman-based product quantization scheme can allocate different numbers of bit to each data subspace, which can effectively increase coding efficiency and quantization accuracy. The proposed approach is tested on three public datasets and compared with four related approaches. Experimental results demonstrate that the proposed approach is superior to some state-of-the-art algorithms for image retrieval on mean average precision and recall. The proposed approach does not belong to precise coding methods; thus, our future work will focus on precise Hashing method for effective image retrieval.
Keywords
Hashing image retrieval approximate nearest neighbor search product quantization bit allocation coding efficiency
|