哈希编码结合空间金字塔的图像分类
摘 要
目的 稀疏编码是当前广泛使用的一种图像表示方法,针对稀疏编码及其改进算法计算过程复杂、费时等问题,提出一种哈希编码结合空间金字塔的图像分类算法。方法 首先,提取图像的局部特征点,构成局部特征点描述集。其次,学习自编码哈希函数,将局部特征点表示为二进制哈希编码。然后,在二进制哈希编码的基础上进行K均值聚类生成二进制视觉词典。最后,结合空间金字塔模型,将图像表示为空间金字塔直方图向量,并应用于图像分类。结果 在常用的Caltech-101和Scene-15数据集上进行实验验证,并和目前与稀疏编码相关的算法进行实验对比。与稀疏编码相关的算法相比,本文算法词典学习时间缩短了50%,在线编码速度提高了1.3~12.4倍,分类正确率提高了1%~5%。结论 提出了一种哈希编码结合空间金字塔的图像分类算法,利用哈希编码代替稀疏编码对局部特征点进行编码,并结合空间金字塔模型用于图像分类。实验结果表明,本文算法词典学习时间更短、编码速度更快,适用于在线词典学习和应用。
关键词
Image classification algorithm based on hash codes and space pyramid
Peng Tianqiang1, Li Fang2(1.Department of Computer Science and Engineering, Henan Institute of Engineering, Zhengzhou 451191, China;2.Henan Image Recognition Engineering Center, Zhengzhou 450002, China) Abstract
Objective Sparse coding is widely used to represent images. However, this method and its improved algorithms require complex computation and long running times, among other drawbacks. An image classification algorithm, based on hash codes and space pyramids, is proposed to solve these issues. Method The algorithm consists of four steps. First, extract local feature points from the images. Second, learn binary auto-encoder hashing functions, which map the local feature points into hash codes. Third, perform binary k-means cluster on the binary hash codes and generate the binary visual vocabularies. Finally, combine with a spatial pyramid matching model, and represent the image by the histogram vector of the space pyramid, which is used for image classification. Result In order to verify the efficiency of the proposed algorithm, we used two common datasets, Caltech-101 and Scene-15. The results were compared with state-of-the-art sparse coding algorithms, which showed the time of learning vocabularies of our method was 50% left, the online encoder speed was increased 1.3~12.4 times, and the classification accuracy increased 1%~5%. We also compared the classification performance of different hash encode methods, such as PCA-ITQ and KSH. Conclusion In this paper, a novel image classification algorithm was proposed. The proposed algorithm, encoded local feature points by hash codes rather than sparse coding, and image classification was achieved with a spatial pyramid matching model. Experimental results showed that the proposed algorithm had faster learning vocabulary and encoder speed, and could be used for online vocabulary learning and other online applications.
Keywords
|