视频局部特征描述子的紧凑表示方法
摘 要
目的 随着手持移动设备的迅猛发展和大数据时代的到来,以多媒体数据为核心的视觉搜索等研究和应用得到了广泛关注。其中局部特征描述子的压缩、存储和传输起到了举足轻重的作用。为此在传统图像/视频压缩框架中,提出一种高效的视觉局部特征的紧凑表示方法,使得传统内容编码可以适应广泛的检索分析等需求。方法 为了得到紧凑、有区分度、同时高效的局部特征表示,首先引入了多参考的预测机制,在消除了时空冗余的同时,通过充分利用视频纹理编码的信息,消除了来自纹理-特征之间的冗余。此外,还提出了一种新的率失真优化方法——码率-准确率最优化方法,使得基于匹配/检索应用的性能达到最优。结果 在不同数据集上进行验证实验,和最新的视频局部描述子压缩框架进行比较,本文方法能够在保证匹配和检索性能的基础上,显著地减少特征带来的比特消耗,达到大约150:1的压缩比。结论 本文方法适用于传统图像/视频编码框架,通过在码流中嵌入少量表示特征的信息,即可实现高效的检索性能,是一种面向检索等智能设备应用的新型多媒体内容编码框架。
关键词
Compact representation of video local feature descriptors
Zhang Xiang, Wang Shiqi, Zhang Xinfeng, Ma Siwei, Gao Wen(Institute of Digital Media, School of Electronic Engineering and Computer Science, Peking University, Beijing 100871, China) Abstract
Objective Compression, storage, and transmission of local feature descriptors have shown remarkable importance in a wide range of image and video applications. In this paper, a hybrid frame work of conventional image/video compression and compact feature representation is developed to make the multimedia smarter for retrieval/analysis driven applications. Method The multiple-reference prediction technique is introduced to remove the redundancy from both spatial-temporal domain and texture feature by efficiently leveraging the information in video coding. This process is done to achieve compact, discriminative, and efficient representation of feature descriptors. Furthermore, the rate-accuracy optimization technique, which targets to optimize the performance in matching/retrieval based applications, is introduced. Result Based on the extensive simulations on different test databases, the bitrate of visual features is significantly reduced according to the proposed scheme to wards 150:1 compression ratio,while maintaining a state-of-the-art matching/retrieval performance. Conclusion The proposed system has demonstrated its efficiency and effectiveness toward novel multimedia compression framework for various applications in smart devices.
Keywords
computer science and technology visual search video local feature descriptor scale-invariant feature transform high efficiency video coding
|