混合生成式和判别式模型的图像自动标注

李志欣; 施智平; 张灿龙; 王金艳

发布时间： 2015-05-07
摘要点击次数： 3787
全文下载次数： 615
DOI: 10.11834/jig.20150511
2015 | Volume 20 | Number 5

混合生成式和判别式模型的图像自动标注

李志欣¹, 施智平², 张灿龙¹, 王金艳¹(1.广西师范大学计算机科学与信息工程学院, 桂林 541004;2.首都师范大学信息工程学院, 北京 100048)

摘要

目的由于图像检索中存在着低层特征和高层语义之间的“语义鸿沟”,图像自动标注成为当前的关键性问题.为缩减语义鸿沟,提出了一种混合生成式和判别式模型的图像自动标注方法.方法在生成式学习阶段,采用连续的概率潜在语义分析模型对图像进行建模,可得到相应的模型参数和每幅图像的主题分布.将这个主题分布作为每幅图像的中间表示向量,那么图像自动标注的问题就转化为一个基于多标记学习的分类问题.在判别式学习阶段,使用构造集群分类器链的方法对图像的中间表示向量进行学习,在建立分类器链的同时也集成了标注关键词之间的上下文信息,因而能够取得更高的标注精度和更好的检索效果.结果在两个基准数据集上进行的实验表明,本文方法在Corel5k数据集上的平均精度、平均召回率分别达到0.28和0.32,在IAPR-TC12数据集上则达到0.29和0.18,其性能优于大多数当前先进的图像自动标注方法.此外,从精度—召回率曲线上看,本文方法也优于几种典型的具有代表性的标注方法.结论提出了一种基于混合学习策略的图像自动标注方法,集成了生成式模型和判别式模型各自的优点,并在图像语义检索的任务中表现出良好的有效性和鲁棒性.本文方法和技术不仅能应用于图像检索和识别的领域,经过适当的改进之后也能在跨媒体检索和数据挖掘领域发挥重要作用.

关键词

图像自动标注概率潜在语义分析多标记学习分类器链图像检索

Hybrid generative/discriminative model for automatic image annotation

Li Zhixin¹, Shi Zhiping², Zhang Canlong¹, Wang Jinyan¹(1.College of Computer Science and Information Technology, Guangxi Normal University, Guilin 541004, China;2.College of Information Engineering, Capital Normal University, Beijing 100048, China)

Abstract

Objective Given the notorious semantic gap between low level features and high level concepts in image retrieval, automatic image annotation has become a crucial issue. To bridge the semantic gap, this paper proposes a hybrid generative/discriminative approach to annotate images automatically.Method In the generative learning stage, images are modeled by continuous probabilistic latent semantic analysis model. As a result, we can obtain the corresponding model parameters and the topic distribution of each image. If this topic distribution is taken as an intermediate representation of each image, the image auto-annotation problem could be transformed into a multi-label classification problem. In the discriminative learning stage, we construct ensembles of classifier chains by learning these intermediate representations. At the same time, the contextual information of the annotation words can be integrated into the classifier chains. Therefore, this approach could achieve higher annotation accuracy and better retrieval performance.Result Experiments on two baseline datasets indicate that the average precision and recall of our approach attained 0.28 and 0.32, respectively, on Corel5k dataset. In addition, these two measures of our approach attained 0.29 and 0.18, respectively, on IAPR-TC12 dataset. The experimental results proved that our approach performed better than most state-of-the-art approaches on many evaluation measures. Furthermore, the precision-recall curve showed the superior performance of our approach over several typical and representative approaches.Conclusion On the basis of hybrid learning strategy, this paper presents an image auto-annotation approach, which integrates the advantages of the generative and discriminative models. As a result, the approach exhibits better, more effective, and more robust semantic image retrieval. The methods and techniques of this paper are not only usable in the fields of image retrieval and recognition, but they can play an important role in the fields of cross-media retrieval and data mining after an appropriate adaption.

Keywords

automatic image annotation probabilistic latent semantic analysis multi-label learning classifier chain image retrieval

在线采编平台

论文出版

年度会议

下载中心

年度信息