Current Issue Cover
映射结合聚类的视频关键帧提取

汪荣贵, 胡健根, 杨娟, 薛丽霞, 张清杨(合肥工业大学计算机与信息学院, 合肥 230009)

摘 要
目的 视频摘要技术在多媒体数据处理和计算机视觉中都扮演着重要的角色。基于聚类的摘要方法多结合图像全局或局部特征,对视频帧进行集群分类操作,再从各类中获取具有代表性的关键帧。然而这些方法多需要提前确定集群的数目,自适应的方法也不能高效的获取聚类的中心。为此,提出一种基于映射和聚类的图像密度值分析的关键帧选取方法。方法 首先利用各图像间存在的差异,提出将其映射至2维空间对应点的度量方法,再依据点对间的相对位置和邻域密度值进行集群的聚类,提出根据聚类的结果从视频中获取具有代表性的关键帧的提取方法。结果 分别使用提出的度量方法对Olivetti人脸库内图像和使用关键帧提取方法对Open Video库进行测试,本文关键帧提取方法的平均查准率达到66%、查全率达到74%,且F值较其他方法高出11%左右达到了69%。结论 本文提出的图像映射后聚类的方法可有效进行图像类别的识别,并可有效地获取视频中的关键帧,进而构成视频的摘要内容。
关键词
Video key frame selection based on mapping and clustering

Wang Ronggui, Hu Jiangen, Yang Juan, Xue Lixia, Zhang Qingyang(School of Computer and Information, Hefei University of Technology, Hefei 230009, China)

Abstract
Objective Increasing public awareness and interest on access to visual information forces the creation of new technologies for representing, indexing, and retrieving multimedia data. For large image data and video libraries, use of efficient algorithms is necessary to enable fast browsing and access. Video abstract technology plays an important role in multimedia data processing and computer vision. Based on the clustering of the global or local features of an image, the video frames are clustered and the representative key frames are obtained. However, most of the existing methods need to determine the number of clusters in advance, and the adaptive method is inefficient in obtaining the clustering center. Method This paper presents a method for video key frame selection based on mapping and clustering. The difference between the different images was used to map the image to the corresponding point in 2D space, and the relative position and field density of points were used to cluster the points. Based on the results of the classification, a representative frame set was selected and used to constitute a video summary. Result Olivetti face database and Open Video database were used to test the proposed algorithm. Video summary results showed precision of 66% and recall of 74%. The F value was 11%. Conclusion Experimental results showed that the proposed method could effectively identify the image categories, which can then be used to quickly obtain the key frames in the video.
Keywords

订阅号|日报