音频高层语义分析

魏维; 游静; 刘凤玉; 许满武

发布时间：
摘要点击次数： 3958
全文下载次数： 368
DOI: 10.11834/jig.20070125
2007 | Volume 12 | Number 1

音频高层语义分析

魏维^1,2, 游静¹, 刘凤玉¹, 许满武²(1.南京理工大学计算机科学与技术系，南京 210094;2.成都信息工程学院计算机系，成都 610225)

摘要

为跨越语义鸿沟,提出了一种提取音频中高层语义概念的方法。该方法先用隐马尔可夫模型(HMM)建立对应于分析窗口的低层语义概念,即基本声音语义事件(basic semantic-audio event,BE);然后以音框为单位将声音信号通过短时傅里叶变换及ICA处理来得到对应于HMM模型的可观察符号;接着用贝叶斯决策排除语义窗口对应声音段中的非预定义BE后,按贝叶斯公式所得最大后验概率为准则得到此语义窗口的一个基本声音语义事件组(group of BE,)GBE;最后采用高层语义逻辑定义来描述GBE与高层声音语义概念间的联系,结合由实例训练得到的高层语义逻辑定义最终得到相应语义窗口的高层语义声音概念(high level audio semantic concept,HC)。实验表明此方法能提取与人思维中相似的高层语义概念,在一定程度上可跨越语义鸿沟。

关键词

声音语义内容分析高层语义概念语义视频分析隐马尔可夫模型

Semantic-audio Content Analysis at High Level

()

Abstract

To bridge the semantic gap between audio feature and high-level semantic concept,an approach for semantic-audio content Analysis is presented in this paper.Hidden Markov model(HMM) is trained for modeling BE.In order to extract G_BE corresponding to a semantic window,Bayesian decision theory is used to eliminate the analysis window not belonging to any predefined HMM.Then,each of the residual analysis windows within the semantic window is classified to BE class by criterion of maximum Bayesian posterior probability.Ignoring the order and repetition of BE,G_BE is got.Logic definition of high level audio semantic concept is the connection of G_BE and HC,through which HC can be extracted.The experimental results demonstrate that the proposal approach could extract HC like human thoughts,and could bridge the semantic gap to some degree.

Keywords

semantic-audio content analysis high level semantic-concept semantic-video analysis HMM

在线采编平台

论文出版

年度会议

下载中心

年度信息