Current Issue Cover
混合增强视觉认知架构及其关键技术进展

王培元, 关欣(海军航空大学, 烟台 264001)

摘 要
智能视觉系统虽然在大规模信息的特征检测、提取与匹配等处理上具备一定优势,但是在深层次认知上仍存在不确定性和脆弱性,尤其是针对视觉感知基础上的视觉认知任务,相关数理逻辑和图像处理方法并未实现质的突破,智能算法难以取代人类执行较为复杂的理解、推理、决策和学习等操作。为助力智能视觉感知和认知技术的进一步发展,本文总结了混合增强智能在视觉认知领域的应用现状,给出了混合增强视觉认知的基本架构,并对可纳入该架构下的应用领域及关键技术进行了综述。首先,在分析智能视觉感知内涵和基本范畴的基础上,融合人的视觉感知与心理认知,探讨混合增强视觉认知的定义、范畴及其深化过程,对不同的视觉信息处理阶段进行对比,进而在分析相关认知模型发展现状的基础上,构建混合增强视觉认知的基本框架。该架构不仅可依靠智能算法进行快速地检测、识别、理解等处理,最大限度地挖掘"机"的计算潜能,而且可凭借适时、适当的人工推理、预测和决策有效增强系统认知的准确性和可靠性,最大程度地发挥人的认知优势。其次,分别从混合增强的视觉监测、视觉驾驶、视觉决策以及视觉共享等4个领域探讨可纳入该架构的代表性应用及存在的问题,指出混合增强视觉认知架构是现有技术条件下能够更好地发挥计算机效能、减轻人处理信息压力的方式。最后,基于高、中、低计算机视觉处理技术体系,分析混合增强视觉认知架构中部分中高级视觉处理技术的宏观、微观关系,重点综述可视化分析、视觉增强、视觉注意、视觉理解、视觉推理、交互式学习以及认知评估等关键技术。混合增强视觉认知架构有助于突破当前视觉信息认知"弱人工智能"的瓶颈,将有力促进智能视觉系统向人机深度融合方向发展。下一步,还需在纯粹的基础创新、高效的人机交互、柔性的连接通路等方面开展更加深入的研究。
关键词
Hybrid enhanced visual cognition framework and its key technologies

Wang Peiyuan, Guan Xin(Naval Aviation University, Yantai 264001, China)

Abstract
Although the current intelligent vision system has certain advantages in feature detection, the extraction and matching of large-scale visual information and the cognition of deep-seated visual information remain uncertain and fragile. How to mine and understand the connotation of visual information efficiently, and make cognitive decisions is an engaging research field in computer vision. Especially for the visual cognitive task based on visual perception, the related mathematical logic and image processing methods have not achieved a qualitative breakthrough at present due to limitations by the western philosophy system. It makes the development of computer vision processing intelligent algorithm enter a bottleneck period and completely replacing human to perform more complex operations such as understanding, reasoning, decision making, and learning difficult. The basic framework of hybrid enhanced visual cognition and the application fields and key technologies that can be included in the framework to promote the development of intelligent visual perception and cognitive technology based on the application status of hybrid enhanced intelligence in the field of visual cognition are summarized in this paper. First, on the basis of analyzing the connotation and basic category of intelligent visual perception, human visual perception and psychological cognition are integrated; the definition, category, and deepening of hybrid enhanced visual cognition are discussed; different visual information processing stages are compared and analyzed; and then the basic framework of hybrid enhanced visual cognition on analyzing the development status of relevant cognitive models is constructed. The framework can rely on intelligent algorithms for rapid detection, recognition, understanding, and other processing to maximize the computational potential of "machine"; can effectively enhance the accuracy and reliability of system cognition with timely, appropriate artificial reasoning, prediction, and decision making; and give full play to human cognitive advantages. Second, the representative applications and existing problems of the framework are discussed from four fields, namely, hybrid enhanced visual monitoring, hybrid enhanced visual driving, hybrid enhanced visual decision making, and hybrid enhanced visual sharing, and the hybrid enhanced visual cognitive framework is identified as an expedient measure to enhance computer efficiency and reduce the pressure on people to process information under existing technical conditions. Then, based on high, medium, and low computer vision processing technology systems, the macro and micro relationships of several medium- and high-level visual processing technologies in a hybrid enhanced visual cognition framework are analyzed, focusing on key technologies such as visual analysis, visual enhancement, visual attention, visual understanding, visual reasoning, interactive learning, and cognitive evaluation. This framework will help break through the bottleneck of "weak artificial intelligence" in current visual information cognition and effectively promote the further development of intelligent vision system toward the direction of human-computer deep integration. Next, more indepth research must be carried out on pure basic innovation, efficient human-computer interaction, and flexible connection path.
Keywords

订阅号|日报