Current Issue Cover
融合结构与非结构信息的自然图像恰可察觉失真阈值估计

许辰1, 骆挺2, 蒋刚毅1, 郁梅1, 姜求平1, 徐海勇1,2(1.宁波大学信息科学与工程学院, 宁波 315211;2.宁波大学科学技术学院, 宁波 315211)

摘 要
目的 研究表明,图像的恰可察觉失真(JND)阈值主要与视觉系统的亮度适应性、对比度掩模、模块掩模以及图像结构等因素有关。为了更好地研究图像结构对JND阈值的影响,提出一种基于稀疏表示的结构信息和非结构信息分离模型,并应用于自然图像的JND阈值估计,使JND阈值模型与人眼视觉系统具有更好的一致性。方法 首先通过K-均值奇异值分解算法(K-SVD)得到过完备视觉字典。然后利用该过完备字典对输入的自然图像进行稀疏表示和重建,得到该图像对应的结构层和非结构层。针对结构层和非结构层,进一步设计基于亮度适应性与对比度掩模的结构层JND估计模型和基于亮度对比度与信息不确定度的非结构层JND估计模型。最后利用一个能够刻画掩模效应的非线性可加模型对以上两个分量的JND估计模型进行融合。结果 本文提出的JND估计模型利用稀疏表示将自然图像的结构/非结构信息进行分离,然后采用符合各自分量特点的JND模型进行计算,与视觉感知机理高度一致。实验结果表明,本文JND模型能够有效地预测自然图像的JND阈值,受污染图的峰值信噪比(PSNR)值比其他3个JND对比模型值高出35 dB。结论 与现有模型相比,该模型与人眼主观视觉感知具有更好的一致性,更能有效地预测自然图像的JND阈值。
关键词
Just distortion threshold estimation on natural images using fusion of structured and unstructured information

Xu Chen1, Luo Ting2, Jiang Gangyi1, Yu Mei1, Jiang Qiuping1, Xu Haiyong1,2(1.Faculty of Information Science and Engineering, Ningbo University, Ningbo 315211, China;2.College of Science and Technology, Ningbo University, Ningbo 315211, China)

Abstract
Objective Neuroscientists have studied the Bayesian brain perception theory, which indicates that the human vision system indirectly processes input signals during the processing of input images. A complete set of intrinsic derivation mechanisms actively predicts and understands input image information and attempts to ignore any uncertainty information in an image. In other words, given an input image, the brain does not fully process the input visual information, but it has an intrinsic derivation mechanism that enables it to actively predict the gross structure of the image, including certain information (structured information). At the same time, uncertain information (unstructured information), such as residual clutter, is ignored to realize the understanding and perception of the image. In considering the role of structured information in just noticeable distortion (JND) estimation on natural images, a sparse representation-based structured/unstructured information separation model is proposed and applied to the JND threshold estimation. The proposed method achieves great consistency with the human visual system in terms of the perceived JND threshold. Method Initially, 90 natural images are selected for dictionary learning. These training images are pre-processed, and each image is divided into 8×8 non-overlapping image blocks. The variance of each image block is calculated, and the image blocks with high variances are selected as training samples. Then, an over-complete dictionary is learned from a set of training samples using the classical K-singular value decomposition algorithm. Then, the input natural image is reconstructed by sparse representation using the previously learned dictionary via the orthogonal matching pursuit (OMP) algorithm. The corresponding structural layer and non-structural layer of the input natural image can be obtained by setting an appropriate iteration number during the implementation of the OMP algorithm. Subsequently, we further design different JND estimation models for structural and non-structural layers. 1) Luminance adaptability and contrast mask-based JND estimation model for structural layers. The JND threshold value of an image is mainly related to the brightness adaptability of the visual system, contrast mask, module mask, and image structure. Thus, the luminance adaptability function and contrast mask equation are derived under the experimental environment of a regular structure. The JND calculation model of the structural layer is derived from the fusion of the two models. 2) Luminance contrast and information uncertainty-based JND estimation model for non-structural layer. The modular mask effect reveals the visibility of stimuli in the visual system because of the interaction or interference among visual stimuli in the visual content of the input scene. When the structure of the visual content is ordered and the background is uniform, the module mask effect is extremely weak, and the spatial object is easily detected. On the contrary, when the visual content is disordered and uncertain, the module mask effect is enhanced, that is, the detection of space objects is suppressed. Therefore, the module mask effect is related not only to brightness contrast but also to information uncertainty. Therefore, we construct an unstructured layer of the JND model on the basis of the module mask combined with information uncertainty and brightness contrast. Finally, given the overlap between the structural layer of JND and the non-structural layer of JND, using a simple linear sum to fuse the two layers is impossible, and the overlapping parts must be removed. A nonlinear additive model describing the masking effect between different components is utilized to fuse the two JND estimation results. Result Three existing JND models are selected for comparison. For a fair comparison, the same noise is injected into the original image through the JND models, and then the visual effects of the polluted image are compared. The subjective experimental results show that the proposed JND model can better guide the distribution of noise and avoid the sensitive region of human vision relative to other JND models when the same noise is injected. The proposed JND model is also consistent with the subjective visual perception of human eyes. To further verify the fairness, we compare the four JND models using the classical peak signal-to-noise ratio (PSNR). The PSNRs of the contaminated Goddess image and contaminated Lena images are compared. The objective experimental results show that the PSNRs of the proposed model are significantly higher than those of the other three JND models. The proposed JND estimation model uses sparse representation to separate the structured and unstructured information of the input natural image. It then calculates the JND threshold according to the characteristics of different components. The process is consistent with the mechanism of human visual perception. Therefore, the proposed JND estimation model can effectively and accurately predict the JND threshold of natural images. Conclusion Compared with the existing relevant models, the proposed JND model can effectively predict the JND threshold of natural images, and it is much more consistent with human visual perception.
Keywords

订阅号|日报