Current Issue Cover
分布式视频编码中关键帧丢失错误保护

荣松, 杨红, 卿粼波, 王正勇(四川大学电子信息学院, 成都 610065)

摘 要
目的 分布式视频编码较其传统视频编码具有编码简单、误码鲁棒性高等特点,可以很好地满足如无人机航拍、无线监控等新型视频业务的需求。在分布式视频编码中,视频图像被交替分为关键帧和Wyner-Ziv帧,由于受到信道衰落和干扰等因素的影响,采用传统帧内编码方式的关键帧的误码鲁棒性远不如基于信道编码的Wyner-Ziv帧。关键帧能否正确传输和解码对于Wyner-Ziv帧能否正确解码起着决定性的作用,进而影响着整个系统的压缩效率和率失真性能。为此针对关键帧在异构网络中的鲁棒性传输问题,提出一种基于小波域的关键帧质量可分级保护传输方案。方法 在编码端对关键帧同时进行传统的帧内视频编码和基于小波域的Wyner-Ziv编码,解码端将经过错误隐藏后的误码关键帧作为基本层,Wyner-Ziv编码产生的校验信息码流作为增强层。为了提高系统的分层特性以便使系统的码率适应不同的网络条件,进一步将小波分解后图像的各个不同层的低频带和高频带组合成不同的增强层,根据不同信道环境,传输不同层的Wyner-Ziv校验数据。同时对误码情况下关键帧的虚拟噪声模型进行了改进,利用第1个增强层已解码重建的频带与其对应边信息来获得第2个和第3个增强层对应频带的更加符合实际的虚拟信道模型的估计。结果 针对不同的视频序列在关键帧误码率为1%20%时,相比较于传统的帧内错误隐藏算法,所提方案可以提高视频重建图像的主观质量和整体系统的率失真性能。例如在关键帧误码率为5%时,通过传输第1个增强层,不同的视频序列峰值信噪比(PSNR)提升可达25 dB左右;如果继续传输第2个增强层的校验信息,视频图像的PSNR也可以提升0.51.6 dB左右;如果3个增强层的校验信息都传输的话,基本上可以达到无误码情况下关键帧的PSNR。结论 本文所提方案可以很好地解决分布式视频编码系统中的关键帧在实际信道传输过程中可能出现的误码问题,同时采用的分层传输方案可以适应不同网络的信道情况。
关键词
Error protection for key frames in distributed video coding

Rong Song, Yang Hong, Qing Linbo, Wang Zhengyong(College of Electronic Information Engineering, Sichuan University, Chengdu 610065, China)

Abstract
Objective Distributed video coding (DVC) has attracted the significant attention of many relevant international standardization committees and experts ever since the emergence of distributed source coding (DSC). DSC is a new class of source coding approaches based on the Slepian-Wolf theorem and the Wyner-Ziv (WZ) theorem. Owing to its characteristic of slight encoding and high error robustness, DVC is a good way to meet the demands of the new video business, which requires low-power consumption and low complexity, such as video chat, unmanned aerial wireless monitoring, and so on. However, the bit error ratio of the wireless channel is higher than the wired channel because of the impact of the channel attenuation, multipath interference, frequency band mutual interference, and so on. In the DVC system, video source is interleaved with key frames and WZ frames, and the side information regarded as the noise version of the current WZ frame is generated by the motion estimation and compensation algorithm of the adjacent key frames. Therefore, the key frames, regardless of their ability to correctly decode and transmit, would affect the compression efficiency and rate distortion of the whole system. However, the robustness of the key frames that use traditional intra-frame coding is far lower than that of the WZ frames, which are based on channel coding. For the robustness and transmission of key frames in the heterogeneous network, this paper presents a quality scalable protection solution for the key frames in wavelet domain DVC. Method At the encoder side, the key frames are encoded by the traditional HEVC/H.265 (High Efficiency Video Coding) intra-frame coding and Wyner-Ziv coding based wavelet domain simultaneously. The HEVC bitstreams are transmitted to the wireless channel. The information bits are directly discarded for the WZ bistreams, and the generated parity bits are stored in buffer. To make the bit rate of the system adapt to different network conditions, different layers of low-frequency and high-frequency bands of the wavelet decomposition image can be combined into different enhanced layers. Initially, the decoder determines whether the HEVC bitstreams of the key frames are lost or not. If there is no error, the HEVC bitstreams are decoded to reconstruct directly, and the WZ parity bits in buffer will be deleted. On the contrary, the error concealment technique will be used to reconstruct a video frame of the received HEVC bitsreams. In addition, the reconstructed frame is accepted as the side information of the current key frame, and the decoder will request the WZ data of different enhancement layer according to the different channel environment. Moreover, the original frame and its corresponding side information roughly obey the Laplace distribution in the DVC system. Therefore, the real practice is to use the forward reference frame and side information to obtain the virtual noise model of the current frame because the decoder cannot obtain accurate original information. However, if the channel condition is limited and there are simultaneous errors in the key frames, then it is impossible to send the parity data of all enhancement layers. As a result, the quality of the reconstructed forward reference frame may be relatively poor and the estimation of the virtual noise model may have a large gap compared with the practical situation. Therefore, this paper improves the virtual noise model of the error key frames because of the similarity of the virtual noise model of the same layer in the wavelet decomposition image. With the decoded bands of the first enhancement layer and its corresponding side information, the more accord actual virtual noise model of the second and the third enhancement layer could be obtained. Result To validate the effectiveness of the proposed scheme, the luminance of three video sequences with different motion characteristics are simulated, including the foreman, bus, and coastguard sequences. The rate-distortion performance over packet loss channels with different random packet loss ratio[i.e., packet loss rate(PLR), PLR=(1%, 5%, 10%, 20%)] is evaluated. Experiments results show that in comparison with the traditional error concealment method, the proposed scheme can effectively improve the rate-distortion performance of the reconstructed video image under different channel conditions. Specifically, if only the parity data of the first enhancement layer are transmitted and the loss rate of key frames is 5%, the peak signal-to-noise ratio (PSNR) of the reconstructed video can be improved to about 25 dB. If the parity data of the second enhancement layer continue to be transmitted, the PSNR of the reconstructed video can also be increased from 0.51.6 dB. If all parity data of the three enhancement layers are transmitted, the decoded video can basically achieve the same quality of the key frames without errors. When the data loss ratio is relatively high, such as 20%, the quality of the reconstructed video by typical error concealment method nearly cannot meet the basic requirements. However, with the parity data of the first enhancement layer transmitted, the PSNR could be improved about 4.58.3 dB in the proposed scheme. If the parity data of the second enhancement layer continues to be transmitted, the PSNR could be also increased from 2.74.1 dB, if all parity data of the three enhancement layers are transmitted, the PSNR can also be increased from 3.7 4.6 dB. In general, the different reconstructed video quality could be obtained with the transmission of the different enhancement layers. Conclusion Experimental results have indicated that the proposed error protection scheme for key frames in wavelet domain DVC can improve the robustness of key frames. The proposed framework can also improve the rate-distortion performance for different channel environments and requirements. However, the proposed scheme is based on the feedback channel, which causes some delay during decoding. Therefore, the rate estimation in the encoder side can be the next direction of research.
Keywords

订阅号|日报