图像深度估计硬件实现算法
摘 要
目的 近年来,3DTV(3-dimension television)与VR(virtual reality)技术迅速发展,但3D内容的短缺却成为该类技术发展的瓶颈。为快速提供更多的3D内容,需将现有的2D视频转换为3D视频。深度估计是2D转3D技术的关键,为满足转换过程中实时性较高的要求,本文提出基于相对高度深度线索方法的硬件实现方案。方法 首先对灰度图进行Sobel边缘检测得到边缘图,然后对其进行线性追踪以及深度赋值完成深度估计得到深度图。在硬件实现方案中,Sobel边缘检测采用五级流水设计以及并行线轨迹计算方式,充分利用硬件设计的并行性,以提高系统的处理效率;在深度估计中通过等效处理简化“能量函数”的方式将算法中大量的乘法、除法以及指数运算简化成加法、减法和比较运算,以减小硬件资源开销;同时方案设计中巧妙借助SDRAM(synchronous dynamic random access memory)突发特性完成行列转换,节省系统硬件资源。结果 最后完成了算法的FPGA(field programmable gate array)实现,并选取了2幅图像进行深度信息提取。将本文方法的软硬件处理效果与基于运动估计的深度图提取方法进行对比,结果表明本文算法相较于运动估计方法对图像深度图提取效果更好,同时硬件处理可以实现对2D图像的深度信息提取,且具有和软件处理一致的效果。在100 MHz的时钟频率下,估算帧率可达33.18帧/s。结论 本文提出的硬件实现方案可以完成对单幅图像的深度信息提取且估算帧率远大于3DTV等3维视频应用中实时要求的24帧/s,具有很好的实时性和可移植性,为后期的视频信息处理奠定了基础。
关键词
Hardware implementation algorithm of image depth estimation
Yang Yuan, Chen Fu(Faculty of Automation and Information Engineering, Xi'an University of Technology, Xi'an 710048, China) Abstract
Objective In recent years, 3D television (3DTV) and virtual reality technology have developed rapidly, but the shortage of 3D resources has become the bottleneck of this technology development. Existing 2D videos must be converted to 3D videos to provide more 3D resources quickly. Depth estimation is the key step of 2D to 3D technology. Hardware implementation is one of the effective methods to meet the requirements of real-time conversion process. Most depth estimation algorithms make hardware implementation highly complex. Considering the depth estimation effect and easy implementation, this study proposes a hardware implementation scheme based on relative height and depth cue method to realize high-speed processing and hardware resource saving. Method For the algorithm level, a color image is first converted to grayscale, and the edge graph is obtained by Sobel edge detection of a grayscale image. Line trace is obtained by a line tracing algorithm, and the depth map is obtained by the depth assignment of the line trajectory.In hardware implementation, Sobel edge detection uses a five-stage pipeline design and parallel trajectory calculation to maximize the parallelism of hardware design to improve system efficiency. In the depth estimation, energy function is simplified by equivalent processing. Thus,a large number of multiplication, division, and exponential operations are replaced by addition, subtraction, and comparison operations. More than 2 300 multiplication and division operations and more than 780 exponential operations are reduced, thereby reducing hardware resource cost. Given that linear tracking and depth assignment are performed in columns, edge graph informationneeds to be converted from rows to columns. In this design, SDRAM burst characteristics are used to complete row-column conversion and save system hardware resources. The hardware implementation scheme is designed with VERILOG-HDL, a hardware description language. Result The study selects two typical images, including buildings and people, and verifies the algorithm based on the Altera DE2-115 FPGA platform to verify the feasibility of the hardware implementation method. The verification method is as follows:First, the design with VERILOG-HDL is simulated with QUARTUS-Ⅱ.A grayscale picture with a size of 1 024×768 pixels is downloaded to FPGA through the serial port, and the depth map is estimated by FPGA. The data are later sent to the PC terminal through a serial port, and the depth map is drawn by MATLAB. Simulation and verification results show that the proposed hardware implementation method can extract the depth of 2D images correctly, and the estimated frame rate is up to 33.18 fps at 100 MHz clock frequency.Finally, the hardware processing effect is compared with the software processing effect of this method and the typical motion estimation algorithm, and the peak signal-to-noise ratio(PSNR) after image processing is calculated. Experimental results show that the PSNR of the three methods for the building picture is 13.147,13.028, and 13.208 4 and that the PSNR of the three methods for the character image is 11.072 8, 10.94, and 10.980 4. Thus, the proposed algorithm is more effective than the motion estimation method, and the hardware processing method can achieve the depth of 2D image extraction, which is consistent with the software processing. Conclusion The proposed hardware implementation can complete the depth information extraction of an image. The estimated frame rate is larger than the real-time requirements in 3D video applications, such as 24 fram/s in 3DTV, and has good real-time performance and portability, thereby establishing a foundation for video information processing. However, similar tomethods based on other typical algorithms such as motion estimation, the edge of the extracted depth map in this study remains sharp with burr. Future works can consider the use of a digital filter to smooth the depth map and improve quality.
Keywords
|