利用运动线索的单目深度测量
摘 要
目的 传统的单目视觉深度测量方法具有设备简单、价格低廉、运算速度快等优点,但需要对相机进行复杂标定,并且只在特定的场景条件下适用。为此,提出基于运动视差线索的物体深度测量方法,从图像中提取特征点,利用特征点与图像深度的关系得到测量结果。方法 对两幅图像进行分割,获取被测量物体所在区域;然后采用本文提出的改进的尺度不变特征变换SIFT(scale-invariant feature transtorm)算法对两幅图像进行匹配,结合图像匹配和图像分割的结果获取被测量物体的匹配结果;用Graham扫描法求得匹配后特征点的凸包,获取凸包上最长线段的长度;最后利用相机成像的基本原理和三角几何知识求出图像深度。结果 实验结果表明,本文方法在测量精度和实时性两方面都有所提升。当图像中的物体不被遮挡时,实际距离与测量距离之间的误差为2.60%,测量距离的时间消耗为1.577 s;当图像中的物体存在部分遮挡时,该方法也获得了较好的测量结果,实际距离与测量距离之间的误差为3.19%,测量距离所需时间为1.689 s。结论 利用两幅图像上的特征点来估计图像深度,对图像中物体存在部分遮挡情况具有良好的鲁棒性,同时避免了复杂的摄像机标定过程,具有实际应用价值。
关键词
Monocular depth measurement using motion cues
Wang Wei, Liang Fengmei, Wang Linlin(School of Information and Computer, Taiyuan University of Technology, Jinzhong 030600, China) Abstract
Objective The process of shooting a normal camera involves forming a 2D image after projecting a 3D scene onto the imaging plane. This process will lose the information of the scene, so the related research based on depth information cannot be developed. For example, target detection and tracking, 3D model reconstruction, and intelligent robots in industrial automation need to obtain depth information for the scene. The depth information of the scene is the basic problem in the field of machine vision. With the development of machine vision, the use of visualization methods to solve the deep extraction problem became an important topic in computer vision research. Among the methods, image depth measurement based on monocular vision has the advantages of simple equipment, low price, and fast calculation speed and could become a research hot spot today. The traditional measurement method based on monocular visual depth requires complex calibration of the camera, so the operability is not strong and it is difficult to apply in practice. Most of the traditional methods are only suitable for specific scene conditions, such as occlusion relations or defocusing in the scene, which limits the application of traditional methods. Aiming at the limitations of traditional methods, this paper proposes an object depth measurement method based on motion parallax cues, extracts feature points from images, analyzes the relationship between feature points and image depth, and obtains image depth results based on the relationship between the two. Method The method uses the parallax cues generated by the camera motion to obtain the image depth, so the method requires two images. The camera is mounted on a movable rail, and after the first image is taken, the camera is moved along the optical axis. The second image is taken at a distance when no adjustment to any parameter of the camera is required. First, the two images acquired by camera movement are segmented, and the region of interest (ROI) is segmented. Second, the improved scale-invariant feature transform algorithm proposed in this paper is used to perform two images. The results of image segmentation and image matching are combined to obtain the matching result of the object to be measured. Then, Graham scanning method is used to obtain the convex hull of the feature points after the two images are matched, thereby obtaining the length of the longest line segment on the convex hull. Finally, the basic principles of camera imaging and triangulation knowledge are used to calculate image depth. Result This proposed method is compared with another method, and results are provided in a table. The experiment is divided into two groups, mainly comparing the two methods from two aspects:measurement time and precision. The first set of experimental results show that the proposed method achieves good measurement results in a simple background environment. The error between the actual distance and the measured distance is 2.60%, and the time consumption of the measured distance is 1.577 s. The second set of experiments shows that when partial occlusion occurs in the scene, the error between the actual distance and the measured distance is 3.19%, and the time required to measure the distance is 1.689 s. By comparing the two sets of experimental data, we found that the method has improved the measurement accuracy and measurement time compared with the previous method, especially in reducing the image depth measurement time. Conclusion Through research, the method estimates the image depth by moving the camera to obtain the corresponding line segment length on the two images, which avoids the complicated camera calibration process. This method improves the image matching algorithm and reduces the computational complexity, which is fast and accurate. The method of obtaining image depth information has certain research value and only needs to process the two pictures, where the hardware requirement is simple. The measurement process does not require a large amount of scene information, so the scope of application is wide. The method utilizes feature points on the image to perform image depth measurement, so the method is not constrained by partial occlusion of the measured object, and still has good robustness. However, as the method uses an image segmentation algorithm, the result of image segmentation greatly influences the accuracy of the measurement. If the captured image contains a complex background environment, obtaining accurate image segmentation results and ideal depth measurement results is difficult. Therefore, the directions of optimization and improvement of this method are adaptable to complex background environments.
Keywords
image depth monocular vision motion parallax improved scale invariant feature transform (SIFT) algorithm Graham scanning method
|