Current Issue Cover
  • 发布时间: 2024-09-04
  • 摘要点击次数:  21
  • 全文下载次数: 16
  • DOI:
  •  | Volume  | Number
触觉增强的图卷积点云超分网络

张驰, 李剑, 王普正, 石键瀚, 王怀钰, 王琴(南京邮电大学通信与信息工程学院)

摘 要
摘 要:目的 随着三维扫描仪以及三维点云采集技术的飞速发展,三维点云在计算机视觉、机器人导引、工业设计等方面的应用越来越广泛。但是由于传感器分辨率、扫描时间、扫描条件等限制,采集到的点云往往比较稀疏,无法满足许多应用任务的要求,因此人们一般采用上采样的方法来获取稠密点云。但是由于原始稀疏点云缺失细节信息,对单一低分辨率点云进行上采样得到的结果往往较差。方法 本文首次提出了一种触觉增强的图卷积点云超分网络,其主要思想是通过动态图卷积提取触觉特征并与低分辨率点云特征进行融合,以得到更加精确的高分辨率点云。由于触觉点云相比于低分辨率点云更加密集、精确,而且比较容易获取,因而本文将其与原始稀疏点云进行融合辅助后可以获得更加准确的局部特征,从而有效提升上采样的精度。结果 本文首先构建了用于点云超分的三维视触觉数据集(3D Vision and Touch, 3DVT),包含12732个样本,其中70%用于训练新模型,30%用于测试;其次,本文采用倒角距离作为评价指标对数据集进行了测试和验证。实验结果表明,不添加触觉辅助信息时,超分后点云的平均倒角距离为3.009*10^-3,加入一次触觉信息融合后,平均倒角距离降低为1.931*10^-3,加入两次触觉信息融合后,平均倒角距离进一步降低为1.916**10^-3,从而验证了本文网络对点云超分效果的提升作用。同时,不同物体的可视化效果图也表明,加入触觉信息辅助后的上采样点云分布更加均匀,边缘更加平滑。此外,进一步的噪声实验显示,在触觉信息的辅助下,本文提出的网络对噪声具有更好的鲁棒性。在以3DVT数据集为基础的对比实验中,相比于现有最新算法,本文算法的平均倒角距离降低了19.22%,取得了最好的实验结果。结论 通过使用本文提出的触觉增强的图卷积点云超分网络,借助动态图卷积提取触觉点云特征并融合低分点云,可以有效地提高超分重构后高分辨率点云的质量,并且对周围噪声具有良好的鲁棒性。
关键词
Tactile-enhanced graph convolutional point cloud super-resolution network

Zhang Chi, Li Jian, Wang Puzheng, Shi JianHan, Wang HuaiYu, Wang Qin(School of Communications and Information Engineering,Nanjing University of Posts and Telecommunications,Nanjing)

Abstract
Abstract: Objective With the rapid development of 3D scanners and 3D point cloud acquisition technologies, the application of 3D point clouds in computer vision, robot guidance, industrial design, and other fields has become increasingly widespread. As long as the point cloud is sufficiently dense, we can construct accurate models to meet the demands of various advanced point cloud tasks. Accurate point clouds facilitate better performance in tasks such as semantic segmentation, completion, classification. However, due to limitations such as sensor resolution, scanning time, and scanning conditions, the acquired point clouds are often sparse. Existing point cloud upsampling methods only address single low-resolution point clouds, and they yield poor results when upsampling highly sparse point clouds at larger magnification rates. And they do not use additional modalities for assistance. Moreover, tactile information has gradually been applied to 3D reconstruction tasks, reconstructing complete 3D models using multi-modal information such as RGB images, depth images, and tactile information. But tactile point clouds have not been applied to point cloud super-resolution tasks yet. Method In this study, we proposed a tactile-enhanced graph convolutional point cloud super-resolution network that uses dynamic graph convolution to extract tactile features and fuse them with low-resolution point cloud features to obtain more accurate high-resolution point clouds. This network consists of feature extraction module and upsampling module. The feature extraction module extracts features from low-resolution point cloud and tactile point cloud, while the upsampling module performs feature expansion and coordinate reconstruction to output high-resolution point cloud. The key to this network lies in extracting features from tactile point clouds and fusing them with low-resolution point cloud features. The tactile feature extraction module adopts Multilayer Perceptron (MLP) and 4-layer cascaded dynamic graph convolution. The tactile point cloud is mapped to a high-dimensional space through multilayer perceptron for subsequent feature extraction. The dynamic graph convolution module consists mainly of K-Nearest-Neighbors (KNN) and edge convolution. In each dynamic graph convolution, the KNN algorithm is used to recompute the neighboring points of each point and construct the graph structure. KNN algorithm can effectively aggregates local feature information, and edge convolution extracts features of center points and neighboring points. The k-nearest neighbors of each point vary in different network layers, leading to the graph structure dynamically updated in each layer. The feature extraction for the low-resolution point cloud adopts graph convolution. The graph structure is first constructed using the KNN algorithm and then shared with subsequent layers. After fusing the features of the low-resolution point cloud and the tactile point cloud, the point cloud features undergo further progressive feature extraction, mainly using dense connected graph convolution modules. The bottleneck layer compresses features to reduce computational complexity in subsequent layers. Two parallel dense graph convolutions extract local features, while the global pooling layer extracts global features. Finally, the feature rearrangement module and coordinate reconstruction module map the high-dimensional features back to the three-dimensional coordinate system. Compared to the low-resolution point cloud, the local tactile point cloud is denser and more precise, while the low-resolution point cloud is often sparser and contains less local information. With the assistance of tactile information, enhanced local features can be obtained. Result In this study, we reconstructed a point cloud super-resolution dataset 3DVT (3D Vision and Touch) with tactile information and trained on this dataset. This dataset contains a diverse range of object categories and a sufficiently large number of samples. Using chamfer distance as the evaluation metric, experimental results show that without adding tactile information, the average chamfer distance is 3.009*10^-3, with one instance of tactile information added, the average chamfer distance decreases to 1.931*10^-3, and with two instances of tactile information added, the average chamfer distance further reduces to 1.916*10^-3. Tactile point clouds can enhance the quality of high-resolution point clouds and serve as an auxiliary aid in point cloud super-resolution tasks. Visualizations of different objects demonstrate that the distribution of upsampled point clouds becomes more uniform and the edges become smoother with the assistance of tactile information. With the assistance of tactile point clouds, the network can better fill in the holes in the point cloud and reduce the generation of outliers. The quantitative results of chamfer distance and density-aware chamfer distance obtained by super-resolution experiments on different objects confirm the effectiveness of tactile point cloud assistance in the super-resolution task. Furthermore, for complex objects, this improvement is even more pronounced. The noise experiments show that at a noise level of 1%, the average chamfer distance without tactile information is 3.132*10^-3, while with the inclusion of two instances of tactile information, the average chamfer distance is 1.954*10^-3. At a noise level of 3%, the average chamfer distance without tactile information is 3.331*10^-3, and with the inclusion of two instances of tactile information, the average chamfer distance is 2.001*10^-3. The experiments demonstrate that with the assistance of tactile information, the impact of noise on the network is reduced, indicating that the network exhibits strong robustness. Conclusion Dynamic graph convolution can effectively extract initial features from tactile point clouds, and the tactile point cloud features contain rich local information. Through feature fusion, it can effectively assist the point cloud super-resolution task. The proposed tactile-enhanced graph convolutional point cloud super-resolution network in this paper uses dynamic graph convolution to extract tactile features and fuse them with low-resolution point cloud features, effectively improving the quality of high-resolution point clouds and exhibiting strong robustness. The superiority of the method lies in its ability to achieve better results by incorporating tactile information without updating the network architecture. This method can provide high-quality point clouds for advanced visual tasks such as point cloud classification and object detection, laying the foundation for further development and application of point clouds.
Keywords

订阅号|日报