结合深度学习与支持向量机的金属零件识别
摘 要
目的 在视觉引导的工业机器人自动拾取研究中,关键技术难点之一是机器人抓取目标区域的识别问题。特别是金属零件,其表面的反光、随意摆放时相互遮挡等非结构化因素都给抓取区域的识别带来巨大的挑战。因此,本文提出一种结合深度学习和支持向量机的抓取区域识别方法。方法 分别提取抓取区域的方向梯度直方图(HOG)和局部二进制模式(LBP)特征,利用主成分分析法(PCA)对融合后的特征进行降维,以此来训练支持向量机(SVM)分类器。通过训练Mask R-CNN(regions with convolutional neural network)神经网络完成抓取区域的初步分割。然后利用SVM对Mask R-CNN识别的抓取区域进行二次分类,完成对干扰区域的剔除。最后计算掩码完成实例分割,以此达到对抓取区域的精确识别。结果 对于随机摆放的铜质金属零件,本文算法与单一的Mask R-CNN及多特征融合的SVM算法就识别准确率、错检率、漏检率3个指标进行了比较,结果表明本文算法在识别准确率上较Mask R-CNN和SVM算法分别提高了7%和25%,同时有效降低了错检率与漏检率。结论 本文算法结合了Mask R-CNN与SVM两种方法,对于反光和遮挡情况具有一定的鲁棒性,同时有效地提升了目标识别的准确率。
关键词
Metal part recognition based on deep learning and support vector machine
Zheng Jianhong, Bao Guanjun, Zhang Libin, Xun Yi, Chen Jiaoliao(Key Laboratory of Special Purpose Equipment and Advaneed Manufacturing Technology, Ministry of Education, Zhejiang University of Technology, Hangzhou 310023, China) Abstract
Objective Under the background of "machine substitution" robotic visual intelligence is crucial to the industrial upgrading of the manufacturing industry. Algorithm-guided industrial robots with a visual perception function are also receiving increasing attention in industrial production.One of the most critical difficulties in the automatic picking of industrial robots is the identification of the target area.This problem is particularly prominent in the picking process of metal parts. Unstructured factors, such as reflective surface and mutual occlusion during random placement, pose great challenges to the identification of the picking area.To solve these problems,this study proposes a picking region recognition method based on deep learning and support vector machine (SVM).These two models are combined to exploit their individual advantages and further improve their accuracy. Method The proposed approach is used to construct a new model that combines regions with a convolutional neural network feature (Mask R-CNN) and SVM.Our methods include feature extraction,multi-feature fusion,SVM classifier training,neural network training, the combination of SVM and deep neural network.First,the local binary pattern(LBP) and histogram of oriented gradient(HOG) features of the picking areaare extracted.The presence of interference areas poses a huge challenge to the identification of the picking area.The interference area is relative to the identification areaand is easily misidentified and obtained through long-term practice on the assembly line.The dimension of the feature matrix generated by directly merging these two features is too large.Thus, we mustutilize principal component analysis to reduce the dimensions of the feature matrix and train the SVM classifier through the trained feature matrix.The size of the matrix after the direct fusion of the two features is 7 000×2 692. Hence, we select a cumulative contribution rate of 94%, at which the recognition accuracy rate is up to 97.25%.The size of the feature matrix is reduced to 7 000×231after dimension reduction.After that,we cancomplete the initial segmentation of the picking area by training the Mask R-CNN,which may contain interference areas inside.Mask R-CNN is roughly composed of the following parts:feature extraction, area suggestion network (RPN), ROIAlign, and final result.The feature extraction part is the backbone of the entire network. Its function is to extract several important features of different targets from numerous training photos.We use an already trained residual network (ResNet101)as the feature extraction network.The RPN network uses the feature map to obtain the candidate frame of the object in the original image, which is currently implemented by anchor technology.In this study, nine candidate regions are selected for each anchor on the feature graph according to different scales (i.e., 128, 256, and 512 pixels) and different aspect ratios (i.e., 1:1, 0.5:1, and 1:0.5).By using the ROIAlign network, the corresponding area in the feature map is pooled to a fixed size according to the position coordinates of the candidate frame.The final classification and regression results are generated by the fully connected layer,and the mask division of the object is generated by the deconvolution operation.Then,quadratic segmentation of the results after initial segmentation by the SVM algorithm basically completes the elimination of the interference area.The final instance segmentation is completed by mask calculation of the picking area.Result Multi-feature fusion SVM, Mask R-CNN, and the proposed algorithm are used to detect the picking area of 500 metal parts.Experimental results show that the algorithm can adapt to the recognition of the picking region. The correct rate of algorithm identification in this work is 89.40%, the missed detection rate is 7.80%, and the false detection rate is 2.80%.The correct rate of algorithm identification is 7.00% and 25.00% higher than those of Mask R-CNN and SVM, respectively.The error detection rate of the algorithm is 7.80% and 18.40% lower than those of Mask R-CNN and SVM, respectively. The missed detection rate of the algorithm is 6.60% lower than that of SVM.Conclusion The SVM classifier with multi-feature fusion is used to classify the recognition results of Mask R-CNN, and the rejection of the interference region is completed. Accurate recognition of the picking region is completed by the calculation of the mask.In the construction of the image training set, the effects of illumination and occlusion between parts are fully considered, and the illumination and occlusion conditions are effectively divided and investigated; hence, the approach exhibits a certain robustness in practical applications.Compared with the sliding window frame method used in traditional target recognition, this work accurately identifies the shape of the target area through mask calculation and has a high recognition accuracy.Moreover, this work compensates for the limitations of the single-network framework by constructing a multi-feature fusion SVM classifier, which effectively reduces the false detection rate.
Keywords
target recognition multi-feature fusion support vector machine(SVM) deep learning instance segmentation
|