改进Faster R-CNN模型的CT图磨玻璃密度影目标检测
杨淑莹1,2, 邓东升1,2, 郑清春3(1.天津理工大学计算机科学与工程学院, 天津 300384;2.计算机视觉与系统教育部重点实验室, 天津 300384;3.天津理工大学机械工程学院, 天津 300384) 摘 要
目的 针对Faster R-CNN (faster region convolutional neural network)模型在肺部计算机断层扫描(computed tomography,CT)图磨玻璃密度影目标检测中小尺寸目标无法有效检测与模型检测速度慢等问题,对Faster R-CNN模型特征提取网络与区域候选网络(region proposal network,RPN)提出了改进方法。方法 使用特征金字塔网络替换Faster R-CNN的特征提取网络,生成特征金字塔;使用基于位置映射的RPN产生锚框,并计算每个锚框的中心到真实物体中心的远近程度(用参数“中心度”表示),对RPN判定为前景的锚框进一步修正位置作为候选区域(region proposal),并将RPN预测的前景/背景分类置信度与中心度结合作为候选区域的排序依据,候选区域经过非极大抑制筛选出感兴趣区域(region of interest,RoI)。将RoI对应的特征区域送入分类回归网络得到检测结果。结果 实验结果表明,在新冠肺炎患者肺部CT图数据集上,本文改进的模型相比于Faster R-CNN模型,召回率(recall)增加了7%,平均精度均值(mean average precision,mAP)增加了3.9%,传输率(frames per second,FPS)由5帧/s提升至9帧/s。特征金字塔网络的引入明显提升了模型的召回率与mAP指标,基于位置映射的RPN显著提升了模型的检测速度。与其他最新改进的目标检测模型相比,本文改进的模型保持了双阶段目标检测模型的高精度,并拉近了与单阶段目标检测模型在检测速度指标上的距离。结论 本文改进的模型能够有效检测到患者肺部CT图的磨玻璃密度影目标区域,对小尺寸目标同样适用,可以快速有效地为医生提供辅助诊断。
关键词
Ground-glass opacity target detection in CT scans based on improved Faster R-CNN model
Yang Shuying1,2, Deng Dongsheng1,2, Zheng Qingchun3(1.School of Computer Science and Engineering, Tianjin University of Technology, Tianjin 300384, China;2.Key Laboratory of Computer Vision and System, Ministry of Education, Tianjin 300384, China;3.School of Mechanical Engineering, Tianjin University of Technology, Tianjin 300384, China) Abstract
Objective The outbreak of corona virus disease 2019 (COVID-19) has become a serious public health event of concern worldwide. The key to controlling the spread of this disease is early detection. Computed tomography (CT) is highly sensitive to the early diagnosis of patients with COVID-19, and the changes in clinical symptoms are time-consistent with the changes in lung CT lesions, which is a simpler, faster indicator for judging changes in the condition. Faint ground-glass opacity is common in the early stage of COVID-19 lesions, and the ground-glass opacity gradually increases as the lesion progresses. Manual detection methods are time consuming, and manual detection inevitably has subjective diagnostic errors. In recent years, deep learning has made great progress in computer vision and achieved outstanding performance in the detection of lung CT scans. In the target detection task, the two-stage target detection method easily achieves a higher precision. The most representative model is faster region convolutional neural network (Faster R-CNN). However, with the increasing diversification and complexity of target detection tasks, the shortcomings of the Faster R-CNN model have also been exposed. In the detection of the ground-glass opacity target, the target size range is large, and Faster R-CNN only uses the highest layer feature map to obtain the region proposal, which has the problem of low recognition rate for small targets. When the region proposal network of Faster R-CNN model supervises the foreground/background classification, most of the overlap calculations between the anchor boxes and the background area are redundant calculations. in the task of detecting ground-glass opacity targets in CT scans of the lung and given the problems of the Faster R-CNN model, an improved method for the feature extraction network and region proposal network of the Faster R-CNN model is proposed. Method First, the feature pyramid network replaces the feature extraction network of Faster R-CNN to generate a feature pyramid. Then, the region proposal network based on location mapping generates anchor boxes and calculates the distance from the center of each anchor boxes to the center of the real object, which is represented by the parameter "centrality". The anchor box judged as the foreground by the region proposal network is further modified as a region proposal, and the foreground/background classification confidence predicted by the region proposal network and centrality are combined as the sorting basis for the region proposal. The interest regions are filtered out from region proposals through non-maximal suppression. Finally, the characteristic regions corresponding to regions of interest are sent to the classification regression network to obtain the detection results. Content of main experiments, the experiment uses recall, mean average precision (mAP), and frames per second (FPS) as evaluation indicators to compare the performance of the standard Faster R-CNN, Faster R-CNN + FPS, and the proposed model, and the effects of different backbone networks on the model in this paper. Result On the dataset of COVID-19, the experimental results show that compared with the Faster R-CNN model, the improved model increases recall by 7%, mAP by 3.9%, and FPS from 5 to 9. Conclusion The improved model can effectively detect the ground-glass opacity target of the patient's lung CT scans and is suitable for small targets. The improved region proposal network reduces network output parameters, saves calculation time, and increases model running speed. Meanings using the feature pyramid network to replace the feature extraction network of Faster R-CNN can be a general method to solve the problem of a large size range of target objects. The method of using the location mapping-based region proposal network to replace the traditional multianchor box mapping-based region proposal network can also provide a reference for accelerating the running speed of the model.
Keywords
corona virus disease 2019 (COVID-19) ground-glass opacity faster region convolutional neural network (Faster R-CNN) feature pyramid network (FPN) region proposal network (RPN) residual neural network (ResNet)
|