鼻咽癌原发肿瘤放疗靶区的自动分割
薛旭东1,2, 郝晓宇3, 石军3, 丁轶1, 魏伟1, 安虹3(1.湖北省肿瘤医院肿瘤放疗科, 武汉 430079;2.中国科学技术大学附属第一医院(安徽省立医院) 肿瘤放疗科, 合肥 230001;3.中国科学技术大学计算机科学与技术学院, 合肥 230026) 摘 要
目的 放射治疗是鼻咽癌的主要治疗方式之一,精准的肿瘤靶区分割是提升肿瘤放疗控制率和减小放疗毒性的关键因素,但常用的手工勾画时间长且勾画者之间存在差异。本文探究Deeplabv3+卷积神经网络模型用于鼻咽癌原发肿瘤放疗靶区(primary tumor gross target volume,GTVp)自动分割的可行性。方法 利用Deeplabv3+网络搭建端到端的自动分割框架,以150例已进行调强放射治疗的鼻咽癌患者CT(computed tomography)影像和GTVp轮廓为研究对象,随机选取其中15例作为测试集。以戴斯相似系数(Dice similarity coefficient,DSC)、杰卡德系数(Jaccard index,JI)、平均表面距离(average surface distance,ASD)和豪斯多夫距离(Hausdorff distance,HD)为评估标准,详细比较Deeplabv3+网络模型、U-Net网络模型的自动分割结果与临床医生手工勾画的差异。结果 研究发现测试集患者的平均DSC值为0.76±0.11,平均JI值为0.63±0.13,平均ASD值为(3.4±2.0)mm,平均HD值为(10.9±8.6)mm。相比U-Net模型,Deeplabv3+网络模型的平均DSC值和JI值分别提升了3%~4%,平均ASD值减小了0.4 mm,HD值无统计学差异。结论 研究表明,Deeplabv3+网络模型相比U-Net模型采用了新型编码—解码网络和带孔空间金字塔网络结构,提升了分割精度,有望提高GTVp的勾画效率和一致性,但在临床实践中需仔细审核自动分割结果。
关键词
Auto-segmentation of high-risk primary tumor gross target volume for the radiotherapy of nasopharyngeal carcinoma
Xue Xudong1,2, Hao Xiaoyu3, Shi Jun3, Ding Yi1, Wei Wei1, An Hong3(1.Department of Radiation Oncology, Hubei Cancer Hospital, Wuhan 430079, China;2.Department of Radiation Oncology, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei 230001, China;3.School of Computer Science and Technology, University of Science and Technology of China, Hefei 230026, China) Abstract
Objective Nasopharyngeal carcinoma (NPC) is a common head and neck cancer in Southeast Asia and China. In 2018, approximately 129 thousand people were diagnosed with NPC, and approximately 73 thousand people died of it. Radiotherapy has become a standard treatment method for NPC patients. Precise radiotherapy relies on the accurate delineation of tumor targets and organs-at-risk (OARs). In radiotherapy practice, these anatomical structures are usually manually delineated by radiation oncologists on a treatment-planning system (TPS). Manual delineation, however, is a time-consuming and labor-intensive process. It is also a subjective process and, hence, prone to interpractitioner variability. The NPC target segmentation is particularly challenging because of the substantial interpatient heterogeneity in tumor shape and the poorly defined tumor-to-normal tissue interface, resulting in considerable variations in gross tumor volume among physicians. Auto-segmentation methods have the potential to improve the contouring accuracy and efficiency. Different auto-segmentation methods have been reported. Nevertheless, atlas-based segmentation has long computation time and often could not account for large anatomical variations due to the uncertainty of deformable registration. Deep learning has achieved great success in computer science. It has been applied in auto-segmenting tumor targets and OARs in radiotherapy. Studies have demonstrated that the deep leaning method can perform comparably with or even better than manual segmentation for some tumor sites. In this work, we propose a Deeplabv3+ model that can automatically segment high-risk primary tumor gross target volume (GTVp) in NPC radiotherapy. Method The Deeplabv3+ convolutional neural network model uses an encoder-decoder structure and a spatial pyramid pooling module to complete the segmentation of high-risk primary tumor from NPC patients. The improved MobileNetV2 network is used as the network backbone, and atrous and depthwise separable convolutions are used in the encoder and decoder modules. The MobileNetV2 network consists of four inverted residual modules that contain depthwise separable convolution with striding to extract feature maps at arbitrary resolutions via atrous separable convolution. Batch normalization and ReLU activation are added after each 3×3 depthwise convolution. The decoder module of this network is as follows: the encoder features are first bilinearly upsampled by a factor of 4 and then concatenated with the corresponding low-level features from the network backbone with the same spatial resolution. We perform a 1×1 convolution on the low-level features to reduce the number of channels. After concatenation, several 3×3 convolutions are used to refine the features, followed by another bilinear upsampling by a factor of 4. Our training and test sets consist of the CT images and manual contours of 150 patients from Anhui Provincial Hospital between January 2016 and May 2019. The dimension, resolution, and thickness of CT images are 512×512, 0.98 mm, and 2.5 mm, respectively. To delineate the tumor region efficiently, T1-weighted MR images are also acquired and fused with CT images. GTVp is delineated by experienced radiation oncologists on the CT images in a Pinnacle TPS. Of the 150 patients, 120 are chosen as the training set, 15 patients are chosen as the validation set, and the remaining 15 patients are chosen as the test set. Images are flipped, translated, and randomly rotated to augment the training dataset. Our network is implemented in Keras toolbox. The input images and ground-truth contours are resized to 512×512 for training. The loss function used in this study is 1-DSC index, AdamOptimizer is used with a learning rate of 0.005, and the weight decay factor is 0.8. The performance of the auto-segmentation algorithm is evaluated with Dice similarity coefficient (DSC), Jaccard index (JI), average surface distance (ASD), and Hausdorff distance (HD). The results are compared with those of the U-Net model. Paired t-test is performed to compare the DSC, JI, ASD, and HD values between the different models. Result The mean DSC value of the 15 NPC patients from the test set is 0.76±0.11, the mean JI value is 0.63±0.13, the average ASD value is (3.4±2.0) mm, and the average HD value is (10.9±8.6) mm. Compared with the U-Net model, the Deeplabv3+ network model shows improved mean DSC and JI values by 3% and 4%, respectively (0.76±0.11 vs. 0.73±0.13, p<0.001; 0.63±0.13 vs. 0.59±0.14, p<0.001). The mean ASD value is also significantly reduced (3.4±2.0 vs. 3.8±3.3 mm, p=0.014) compared with the U-Net result. However, for HD values, no statistical difference exists between the two network models (10.9±8.6 vs. 11.1±7.5 mm, p=0.745). The experiment results indicate that the Deeplabv3+ network model outperforms the U-Net model in the segmentation of NPC target area. As 2D visualizations of auto-segmented contours, the Deeplabv3+ model results have more overlap with the manual contours and are closer to the results of the “ground truth”. The visualizations show that our model can produce refined results. In addition, the average time required to segment a CT image is 16 and 14 ms for our model and the U-Net model, respectively, which is much less than the manual contouring time. Conclusion In this study, a Deeplabv3+ convolutional neural network model is proposed to auto-segment the GTVp of NPC patients with radiotherapy. The results show that the auto-segmentations of the Deeplabv3+ network are close to the manual contours from oncologists. This model has the potential to improve the efficiency and consistency of GTVp contouring for NPC patients.
Keywords
auto-segmentation radiotherapy convolutional neural network(CNN) primary tumor gross target volume (GTVp) nasopharyngeal carcinoma(NPC)
|