聚焦全局与中间层的细节增强医学图像分割
刘威, 钟淼, 刘光伟, 王浩男, 宁倩(辽宁工程技术大学) 摘 要
目的 随着人工智能的发展,深度学习技术在医学图像分割中得到了广泛的应用。但现有方法往往采用自上而下或自下而上的方式进行特征融合,易忽略或丢失中间层特征信息。此外,现有方法仍对病灶区域分割边界不够精细。针对上述问题,本文提出了一种聚焦全局与中间层特征的细节增强医学图像分割网络(detail-enhanced medical image segmentation network focusing on global and intermediate features,DEMS-GIF)。方法 首先通过进一步关注中间层信息,并利用Transformer提取不同区域之间的长距离依赖关系的能力,本文设计了一种基于Transformer的桥接特征融合模块(transformer-based bridge feature fusion module,TBBFF),以提升模型的特征提取能力。其次,通过引入反向注意力机制,并结合腐蚀和膨胀操作,提出了一种反向注意下的扩缩区域增强上采样策略(expanded and scaled region enhanced upsampling strategy under reverse attention,ESRU),使得模型能够更好地捕捉边界和细节信息。DEMS-GIF模型通过结合TBBFF模块和ESRU策略,进一步提高了分割的准确性。结果 在CVC-ClinicDB、DDTI和Kvasir-SEG三个数据集上进行对比实验和模块消融实验,评估了所提出的DEMS-GIF模型,并在CVC-ClinicDB数据集上进行了参数消融实验,以了解DEMS-GIF中每个模块和结构内部的有效性。实验结果表明,DEMS-GIF模型的mIoU值分别达到了94.74%、84.56%和88.46%,Dice值分别达到了94.82%、82.95%和87.44%。与原UNet型通道变换网络相比,mIoU值分别提升了3.73%、3.4%和5.24%,Dice值分别提升了4.84%、5.45%和6.82%。结论 本文提出的DEMS-GIF网络模型较其他先进的分割方法的分割效果最优,表明了其在医学图像分割中的优越性。
关键词
Detail-Enhanced medical image segmentation focusing on global and intermediate features
liuwei, zhongmiao, liuguangwei, wanghaonan, ningqian() Abstract
Objective: Medical image segmentation is a crucial and challenging task in the field of medical imaging, holding significant importance in modern medicine. Using segmentation techniques, tumors, blood vessels, and other structures can be precisely located and measured, facilitating the early diagnosis of the disease and the evaluation of treatment outcomes. In addition, accurate and reliable medical image segmentation can be used to monitor disease progression, assist physicians in formulating long-term treatment plans, lay a solid foundation for clinical diagnosis and pathological research, and provide valuable data support. Feature fusion in medical image segmentation can comprehensively capture both detailed and global information by combining multi-level, multi-scale, and multi-modal features, thereby improving segmentation accuracy and robustness. This not only enhances the capability of automated medical image processing and reduces dependence on large amounts of annotated data, but also increases the accuracy of clinical decisions. This helps doctors make more reliable judgments in diagnosis and treatment, promoting the development of medical automation. In medical image segmentation, upsampling strategies can effectively restore high-resolution features and detailed information, thus enhancing the segmentation model""s ability to recognize small structures and boundaries. By adopting appropriate upsampling techniques, it is possible to preserve and recover the spatial information of the original image, thereby improving the precision and precision of segmentation. This not only aids in the precise localization of lesion areas and the extraction of biological features but also holds significant importance for clinical diagnosis and treatment decision-making. With the advancement of artificial intelligence, deep learning techniques have been widely applied. However, these deep learning methods often employ top-down or bottom-up approaches for feature fusion, which can result in the neglect or loss of intermediate layer feature information. Moreover, existing methods still face issues with imprecise segmentation boundaries of lesion areas, leading to the omission of critical information when dealing with fine structures and complex background information. To address these issues, this paper proposes a Detail-Enhanced Medical Image Segmentation Network Focusing on Global and Intermediate Features (DEMS-GIF). Method: Firstly, to address the shortcomings of existing feature fusion methods in capturing complex structures and integrating intermediate features, this paper proposes a Transformer-based Bridge Feature Fusion module (TBBFF module). Compared to other feature fusion modules, the TBBFF module focuses more on intermediate feature information and leverages the Transformer""s ability to capture long-range dependencies between different regions. This enables the network to better understand the overall structure and contextual information of the image, further enhancing the model""s segmentation performance and robustness. Secondly, to address the issues of excessive smoothness in generated images and the lack of precise boundary segmentation in the areas of lesion using existing methods, this paper proposes an expanded and scale-region-enhanced upsampling strategy under reverse attention (ESRU strategy). By incorporating a reverse attention mechanism and combining erosion and dilation operations, the model can better capture boundary and detail information in the target regions of medical images, enhancing the integrity and continuity of the segmented regions. This, in turn, improves the accuracy and stability of segmentation. In conclusion, the DEMS-GIF model, by integrating the TBBFF module and the ESRU strategy, effectively extracts image details and global information, ensuring the integrity of the segmented regions and further enhancing segmentation accuracy. Result: To evaluate the superiority of the DEMS-GIF model compared to other methods, we conducted experiments on three different datasets: CVC-ClinicDB, DDTI, and Kvasir-SEG, comparing it with the latest and classic methods. The experimental results demonstrate that on the CVC-ClinicDB dataset, the DEMS-GIF model achieved mIoU and Dice scores of 94.74% and 94.82%, respectively. Compared to recently proposed segmentation methods, DEMS-GIF""s mIoU outperformed MSUNet by 3.98%, MBSNet by 4.42%, and SCSONet by 20.13%. With an increase in training iterations, its train loss decreased significantly and was notably lower than that of ResUNet++ and SCSONet. On the DDTI dataset, the DEMS-GIF model achieved a precision of 85.52% and an mIoU of 84.56%. Compared to traditional network models, DEMS-GIF showed the most significant differences with ResNet++, with precision and mIoU higher by 13.97% and 10.96%, respectively. Additionally, its train loss was noticeably lower compared to other models, with the most significant difference observed with MFSNet. On the Kvasir-SEG dataset, the DEMS-GIF model achieved mIoU and Dice scores of 87.44%. Its mIoU was 12.69% higher than ResUNet++ and Dice was 6.82% higher than UNet, demonstrating substantial superiority. Compared to other models such as MBSNet, DTA-UNet, and SCSONet, DEMS-GIF""s mIoU was higher by 6.77%, 3.84%, and 25.02%, respectively, achieving the best performance. During training, its train loss was significantly lower than other network models except MFSNet. Additionally, we conducted ablation experiments to understand the effectiveness of each module and structure within DEMS-GIF. The ablation experiments were divided into module ablation and parameter ablation, with module ablation experiments conducted on the CVC-ClinicDB, DDTI, and Kvasir-SEG datasets, and parameter ablation experiments conducted on the CVC-ClinicDB dataset. Using Res2Net as the baseline network, the module ablation experiments thoroughly discussed the impact of the TBBFF module and ESRU strategy on model performance. The results demonstrated that the combination of the TBBFF module and ESRU strategy allows the network to focus more on lesion areas, resulting in more accurate segmentation. Parameter ablation involved experiments on the threshold and adaptive parameters used in reweighting with erosion and dilation operations. This illustrated the contribution of each parameter in the DEMS-GIF model and identified the optimal parameter values, optimizing the model settings to further enhance the overall performance and reliability of the model. Conclusion: In this paper, we propose a Detail-Enhanced Medical Image Segmentation Network Focusing on Global and Intermediate Features (DEMS-GlF). By integrating the TBBFF module and the ESRU strategy, the network effectively leverages intermediate layer feature information to integrate features across different scales and levels. It also enhances the focus on boundary and detail information of lesion areas, achieving efficient extraction and refined processing of image features. This results in more complete and accurate segmentation regions. Experimental results show that the proposed DEMS-GIF network model outperforms other advanced segmentation methods, demonstrating its superiority in medical image segmentation.
Keywords
|