Current Issue Cover
结合多通道注意力的糖尿病性视网膜病变分级

顾婷菲1, 郝鹏翼1, 白琮1, 柳宁2(1.浙江工业大学计算机科学与技术学院, 杭州 310023;2.上海交通大学电子信息与电气工程学院, 上海 200240)

摘 要
目的 糖尿病性视网膜病变(diabetic retinopathy, DR)是一种常见的致盲性视网膜疾病,需要患者在早期就能够被诊断并接受治疗,否则将会造成永久性的视力丧失。能否检测到视网膜图像中的微小病变如微血管瘤,是糖尿病性视网膜病变分级的关键。然而这些病变过于细小导致使用一般方法难以正确地辨别。为了解决这一问题,本文提出了一种基于多通道注意力选择机制的细粒度分级方法(fine-grained grading method based on multi-channel attention selection, FGMAS)用于糖尿病性视网膜病变的分级。方法 该方法结合了细粒度分类方法和多通道注意力选择机制,通过获取局部特征提升分级的准确度。此外考虑到每一层通道特征信息量与分类置信度的关系,本文引入了排序损失以优化每一层通道的信息量,用于获取更加具有信息量的局部区域。结果 使用两个公开的视网膜数据集(Kaggle和Messidor)来评估提出的细粒度分级方法和多通道注意力选择机制的有效性。实验结果表明:FGMAS在Kaggle数据集上进行的五级分类任务中相较于现有方法,在平均准确度(average of classification accuracy,ACA)上取得了3.4%10.4%的提升。尤其是对于病变点最小的1级病变,准确率提升了11%18.9%。此外,本文使用FGMAS在Messidor数据集上进行二分类任务。在推荐转诊/不推荐转诊分类上FGMAS得到的准确度(accuracy,Acc)为0.912,比现有方法提升了0.1%1.9%,同时AUC (area under the curve)为0.962,比现有方法提升了0.5%9.9%;在正常/不正常分类上FGMAS得到的准确度为0.909,比现有方法提升了2.9%8.8%,AUC为 0.950,比现有方法提升了0.4%8.9%。实验结果表明,本文方法在五分类和二分类上均优于现有方法。结论 本文所提细粒度分级模型,综合了细粒度提取局部区域的思路以及多通道注意力选择机制,可以获得较为准确的分级结果。
关键词
Diabetic retinopathy grading based on multi-channel attention

Gu Tingfei1, Hao Pengyi1, Bai Cong1, Liu Ning2(1.College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou 310023, China;2.School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China)

Abstract
Objective Diabetic retinopathy (DR) is a common blinding retinal disease that cannot be cured in the later stage, and requires patients to be diagnosed and treated at an early stage; otherwise, it causes permanent vision loss. The prevalence of diabetic retinopathy is extremely high in China, and is in the stage of rapid growth. At present, China has become the country that has the largest number of patients with diabetic retinopathy. Diagnosis of DR is usually performed by analyzing fundus medical images. Detection of microscopic lesions such as microaneurysms in retinal images is necessary in grading diabetic retinopathy with neural networks. This condition requires the attention mechanism to simulate the focus of the human eyes and focus on the local area with information. However, most of the present methods only consider the attention in the spatial domain and ignore the information in the channel attention, which cause difficulty in distinguishing the small lesions. To solve this problem, a fine-grained grading method based on multi-channel attention selection (FGMAS) mechanism is proposed for the grading of diabetic retinopathy in this paper. Method This method combines fine-grained classification with a multi-channel attention selection mechanism. First, the structure of fine-grained classification is adopted to improve the recognition accuracy of small differences between categories by obtaining local regional features. Then, the characteristics of different feature layers in the channel domain with different information content are used to select high-information channels. The model establishes the relationship between information content and classification confidence, and obtains the lesion area that is conducive to classification results. Finally, the local and global features are combined to improve the accuracy of classification. In addition, considering the relationship between the channel characteristic information of each layer and the classification confidence, this study also introduces Rank_loss to optimize the channel information of each layer. The loss function enables the regions with higher classification confidence to have higher information content and obtain better classification results. Result Two open retinal datasets (Kaggle and Messidor) are used to evaluate the effectiveness of the proposed fine-grained grading method and multi-channel attention selection mechanism. The experimental results show that FGMAS performs a five-level classification on the Kaggle dataset with better results than the existing method, with an average accuracy of 0.577, which is 3.4%10.4% higher than the accuracy of other methods. The first category shows small lesion points, which are difficult to distinguish in other methods. However, the accuracy rate of 0.301 can be obtained through FGMAS proposed in this paper, which is better than other methods with the improvement of 11%18.9%, such as 0.190 of VGGNet with Extra Kernel/LGI (VNXK/LGI). Meanwhile, FGMAS is used in the Messidor dataset to perform a dichotomous task, including recommended reference/non-reference and normal/abnormal classification. In the reference/non-reference classification task, the experimental results are 0.912 of accuracy and 0.962 of AUC(area under the curve), which is superior to the existing methods by 0.1%1.9% and 0.5%9.9%, respectively. In the normal/abnormal classification task, the experimental results are 0.909 of accuracy and 0.950 of AUC, which are improved by 2.9%8.8% and 0.4%8.9% respectively, compared with existing methods. In addition, parameter experiments are set up in this study, and the function of each parameter and optimal parameter selection result are analyzed in detail. Conclusion This study proposes a fine-grained grading model that combines the fine-grained classification and multi-channel attention models. In addition, Rank_loss combines the ranking result and information of every layer. It is used to obtain the local feature area, which is beneficial to the classification result. According to the experimental results, the model can obtain good results in five-classification and two-classification tasks.
Keywords

订阅号|日报