用于组织病理图像分类的双层多实例学习模型
摘 要
目的 分析组织病理学全玻片图像(whole slide images,WSIs)是病理学诊断的金标准。WSIs具有千兆像素,且通常缺乏像素级标注。弱监督多实例学习是分析WSIs的主流方法,其关键是怎样从大量实例中精确识别出触发类别预测的关键实例。以前的WSIs分析方法主要是在独立同分布假设下设计的,忽略了实例间的相关性和肿瘤的异质性。针对上述问题,提出一种新的双层多实例学习模型。方法 具体地,提出的模型由自适应特征挖掘器和双路交叉检测模块级联构成。首先,第1层的自适应特征挖掘器检索包中的区分性特征,为后续的实例特征聚合生成可靠的内部查询;然后,第2层的双路交叉检测模块通过建模内部查询与实例间的相关性,聚合包中所有实例生成最终的包级表示。此外,在特征提取部分中引入了自监督对比学习方法SimCLR以生成高质量的实例特征。结果 在两个公共可用的数据集CAMELYON-16和TCGA (the cancer genome atlas)肺癌上评估了提出的模型,对比分析6种经典的多实例学习模型,结果显示本文模型的性能最优。在准确率方面,所提方法在CAMELYON-16和TCGA肺癌两个数据集上分别达到了95.35%和91.87%,较对比方法中最优的分别高出2.33%和0.96%。结论 提出的模型可以较好地挖掘组织病理学图像的内部特征信息,显著提升检测精度,表明其在病理学诊断应用中的有效性,并能够准确定位病变区域,在病理辅助诊断场景下有较高的应用价值。
关键词
Double-tier multiple instance learning model for histopathology image classification
Lu Hao1, Chen Jinling1, Chen Jie1, Chen Baihe1, Tang Zhuowei2(1.School of Electrical Engineering and Information, Southwest Petroleum University, Chengdu 610500, China;2.Mianyang Central Hospital, Mianyang 621000, China) Abstract
Objective Whole slide images(WSIs), which refer to scanning and converting a complete microscope slide to digital WSIs, is an efficient technique for visualizing tissue sections in disease diagnosis, medical education, and pathological research. Analysis of histopathology WSIs is the gold standard for pathology diagnosis. However, analyzing pathological WSIs is a tedious and time-consuming task, and the diagnosis result is easily influenced by personal experience. The increasing use of WSIs in histopathology results in digital pathology providing huge improvements in pathologists'workflow and diagnosis decision-making, but it also stimulates the need for computer-aided diagnostic tools of WSIs. At present, a significant number of experts and scholars have begun exploring the application of deep learning in the field of pathological image analysis. WSIs possess gigapixel resolution and usually lack pixel-level annotations. Existing deep learning techniques are developed for small-sized conventional images. Therefore, applying these techniques directly to WSI analysis is not feasible. Weakly supervised multiple instance learning(MIL) is a powerful method in analyzing WSIs, and the key component is how to effectively discover the crucial instance that triggers the prediction from massive instances and summarize valuable information from different instances. Previous methods were primarily designed based on the independent and identical distribution (i. i. d.) hypothesis, disregarding the relationships among different instances and the heterogeneity of tumors. To solve these problems, a novel double-tier MIL(DT-MIL) model is proposed. Method The proposed method consists of three aspects:1) pre-processing operation of WSIs, 2) convolutional neural network(CNN) -based feature encoding, and 3) feature fusion of instance embeddings. First, WSIs are cropped into fixed-sized image patches using a sliding window strategy, filtering out invalid background regions and retaining only the foreground areas containing pathological tissues. Second, the CNN-based feature encoder encodes the image patches into fixed-length feature embeddings. Lastly, the proposed DT-MIL model is deployed in the feature fusion part. DT-MIL contains two MIL models in series. The Tier-1 MIL model is applied to generate negative and positive internal queries, also known as the adaptive feature miner. The Tier-2 MIL model consists of deep non-linear and double-detection cross-attention modules. The former maps the instance features in the bag, while the latter is applied to generate a bag-level representation for final classification. In particular, Tier-1's adaptive feature miner applies the idea of Grad-CAM to provide a reliable probability distribution of instances under the AB-MIL framework. Thereafter, highly reliable features are retrieved and aggregated to generate internal query for each subclass. Moreover, adaptive feature miner flexibly selects K discriminative instances to generate reliable internal query to mitigate the constraints of tumor heterogeneity on model performance and avoid introducing false information. In addition, adaptive feature miner considers positive and negative instances to prevent biased decision boundary. Tier-2 aims to produce a robust bag-level representation for subsequent classifiers by simultaneously modeling the relationship among positive query, negative query, and instances in the bag. Aggregating all instances from the bag by establishing the connections among positive query, negative query, and each instance simultaneously can supplement the feature information and also enable the model to remain sensitive to positive and negative instances. Consequently, the model is prevented from being biased against negative instances, and its robustness is improved. An in-domain feature encoder pretrained by the self-supervised comparative learning framework SimCLR is also introduced into the proposed model to generate more robust feature embeddings. Result This study performs a comparison and ablation-related experiments on two publicly available datasets, namely, CAMELYON-16 and TCGA lung cancer. First, we compared six classical multi-instance learning models. Experimental results show that the proposed model performs optimally and achieves significant improvements in accuracy, precision, and recall. In the CAMELYON-16 dataset, testing accuracy, precision, and recall for binary tumor classification reached 95. 35%, 95. 91%, and 94. 27%, respectively. In the TCGA lung cancer dataset, testing accuracy, precision, and recall for cancer subtype classification achieved 91. 87%, 91. 92%, and 91. 83%, respectively. The proposed method achieved accuracy rates 2. 33% and 0. 96% higher than the state-of-the-art methods in the CAMELYON-16 and TCGA lung cancer datasets, respectively. Second, we conducted ablation experiments on the proposed model to verify the effectiveness of its key components. Experimental results show that sequentially adding the feature extractor, adaptive feature miner, and dual-path cross-detection module helped improve the accuracy of the model by 31. 78%, 3. 1%, and 0. 78%, respectively. Lastly, we compared the proposed adaptive feature miner with traditional Kmeans clustering and aggregate Top-K instances. Experimental results indicate that the adaptive feature miner can flexibly extract discriminative features, thereby generating optimal internal query. Conclusion The proposed DT-MIL model sinuously considers correlation between instances and the tumor heterogeneity. It can better mine the internal feature information of histopathological images and significantly improve the detection accuracy. This result demonstrates the effectiveness of the proposed model in pathological diagnosis and accurately locating the lesion region. These aspects have high application value in pathology-assisted diagnostic scenarios.
Keywords
multiple instance learning(MIL) histopathological image self-supervised comparative learning weakly supervised learning deep learning
|