Current Issue Cover
深度医学图像配准研究进展:迈向无监督学习

马露凡1, 罗凤1, 严江鹏1, 徐哲1,2, 罗捷2, 李秀1(1.清华大学深圳国际研究生院, 深圳 518055;2.哈佛医学院, 美国波士顿 02115)

摘 要
在疾病诊断、手术引导及放射性治疗等图像辅助诊疗场景中,将不同时间、不同模态或不同设备的图像通过合理的空间变换进行配准是必要的处理流程之一。随着深度学习的快速发展,基于深度学习的医学图像配准研究以其耗时短、精度高的优势吸引了研究者的广泛关注。本文全面整理了2015—2019年深度医学图像配准方向的论文,系统地分析了深度医学图像配准领域的最新研究进展,展现了深度配准算法研究从迭代优化到一步预测、从有监督学习到无监督学习的总体发展趋势。具体来说,本文在界定深度医学图像配准问题和介绍配准研究分类方法的基础上,以相关算法的网络训练过程中所使用的监督信息多少作为分类标准,将深度医学图像配准划分为全监督、双监督与弱监督、无监督医学图像配准方法。全监督配准方法通过采用随机变换、传统算法和模型生成等方式获取近似的金标准作为监督信息;双监督、无监督配准方法通过引入图像相似度损失、标签相似度损失等其他监督信息以降低对金标准的依赖;无监督配准方法则完全消除对标注数据的需要,仅使用图像相似度损失和正则化损失监督网络训练。目前,无监督医学图像算法已经成为医学图像配准领域的研究重点,在无需获得代价高昂的标注信息下就能够取得与有监督和传统方法相当甚至更高的配准精度。在此基础上,本文进一步讨论了医学图像配准研究后续可能的4个未来挑战,希望能够为更高精度、更高效率的深度医学图像配准算法的研究提供方向,并推动深度医学图像配准技术在临床诊疗中落地应用。
关键词
Deep-learning based medical image registration pathway: towards unsupervised learning

Ma Lufan1, Luo Feng1, Yan Jiangpeng1, Xu Zhe1,2, Luo Jie2, Li Xiu1(1.Shenzhen International Graduate School, Tsinghua University, Shenzhen 518055, China;2.Harvard Medical School, Boston 02115, USA)

Abstract
Medical image registration (MIR) has aimed to implement the optimal transformation via aligning anatomical structures of a pair of medical images spatially. The crucial clinical applications like disease diagnosis, surgical guidance and radiation therapy have been envolved. Scholors have categorized MIR into inter-/intra-patient registration, uni-/multi-modal registration and rigid/non-rigid registration. Image classification has been developing deep learning-based (DL-based) MIR methods. The DL-based MIR has demonstrated substantial improvement in computational efficiency and task-specific registration accuracy over traditional iterative registration approaches. A sophisticated literature review of DL-based MIR have benefited to the disciplines. The current MIR has been analysed based on iterative optimization to one-step prediction and supervised learning to unsupervised learning. The DL-based MIR has been classified into fully supervised, dual supervised, weakly supervised and unsupervised approaches to train the DL network via the amount of supervision. Each category has been systematically reviewed. At the beginning, fully supervised methods have been reviewed in terms of the initial exploration to remove the time-consuming with low inference speed issues of deep iterative registration algorithms (deep similarity-based registration, reinforcement learning-based registration). One-step fully supervised registration has predicted the final transformation. The lack of training datasets with ground-truth transformations have barriered to train a fully supervised registration network. Most scholors have generated synthesized transformations with the following three approaches as below:1) random augmentation-based generation; 2) traditional registration-based generation; 3) model-based generation. Next, the integration of dual-supervised and weak supervised registration have alleviated the reliance on ground truth compared with fully supervised approaches via the transition technologies between fully supervised and unsupervised methods. Dual-supervised registration frameworks have integrated image similarity metric to supervise the training. Weak supervised registration in the context of anatomical labels of interest (solid organs, vessels, ducts, structure boundaries and other subject-specific ad hoc landmarks) has replaced ground truth. The label similarity using label-driven supervised registration has facilitated the network to directly estimate the transformation for paired fixed image and moving image. The end-to-end unsupervision has been used to indicate the DL-based medical image registration evolved into the unsupervised field gradually. The unsupervision has avoided the acquisition of ground-truth transformations and segmentation labels for the supervised methods. Unsupervised registration frameworks have performed spatial data based on spatial transformer network (STN) to flat image similarity loss calculation during the training process with unknown transformations further. The latest developments and applications of DL-based unsupervised registration methods have been summarized from the aspects of loss functions and network architectures. DL-based unsupervised registration algorithms on liver CT(computed tomography) scan datasets have also been re-implemented. The demonstrated analyses have the priority to baseline model. At the end, the potentials and possibilities have been illustrated as following:1) constructing more robust similarity metrics and more effective regularization terms to deal with multi-modality MIR;2) quantifying registration result confidence of various DL-based models or integrating domain knowledge into current data-driven networks;3) designing more qualified networks with fewer parameters (e.g., 3D convolution factorization, capsule network architecture).
Keywords

订阅号|日报