Deep neural networks for medical image super-resolution

Zhu, Jin

Deep neural networks for medical image super-resolution

Repository URI

https://www.repository.cam.ac.uk/handle/1810/357027

Repository DOI

https://doi.org/10.17863/CAM.101535

Files

Thesis (53.49 MB)

Type

Thesis

Authors

Zhu, Jin

Abstract

Super-resolution plays an essential role in medical imaging because it provides an alternative way to achieve high spatial resolutions with no extra acquisition cost. In the past decades, the rapid development of deep neural networks has ensured high reconstruction fidelity and photo-realistic super-resolution image generation. However, challenges still exist in the medical domain, requiring novel network architectures, training tricks, and SR image evaluation techniques. This dissertation concentrates on backbone networks for supervised single-image super-resolution tasks on various medical images with challenging magnification scales. Besides incorporating widespread methods designed for natural images, I explore progressive learning, adversarial learning and meta-learning in end-to-end frameworks based on convolution neural networks, generative adversarial networks and vision transformers for robust medical image super-resolution. In addition to general image quality assessments, task-specific objective and subjective evaluation metrics are implemented for comprehensive comparisons. Specifically, the proposed approaches contain three directions, achieving state-of-the-art performance on diverse medical image modalities.

First, I implement progressive and adversarial learning for perceptually realistic texture generation in super-resolution tasks with challenging magnification scales (i.e. x4). I present a CNN-based multi-scale super-resolution image generator that decomposes the complex mapping problem into simpler sub-problems to avoid over-smoothing the structural information and introducing non-realistic high-frequency textures in super-resolved images. Moreover, it involves a lesion-focused training strategy and an advanced adversarial loss based on the Wasserstein distance for more efficient and stabilised training. This proposed method dramatically improves the perceptual quality of generated images, achieving comparable subjective scores of experienced radiologists with ground truth high-resolution images in the experiments of the brain and cardiac magnetic resonance images. It competed for state-of-the-art perceptual quality in medical image super-resolution in 2019 and became the pioneer of GAN-based medical image research with enduring effects.

Second, I introduce meta-learning and transfer learning to GANs for efficient and robust medical image super-resolution with arbitrary scales (e.g. (1, 4]). In the post-upsampling framework, I implement a lightweight network based on EDSR for productive low-resolution feature extraction and a weight prediction module for scale-free feature map upsampling. Compared with existing SISR networks, this framework supports non-integer magnification with no adverse effects of pre-/post- processing. Specifically, this approach achieves comparable reconstruction accuracy and objective perceptual quality performance with much fewer parameters than SOTA methods. Additionally, I robustly transfer the pre-trained SR model of one medical image dataset (i.e. brain MRI) to various new medical modalities (e.g. chest CT and cardiac MR) with a few fine-tuning steps. Moreover, exhaustive ablation studies are conducted to discuss the perception-distortion tradeoff and to illustrate the impacts of residual block connections, hyper-parameters, loss components and adversarial loss variants on medical image super-resolution performance.

Finally, I propose an efficient vision transformer with residual dense connections and local feature fusion to achieve superior single-image super-resolution performances of medical modalities. Due to the improved information flow, this CNN-transformer hybrid model has advanced representation capability with fewer training computational requirements. Meanwhile, I implement a general-purpose perceptual loss with manual control for desired image quality improvements by incorporating prior knowledge of medical image segmentation. Compared with state-of-the-art methods on four public medical image datasets, the proposed method achieves the best PSNR scores of 6 modalities among seven modalities with only 38% parameters of SwinIR (the most recent SOTA method). On the other hand, the segmentation-based perceptual loss increases by +0.14 dB PSNR on average for prevalent super-resolution networks without extra training costs. Additionally, I discuss potential factors for the superior performance of vision transformers over CNNs and GANs and the impacts of network and loss function components in a comprehensive ablation study.

In conclusion, this dissertation represents my research contributions of applying deep neural networks on robust medical image super-resolution tasks, including efficient network architectures, broad applicability training techniques, and clinically meaningful image quality evaluation. When publishing, these proposed approaches perform state-of-the-art on various public and private medical image datasets in simulation experiments. These algorithms potentially apply in hospitals for advanced clinical processes with proper case-specific modifications and supplementary techniques. Moreover, the novel methods and findings of super-resolution may also benefit other low-level image processing tasks, while the discussion and ablation studies provide exciting future research directions.

Date

2023-03-01

Advisors

Lio, Pietro

Keywords

Convolution neural networks, deep neural networks, Generative adversarial networks, Image processing, Machine learning, Medical image analysis, Super-resolution, Vision transformers

Qualification

Doctor of Philosophy (PhD)

Awarding Institution

University of Cambridge

Rights

Sponsorship

This thesis was supported by China Scholarship Council (201708060173).

Collections

Theses - Computer Science and Technology