Quality assessment of anatomical MRI images from generative adversarial networks: Human assessment and image quality metrics.

Change log
Treder, Matthias S 
Codrai, Ryan 
Tsvetanov, Kamen A.  ORCID logo  https://orcid.org/0000-0002-3178-6363

BACKGROUND: Generative Adversarial Networks (GANs) can synthesize brain images from image or noise input. So far, the gold standard for assessing the quality of the generated images has been human expert ratings. However, due to limitations of human assessment in terms of cost, scalability, and the limited sensitivity of the human eye to more subtle statistical relationships, a more automated approach towards evaluating GANs is required. NEW METHOD: We investigated to what extent visual quality can be assessed using image quality metrics and we used group analysis and spatial independent components analysis to verify that the GAN reproduces multivariate statistical relationships found in real data. Reference human data was obtained by recruiting neuroimaging experts to assess real Magnetic Resonance (MR) images and images generated by a GAN. Image quality was manipulated by exporting images at different stages of GAN training. RESULTS: Experts were sensitive to changes in image quality as evidenced by ratings and reaction times, and the generated images reproduced group effects (age, gender) and spatial correlations moderately well. We also surveyed a number of image quality metrics. Overall, Fréchet Inception Distance (FID), Maximum Mean Discrepancy (MMD) and Naturalness Image Quality Evaluator (NIQE) showed sensitivity to image quality and good correspondence with the human data, especially for lower-quality images (i.e., images from early stages of GAN training). However, only a Deep Quality Assessment (QA) model trained on human ratings was able to reproduce the subtle differences between higher-quality images. CONCLUSIONS: We recommend a combination of group analyses, spatial correlation analyses, and both distortion metrics (FID, MMD, NIQE) and perceptual models (Deep QA) for a comprehensive evaluation and comparison of brain images produced by GANs.

Ageing, Deep learning, GAN, Generative Adversarial Network, Generative models, MRI, Machine learning, Quality assessment, Benchmarking, Brain, Humans, Image Processing, Computer-Assisted, Magnetic Resonance Imaging, Signal-To-Noise Ratio
Journal Title
J Neurosci Methods
Conference Name
Journal ISSN
Volume Title
Elsevier BV
Guarantors of Brain (Unknown)
Medical Research Council (MR/J009482/1)
Medical Research Council (MR/M008983/1)
Medical Research Council (MC_U105597119)
Medical Research Council (MC_UU_00005/12)