Quality assessment of anatomical MRI images from generative adversarial networks: Human assessment and image quality metrics.
Treder, Matthias S
Tsvetanov, Kamen A
J Neurosci Methods
MetadataShow full item record
Treder, M. S., Codrai, R., & Tsvetanov, K. A. (2022). Quality assessment of anatomical MRI images from generative adversarial networks: Human assessment and image quality metrics.. J Neurosci Methods, (109579), 109579-109579. https://doi.org/10.1016/j.jneumeth.2022.109579
BACKGROUND: Generative Adversarial Networks (GANs) can synthesize brain images from image or noise input. So far, the gold standard for assessing the quality of the generated images has been human expert ratings. However, due to limitations of human assessment in terms of cost, scalability, and the limited sensitivity of the human eye to more subtle statistical relationships, a more automated approach towards evaluating GANs is required. NEW METHOD: We investigated to what extent visual quality can be assessed using image quality metrics and we used group analysis and spatial independent components analysis to verify that the GAN reproduces multivariate statistical relationships found in real data. Reference human data was obtained by recruiting neuroimaging experts to assess real Magnetic Resonance (MR) images and images generated by a GAN. Image quality was manipulated by exporting images at different stages of GAN training. RESULTS: Experts were sensitive to changes in image quality as evidenced by ratings and reaction times, and the generated images reproduced group effects (age, gender) and spatial correlations moderately well. We also surveyed a number of image quality metrics. Overall, Fréchet Inception Distance (FID), Maximum Mean Discrepancy (MMD) and Naturalness Image Quality Evaluator (NIQE) showed sensitivity to image quality and good correspondence with the human data, especially for lower-quality images (i.e., images from early stages of GAN training). However, only a Deep Quality Assessment (QA) model trained on human ratings was able to reproduce the subtle differences between higher-quality images. CONCLUSIONS: We recommend a combination of group analyses, spatial correlation analyses, and both distortion metrics (FID, MMD, NIQE) and perceptual models (Deep QA) for a comprehensive evaluation and comparison of brain images produced by GANs.
Guarantors of Brain (Unknown)
Embargo Lift Date
External DOI: https://doi.org/10.1016/j.jneumeth.2022.109579
This record's URL: https://www.repository.cam.ac.uk/handle/1810/335839
All Rights Reserved
Licence URL: http://www.rioxx.net/licenses/all-rights-reserved