Concept Distillation in Graph Neural Networks

The opaque reasoning of Graph Neural Networks induces a lack of human trust. Existing graph network explainers attempt to address this issue by providing post-hoc explanations, however, they fail to make the model itself more interpretable. To fill this gap, we introduce the Concept Distillation Module, the first differentiable concept-distillation approach for graph networks. The proposed approach is a layer that can be plugged into any graph network to make it explainable by design, by first distilling graph concepts from the latent space and then using these to solve the task. Our results demonstrate that this approach allows graph networks to: (i) attain model accuracy comparable with their equivalent vanilla versions, (ii) distill meaningful concepts achieving 4.8% higher concept completeness and 36.5% lower purity scores on average, (iii) provide high-quality concept-based logic explanations for their prediction, and (iv) support effective interventions at test time: these can increase human trust as well as improve model performance.

Title

Concept Distillation in Graph Neural Networks

Keywords

Explainability, Concepts, Graph Neural Networks

Is Part Of

Communications in Computer and Information Science

Book type

Edited collection

Publisher

Springer Nature Switzerland

Publisher DOI

https://doi.org/10.1007/978-3-031-44070-0_12

ISBN

978-3-031-44069-4

Rights and licensing

Sponsorship

European Research Council

Collections

University of Cambridge Research Outputs