Repository logo
 

Enhancing Interpretability: The Role of Concept-based Explanations Across Data Types


Type

Thesis

Change log

Authors

Kazhdan, Dmitry 

Abstract

Deep Neural Networks (DNNs) have achieved remarkable performance on a range of tasks. Unfortunately, they have been shown to be black-boxes whose behaviour cannot be understood directly. Thus, a key step to further empowering DNN-based approaches is improving their explainability, which is addressed by the field of Explainable AI (XAI). Recently, Concept-based explanations (CbEs) have emerged as a powerful new XAI paradigm, providing model explanations in terms of human-understandable units, rather than individual features, pixels, or characters. Despite their numerous applications, existing literature is lacking precise answers to what concepts are, where they can be applied, and how they should be applied. In this thesis, we give more precise answers to these questions.

Firstly, we unify and extend existing definitions of CbEs in Computer Vision using our Concept Decomposition (ConD) framework, and use it formulate our novel Concept-based Model Extraction (CME) framework. CME provides a way of extracting Concept-based Models from pre-trained vanilla Convolutional Neural Networks in a semi-supervised fashion, requiring vastly fewer concept annotations. Following our work on CME, we explore how similar approaches transfer to the temporal data modality. In particular, we introduce our Model Explanations via Model Extraction (MEME) framework, a first-of-its-kind temporal CbE framework capable of extracting Concept-based Models from Recurrent Neural Networks (RNNs). Using MEME, we demonstrate how we may harness the benefits of CbEs in the temporal data domain. Equipped with a better understanding of what CbEs are, we proceed to exploring their applications to graph data. In particular, we automate the verification and inspection process of graph concepts, by leveraging the inherent structure of the underlying graph data. In order to achieve this, we introduce the Graph Concept Interpretation (GCI) framework, which can be used to extract and quantitatively evaluate concepts from Graph Neural Networks (GNNs).

Overall, this thesis introduces a range of approaches enabling: (i) novel concept definitions, (ii) novel concept applications, (iii) novel concept evaluations. Consequently, these approaches advance the field of concept-based XAI, bringing us closer to making DNNs more transparent, trustworthy, and interpretable.

Description

Date

2023-09-29

Advisors

Jamnik, Mateja
Lio, Pietro

Keywords

Artificial Intelligence, Concept Explanations, Deep Learning, Explainable AI, Safe AI, XAI

Qualification

Doctor of Philosophy (PhD)

Awarding Institution

University of Cambridge
Sponsorship
EPSRC (2281344)
Engineering and Physical Sciences Research Council (2281344)
EPSRC iCase studentship, funded by EPSRC & GSK