Repository logo
 

Understanding Inter-Concept Relationships in Concept-Based Models

Accepted version
Peer-reviewed

Loading...
Thumbnail Image

Change log

Abstract

Concept-based explainability methods provide insight into deep learning systems by constructing explanations using human-understandable concepts. While the literature on human reasoning demonstrates that we exploit relationships between concepts when solving tasks, it is unclear whether concept-based methods incorporate the rich structure of inter-concept relationships. We analyse the concept representations learnt by concept-based models to understand whether these models correctly capture inter-concept relationships. First, we empirically demonstrate that state-of-the-art concept-based models produce representations that lack stability and robustness, and such methods fail to capture inter-concept relationships. Then, we develop a novel algorithm which leverages inter-concept relationships to improve concept intervention accuracy, demonstrating how correctly capturing inter-concept relationships can improve downstream tasks.

Description

Keywords

Journal Title

Proceedings of Machine Learning Research

Conference Name

The Forty-First International Conference on Machine Learning (ICML 2024)

Journal ISSN

2640-3498

Volume Title

235

Publisher

OpenReview.net

Publisher DOI

Rights and licensing

Except where otherwised noted, this item's license is described as All Rights Reserved