Attention-based representation learning on graphs

Buterez, David

doi:https://doi.org/10.17863/CAM.122354

Attention-based representation learning on graphs

Repository URI

https://www.repository.cam.ac.uk/handle/1810/391101

Repository DOI

https://doi.org/10.17863/CAM.122354

Files

Primary Thesis (25.33 MB)

Type

Thesis

Authors

Buterez, David

https://orcid.org/0000-0001-6558-0833

Abstract

As data that does not conform to simple and regular structures such as images or text becomes more readily available, the field of representation learning has continued to evolve through approaches that seek to describe, understand, and even unify deep learning strategies for data structures such as sets, grids, and graphs. A remarkably successful application of this field of geometric deep learning is learning on graphs, abstractions that represent relationships between items of a set, and which naturally describe real-world phenomena such as social, biological, or transportation networks. Recently, breakthroughs in graph learning translated to impactful applications such as protein structure prediction, as well as the inverse problem of constructing an amino acid sequence that folds to a target structure, and even Earth-scale weather forecasts. Despite these achievements, there is uncertainty regarding the balance of algorithmic complexity, computational resource utilisation, and task performance, with few graph methods consistently performing well across multiple datasets, benchmarks, and settings.

This rapid growth of the field produces new challenges, at the same time exposing shortcomings of existing methodologies. In this dissertation, I investigate multiple pertinent aspects of deep learning, including graph neural networks, graph transformers, and transfer learning in the graph domain. Conceptually, the main limitation being addressed is the fixed and handcrafted nature of graph learning operators. Instead, I propose attention as a universal mechanism capable of augmenting and even superseding current architectural choices. My first contribution targets graph neural networks and consists of replacing classical readout functions with neural network-based adaptive readouts, and in particular with attention-based pooling. Secondly, I study transfer learning in the context of high-throughput screening funnels specific to early-stage drug discovery. In this setting, I demonstrate empirically that classical readouts are unable to model molecular data at the multi-million scale, and instead show that adaptive readouts unlock the transfer learning potential of graph neural networks. Finally, motivated by these conclusions and recent advances in efficient and exact attention, I propose an end-to-end attention-based framework for learning on graphs: edge set attention, which is inherently edge-based, simpler than message passing and graph transformers, and achieves state-of-the-art results. The findings and advances proposed in this thesis have been empirically validated across hundreds of experiments, consistently outperforming conventional approaches and validating the hypotheses put forward here.

Date

2025-01-14

Advisors

Lio, Pietro

Qualification

Doctor of Philosophy (PhD)

Awarding Institution

University of Cambridge

Rights and licensing

Sponsorship

AstraZeneca

Collections

Theses - Computer Science and Technology

Attention-based representation learning on graphs

Repository URI

Repository DOI

Files

Type

Change log

Authors

Abstract

Description

Date

Advisors

Keywords

Qualification

Awarding Institution

Rights and licensing

Sponsorship

Collections