Neuro-symbolic fact verification
Repository URI
Repository DOI
Change log
Authors
Abstract
Fact-checking, the process of assessing the veracity of claims, is a time-consuming task that can potentially take hours or days for a single claim, incentivising the development of computational methods to automate (parts of) the fact-checking process. This challenge has been instantiated in the field of natural language processing as fact verification, and is typically modelled by systems which extract textual evidence from a knowledge source and reason about a claim’s veracity via neural entailment systems. However, the reasoning processes of these systems are inherently opaque, suffer from robustness issues, and fail at capturing well-formalised semantic concepts like monotonicity.
To address these issues, this thesis explores neuro-symbolic methods for fact verification, which integrate symbolic systems with neural representations. We focus in particular on natural logic, a framework of compositional entailment which operates directly on natural language by capturing the set-theoretic relation between parts of a claim and textual evidence. As a logical system designed to identify valid inferences via deterministic proofs, it is particularly suited for fact verification, where a claim needs to be entailed by evidence, while guaranteeing explainability properties like faithfulness and actionability.
The first contribution of this thesis is the development of FEVEROUS, a large-scale dataset which requires complex reasoning over retrieved textual and tabular evidence, such as arithmetic or multi-hop reasoning, to incentivise the development of neuro-symbolic methods. We then explore means of combining natural logic as a symbolic reasoning framework with advances in autoregressive language modelling to improve the explainability, robustness, and generalisability of fact verification systems. We propose systems that (i) integrate natural logic as a dynamic and transparent stopping criterion for autoregressive multi-hop document retrieval; (ii) obviate the need for large-scale annotated data for training natural logic proof systems; and (iii) extend natural logic to tabular evidence and arithmetic computations, thus addressing key challenges encountered in the verification of complex claims. Finally, we unify these three contributions into a single natural logic-based fact verification system towards reasoning over textual and tabular evidence while satisfying important explainability desiderata.
