Evaluating adversarial attacks against multiple fact verification systems
View / Open Files
Publication Date
2020-01-01Journal Title
EMNLP-IJCNLP 2019 - 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, Proceedings of the Conference
ISBN
9781950737901
Publisher
Association for Computational Linguistics
Pages
2944-2953
Type
Conference Object
This Version
VoR
Metadata
Show full item recordCitation
Thorne, J., Vlachos, A., Christodoulopoulos, C., & Mittal, A. (2020). Evaluating adversarial attacks against multiple fact verification systems. EMNLP-IJCNLP 2019 - 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, Proceedings of the Conference, 2944-2953. https://doi.org/10.17863/CAM.52938
Abstract
© 2019 Association for Computational Linguistics Automated fact verification has been progressing owing to advancements in modeling and availability of large datasets. Due to the nature of the task, it is critical to understand the vulnerabilities of these systems against adversarial instances designed to make them predict incorrectly. We introduce two novel scoring metrics, attack potency and system resilience which take into account the correctness of the adversarial instances, an aspect often ignored in adversarial evaluations. We consider six fact verification systems from the recent Fact Extraction and VERification (FEVER) challenge: the four best-scoring ones and two baselines. We evaluate adversarial instances generated by a recently proposed state-of-the-art method, a paraphrasing method, and rule-based attacks devised for fact verification. We find that our rule-based attacks have higher potency, and that while the rankings among the top systems changed, they exhibited higher resilience than the baselines.
Identifiers
This record's DOI: https://doi.org/10.17863/CAM.52938
This record's URL: https://www.repository.cam.ac.uk/handle/1810/305856