Leveraging type descriptions for zero-shot named entity recognition and classification
View / Open Files
Publication Date
2021Journal Title
ACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference
Conference Name
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
ISBN
9781954085527
Publisher
Association for Computational Linguistics
Pages
1516-1528
Type
Conference Object
This Version
VoR
Metadata
Show full item recordCitation
Aly, R., Vlachos, A., & McDonald, R. (2021). Leveraging type descriptions for zero-shot named entity recognition and classification. ACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference, 1516-1528. https://doi.org/10.18653/v1/2021.acl-long.120
Abstract
A common issue in real-world applications of named entity recognition and classification (NERC) is the absence of annotated data for target entity classes during training. Zeroshot learning approaches address this issue by learning models that can transfer information from observed classes in the training data to unseen classes. This paper presents the first approach for zero-shot NERC, introducing a novel architecture that leverage the fact that textual descriptions for many entity classes occur naturally. Our architecture addresses the zero-shot NERC specific challenge that the not-an-entity class is not well defined, since different entity classes are considered in training and testing. For evaluation, we adapt two datasets, OntoNotes and MedMentions, emulating the difficulty of real-world zero-shot learning by testing models on the rarest entity classes. Our proposed approach outperforms baselines adapted from machine reading comprehension and zero-shot text classification. Furthermore, we assess the effect of different class descriptions for this task.
Sponsorship
European Commission Horizon 2020 (H2020) ERC (865958)
Embargo Lift Date
2100-01-01
Identifiers
External DOI: https://doi.org/10.18653/v1/2021.acl-long.120
This record's URL: https://www.repository.cam.ac.uk/handle/1810/330478
Statistics
Total file downloads (since January 2020). For more information on metrics see the
IRUS guide.