Keep the Primary, Rewrite the Secondary: A Two-Stage Approach for Paraphrase Generation
Published version
Peer-reviewed
Repository URI
Repository DOI
Change log
Authors
Su, Yixuan https://orcid.org/0000-0002-1472-7791
Vandyke, D
Baker, S
Wang, Y
Collier, Nigel https://orcid.org/0000-0002-7230-4164
Abstract
Paraphrase generation is an important and challenging NLG problem. In this work, we propose a new Identification-then-Aggregation (IA) framework to tackle this task. In the identification step, the input tokens are sorted into two groups by a novel Primary/Secondary Identification (PSI) algorithm. In the aggregation step, these groups are separately encoded, before being aggregated by a custom designed decoder, which autoregressively generates the paraphrased sentence. In extensive experiments on two benchmark datasets, we demonstrate that our model outperforms previous studies by a notable margin. We also show that the proposed approach can generate paraphrases in an interpretable and controllable way.
Description
Keywords
Journal Title
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021
Conference Name
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021
Journal ISSN
Volume Title
Publisher
Association for Computational Linguistics
Publisher DOI
Rights
Publisher's own licence