Abstractive spoken document summarization using hierarchical model with multi-stage attention diversity optimization

Manakul, P; Gales, MJF; Wang, L

Abstractive spoken document summarization using hierarchical model with multi-stage attention diversity optimization

Accepted version

Peer-reviewed

Repository URI

https://www.repository.cam.ac.uk/handle/1810/310361

Repository DOI

https://doi.org/10.17863/CAM.57455

Files

Accepted version (569.56 KB)

Type

Conference Object

Authors

Manakul, Potsawee

https://orcid.org/0000-0001-7108-8626

Gales, MJF

Wang, L

Abstract

Abstractive summarization is a standard task for written documents, such as news articles. Applying summarization schemes to spoken documents is more challenging, especially in situations involving human interactions, such as meetings. Here, utterances tend not to form complete sentences and sometimes contain little information. Moreover, speech disfluencies will be present as well as recognition errors for automated systems. For current attention-based sequence-to-sequence summarization systems, these additional challenges can yield a poor attention distribution over the spoken document words and utterances, impacting performance. In this work, we propose a multi-stage method based on a hierarchical encoder-decoder model to explicitly model utterance-level attention distribution at training time; and enforce diversity at inference time using a unigram diversity term. Furthermore, multitask learning tasks including dialogue act classification and extractive summarization are incorporated. The performance of the system is evaluated on the AMI meeting corpus. The inclusion of both training and inference diversity terms improves performance, outperforming current state-of-the-art systems in terms of ROUGE scores. Additionally, the impact of ASR errors, as well as performance on the multitask learning tasks, is evaluated.

Keywords

abstractive spoken document summarization, hierarchical model, attention diversity, multitask learning

Journal Title

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

Conference Name

Interspeech 2020

Journal ISSN

2308-457X
1990-9772

Volume Title

2020-October

Publisher

ISCA

Publisher DOI

https://doi.org/10.21437/Interspeech.2020-1683

Rights

Sponsorship

Cambridge Assessment (Unknown)

Collections

Cambridge University Research Outputs