Unsupervised Declarative Knowledge Induction for Constraint-Based Learning of Information Structure in Scientific Documents

Guo, Yufan; Reichart, Roi; Korhonen, Anna

Unsupervised Declarative Knowledge Induction for Constraint-Based Learning of Information Structure in Scientific Documents

Repository URI

https://www.repository.cam.ac.uk/handle/1810/247825

Files

Guo et al 2015 Transactions of Association for Computational Linguistics.pdf (294.76 KB)

Type

Article

Authors

Guo, Yufan

Reichart, Roi

Korhonen, Anna

Abstract

Inferring the information structure of scientific documents is useful for many NLP applications. Existing approaches to this task require substantial human effort. We propose a framework for constraint learning that reduces human involvement considerably. Our model uses topic models to identify latent topics and their key linguistic features in input documents, induces constraints from this information and maps sentences to their dominant information structure categories through a constrained unsupervised model. When the induced constraints are combined with a fully unsupervised model, the resulting model challenges existing lightly supervised featurebased models as well as unsupervised models that use manually constructed declarative knowledge. Our results demonstrate that useful declarative knowledge can be learned from data with very limited human involvement.

Journal Title

Transactions of Association for Computational Linguistics

Journal ISSN

2307-387X

Volume Title

3

Publisher

Association for Computational Linguistics

Publisher URL

https://ie.technion.ac.il/%20roiri/papers/dec-knowledge-learning-tacl.pdf

Rights

Attribution-NonCommercial-ShareAlike 2.0 UK: England & Wales

Collections

Scholarly Works - Theoretical and Applied Linguistics
Symplectic mapped items for data match