Repository logo
 

Citation block determination using textual coherence

Published version
Peer-reviewed

Repository DOI


Change log

Authors

Kaplan, D 
Tokunaga, T 

Abstract

Detecting the boundaries of citations in the running text of research papers is an important task for research paper summarisation, idea attribution, sentiment analysis, and other citation-based analysis research. Recently, detecting non-explicit citing sentences has garnered some attention, but can still be seen as in its infancy. We define this task as citation block determination (CBD). In this paper we propose and investigate the effects of various types of textual coherence on CBD, positing that it is a crucial aspect of identifying citation blocks, as it is fundamental to the composition of citations themselves. We demonstrate promising results, with our method outperforming previous state-of-the-art on F1 by a large margin, with an improvement in both precision and recall, and further provide an in-depth error analysis and discussion of why this is the case.

Description

Keywords

citation block determination, citation analysis, citations, research paper summarisation, textual coherence, natural language processing, information extraction

Journal Title

Journal of Information Processing

Conference Name

Journal ISSN

0387-5806
1882-6652

Volume Title

24

Publisher

Information Processing Society of Japan
Sponsorship
This work has been funded in part by the Microsoft 2008 WEBSCALE grant, as well as by the Intelligence Advanced Research Projects Activity (IARPA) via Department of Interior National Business Center (DoI/NBC) contract number D11PC20153. The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright annotation thereon. Disclaimer: The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of IARPA, DoI/NBC, or the U.S. Government.