The limits of annotation in machine learning a documents Hohfeldian legal entities

Izzidien, Ahmed

doi:10.33774/coe-2021-dqwvg

The limits of annotation in machine learning a documents Hohfeldian legal entities

Published version

Peer-reviewed

Repository URI

https://www.repository.cam.ac.uk/handle/1810/331062

Repository DOI

https://doi.org/10.17863/CAM.78507

Files

Published version (286.92 KB)

Type

Conference Object

Authors

Izzidien, Ahmed

https://orcid.org/0000-0002-0929-8064

Abstract

Natural language processing (NLP) summarisers aim to capture the essential elements of a document. Yet, the ontological character of a summary can be domain specific. In legal analysis, the Hohfeldian matrix is used to summarise principle legal relations between agents, such as individuals and organisations. We test a limit of using machine learning (ML) to detect such agents. Based on training with our 2400 hand labelled annotations, an F1= 80.1 is found. Extrapolating this suggests that over one million annotations are required to capture all the agents mentioned in a document. This questions the feasibility of such an approach, one that is unable to be inclusive of all agents who are party to a legal relation. Such complete capture is an essential criteria of fair ML and accurate legal summaries. An alternative approach based on hypernymy is suggested.

Publisher DOI

https://doi.org/10.33774/coe-2021-dqwvg

Rights

Attribution 4.0 International

Collections

University of Cambridge Research Outputs (Articles and Conferences)

The limits of annotation in machine learning a documents Hohfeldian legal entities

Published version

Peer-reviewed

Repository URI

Repository DOI

Files

Type

Change log

Authors

Abstract

Description

Keywords

Journal Title

Conference Name

Journal ISSN

Volume Title

Publisher

Publisher DOI

Rights

Collections