Repository logo
 

The CLC-UKET Dataset: Benchmarking Case Outcome Prediction for the UK Employment Tribunal

Accepted version
Peer-reviewed

Change log

Abstract

This paper explores the intersection of tech- nological innovation and access to justice by developing a benchmark for predicting case outcomes in the UK Employment Tribunal (UKET). To address the challenge of extensive manual annotation, the study employs a large language model (LLM) for automatic annota- tion, resulting in the creation of the CLC-UKET dataset. The dataset consists of approximately 19,000 UKET cases and their metadata. Com- prehensive legal annotations cover facts, claims, precedent references, statutory references, case outcomes, reasons and jurisdiction codes. Fa- cilitated by the CLC-UKET data, we examine a multi-class case outcome prediction task in the UKET. Human predictions are collected to establish a performance reference for model comparison. Empirical results from baseline models indicate that finetuned transformer mod- els outperform zero-shot and few-shot LLMs on the UKET prediction task. The performance of zero-shot LLMs can be enhanced by integrat- ing task-related information into few-shot ex- amples. We hope that the CLC-UKET dataset, along with human annotations and empirical findings, can serve as a valuable benchmark for employment-related dispute resolution.

Description

Keywords

Journal Title

Conference Name

NLLP: 6th Natural Legal Language Processing Workshop

Journal ISSN

Volume Title

Publisher

Publisher DOI

Publisher URL

Rights and licensing

Except where otherwised noted, this item's license is described as All Rights Reserved
Sponsorship
Cambridge Centre for Data-Driven Discovery and Accelerate Programme for Scientific Discovery, made possible by a donation from Schmidt Futures.