Repository logo
 

The reliabilities of three potential methods of capturing expert judgement in determining grade boundaries

Published version
Peer-reviewed

Change log

Abstract

In England there is a strong public expectation that qualification standards should remain constant over time. At each examination session, awarding bodies must therefore determine the grade boundaries for their examinations that equate to those of previous sessions. We investigated the reliabilities of three methods for capturing the expert judgement of professional examiners who are responsible for maintaining year-on-year examination standards. The methods were those used in: traditional (current) awarding; Thurstone pairs; and rank ordering.

In the context of setting grade boundaries in AS level Biology and GCSE English, we conducted a three-way comparison of the intra-method and inter-method reliabilities of the three methods. For each subject, three mutually exclusive sets of examination scripts were created, which were matched for mark. Three groups of ten 'judges' (examiners, matched for experience of the methods) made judgements using each of the three methods on a different set of scripts. It was found that for both subjects, the traditional awarding and Thurstone pairs methods generated very similar boundary marks, except for at the biology A/B grade boundary. The boundary marks generated by rank ordering were all on the lenient side for biology, whereas for the English C/D grade boundary, they were on the severe side.

Description

Journal Title

Research Matters

Conference Name

Journal ISSN

Volume Title

Publisher

Research Division, Cambridge University Press & Assessment

Publisher DOI

Publisher URL

Rights and licensing

Except where otherwised noted, this item's license is described as All Rights Reserved