Clinical records anonymisation and text extraction (CRATE): an open-source software system.

Change log
Cardinal, Rudolf N 

BACKGROUND: Electronic medical records contain information of value for research, but contain identifiable and often highly sensitive confidential information. Patient-identifiable information cannot in general be shared outside clinical care teams without explicit consent, but anonymisation/de-identification allows research uses of clinical data without explicit consent. RESULTS: This article presents CRATE (Clinical Records Anonymisation and Text Extraction), an open-source software system with separable functions: (1) it anonymises or de-identifies arbitrary relational databases, with sensitivity and precision similar to previous comparable systems; (2) it uses public secure cryptographic methods to map patient identifiers to research identifiers (pseudonyms); (3) it connects relational databases to external tools for natural language processing; (4) it provides a web front end for research and administrative functions; and (5) it supports a specific model through which patients may consent to be contacted about research. CONCLUSIONS: Creation and management of a research database from sensitive clinical records with secure pseudonym generation, full-text indexing, and a consent-to-contact process is possible and practical using entirely free and open-source software.

Anonymisation, Clinical informatics, De-identification, Electronic medical records, Open-source software, Pseudonymisation, Psychiatry, Data Anonymization, Electronic Health Records, Research, Software
Journal Title
BMC Med Inform Decis Mak
Conference Name
Journal ISSN
Volume Title
Springer Science and Business Media LLC
The project was funded in part by the UK National Institute of Health Research Cambridge Biomedical Research Centre. The work was conducted within the Behavioural and Clinical Neuroscience Institute, University of Cambridge, supported by the Wellcome Trust and the UK Medical Research Council.