Evaluation of a Concept Mapping Task Using Named Entity Recognition and Normalization in Unstructured Clinical Text.
Publication Date
2020-12Journal Title
J Healthc Inform Res
ISSN
2509-4971
Publisher
Springer Science and Business Media LLC
Volume
4
Issue
4
Pages
395-410
Language
en
Type
Article
This Version
VoR
Metadata
Show full item recordCitation
Trivedi, S., Gildersleeve, R., Franco, S., Kanter, A. S., & Chaudhry, A. (2020). Evaluation of a Concept Mapping Task Using Named Entity Recognition and Normalization in Unstructured Clinical Text.. J Healthc Inform Res, 4 (4), 395-410. https://doi.org/10.1007/s41666-020-00079-z
Abstract
In this pilot study, we explore the feasibility and accuracy of using a query in a commercial natural language processing engine in a named entity recognition and normalization task to extract a wide spectrum of clinical concepts from free text clinical letters. Editorial guidance developed by two independent clinicians was used to annotate sixty anonymized clinic letters to create the gold standard. Concepts were categorized by semantic type, and labels were applied to indicate contextual attributes such as negation. The natural language processing (NLP) engine was Linguamatics I2E version 5.3.1, equipped with an algorithm for contextualizing words and phrases and an ontology of terms from Intelligent Medical Objects to which those tokens were mapped. Performance of the engine was assessed on a training set of the documents using precision, recall, and the F1 score, with subset analysis for semantic type, accurate negation, exact versus partial conceptual matching, and discontinuous text. The engine underwent tuning, and the final performance was determined for a test set. The test set showed an F1 score of 0.81 and 0.84 using strict and relaxed criteria respectively when appropriate negation was not required and 0.75 and 0.77 when it was. F1 scores were higher when concepts were derived from continuous text only. This pilot study showed that a commercially available NLP engine delivered good overall results for identifying a wide spectrum of structured clinical concepts. Such a system holds promise for extracting concepts from free text to populate problem lists or for data mining projects.
Keywords
Annotation, Clinical letters, Gold standard, Named entity recognition, Natural language processing, Text mining
Sponsorship
Cambridge NIHR Biomedical Research Centre (RCAG/929 and RCAG/030)
Identifiers
s41666-020-00079-z, 79
External DOI: https://doi.org/10.1007/s41666-020-00079-z
This record's URL: https://www.repository.cam.ac.uk/handle/1810/329549
Rights
Licence:
http://creativecommons.org/licenses/by/4.0/
Statistics
Total file downloads (since January 2020). For more information on metrics see the
IRUS guide.
Recommended or similar items
The current recommendation prototype on the Apollo Repository will be turned off on 03 February 2023. Although the pilot has been fruitful for both parties, the service provider IKVA is focusing on horizon scanning products and so the recommender service can no longer be supported. We recognise the importance of recommender services in supporting research discovery and are evaluating offerings from other service providers. If you would like to offer feedback on this decision please contact us on: support@repository.cam.ac.uk