Normalising Medical Concepts in Social Media Texts by Learning Semantic Representation
Accepted version
Peer-reviewed
Repository URI
Repository DOI
Change log
Authors
Abstract
Automatically recognising medical con- cepts mentioned in social media messages (e.g. tweets) enables several applications for enhancing health quality of people in a community, e.g. real-time monitoring of infectious diseases in population. How- ever, the discrepancy between the type of language used in social media and med- ical ontologies poses a major challenge. Existing studies deal with this challenge by employing techniques, such as lexi- cal term matching and statistical machine translation. In this work, we handle the medical concept normalisation at the se- mantic level. We investigate the use of neural networks to learn the transition be- tween layman’s language used in social media messages and formal medical lan- guage used in the descriptions of medi- cal concepts in a standard ontology. We evaluate our approaches using three differ- ent datasets, where social media texts are extracted from Twitter messages and blog posts. Our experimental results show that our proposed approaches significantly and consistently outperform existing effective baselines, which achieved state-of-the-art performance on several medical concept normalisation tasks, by up to 44%.