Improved Semantic Representation for Domain-Specific Entities
Proceedings of the 4th BioNLP Shared Task Workshop
Association for Computational Linguistics
MetadataShow full item record
Pilehvar, M., & Collier, N. (2016). Improved Semantic Representation for Domain-Specific Entities. Proceedings of the 4th BioNLP Shared Task Workshop https://doi.org/10.17863/CAM.4496
Most existing corpus-based approaches to semantic representation suffer from inaccurate modeling of domain-specific lexical items which either have low frequencies or are non-existent in open-domain corpora. We put forward a technique that improves word embeddings in specific domains by first transforming a given lexical item to a sorted list of representative words and then modeling the item by combining the embeddings of these words. Our experiments show that the proposed technique can significantly improve some of the recent word embedding techniques while modeling a set of lexical items in the biomedical domain, i.e., phenotypes.
The authors gratefully acknowledge the support of the MRC grant No. MR/M025160/1 for PheneBank.
MEDICAL RESEARCH COUNCIL (MR/M025160/1)
This record's DOI: https://doi.org/10.17863/CAM.4496
This record's URL: https://www.repository.cam.ac.uk/handle/1810/260268
Attribution 4.0 International
Licence URL: http://creativecommons.org/licenses/by/4.0/