Improving a Fundamental Measure of Lexical Association
View / Open Files
Authors
Recchia, Gabriel
Nulty, Paul
Journal Title
Proceedings of the 39th Annual Meeting of the Cognitive Science Society
Conference Name
39th Annual Meeting of the Cognitive Science Society
Type
Conference Object
Metadata
Show full item recordCitation
Recchia, G., & Nulty, P. Improving a Fundamental Measure of Lexical Association. Proceedings of the 39th Annual Meeting of the Cognitive Science Society https://doi.org/10.17863/CAM.30302
Abstract
Pointwise mutual information (PMI), a simple measure of lexical association, is part of several algorithms used as models of lexical semantic memory. Typically, it is used as a component of more complex distributional models rather than in isolation. We show that when two simple techniques are applied—(1) down-weighting co-occurrences involving low-frequency words in order to address PMI’s so-called “frequency bias,” and (2) defining co-occurrences as counts of “events in which instances of word1 and word2 co-occur in a context” rather than “contexts in which word1 and word2 co-occur”—then PMI outperforms default parameterizations of word embedding models in terms of how closely it matches human relatedness judgments. We also identify which down-weighting techniques are most helpful. The results suggest that simple measures may be capable of modeling certain phenomena in semantic memory, and that complex models which incorporate PMI might be improved with these modifications.
Sponsorship
Cambridge Centre for Digital Knowledge
Funder references
Foundation for the Future
Identifiers
External DOI: https://doi.org/10.17863/CAM.30302
This record's URL: https://www.repository.cam.ac.uk/handle/1810/282939
Rights
Licence:
http://www.rioxx.net/licenses/all-rights-reserved
Statistics
Total file downloads (since January 2020). For more information on metrics see the
IRUS guide.