Repository logo

Improving a Fundamental Measure of Lexical Association

Accepted version


Conference Object

Change log


Recchia, GL 
Nulty, Paul 


Pointwise mutual information (PMI), a simple measure of lexical association, is part of several algorithms used as models of lexical semantic memory. Typically, it is used as a component of more complex distributional models rather than in isolation. We show that when two simple techniques are applied—(1) down-weighting co-occurrences involving low-frequency words in order to address PMI’s so-called “frequency bias,” and (2) defining co-occurrences as counts of “events in which instances of word1 and word2 co-occur in a context” rather than “contexts in which word1 and word2 co-occur”—then PMI outperforms default parameterizations of word embedding models in terms of how closely it matches human relatedness judgments. We also identify which down-weighting techniques are most helpful. The results suggest that simple measures may be capable of modeling certain phenomena in semantic memory, and that complex models which incorporate PMI might be improved with these modifications.



Journal Title

Proceedings of the 39th Annual Meeting of the Cognitive Science Society

Conference Name

39th Annual Meeting of the Cognitive Science Society

Journal ISSN

Volume Title


Foundation for the Future
Cambridge Centre for Digital Knowledge