Repository logo
 

Improving a Fundamental Measure of Lexical Association

Accepted version
Peer-reviewed

Type

Conference Object

Change log

Authors

Recchia, GL 
Nulty, Paul 

Abstract

Pointwise mutual information (PMI), a simple measure of lexical association, is part of several algorithms used as models of lexical semantic memory. Typically, it is used as a component of more complex distributional models rather than in isolation. We show that when two simple techniques are applied—(1) down-weighting co-occurrences involving low-frequency words in order to address PMI’s so-called “frequency bias,” and (2) defining co-occurrences as counts of “events in which instances of word1 and word2 co-occur in a context” rather than “contexts in which word1 and word2 co-occur”—then PMI outperforms default parameterizations of word embedding models in terms of how closely it matches human relatedness judgments. We also identify which down-weighting techniques are most helpful. The results suggest that simple measures may be capable of modeling certain phenomena in semantic memory, and that complex models which incorporate PMI might be improved with these modifications.

Description

Keywords

Journal Title

Proceedings of the 39th Annual Meeting of the Cognitive Science Society

Conference Name

39th Annual Meeting of the Cognitive Science Society

Journal ISSN

Volume Title

Publisher

Sponsorship
Foundation for the Future
Cambridge Centre for Digital Knowledge