Repository logo
 

The informativeness of linguistic unit boundaries

Published version
Peer-reviewed

Type

Article

Change log

Authors

Geertzen, J 
Blevins, JP 
Milin, P 

Abstract

Contemporary models of structural analysis tend to operate with discrete units at different linguistic levels. There is, however, considerable debate regarding the choice of units and the validity of the cues that guide their demarcation. At the level of grammatical analysis, this debate focuses largely on the status of words vs sub-word units and on the generality of the linguistic properties that mark each type of unit. This paper suggests that the status of a unit type can be evaluated in terms of its informativity A measure of informativity is obtained by assessing the influence that different unit boundary types have on text compressibility. The results obtained from this initial study support a pair of general conclusions. The first is that unit boundaries primarily reflect a statistical structure, and that the typological variability of linguistic cues reflects the fact that they serve a secondary reinforcing function. The second is that word boundaries are the most informative boundary type, and that the demarcation of words provides the most informative description of the regular patterns in a language.

Description

Keywords

Journal Title

Italian Journal of Linguistics

Conference Name

Journal ISSN

1120-2726

Volume Title

28

Publisher

Pacini Editore

Publisher DOI