On the Optimality of the Lexicon
Repository URI
Repository DOI
Change log
Authors
Abstract
The principle of least effort posits that a pressure towards communicative efficiency shapes natural languages. In this thesis, we investigate the existence, the nature, and the impact of such a pressure in natural languages' lexicons. We investigate the existence of this pressure by (i) estimating what optimal word lengths would be using coding theory, (ii) proposing pressure-free baselines and estimating their word lengths, and then (iii) comparing natural lexicons to both these optimal and pressure-free artificial lexicons. We investigate the nature of this pressure by comparing multiple ways in which communicative efficiency can be operationalised; we formalise it as either a pressure to shorten utterances, or a pressure to keep information rates as close as possible to an unknown communication channel capacity. Finally, we study the impact of this pressure on cross-linguistic differences in word lengths and on the ratio of homophones in natural languages. Overall, our results support a Zipfian view of communicative efficiency, in which lexicons are pressured towards having utterances that are as short as possible. Our results, however, also highlight the existence of competing constraints and pressures in how lexicons are structured: (i) a language's phonotactic complexity seems to bottleneck the extent to which economy of expression can optimise a lexicon, and (ii) a pressure for clarity seems to keep the ratio of homophones in a language close to chance.