Abstract | ||
---|---|---|
A longstanding debate in semiotics centers on the relationship between linguistic signs and their corresponding semantics: is there an arbitrary relationship between a word form and its meaning, or does some systematic phenomenon pervade? For instance, does the character bigram \textit{gl} have any systematic relationship to the meaning of words like \textit{glisten}, \textit{gleam} and \textit{glow}? In this work, we offer a holistic quantification of the systematicity of the sign using mutual information and recurrent neural networks. We employ these in a data-driven and massively multilingual approach to the question, examining 106 languages. We find a statistically significant reduction in entropy when modeling a word form conditioned on its semantic representation. Encouragingly, we also recover well-attested English examples of systematic affixes. We conclude with the meta-point: Our approximate effect size (measured in bits) is quite small---despite some amount of systematicity between form and meaning, an arbitrary relationship and its resulting benefits dominate human language. |
Year | Venue | DocType |
---|---|---|
2019 | Meeting of the Association for Computational Linguistics | Journal |
Volume | Citations | PageRank |
abs/1906.05906 | 0 | 0.34 |
References | Authors | |
0 | 5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Tiago Pimentel | 1 | 0 | 1.69 |
Arya McCarthy | 2 | 2 | 5.11 |
Damián E. Blasi | 3 | 0 | 0.34 |
Brian Roark | 4 | 479 | 48.82 |
Ryan Cotterell | 5 | 3 | 6.13 |