Title
Critical Behavior from Deep Dynamics: A Hidden Dimension in Natural Language.
Abstract
We show that in many data sequences - from texts in different languages to melodies and genomes - the mutual information between two symbols decays roughly like a power law with the number of symbols in between the two. In contrast, we prove that Markov/hidden Markov processes generically exhibit exponential decay in their mutual information, which explains why natural languages are poorly approximated by Markov processes. We present a broad class of models that naturally reproduce this critical behavior. They all involve dynamics of a recursive nature, as can be approximately implemented by tree-like or recurrent neural networks. This model class captures the essence of probabilistic context-free grammars as well as recursive self-reproduction in physical phenomena such as turbulence and cosmological inflation. We derive an analytic formula for the asymptotic power law and elucidate our results in a statistical physics context: 1-dimensional shallow models (such as Markov models or regular grammars) will fail to model natural language, because they cannot exhibit criticality, whereas deep models with one or more hidden dimensions representing levels of abstraction or scale can potentially succeed.
Year
Venue
Field
2016
arXiv: Disordered Systems and Neural Networks
Statistical physics,Markov process,Quantum mechanics,Markov model,Computer science,Markov chain,Algorithm,Natural language,Probabilistic logic,Hidden Markov model,Recursion,Hidden semi-Markov model
DocType
Volume
Citations 
Journal
abs/1606.06737
5
PageRank 
References 
Authors
0.45
0
2
Name
Order
Citations
PageRank
Henry W. Lin1704.05
Max Tegmark216016.27