Title
Efficient Computing of Stochastic Complexity.
Abstract
Stochastic complexity of a data set is defined as the shortest possible code length for the data ob- tainable by using some fixed set of models. This measure is of great theoretical and practical im- portance as a tool for tasks such as model selec- tion or data clustering. Unfortunately, comput- ing the modern version of stochastic complexity, defined as the Normalized Maximum Likelihood (NML) criterion, requires computing a sum with an exponential number of terms. Therefore, in order to be able to apply the stochastic complex- ity measure in practice, in most cases it has to be approximated. In this paper, we show that for some interesting and important cases with multi- nomial data sets, the exponentiality can be re- moved without loss of accuracy. We also intro- duce a new computationally efficient approxima- tion scheme based on analytic combinatorics and assess its accuracy, together with earlier approx- imations, by comparing them to the exact form. The results suggest that due to its accuracy and efficiency, the new sharper approximation will be useful for a wide class of problems with discrete data.
Year
Venue
Keywords
2003
AISTATS
analytic combinatorics,data clustering
DocType
Citations 
PageRank 
Conference
3
0.64
References 
Authors
12
5
Name
Order
Citations
PageRank
Petri Kontkanen125530.36
Wray L. Buntine22271579.82
Petri Myllymaki3699.84
Jorma Rissanen41665798.14
Henry Tirri5691145.38