Abstract | ||
---|---|---|
We consider the problem of learning mixtures of distributions via spectral methods and derive a characterization of when such methods are useful. Specifically, given a mixture-sample, let $\bar\mu_{i}, {\bar C_{i}}, \bar w_{i}$ denote the empirical mean, covariance matrix, and mixing weight of the samples from the i-th component. We prove that a very simple algorithm, namely spectral projection followed by single-linkage clustering, properly classifies every point in the sample provided that each pair of means $\bar\mu_{i},\bar\mu_{j}$ is well separated, in the sense that $\|\bar\mu_{i} - \bar\mu_{j}\| We consider the problem of learning mixtures of distributions via spectral methods and derive a characterization of when such methods are useful. Specifically, given a mixture-sample, let $\bar\mu_{i}, {\bar C_{i}}, \bar w_{i}$ denote the empirical mean, covariance matrix, and mixing weight of the samples from the i-th component. We prove that a very simple algorithm, namely spectral projection followed by single-linkage clustering, properly classifies every point in the sample provided that each pair of means $\bar\mu_{i},\bar\mu_{j}$ is well separated, in the sense that $\|\bar\mu_{i} - \bar\mu_{j}\|^{2}$ is at least $\|{\bar C_{i}\|_{2}(1/\bar w_{i}+1/\bar w_{j})}$ plus a term that depends on the concentration properties of the distributions in the mixture. This second term is very small for many distributions, including Gaussians, Log-concave, and many others. As a result, we get the best known bounds for learning mixtures of arbitrary Gaussians in terms of the required mean separation. At the same time, we prove that there are many Gaussian mixtures {(μi ,Ci ,wi)} such that each pair of means is separated by ||Ci||2(1/wi+1/wj), yet upon spectral projection the mixture collapses completely, i.e., all means and covariance matrices in the projected mixture are identical. ${\bar C_{i}\|_{2}(1/\bar w_{i}+1/\bar w_{j})}$ plus a term that depends on the concentration properties of the distributions in the mixture. This second term is very small for many distributions, including Gaussians, Log-concave, and many others. As a result, we get the best known bounds for learning mixtures of arbitrary Gaussians in terms of the required mean separation. At the same time, we prove that there are many Gaussian mixtures {(μi ,Ci ,wi)} such that each pair of means is separated by ||Ci||2(1/wi+1/wj), yet upon spectral projection the mixture collapses completely, i.e., all means and covariance matrices in the projected mixture are identical. |
Year | DOI | Venue |
---|---|---|
2005 | 10.1007/11503415_31 | COLT |
Keywords | Field | DocType |
covariance matrix,gaussian mixture,spectral projection,spectral method,empirical mean,projected mixture,arbitrary gaussians,spectral learning,required mean separation,bar c,bar w,singular value decomposition,k means | Discrete mathematics,Singular value decomposition,Sample mean and sample covariance,Matrix (mathematics),Gaussian,Spectral method,Mathematics,Covariance | Conference |
Volume | ISSN | ISBN |
3559 | 0302-9743 | 3-540-26556-2 |
Citations | PageRank | References |
90 | 5.90 | 5 |
Authors | ||
2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Dimitris Achlioptas | 1 | 2037 | 174.89 |
Frank McSherry | 2 | 4289 | 288.94 |