Abstract | ||
---|---|---|
Learning the structure of dependencies among multiple random variables is a problem of considerable theoretical and practical interest. In practice, score optimisation with multiple restarts provides a practical and surprisingly successful solution, yet the conditions under which this may be a well founded strategy are poorly understood. In this paper, we prove that the problem of identifying the structure of a Bayesian Network via regularised score optimisation can be recast, in expectation, as a submodular optimisation problem, thus guaranteeing optimality with high probability. This result both explains the practical success of optimisation heuristics, and suggests a way to improve on such algorithms by artificially simulating multiple data sets via a bootstrap procedure. We show on several synthetic data sets that the resulting algorithm yields better recovery performance than the state of the art, and illustrate in a real cancer genomic study how such an approach can lead to valuable practical insights. |
Year | Venue | Field |
---|---|---|
2017 | arXiv: Learning | Multiple data,Mathematical optimization,Random variable,Submodular set function,Heuristics,Bayesian network,Artificial intelligence,Synthetic data sets,Maximization,Mathematics,Machine learning,Bootstrapping (electronics) |
DocType | Volume | Citations |
Journal | abs/1706.02386 | 0 |
PageRank | References | Authors |
0.34 | 11 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Giulio Caravagna | 1 | 156 | 16.46 |
Daniele Ramazzotti | 2 | 36 | 10.54 |
Guido Sanguinetti | 3 | 772 | 57.09 |