Title
Communication-Efficient Distributed Learning of Discrete Distributions.
Abstract
We initiate a systematic investigation of distribution learning (density estimation) when the data is distributed across multiple servers. The servers must communicate with a referee and the goal is to estimate the underlying distribution with as few bits of communication as possible. We focus on non-parametric density estimation of discrete distributions with respect to the l1 and l2 norms. We provide the first non-trivial upper and lower bounds on the communication complexity of this basic estimation task in various settings of interest. Specifically, our results include the following: 1. When the unknown discrete distribution is unstructured and each server has only one sample, we show that any blackboard protocol (i.e., any protocol in which servers interact arbitrarily using public messages) that learns the distribution must essentially communicate the entire sample. 2. For the case of structured distributions, such as k-histograms and monotone distributions, we design distributed learning algorithms that achieve significantly better communication guarantees than the naive ones, and obtain tight upper and lower bounds in several regimes. Our distributed learning algorithms run in near-linear time and are robust to model misspecification. Our results provide insights on the interplay between structure and communication efficiency for a range of fundamental distribution estimation tasks.
Year
Venue
Field
2017
NIPS
Density estimation,Mathematical optimization,Upper and lower bounds,Computer science,Server,Theoretical computer science,Distributed learning,Communication complexity,Probability distribution,Monotone polygon
DocType
Citations 
PageRank 
Conference
1
0.35
References 
Authors
0
6
Name
Order
Citations
PageRank
Ilias Diakonikolas177664.21
Elena Grigorescu219224.75
Jerry Li322922.67
Abhiram Natarajan440.74
Krzysztof Onak549325.90
Ludwig Schmidt668431.03