Title
Data-Importance Aware User Scheduling for Communication-Efficient Edge Machine Learning
Abstract
With the prevalence of intelligent mobile applications, edge learning is emerging as a promising technology for powering fast intelligence acquisition for edge devices from distributed data generated at the network edge. One critical task of edge learning is to efficiently utilize the limited radio resource to acquire data samples for model training at an edge server. In this paper, we develop a novel user scheduling algorithm for data acquisition in edge learning, called <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">(data) importance-aware scheduling</i> . A key feature of this scheduling algorithm is that it takes into account the informativeness of data samples, besides communication reliability. Specifically, the scheduling decision is based on a <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">data importance indicator</i> (DII), elegantly incorporating two “important” metrics from communication and learning perspectives, i.e., the <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">signal-to-noise ratio</i> (SNR) and <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">data uncertainty</i> . We first derive an explicit expression for this indicator targeting the classic classifier of <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">support vector machine</i> (SVM), where the uncertainty of a data sample is measured by its distance to the decision boundary. Then, the result is extended to <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">convolutional neural networks</i> (CNN) by replacing the distance based uncertainty measure with the entropy. As demonstrated via experiments using real datasets, the proposed importance-aware scheduling can exploit the two-fold multi-user diversity, namely the diversity in both the multiuser channels and the distributed data samples. This leads to faster model convergence than the conventional scheduling schemes that exploit only a single type of diversity.
Year
DOI
Venue
2021
10.1109/TCCN.2020.2999606
IEEE Transactions on Cognitive Communications and Networking
Keywords
DocType
Volume
Scheduling,resource management,image classification,multiuser channels,data acquisition
Journal
7
Issue
ISSN
Citations 
1
2332-7731
4
PageRank 
References 
Authors
0.42
0
4
Name
Order
Citations
PageRank
Dongzhu Liu1252.53
Guangxu Zhu234324.03
Jun Zhang33772190.36
Kaibin Huang43155182.06