Title
Bring Your Own Learner: A Cloud-Based, Data-Parallel Commons for Machine Learning
Abstract
We introduce FCUBE, a cloud-based framework that enables machine learning researchers to contribute their learners to its community-shared repository. FCUBE exploits data parallelism in lieu of algorithmic parallelization to allow its users to efficiently tackle large data problems automatically. It passes random subsets of data generated via resampling to multiple learners that it executes simultaneously and then it combines their model predictions with a simple fusion technique. It is an example of what we have named a Bring Your Own Learner model. It allows multiple machine learning researchers to contribute algorithms in a plug-and-play style. We contend that the Bring Your Own Learner model signals a design shift in cloud-based machine learning infrastructure because it is capable of executing anyone's supervised machine learning algorithm. We demonstrate FCUBE executing five different learners contributed by three different machine learning groups on a 100 node deployment on Amazon EC2. They collectively solve a publicly available classification problem trained with 11 million exemplars from the Higgs dataset.
Year
DOI
Venue
2015
10.1109/MCI.2014.2369892
Computational Intelligence Magazine, IEEE  
Keywords
Field
DocType
cloud computing,data handling,learning (artificial intelligence),parallel processing,fcube,higgs dataset,bring your own learner model,community shared repository,data parallel commons,data problems,fusion technique,supervised machine learning algorithm
Online machine learning,Semi-supervised learning,Active learning (machine learning),Computer science,Exploit,Data parallelism,Artificial intelligence,Resampling,Machine learning,Cloud computing,Commons
Journal
Volume
Issue
ISSN
10
1
1556-603X
Citations 
PageRank 
References 
8
0.93
13
Authors
4
Name
Order
Citations
PageRank
Ignacio Arnaldo1817.69
Kalyan Veeramachaneni271661.50
Andrew Song383.98
Una-May O'Reilly481.27