Title
Dealing With "Very Large" Datasets An Overview Of A Promising Research Line: Distributed Learning
Abstract
Traditionally, a bottleneck preventing the development of more intelligent systems was the limited amount of data available. However, nowadays in many domains of machine learning, the size of the datasets is so large that the limiting factor is the inability of learning algorithms to use all the data to learn with in a reasonable time. In order to handle this problem a new field in machine learning has emerged: large-scale learning, where learning is limited by computational resources rather than by the availability of data. Moreover, in many real applications, "very large" datasets are naturally distributed and it is necessary to learn locally in each of the workstations in which the data are generated. However, the great majority of well-known learning algorithms do not provide an admissible solution to both problems: learning from "very large" datasets and learning from distributed data. In this context, distributed learning seems to be a promising line of research with which to deal with both situations, since "very large" concentrated datasets can be partitioned among several workstations. This paper provides some background regarding distributed environments as well as an overview of distributed learning for dealing with "very large" datasets.
Year
Venue
Keywords
2011
ICAART 2011: PROCEEDINGS OF THE 3RD INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE, VOL 1
Machine learning, Large-scale learning, Very large dataset, Data fragmentation, Distributed learning
Field
DocType
Citations 
Data science,Data mining,Computer science,Distributed learning
Conference
0
PageRank 
References 
Authors
0.34
0
3
Name
Order
Citations
PageRank
Diego Peteiro-Barral1709.07
Bertha Guijarro-Berdiñas229634.36
Beatriz Pérez-Sánchez39514.03