Title
Large scale support vector regression for aviation safety
Abstract
Regression problems on massive data sets are ubiquitous in many application domains including the Internet, earth and space sciences, and aviation. Support vector regression (SVR) is a popular technique for modeling the input-output relations of a set of variables under the added constraint of maximizing the margin, thereby leading to a very generalizable and regularized model. However, for a dataset with m training points, it is challenging to build SVR models due to the O(m3) cost involved in building them. In this paper we propose ParitoSVR ¿ a parallel iterated optimizer for Support Vector Regression in the primal that can be deployed over a network of machines, where each machine iteratively solves a small (sub-)problem based only on the data observed locally and these solutions are then combined to form the solution to the global problem. Our proposed method is based on the Alternating Direction Method of Multipliers (ADMM) optimization technique. Unlike many other existing techniques, ParitoSVR is provably convergent to the results obtained from the centralized algorithm, where the optimization has access to the entire data set. The experimental results show that the algorithm is scalable both with respect to accuracy and time to convergence. We use ParitoSVR to identify flights having anomalous fuel consumption from a large fleet-wide commercial aviation database containing thousands of flights. Along with the algorithmic contributions, this paper also describes the process of deployment of the ADMM-based SVR method on a multicore architecture, namely, the NASA Pleiades supercomputing infrastructure. We have been successful in running ParitoSVR on millions of training data points and hundreds of compute nodes.
Year
DOI
Venue
2015
10.1109/BigData.2015.7363851
Big Data
Keywords
Field
DocType
distributed optimization, support vector regression, aviation
Convergence (routing),Data modeling,Data mining,Data set,Supercomputer,Computer science,Support vector machine,Artificial intelligence,Relevance vector machine,Machine learning,Commercial aviation,Scalability
Conference
Citations 
PageRank 
References 
0
0.34
10
Authors
4
Name
Order
Citations
PageRank
Kamalika Das116813.46
Kanishka Bhaduri228918.96
Bryan L. Matthews3806.15
nikunj c oza469454.32