Title
Efficient Parallel Nonnegative Least Squares on Multicore Architectures
Abstract
We parallelize a version of the active-set iterative algorithm derived from the original works of Lawson and Hanson [Solving Least Squares Problems, Prentice-Hall, 1974] on multicore architectures. This algorithm requires the solution of an unconstrained least squares problem in every step of the iteration for a matrix composed of the passive columns of the original system matrix. To achieve improved performance, we use parallelizable procedures to efficiently update and downdate the $QR$ factorization of the matrix at each iteration, to account for inserted and removed columns. We use a reordering strategy of the columns in the decomposition to reduce computation and memory access costs. We consider graphics processing units (GPUs) as a new mode for efficient parallel computations and compare our implementations to that of multicore CPUs. Both synthetic and nonsynthetic data are used in the experiments.
Year
DOI
Venue
2011
10.1137/100799083
SIAM J. Scientific Computing
Keywords
Field
DocType
new mode,squares problems,memory access cost,efficient parallel computation,original system matrix,improved performance,active-set iterative algorithm,parallel nonnegative,multicore architectures,original work,multicore architecture,multicore cpus,deconvolution,multicore
Least squares,Mathematical optimization,Iterative method,Computer science,Matrix (mathematics),CUDA,Parallel computing,Non-negative matrix factorization,Factorization,Graphics processing unit,Multi-core processor
Journal
Volume
Issue
ISSN
33
5
1064-8275
Citations 
PageRank 
References 
9
0.73
5
Authors
2
Name
Order
Citations
PageRank
Yuancheng Luo1285.10
Ramani Duraiswami21721161.98