Abstract | ||
---|---|---|
We parallelize a version of the active-set iterative algorithm derived from the original works of Lawson and Hanson [Solving Least Squares Problems, Prentice-Hall, 1974] on multicore architectures. This algorithm requires the solution of an unconstrained least squares problem in every step of the iteration for a matrix composed of the passive columns of the original system matrix. To achieve improved performance, we use parallelizable procedures to efficiently update and downdate the $QR$ factorization of the matrix at each iteration, to account for inserted and removed columns. We use a reordering strategy of the columns in the decomposition to reduce computation and memory access costs. We consider graphics processing units (GPUs) as a new mode for efficient parallel computations and compare our implementations to that of multicore CPUs. Both synthetic and nonsynthetic data are used in the experiments. |
Year | DOI | Venue |
---|---|---|
2011 | 10.1137/100799083 | SIAM J. Scientific Computing |
Keywords | Field | DocType |
new mode,squares problems,memory access cost,efficient parallel computation,original system matrix,improved performance,active-set iterative algorithm,parallel nonnegative,multicore architectures,original work,multicore architecture,multicore cpus,deconvolution,multicore | Least squares,Mathematical optimization,Iterative method,Computer science,Matrix (mathematics),CUDA,Parallel computing,Non-negative matrix factorization,Factorization,Graphics processing unit,Multi-core processor | Journal |
Volume | Issue | ISSN |
33 | 5 | 1064-8275 |
Citations | PageRank | References |
9 | 0.73 | 5 |
Authors | ||
2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Yuancheng Luo | 1 | 28 | 5.10 |
Ramani Duraiswami | 2 | 1721 | 161.98 |