Abstract | ||
---|---|---|
This paper focuses on reducing the execution time of the video compression algorithms based on the 3D wavelet transform. We present several optimizations that could not be applied by the compiler due to the characteristics of the algorithm. First, we use the Streaming SIMD Extensions (SSE) for some of the dimensions of the sequence (y and time), in order to reduce the number of floating point instructions, exploiting Data Level Parallelism. Then, we apply loop unrolling and data prefetching to critical parts of the code, and finally the algorithm is vectorized by columns, allowing the use of SIMD instructions for the y dimension. Results show improvements of up to 1.54 over a version compiled with the maximum optimizations of the Intel C/C+ + compiler. Our experiments also show that, allowing the compiler to perform some of these optimizations (i.e. automatic code vectorization) causes performance slowdown which demonstrates the effectiveness of our optimizations. |
Year | DOI | Venue |
---|---|---|
2003 | 10.1109/EMPDP.2003.1183565 | ELEVENTH EUROMICRO CONFERENCE ON PARALLEL, DISTRIBUTED AND NETWORK-BASED PROCESSING, PROCEEDINGS |
Keywords | Field | DocType |
data level parallelism,video compression,parallel processing,data compression,floating point,wavelet transform,wavelet transforms,transform coding | SSE2,Computer science,Floating point,Parallel computing,Vectorization (mathematics),SIMD,Compiler,Data parallelism,Streaming SIMD Extensions,Loop unrolling | Conference |
Citations | PageRank | References |
7 | 0.56 | 16 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Gregorio Bernabé | 1 | 106 | 12.32 |
J. M. García | 2 | 588 | 58.90 |
José González | 3 | 526 | 35.85 |