Title
16-Bit FP Sub-Word Parallelism to Facilitate Compiler Vectorization and Improve Performance of Image and Media Processing
Abstract
We consider the implementation of 16-bit floating point instructions on a Pentium 4 and a PowerPC G5 for image and media processing. By measuring the execution time of benchmarks with these new simulated instructions, we show that significant speed-up is obtained compared to 32-bit FP versions. For image processing, the speed-up both comes from doubling the number of operations per SIMD instruction and the better cache behavior with byte storage. For data stream processing with arrays of structures, the speed-up mainly comes from the wider SIMD instructions.
Year
DOI
Venue
2004
10.1109/ICPP.2004.1
ICPP
Keywords
Field
DocType
improve performance,facilitate compiler vectorization,16-bit floating point instruction,wider simd instruction,media processing,powerpc g5,significant speed-up,image processing,simd instruction,16-bit fp sub-word parallelism,data stream processing,byte storage,32-bit fp version,instruction sets,parallel processing,floating point arithmetic,floating point
Central processing unit,Cache,Instruction set,Computer science,Parallel computing,Image processing,SIMD,Pentium,Digital image processing,Computer hardware,PowerPC
Conference
ISSN
ISBN
Citations 
0190-3918
0-7695-2197-5
1
PageRank 
References 
Authors
0.44
6
2
Name
Order
Citations
PageRank
Daniel Etiemble130042.43
Lionel Lacassagne212723.17