Title
Overflow controlled SIMD arithmetic
Abstract
Although the ”SIMD within a register” parallel architectures have existed for almost 10 years, the automatic optimizations for such architectures are not well developed yet. Since most optimizations for SIMD architectures are transplanted from traditional vectorization techniques, many special features of SIMD architectures, such as packed operations, have not been thoroughly considered. As operands are tightly packed within a register, there is no spare space to indicate overflow. To maintain the accuracy of automatic SIMDized programs, the operands should be unpacked to preserve enough space for interim overflow. By doing this, great overhead would be introduced. Furthermore, the instructions for handling interim overflows can sometimes prevent other optimizations. In this paper, a new technique, OCSA (overflow controlled SIMD arithmetic), is proposed to reduce the negative effects caused by interim overflow handling and eliminate the interference of interim overflows. We have applied our algorithm to the multimedia benchmarks of Berkeley. The experimental results show that the OCSA algorithm can significantly improve the performance of ADPCM-Decoder (110%), MESA-Reflect (113%) and DJVU-Encoder (106%).
Year
DOI
Venue
2004
null
Lecture Notes in Computer Science
Keywords
Field
DocType
spare space,interim overflow handling,simd architecture,packed operation,enough space,automatic simdized program,simd arithmetic,automatic optimizations,ocsa algorithm,interim overflow
Spare part,Computer science,Parallel computing,Operand,SIMD,Vectorization (mathematics),Arithmetic,Compiler,Interference (wave propagation),Interim,Automatic programming,Distributed computing
Conference
Volume
Issue
ISSN
3602
null
null
ISBN
Citations 
PageRank 
3-540-28009-X
1
0.36
References 
Authors
7
5
Name
Order
Citations
PageRank
Jiahua Zhu161.16
Hong-Jiang ZHANG2173781393.22
Hui Shi310.36
Binyu Zang498462.75
Chuan-qi Zhu524039.42