Sabrewing: A lightweight architecture for combined floating-point and integer arithmetic - Citegraph

Paper Info

Title
Sabrewing: A lightweight architecture for combined floating-point and integer arithmetic

Abstract
In spite of the fact that floating-point arithmetic is costly in terms of silicon area, the joint design of hardware for floating-point and integer arithmetic is seldom considered. While components like multipliers and adders can potentially be shared, floating-point and integer units in contemporary processors are practically disjoint. This work presents a new architecture which tightly integrates floating-point and integer arithmetic in a single datapath. It is mainly intended for use in low-power embedded digital signal processors and therefore the following design constraints were important: limited use of pipelining for the convenience of the compiler; maintaining compatibility with existing technology; minimal area and power consumption for applicability in embedded systems. The architecture is tailored to digital signal processing by combining floating-point fused multiply-add and integer multiply-accumulate. It could be deployed in a multi-core system-on-chip designed to support applications with and without dominance of floating-point calculations. The VHDL structural description of this architecture is available for download under BSD license. Besides being configurable at design time, it has been thoroughly checked for IEEE-754 compliance by means of a floating-point test suite originating from the IBM Research Labs. A proof-of-concept has also been implemented using STMicroelectronics 65nm technology. This prototype supports 32-bit signed two's complement integers and 41-bit (8-bit exponent and 32-bit significand) floating-point numbers. Our evaluations show that over 67&percnt; energy and 19&percnt; area can be saved compared to a reference design in which floating-point and integer arithmetic are implemented separately. The area overhead caused by combining floating-point and integer is less than 5&percnt;. Implemented in ST's general-purpose CMOS technology, the design can operate at a frequency of 1.35GHz, while 667MHz can be achieved in low-power CMOS. Considering that the entire datapath is partitioned in just three pipeline stages, and the fact that the design is intended for use in the low-power domain, these frequencies are adequate. They are in fact competitive with current technology low-power floating-point units. Post-layout estimates indicate that the required area of a low-power implementation can be as small as 0.04mm2. Power consumption is on the order of several milliwatts. Strengthened by the fact that clock gating could reduce power consumption even further, we think that a shared floating-point and integer architecture is a good choice for signal processing in low-power embedded systems.

Year	DOI	Venue
2012	10.1145/2086696.2086720	TACO
Keywords	Field	DocType
floating-point arithmetic,power consumption,low-power floating-point unit,lightweight architecture,integer architecture,floating-point calculation,floating-point number,floating-point test suite,combined floating-point,integer arithmetic,shared floating-point,floating-point fused multiply-add,system on chip,integer,area,floating point arithmetic,pipeline,digital signal processor,proof of concept,embedded system,embedded systems,clock gating,signal processing,digital signal processing,floating point,floating point unit	Clock gating,Pipeline (computing),Integer overflow,Datapath,Computer science,Digital signal processor,Floating point,Parallel computing,Real-time computing,Binary scaling,Significand	Journal
Volume	Issue	ISSN
8	4	1544-3566
Citations	PageRank	References
2	0.36	16
Authors
5

Authors (5 rows)

Cited by (2 rows)

References (16 rows)

Name	Order	Citations	PageRank
Tom Bruintjes	1	24	2.16
Karel H. G. Walters	2	2	0.70
Sabih Gerez	3	111	11.24
Bert Molenkamp	4	7	2.37
Gerard J. M. Smit	5	888	89.18

1