Abstract | ||
---|---|---|
For parallelism to become tractable for mass programmers, shared-memory languages and environments must evolve to enforce disciplined practices that ban "wild shared-memory behaviors;'' e.g., unstructured parallelism, arbitrary data races, and ubiquitous non-determinism. This software evolution is a rare opportunity for hardware designers to rethink hardware from the ground up to exploit opportunities exposed by such disciplined software models. Such a co-designed effort is more likely to achieve many-core scalability than a software-oblivious hardware evolution. This paper presents DeNovo, a hardware architecture motivated by these observations. We show how a disciplined parallel programming model greatly simplifies cache coherence and consistency, while enabling a more efficient communication and cache architecture. The DeNovo coherence protocol is simple because it eliminates transient states - verification using model checking shows 15X fewer reachable states than a state-of-the-art implementation of the conventional MESI protocol. The DeNovo protocol is also more extensible. Adding two sophisticated optimizations, flexible communication granularity and direct cache-to-cache transfers, did not introduce additional protocol states (unlike MESI). Finally, DeNovo shows better cache hit rates and network traffic, translating to better performance and energy. Overall, a disciplined shared-memory programming model allows DeNovo to seamlessly integrate message passing-like interactions within a global address space for improved design complexity, performance, and efficiency. |
Year | DOI | Venue |
---|---|---|
2011 | 10.1109/PACT.2011.21 | PACT |
Keywords | DocType | ISSN |
hardware architecture,protocols,parallel programming model,disciplined parallelism,mass programmer,network traffic,denovo coherence protocol,shared-memory language,additional protocol state,parallel programming,cache storage,software-oblivious hardware evolution,direct cache-to-cache transfer,conventional mesi protocol,denovo protocol,design complexity,flexible communication granularity,disciplined software model,hardware designer,disciplined shared-memory programming model,memory hierarchy,shared memory systems,ubiquitous computing,shared-memory programming model,cache architecture,software evolution,arbitrary data race,disciplined parallel programming model,message passing,cache hit rate,mesi protocol,parallel memories,many-core scalability,coherence,programming,model checking,shared memory,cache coherence,programming model,hardware | Conference | 1089-795X |
ISBN | Citations | PageRank |
978-1-4577-1794-9 | 78 | 1.94 |
References | Authors | |
47 | 9 |
Name | Order | Citations | PageRank |
---|---|---|---|
Byn Choi | 1 | 131 | 4.32 |
Rakesh Komuravelli | 2 | 156 | 6.24 |
Hyojin Sung | 3 | 144 | 4.99 |
Robert Smolinski | 4 | 78 | 1.94 |
Nima Honarmand | 5 | 147 | 7.30 |
Sarita V. Adve | 6 | 3773 | 257.16 |
Vikram S. Adve | 7 | 3347 | 183.25 |
Nicholas P. Carter | 8 | 349 | 33.84 |
Ching-Tsun Chou | 9 | 274 | 18.57 |