Title
An implementation of the codelet model
Abstract
Chip architectures are shifting from few, faster, functionally heavy cores to abundant, slower, simpler cores to address pressing physical limitations such as energy consumption and heat expenditure. As architectural trends continue to fluctuate, we propose a novel program execution model, the Codelet model, which is designed for new systems tasked with efficiently managing varying resources. The Codelet model is a fine-grained dataflow inspired model extended to address the cumbersome resources available in new architectures. In the following, we define the Codelet execution model as well as provide an implementation named DARTS. Utilizing DARTS and two predominant kernels, matrix multiplication and the Graph 500's breadth first search, we explore the validity of fine-grain execution as a promising and viable execution model for future and current architectures. We show that our runtime is on par or performs better than AMD's highly-optimized parallel library for matrix multication, outperforming it on average by 1.40× with a speedup up to 4×. Our implementation of the parallel BFS outperforms Graph 500's reference implementation (with or without dynamic scheduling) on average by 1.50× with a speed up of up to 2.38×.
Year
DOI
Venue
2013
10.1007/978-3-642-40047-6_63
Euro-Par
Keywords
DocType
Volume
multicore
Conference
8097
ISSN
Citations 
PageRank 
0302-9743
9
0.62
References 
Authors
12
3
Name
Order
Citations
PageRank
Joshua Suettlerlein190.62
Stéphane Zuckerman2428.16
Guang R. Gao32661265.87