Title
pLUTo: Enabling Massively Parallel Computation in DRAM via Lookup Tables
Abstract
Data movement between the main memory and the processor is a key contributor to execution time and energy consumption in memory-intensive applications. This data movement bottleneck can be alleviated using Processing-in-Memory (PiM). One category of PiM is Processing-using-Memory (PuM), in which computation takes place inside the memory array by exploiting intrinsic analog properties of the memory device. PuM yields high performance and energy efficiency, but existing PuM techniques support a limited range of operations. As a result, current PuM architectures cannot efficiently perform some complex operations (e.g., multiplication, division, exponentiation) without large increases in chip area and design complexity. To overcome these limitations of existing PuM architectures, we introduce pLUTo (processing-using-memory with lookup table (LUT) operations), a DRAM-based PuM architecture that leverages the high storage density of DRAM to enable the massively parallel storing and querying of lookup tables (LUTs). The key idea of pLUTo is to replace complex operations with low-cost, bulk memory reads (i.e., LUT queries) instead of relying on complex extra logic. We evaluate pLUTo across 11 real-world workloads that showcase the limitations of prior PuM approaches and show that our solution outperforms optimized CPU and GPU base-lines by an average of $713 \times$ and $1.2 \times$, respectively, while simultaneously reducing energy consumption by an average of $1855 \times$ and $39.5 \times$. Across these workloads, pLUTo outperforms state-of-the-art PiM architectures by an average of $18.3 \times$. We also show that different versions of pLUTo provide different levels of flexibility and performance at different additional DRAM area overheads (between 10.2% and 23.1%). pLUTo’s source code and all scripts required to reproduce the results of this paper are openly and fully available at https://github.com/CMU-SAFARI/pLUTo.
Year
DOI
Venue
2022
10.1109/MICRO56248.2022.00067
2022 55th IEEE/ACM International Symposium on Microarchitecture (MICRO)
Keywords
DocType
ISBN
enabling massively parallel computation,key contributor,execution time,energy consumption,memory-intensive applications,data movement bottleneck,Processing-in-Memory,memory array,intrinsic analog properties,memory device,PuM yields high performance,energy efficiency,existing PuM techniques,current PuM architectures,design complexity,existing PuM architectures,DRAM-based PuM architecture,high storage density,massively parallel storing,bulk memory,LUT queries,complex extra logic,state-of-the-art PiM architectures,flexibility,different additional DRAM area overheads,pLUTo's source code
Conference
978-1-6654-7428-3
Citations 
PageRank 
References 
0
0.34
84
Authors
11
Name
Order
Citations
PageRank
João Dinis Ferreira100.68
Gabriel Falcão26416.36
Juan Gómez-Luna320923.34
Mohammed Alser4173.19
Lois Orosa5224.20
Mohammad Sadrosadati601.69
Jeremie Kim726313.68
Geraldo F. Oliveira8161.86
Taha Shahroodi911.36
Anant Nori10193.01
Onur Mutlu119446357.40