Abstract | ||
---|---|---|
Attention-based machine learning is used to model long-term dependencies in sequential data. Processing these models on long sequences can be prohibitively costly because of the large memory consumption. In this work, we propose MAT, a processing in-memory (PIM) framework, to accelerate long-sequence attention models. MAT adopts a memory-efficient processing flow for attention models to process su... |
Year | DOI | Venue |
---|---|---|
2021 | 10.1109/DAC18074.2021.9586212 | 2021 58th ACM/IEEE Design Automation Conference (DAC) |
Keywords | DocType | ISSN |
Processor scheduling,Computational modeling,Pipelines,Memory management,Graphics processing units,Energy efficiency,Natural language processing | Conference | 0738-100X |
ISBN | Citations | PageRank |
978-1-6654-3274-0 | 1 | 0.35 |
References | Authors | |
0 | 6 |
Name | Order | Citations | PageRank |
---|---|---|---|
Minxuan Zhou | 1 | 20 | 4.00 |
Guo, Yunhui | 2 | 1 | 1.37 |
Weihong Xu | 3 | 1 | 0.69 |
Bin Li | 4 | 2 | 0.70 |
Kevin W. Eliceiri | 5 | 77 | 10.87 |
Tajana Simunic | 6 | 3198 | 266.23 |