Title
MAT: Processing In-Memory Acceleration for Long-Sequence Attention
Abstract
Attention-based machine learning is used to model long-term dependencies in sequential data. Processing these models on long sequences can be prohibitively costly because of the large memory consumption. In this work, we propose MAT, a processing in-memory (PIM) framework, to accelerate long-sequence attention models. MAT adopts a memory-efficient processing flow for attention models to process su...
Year
DOI
Venue
2021
10.1109/DAC18074.2021.9586212
2021 58th ACM/IEEE Design Automation Conference (DAC)
Keywords
DocType
ISSN
Processor scheduling,Computational modeling,Pipelines,Memory management,Graphics processing units,Energy efficiency,Natural language processing
Conference
0738-100X
ISBN
Citations 
PageRank 
978-1-6654-3274-0
1
0.35
References 
Authors
0
6
Name
Order
Citations
PageRank
Minxuan Zhou1204.00
Guo, Yunhui211.37
Weihong Xu310.69
Bin Li420.70
Kevin W. Eliceiri57710.87
Tajana Simunic63198266.23