Decoupled Vector Runahead for Prefetching Nested Memory-Access Chains
Accepted version
Peer-reviewed
Repository URI
Repository DOI
Change log
Abstract
Decoupled vector runahead (DVR) exploits massive amounts of memory-level parallelism to improve the performance of applications that feature indirect memory accesses by dynamically inferring loop bounds at runtime, recognizing striding loads, and speculatively vectorizing the subsequent instructions that are part of an indirect chain. DVR runs as an on-demand, speculative, in-order, lightweight hardware subthread alongside the main thread within the core. DVR incurs minimal hardware overhead while delivering a substantial performance boost.
Description
Journal Title
IEEE Micro
Conference Name
Journal ISSN
0272-1732
1937-4143
1937-4143
Volume Title
44
Publisher
Institute of Electrical and Electronics Engineers (IEEE)
Publisher DOI
Rights and licensing
Except where otherwised noted, this item's license is described as Attribution 4.0 International
Sponsorship
EPSRC (EP/W00576X/1)

