Repository logo
 

Graph prefetching using data structure knowledge


Type

Conference Object

Change log

Authors

Jones, TM 

Abstract

Searches on large graphs are heavily memory latency bound, as a result of many high latency DRAM accesses. Due to the highly irregular nature of the access patterns involved, caches and prefetchers, both hardware and software, perform poorly on graph workloads. This leads to CPU stalling for the majority of the time. However, in many cases the data access pattern is well defined and predictable in advance, many falling into a small set of simple patterns. Although existing implicit prefetchers cannot bring significant benefit, a prefetcher armed with knowledge of the data structures and access patterns could accurately anticipate applications' traversals to bring in the appropriate data. This paper presents a design of an explicitly configured prefetcher to improve performance for breadth-first searches and sequential iteration on the efficient and commonly-used compressed sparse row graph format. By snooping L1 cache accesses from the core and reacting to data returned from its own prefetches, the prefetcher can schedule timely loads of data in advance of the application needing it. For a range of applications and graph sizes, our prefetcher achieves average speedups of 2.3x, and up to 3.3x, with little impact on memory bandwidth requirements.

Description

Keywords

33 Built Environment and Design, 3301 Architecture

Journal Title

Proceedings of the International Conference on Supercomputing

Conference Name

ICS '16: 2016 International Conference on Supercomputing

Journal ISSN

Volume Title

Publisher

ACM
Sponsorship
Engineering and Physical Sciences Research Council (EP/K026399/1)
EPSRC (1510365)
Engineering and Physical Sciences Research Council (EP/M506485/1)
This work was supported by the Engineering and Physical Sciences Research Council (EPSRC), through grant references EP/K026399/1 and EP/M506485/1, and ARM Ltd.