Graph Prefetching Using Data Structure Knowledge
ICS '16 Proceedings of the 2016 International Conference on Supercomputing
Association for Computing Machinery
MetadataShow full item record
Ainsworth, S., & Jones, T. M. (2016). Graph Prefetching Using Data Structure Knowledge. ICS '16 Proceedings of the 2016 International Conference on Supercomputing, (39)https://doi.org/10.1145/2925426.2926254
Searches on large graphs are heavily memory latency bound, as a result of many high latency DRAM accesses. Due to the highly irregular nature of the access patterns involved, caches and prefetchers, both hardware and software, perform poorly on graph workloads. This leads to CPU stalling for the majority of the time. However, in many cases the data access pattern is well defined and predictable in advance, many falling into a small set of simple patterns. Although existing implicit prefetchers cannot bring significant benefit, a prefetcher armed with knowledge of the data structures and access patterns could accurately anticipate applications' traversals to bring in the appropriate data. This paper presents a design of an explicitly configured prefetcher to improve performance for breadth-first searches and sequential iteration on the efficient and commonly-used compressed sparse row graph format. By snooping L1 cache accesses from the core and reacting to data returned from its own prefetches, the prefetcher can schedule timely loads of data in advance of the application needing it. For a range of applications and graph sizes, our prefetcher achieves average speedups of 2.3x, and up to 3.3x, with little impact on memory bandwidth requirements.
This work was supported by the Engineering and Physical Sciences Research Council (EPSRC), through grant references EP/K026399/1 and EP/M506485/1, and ARM Ltd.
External DOI: https://doi.org/10.1145/2925426.2926254
This record's URL: https://www.repository.cam.ac.uk/handle/1810/260404