Repository logo
 

Software Prefetching for Unstructured Mesh Applications

Accepted version
Peer-reviewed

Type

Conference Object

Change log

Authors

Hadade, I 
Jones, TM 
Wang, F 
Di Mare, L 

Abstract

Applications that exhibit regular memory access patterns usually benefit transparently from hardware prefetchers that bring data into the fast on-chip cache just before it is required, thereby avoiding expensive cache misses. Unfortunately, unstructured mesh applications contain irregular access patterns that are often more difficult to identify in hardware. An alternative for such workloads is software prefetching, where special non-blocking instructions load data into the cache hierarchy. However, there are currently few examples in the literature on how to incorporate such software prefetches into existing applications with positive results. This paper addresses these issues by demonstrating the utility and implementation of software prefetching in an unstructured finite volume CFD code of representative size and complexity to an industrial application and across a number of processors. We present the benefits of auto-tuning for finding the optimal prefetch distance values across different computational kernels and architectures and demonstrate the importance of choosing the right prefetch destination across the available cache levels for best performance. We discuss the impact of the data layout on the number of prefetch instructions required in kernels with indirect-access patterns and show how to integrate them on top of existing optimisations such as vectorisation. Through this we show significant full application speed-ups on a range of processors, such as the Intel Xeon Skylake CPU (15%) as well as on the in-order Intel Xeon Phi Knights Corner (1.99x) architecture and the out-of-order Knights Landing (33%) many-core processor.

Description

Keywords

software prefetching, unstructured mesh, computational fluid dynamics, irregular memory access, memory parallelism, auto-tuning

Journal Title

2018 IEEE/ACM 8th Workshop on Irregular Applications: Architectures and Algorithms (IA3)

Conference Name

IA^3 2018: 8th Workshop on Irregular Applications: Architectures and Algorithms

Journal ISSN

Volume Title

Publisher

IEEE

Rights

All rights reserved
Sponsorship
Engineering and Physical Sciences Research Council (EP/K026399/1)
Parts of this work were supported by the Engineering and Physical Sciences Research Council (EPSRC)and Rolls-Royce plc through the industrial CASE award 13220161 and grant EP/K026399/1. This work used the ARCHER KNL Testing and Development Platform part of the UK National Supercomputing Service