Performance Implications of Transient Loop-Carried Data Dependences in Automatically Parallelized Loops

Murphy, Niall; Jones, Timothy; Mullins, Robert; Campanoni, Simone

Performance Implications of Transient Loop-Carried Data Dependences in Automatically Parallelized Loops

Accepted version

Peer-reviewed

Repository URI

https://www.repository.cam.ac.uk/handle/1810/254463

Files

Accepted version (432.68 KB)

Type

Article

Authors

Murphy, Niall

Jones, Timothy M.

https://orcid.org/0000-0002-4114-7661

Mullins, Robert

https://orcid.org/0000-0002-8393-2748

Campanoni, Simone

Abstract

Recent approaches to automatic parallelization have taken advantage of the low-latency on-chip interconnect provided in modern multicore processors, demonstrating significant speedups, even for complex workloads. Although these techniques can already extract significant thread-level parallelism from application loops, we are interested in quantifying and exploiting any additional performance that remains on the table.

This paper confirms the existence of significant extra threadlevel parallelism within loops parallelized by the HELIX compiler. However, improving static data dependence analysis is unable to reach the additional performance offered because the existing loopcarried dependences are true only on a small subset of loop iterations. We therefore develop three approaches to take advantage of the transient nature of these data dependences through speculation, via transactional memory support. Results show that coupling the state-of-the-art data dependence analysis with fine-grained speculation achieves most of the speedups and may help close the gap towards the limit of HELIX-style thread-level parallelism.

Keywords

Thread-level Speculation, Transactional Memory

Journal Title

PROCEEDINGS OF THE 25TH INTERNATIONAL CONFERENCE ON COMPILER CONSTRUCTION (CC 2016)

Publisher

ACM

Publisher DOI

https://doi.org/10.1145/2892208.2892214

Rights

http://www.rioxx.net/licenses/all-rights-reserved

Sponsorship

Engineering and Physical Sciences Research Council (EP/G033110/1)
Engineering and Physical Sciences Research Council (EP/K026399/1)

This work was supported by the Engineering and Physical Sciences Research Council (EPSRC) through grant references EP/G033110/1 and EP/K026399/1.

Collections

Scholarly Works - Computer Science and Technology
Symplectic mapped items for data match