In-core, hint-based, speculative multithreading
Repository URI
Repository DOI
Change log
Authors
Abstract
State-of-the-art high-performance processors rely on instruction-level parallelism (ILP) during sequential regions. This results in excellent performance in regions that exhibit large amounts of ILP. However, gains are limited elsewhere, due to strict upper bounds, superlinear scaling of costs, and sublinear returns. As the execution time of optimised regions decreases, Amdahl’s law suggests that achieving good performance in the remaining regions is key for improving whole-program performance. On the other hand, maintaining good performance in high-ILP regions is just as important.
Thread-level speculation (TLS or SpMT, speculative multi-threading) has been identified as a possible solution in past academic work, but there has been a large disconnect between academia and industry in identifying design constraints and addressing practical adoption hurdles, which has led to underwhelming reception by industry. In particular, challenges remain in maintaining performance in high-ILP regions, minimising impact on the operating system and the architecture, and maintaining compatibility with high-performance microarchitectures.
I propose an in-core, hint-based, task-level speculation scheme to solve these challenges and efficiently speed up underperforming low-ILP program regions. Since the scheme does not introduce compatibility-breaking architectural changes, adding it to a modern high- performance out-of-order superscalar processor pipeline is feasible. I devise a set of crucial optimisations to overcome fundamental practical challenges, and describe a system for data forwarding in order to extend coverage to regions with frequent cross-task dependences. Finally, I identify the limitations and challenges that need to be solved to facilitate wide-scale deployment of the technology.