Repository logo
 

Lynx: Using OS and hardware support for fast fine-grained inter-core communication

Accepted version
Peer-reviewed

Repository DOI


Loading...
Thumbnail Image

Type

Conference Object

Change log

Authors

Mitropoulou, K 
Porpodas, V 
Zhang, X 
Jones, TM 

Abstract

Designing high-performance software queues for fast intercore communication is challenging, but critical for maximising software parallelism. State-of-the-art single-producer / single-consumer queues for streaming applications contain multiple sections, requiring the producer and consumer to operate independently on different sections from each other. While these queues perform well for coarse-grained data transfers, they perform poorly in the fine-grained case. This paper proposes Lynx, a novel SP/SC queue, specifically tuned for fine-grained communication. Lynx is built from the ground up, reducing the generated code on the critical-path to just two operations per enqueue and dequeue. To achieve this it relies on existing commodity processor hardware and operating system exception handling support to deal with infrequent queue maintenance operations. Lynx outperforms the state-of-the art by up to 1.57× in total 64-bit throughput reaching a peak throughput of 15.7GB/s on a common desktop system. Real applications using Lynx get a performance improvement of up to 1.4×.

Description

Keywords

single-producer / single-consumer software queue, finegrained communication, hardware exceptions

Journal Title

Proceedings of the International Conference on Supercomputing

Conference Name

ICS '16: 2016 International Conference on Supercomputing

Journal ISSN

Volume Title

Publisher

ACM
Sponsorship
Engineering and Physical Sciences Research Council (EP/K026399/1)
Engineering and Physical Sciences Research Council (EP/J016284/1)
This work was supported by the Engineering and Physical Sciences Research Council (EPSRC), through grant reference EP/K026399/1.