Performance analysis of matrix-free conjugate gradient kernels using SYCL
View / Open Files
Conference Name
10th International Workshop on OpenCL and SYCL
Type
Conference Object
This Version
AM
Metadata
Show full item recordCitation
Baratta, I., Richardson, C., & Wells, G. Performance analysis of matrix-free conjugate gradient kernels using SYCL. 10th International Workshop on OpenCL and SYCL. https://doi.org/10.17863/CAM.83190
Abstract
We examine the performance of matrix-free SYCL implementations of the
conjugate gradient method for solving sparse linear systems of
equations. Performance is tested on an NVIDIA A100-80GB device and a
dual socket Intel Ice Lake CPU node using different SYCL
implementations, and compared to CUDA BLAS (cuBLAS) implementations on
the A100 GPU and MKL implementations on the CPU node. All considered
kernels in the matrix-free implementation are memory bandwidth
limited, and a simple performance model is applied to estimate the
asymptotic memory bandwidth and the latency. Our experiments show that
in most cases the considered SYCL implementations match the asymptotic
performance of the reference implementations. However, for smaller but
practically relevant problem sizes latency is observed to have a
significant impact on performance. For some cases the SYCL latency is
reasonably close to the reference (cuBLAS/MKL) implementation latency,
but in other cases it is more than one order of magnitude greater. In
particular, SYCL built-in reductions on the GPU and all operations for
one of the SYCL implementations on the CPU exhibit high latency, and
this latency limits performance at problem sizes that can in cases be
representative of full application simulations, and can degrade strong
scaling performance.
Relationships
Is supplemented by: https://doi.org/10.1145/3529538.3529993
Embargo Lift Date
2023-04-04
Identifiers
External DOI: https://doi.org/10.17863/CAM.83190
This record's URL: https://www.repository.cam.ac.uk/handle/1810/335753
Statistics
Total file downloads (since January 2020). For more information on metrics see the
IRUS guide.