Accelerating X-Ray Tracing for Exascale Systems using Kokkos
Felix Wittwer, Nicholas K. Sauter, Derek Mendez, Billy K. Poon, Aaron S. Brewster, James M. Holton, Michael E. Wall, William E. Hart, Deborah J. Bard, Johannes P. Blaschke
The upcoming exascale computing systems Frontier and Aurora will draw much of
their computing power from GPU accelerators. The hardware for these systems
will be provided by AMD and Intel, respectively, each supporting their own GPU
programming model. The challenge for applications that harness one of these
exascale systems will be to avoid lock-in and to preserve performance
portability.
We report here on our results of using Kokkos to accelerate a real-world
application on NERSC's Perlmutter Phase 1 (using NVIDIA A100 accelerators) and
the testbed system for OLCF's Frontier (using AMD MI250X). By porting to
Kokkos, we were able to successfully run the same X-ray tracing code on both
systems and achieved speed-ups between 13% and 66% compared to the original
CUDA code. These results are a highly encouraging demonstration of using Kokkos
to accelerate production science code.