PPoPP 2023
Sat 25 February - Wed 1 March 2023 Montreal, Canada
Mon 27 Feb 2023 14:50 - 15:10 at Montreal 4 - Session 2: Programming Chair(s): Michael Scott

While parallelism remains the main source of performance, architectural implementations and programming models change with each new hardware generation, often leading to costly application re-engineering. Most tools for performance portability require manual and costly application porting to yet another programming model.

We propose an alternative approach that automatically translates programs written in one programming model (CUDA), into another (CPU threads) based on Polygeist/MLIR. Our approach includes a representation of parallel constructs that allows conventional compiler transformations to apply transparently and without modification and enables parallelism-specific optimizations. We evaluate our framework by transpiling and optimizing the CUDA Rodinia benchmark suite for a multi-core CPU and achieve a 58% geomean speedup over handwritten OpenMP code. Further, we show how CUDA kernels from PyTorch can efficiently run and scale on the CPU-only Supercomputer Fugaku without user intervention. Our PyTorch compatibility layer making use of transpiled CUDA PyTorch kernels outperforms the PyTorch CPU native backend by 2.7$\times$.

Mon 27 Feb

Displayed time zone: Eastern Time (US & Canada) change

13:50 - 15:10
Session 2: ProgrammingMain Conference at Montreal 4
Chair(s): Michael Scott University of Rochester
13:50
20m
Talk
A Programming Model for GPU Load Balancing
Main Conference
Muhammad Osama University of California, Davis, Serban D. Porumbescu University of California, Davis, John D. Owens University of California, Davis
14:10
20m
Talk
Exploring the Use of WebAssembly in HPC
Main Conference
Mohak Chadha Chair of Computer Architecture and Parallel Systems, Technical University of Munich, Nils Krueger Chair of Computer Architecture and Parallel Systems, Technical University of Munich, Jophin John Chair of Computer Architecture and Parallel Systems, Technical University of Munich, Anshul Jindal Chair of Computer Architecture and Parallel Systems, Technical University of Munich, Michael Gerndt TUM, Shajulin Benedict Indian Institute of Information Technology Kottayam, Kerala, India
14:30
20m
Talk
Fast and Scalable Channels in Kotlin Coroutines
Main Conference
Nikita Koval JetBrains, Dan Alistarh IST Austria, Roman Elizarov JetBrains
14:50
20m
Talk
High-Performance GPU-to-CPU Transpilation and Optimization via High-Level Parallel Constructs
Main Conference
William S. Moses Massachusetts Institute of Technology, Ivan Radanov Ivanov Tokyo Institute of Technology, Jens Domke RIKEN Center for Computational Science, Toshio Endo Tokyo Institute of Technology, Johannes Doerfert Lawrence Livermore National Laboratory, Oleksandr Zinenko Google