Asynchronous programming has gained significant popularity over the last decade: support for this programming pattern is available in many popular languages via libraries and native language implementations, typically in the form of coroutines or the async/await
construct. Instead of programming via shared memory, this concept assumes implicit synchronization through message passing. The key data structure enabling such communication is the \emph{rendezvous channel}. Roughly, a rendezvous channel is a blocking queue of size zero, so both send(e)
and receive()
operations wait for each other, performing a rendezvous when they meet. To optimize the message passing pattern, channels are usually equipped with a fixed-size buffer, so send
-s do not suspend and put elements into the buffer until its capacity is exceeded. This primitive is known as a \emph{buffered channel}.
This paper presents a fast and scalable algorithm for both rendezvous and buffered channels. Similarly to modern queues, our solution is based on an infinite array with two positional counters for send(e)
and receive()
operations, leveraging the unconditional Fetch-And-Add
instruction to update them. Yet, the algorithm requires non-trivial modifications of this classic pattern, in order to support the full channel semantics, such as buffering and cancellation of waiting requests. We compare the performance of our solution to that of the Kotlin implementation, as well as against other academic proposals, showing up to 9.8$\times$ speedup. To showcase its expressiveness and performance, we also integrated the proposed algorithm into the standard Kotlin Coroutines library, replacing the previous channel implementations.
Mon 27 FebDisplayed time zone: Eastern Time (US & Canada) change
13:50 - 15:10 | |||
13:50 20mTalk | A Programming Model for GPU Load Balancing Main Conference Muhammad Osama University of California, Davis, Serban D. Porumbescu University of California, Davis, John D. Owens University of California, Davis | ||
14:10 20mTalk | Exploring the Use of WebAssembly in HPC Main Conference Mohak Chadha Chair of Computer Architecture and Parallel Systems, Technical University of Munich, Nils Krueger Chair of Computer Architecture and Parallel Systems, Technical University of Munich, Jophin John Chair of Computer Architecture and Parallel Systems, Technical University of Munich, Anshul Jindal Chair of Computer Architecture and Parallel Systems, Technical University of Munich, Michael Gerndt TUM, Shajulin Benedict Indian Institute of Information Technology Kottayam, Kerala, India | ||
14:30 20mTalk | Fast and Scalable Channels in Kotlin Coroutines Main Conference | ||
14:50 20mTalk | High-Performance GPU-to-CPU Transpilation and Optimization via High-Level Parallel Constructs Main Conference William S. Moses Massachusetts Institute of Technology, Ivan Radanov Ivanov Tokyo Institute of Technology, Jens Domke RIKEN Center for Computational Science, Toshio Endo Tokyo Institute of Technology, Johannes Doerfert Lawrence Livermore National Laboratory, Oleksandr Zinenko Google |