GPU Communication Requires Rethinking Abstractions (PPoPP 2023 - Keynotes)

Track

PPoPP 2023 Keynotes

Time Zone

The program is currently displayed in (GMT-05:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-05:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Tue 28 Feb 2023 08:30 - 09:30 at Montreal 3-4-5 - PPoPP Keynote

Abstract

Machine learning models are growing at an unprecedented speed and fitting such models requires a large distributed GPU system with costly collective communication primitives such as AllReduce or AllToAll. Recent trends in large language models suggest that 30-70% of both training and inferencing is spent on communication. Despite major leaps in hardware innovation for GPU communication, there is still a lot of software performance improvement left on the table. The reason for this gap is majorly due to the lack of a performant P2P communication abstraction on GPUs. HPC applications running on CPUs utilize performant implementations of MPI abstraction to control the communication at a fine-grained level, however, such a performant abstraction does not exist on GPUs.

In this talk, we will discuss how much performance room there is for collective communication on GPUs and the challenges for maximizing it. We will also look at existing abstractions in communication primitives and how they are limiting the space of parallelizing configurations for AI workloads. Lastly, we will look at overlapping communication-computation challenges in GPUs. We will conclude by proposing an abstraction for GPU communication and how effective they are in closing the performance gaps.

Time Zone

The program is currently displayed in (GMT-05:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-05:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Tue 28 Feb
Displayed time zone: Eastern Time (US & Canada) change

08:30 - 09:30	PPoPP KeynoteKeynotes at Montreal 3-4-5

08:30 60m Talk		GPU Communication Requires Rethinking Abstractions Keynotes Saeed Maleki Microsoft Research

GPU Communication Requires Rethinking Abstractions

Program Display Configuration

Program Display Configuration

Tue 28 FebDisplayed time zone: Eastern Time (US & Canada) change

Saeed Maleki

Microsoft Research

Tue 28 Feb
Displayed time zone: Eastern Time (US & Canada) change