iQAN: Fast and Accurate Vector Search with Efficient Intra-Query Parallelism on Multi-Core Architectures (PPoPP 2023 - Main Conference)

Who

Zhen Peng, Minjia Zhang, Kai Li, Ruoming Jin, Bin Ren

Track

PPoPP 2023 Main Conference

Time Zone

The program is currently displayed in (GMT-05:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-05:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Tue 28 Feb 2023 15:40 - 16:00 at Montreal 4 - Session 6: Kernels Chair(s): Martin Kong

Abstract

Vector search has drawn a rapid increase of interest in the research community due to its application in novel AI applications. Maximizing its performance is essential for many tasks but remains preliminary understood. In this work, we investigate the root causes of the scalability bottleneck of using intra-query parallelism to speedup the state-of-the-art graph-based vector search systems on multi-core architectures. Our in-depth analysis reveals several scalability challenges from both system and algorithm perspectives. Based on the insights, we propose iQAN, a parallel search algorithm with a set of optimizations that boost convergence, avoid redundant computations, and mitigate synchronization overhead. Our evaluation results on a wide range of real-world datasets show that iQAN achieves up to $37.7\times$ and $76.6\times$ lower latency than state-of-the-art sequential baselines on datasets ranging from a million to a hundred million datasets. We also show that iQAN achieves outstanding scalability as the graph size or the accuracy target increases, allowing it to outperform the state-of-the-art baseline on two billion-scale datasets by up to $16.0\times$ with up to 64 cores.

Zhen Peng

William & Mary

United States

Minjia Zhang

Microsoft Research

United States

Kai Li

Kent State University

Ruoming Jin

Kent State University

Bin Ren

College of William & Mary

United States

Time Zone

The program is currently displayed in (GMT-05:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-05:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Tue 28 Feb
Displayed time zone: Eastern Time (US & Canada) change

15:40 - 16:40	Session 6: KernelsMain Conference at Montreal 4 Chair(s): Martin Kong The Ohio State University

15:40 20m Talk		iQAN: Fast and Accurate Vector Search with Efficient Intra-Query Parallelism on Multi-Core Architectures Main Conference Zhen Peng William & Mary, Minjia Zhang Microsoft Research, Kai Li Kent State University, Ruoming Jin Kent State University, Bin Ren College of William & Mary
16:00 20m Talk		WISE: Predicting the Performance of Sparse Matrix Vector Multiplication with Machine Learning Main Conference Serif Yesil University of Illinois Urbana-Champaign, Azin Heidarshenas University of Illinois Urbana-Champaign, Adam Morrison Tel Aviv University, Josep Torrellas University of Illinois at Urbana-Champaign
16:20 20m Talk		Efficient Direct Convolution Using Long SIMD Instructions Main Conference Alexandre Santana Barcelona Supercomputing Center, Adrià Armejach Sanosa Barcelona Supercomputing Center & Universitat Politècnica de Catalunya, Marc Casas Barcelona Supercomputing Center