POSTER: High-Throughput GPU Random Walk with Fine-tuned Concurrent Query Processing
Random walk serves as a powerful tool in dealing with large-scale graphs, reducing data size while preserving structural information. Unfortunately, existing system frameworks all focus on the execution of a single walker task in serial. In this work, we show that conventional execution model cannot fully unleash the potential of today’s GPU cores. Simply performing coarse-grained space sharing among multiple tasks incurs unexpected performance interference. Based on the above observations, we propose CoWalker, a high-throughput GPU random walk framework tailored for concurrent random walk tasks. CoWalker introduces a multi-level concurrent execution model and a multi-dimensional scheduler to allow concurrent random walk tasks to efficiently share GPU resources with low overhead. It is able to reduce stalled GPU cores by reorganizing memory access pattern, which leads to higher throughput. Our system prototype confirms that CoWalker could outperform (up to 54%) the state-of-the-art in a wide spectral of scenarios.