Analysis of Thread Block Scheduling Algorithms for General Purpose GPU Systems

Soyeon Park, Kyungwoon Cho, Hyokyung Bahn

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

Modern GPGPUs (General-Purpose Graphics Processing Units) have the ability of executing thousands of threads simultaneously. However, the resource utilization of GPGPU in real systems is limited as the load balancing between SMs (Stream Multiprocessors) is difficult during the scheduling of thread blocks, which are the basic units for resource allocation in GPGPU. In order to schedule thread blocks in GPGPU, the current hardware scheduler allocates thread blocks to SMs by the Round-Robin order. Although this is simple and easy to implement, we show that Round-Robin is not efficient when thread blocks of heterogeneous workloads are mixed. In such environments, efficient resource sharing in GPGPU is challenging as workloads have different resource usage patterns, but scheduling should be performed instantly. In this paper, we present a new thread block scheduling algorithm that has the ability of analyzing the load of SMs and the characteristics of pending thread blocks. Specifically, we formulate thread block scheduling as a bin-packing problem, and aim to minimize the internal fragmentation of SMs by arranging size-aware filling of thread blocks to overall SMs in advance. To do so, we make use of multiple queues for incoming thread blocks according to their sizes and perform scheduling by considering the load balancing of SMs. Our experimental results under a wide range of workload conditions show that the proposed algorithm improves the performance of GPGPU by 24.8% on average compared to the Round-Robin scheduler.

Original languageEnglish
Title of host publication2021 IEEE Asia-Pacific Conference on Computer Science and Data Engineering, CSDE 2021
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781665495523
DOIs
StatePublished - 2021
Event2021 IEEE Asia-Pacific Conference on Computer Science and Data Engineering, CSDE 2021 - Brisbane, Australia
Duration: 8 Dec 202110 Dec 2021

Publication series

Name2021 IEEE Asia-Pacific Conference on Computer Science and Data Engineering, CSDE 2021

Conference

Conference2021 IEEE Asia-Pacific Conference on Computer Science and Data Engineering, CSDE 2021
Country/TerritoryAustralia
CityBrisbane
Period8/12/2110/12/21

Bibliographical note

Funding Information:
ACKNOWLEDGMENT This work was supported by the ICT R&D program of MSIP/IITP (2018-0-00549, extremely scalable order preserving OS for manycore and non-volatile memory) and (2019-0-00074, developing system software technologies for emerging new memory that adaptively learn workload characteristics). Hyokyung Bahn is the corresponding author of this paper.

Publisher Copyright:
© IEEE 2022.

Keywords

  • GPGPU
  • load balancing
  • multitasking
  • resource utilization
  • thread block scheduler

Fingerprint

Dive into the research topics of 'Analysis of Thread Block Scheduling Algorithms for General Purpose GPU Systems'. Together they form a unique fingerprint.

Cite this