Triple-A: Early Operand Collector Allocation for Maximizing GPU Register Bank Utilization

Research output: Contribution to journalArticlepeer-review

Abstract

Recent GPUs provisioned with large register files (RFs) cannot fully utilize the bandwidth between the RFs and execution pipelines, as the current policy for allocating operand (OP) collectors defers the RF accesses until all the source OPs become ready. To tackle this issue, this letter introduces a new OP collector allocation mechanism called Triple-A. Triple-A comprises four key operations. First, Triple-A proactively allocates an OP collector (OC) to a warp instruction even if one of its source OPs is not yet ready, taking advantage of GPUs' in-order execution. Second, a computation result can be directly forwarded to an early allocated OC along with a data dependence, reducing OP loading time from the RFs. Third, Triple-A bypasses RF write operations if the forwarded data is not consumed by any other instruction. Finally, the early allocation is further enhanced with latency-aware optimization, alleviating the potential performance degradation caused by allocating OCs aggressively. Together, these techniques synergistically improve the register bank utilization, demonstrating a 14.1% improvement in performance and an 11.8% reduction in RF energy consumption compared to the state-of-the-art GPUs.

Original languageEnglish
Pages (from-to)206-209
Number of pages4
JournalIEEE Embedded Systems Letters
Volume16
Issue number2
DOIs
StatePublished - 1 Jun 2024

Bibliographical note

Publisher Copyright:
© 2009-2012 IEEE.

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 7 - Affordable and Clean Energy
    SDG 7 Affordable and Clean Energy

Keywords

  • Data forwarding
  • graphics processing units (GPUs)
  • operand collector (OC)
  • register files (RFs)

Fingerprint

Dive into the research topics of 'Triple-A: Early Operand Collector Allocation for Maximizing GPU Register Bank Utilization'. Together they form a unique fingerprint.

Cite this