Avant-Garde: Empowering GPUs with Scaled Numeric Formats

Minseong Gil, Dongho Ha, Simla Burcu Harma, Myung Kuk Yoon, Babak Falsafi, Won Woo Ro, Yunho Oh

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The escalating computational and memory demands of deep neural networks have outpaced chip density improvements, making arithmetic density a key bottleneck for GPUs. Scaled numeric formats, such as FP8 and Microscaling (MX), improve arithmetic density by applying adaptive scaling factors across varying block sizes and multiple scaling hierarchies. Unfortunately, supporting diverse scaled numeric formats often requires GPUs to rely on softwarebased implementations, increasing instruction and register overhead and degrading performance. We propose Avant-Garde, a GPU microarchitecture that natively supports diverse scaled numeric formats by converting them into a consistent single-level internal representation. Avant-Garde integrates an Operand Transformer, a hardware module that dynamically flattens multi-level scaling formats into single-level internal representations, a novel Tensor Core, and an optimized data layout to eliminate instruction and register overhead. Our evaluations show that Avant-Garde achieves up to 74% higher throughput and 44% lower execution time, while maintaining accuracy within 0.2% compared to conventional GPUs.

Original languageEnglish
Title of host publicationISCA 2025 - Proceedings of the 52nd Annual International Symposium on Computer Architecture
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages153-165
Number of pages13
ISBN (Electronic)9798400712616
DOIs
StatePublished - 21 Jun 2025
Event52nd Annual International Symposium on Computer Architecture, ISCA 2025 - Tokyo, Japan
Duration: 21 Jun 202525 Jun 2025

Publication series

NameProceedings - International Symposium on Computer Architecture
ISSN (Print)1063-6897
ISSN (Electronic)2575-713X

Conference

Conference52nd Annual International Symposium on Computer Architecture, ISCA 2025
Country/TerritoryJapan
CityTokyo
Period21/06/2525/06/25

Bibliographical note

Publisher Copyright:
© 2025 Copyright held by the owner/author(s).

Keywords

  • Deep Neural Network
  • GPU
  • Scaled Numeric Format

Fingerprint

Dive into the research topics of 'Avant-Garde: Empowering GPUs with Scaled Numeric Formats'. Together they form a unique fingerprint.

Cite this