Generalized Gumbel-Softmax gradient estimator for generic discrete random variables

Weonyoung Joo, Dongjun Kim, Seungjae Shin, Il Chul Moon

Research output: Contribution to journalArticlepeer-review

Abstract

Estimating the gradients of stochastic nodes in stochastic computational graphs is one of the crucial research questions in the deep generative modeling community, which enables gradient descent optimization on neural network parameters. Stochastic gradient estimators of discrete random variables, such as the Gumbel-Softmax reparameterization trick for Bernoulli and categorical distributions, are widely explored. Meanwhile, other discrete distribution cases, such as the Poisson, geometric, binomial, multinomial, negative binomial, etc., have not been explored. This paper proposes a generalized version of the Gumbel-Softmax stochastic gradient estimator. The proposed method is able to reparameterize generic discrete distributions, not restricted to the Bernoulli and the categorical, and it enables learning on large-scale stochastic computational graphs with discrete random nodes. Our experiments consist of (1) synthetic examples and applications on variational autoencoders, which show the efficacy of our methods; and (2) topic models, which demonstrate the value of the proposed estimation in practice.

Original languageEnglish
Pages (from-to)148-155
Number of pages8
JournalPattern Recognition Letters
Volume196
DOIs
StatePublished - Oct 2025

Bibliographical note

Publisher Copyright:
© 2025 Elsevier B.V.

Keywords

  • Deep generative model
  • Discrete random variable
  • Gumbel-softmax trick
  • Reparameterization trick
  • Stochastic gradient estimator
  • Variational autoencoder

Fingerprint

Dive into the research topics of 'Generalized Gumbel-Softmax gradient estimator for generic discrete random variables'. Together they form a unique fingerprint.

Cite this