Abstract
This paper proposes Dirichlet Variational Autoencoder (DirVAE) using a Dirichlet prior. To infer the parameters of DirVAE, we utilize the stochastic gradient method by approximating the inverse cumulative distribution function of the Gamma distribution, which is a component of the Dirichlet distribution. This approximation on a new prior led an investigation on the component collapsing, and DirVAE revealed that the component collapsing originates from two problem sources: decoder weight collapsing and latent value collapsing. The experimental results show that 1) DirVAE generates the result with the best log-likelihood compared to the baselines; 2) DirVAE produces more interpretable latent values with no collapsing issues which the baselines suffer from; 3) the latent representation from DirVAE achieves the best classification accuracy in the (semi-)supervised classification tasks on MNIST, OMNIGLOT, COIL-20, SVHN, and CIFAR-10 compared to the baseline VAEs; and 4) the DirVAE augmented topic models show better performances in most cases.
Original language | English |
---|---|
Article number | 107514 |
Journal | Pattern Recognition |
Volume | 107 |
DOIs | |
State | Published - Nov 2020 |
Externally published | Yes |
Bibliographical note
Funding Information:This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education ( NRF-2018R1C1B6008652 ).
Publisher Copyright:
© 2020 Elsevier Ltd
Keywords
- Component collapse
- Deep generative model
- Multi-modal latent representation
- Representation learning
- Variational autoencoder