Deep Generative Positive-Unlabeled Learning under Selection Bias

Byeonghu Na, Hyemi Kim, Kyungwoo Song, Weonyoung Joo, Yoon Yeong Kim, Il Chul Moon

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

18 Scopus citations

Abstract

Learning in the positive-unlabeled (PU) setting is prevalent in real world applications. Many previous works depend upon theSelected Completely At Random (SCAR) assumption to utilize unlabeled data, but the SCAR assumption is not often applicable to the real world due to selection bias in label observations. This paper is the first generative PU learning model without the SCAR assumption. Specifically, we derive the PU risk function without the SCAR assumption, and we generate a set of virtual PU examples to train the classifier. Although our PU risk function is more generalizable, the function requires PU instances that do not exist in the observations. Therefore, we introduce the VAE-PU, which is a variant of variational autoencoders to separate two latent variables that generate either features or observation indicators. The separated latent information enables the model to generate virtual PU instances. We test the VAE-PU on benchmark datasets with and without the SCAR assumption. The results indicate that the VAE-PU is superior when selection bias exists, and the VAE-PU is also competent under the SCAR assumption. The results also emphasize that the VAE-PU is effective when there are few positive-labeled instances due to modeling on selection bias.

Original languageEnglish
Title of host publicationCIKM 2020 - Proceedings of the 29th ACM International Conference on Information and Knowledge Management
PublisherAssociation for Computing Machinery
Pages1155-1164
Number of pages10
ISBN (Electronic)9781450368599
DOIs
StatePublished - 19 Oct 2020
Externally publishedYes
Event29th ACM International Conference on Information and Knowledge Management, CIKM 2020 - Virtual, Online, Ireland
Duration: 19 Oct 202023 Oct 2020

Publication series

NameInternational Conference on Information and Knowledge Management, Proceedings

Conference

Conference29th ACM International Conference on Information and Knowledge Management, CIKM 2020
Country/TerritoryIreland
CityVirtual, Online
Period19/10/2023/10/20

Bibliographical note

Funding Information:
The authors would like to thank the anonymous reviewers for their valuable comments and helpful suggestions on our manuscript. This research was supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF), funded by the Ministry of Education (NRF-2019M3F2A1072239).

Publisher Copyright:
© 2020 ACM.

Keywords

  • positive-unlabeled learning
  • selection bias
  • variational autoencoders

Fingerprint

Dive into the research topics of 'Deep Generative Positive-Unlabeled Learning under Selection Bias'. Together they form a unique fingerprint.

Cite this