Semi-supervised support vector regression based on self-training with label uncertainty: An application to virtual metrology in semiconductor manufacturing

Pilsung Kang, Dongil Kim, Sungzoon Cho

Research output: Contribution to journalArticlepeer-review

80 Scopus citations

Abstract

Dataset size continues to increase and data are being collected from numerous applications. Because collecting labeled data is expensive and time consuming, the amount of unlabeled data is increasing. Semi-supervised learning (SSL) has been proposed to improve conventional supervised learning methods by training from both unlabeled and labeled data. In contrast to classification problems, the estimation of labels for unlabeled data presents added uncertainty for regression problems. In this paper, a semi-supervised support vector regression (SS-SVR) method based on self-training is proposed. The proposed method addresses the uncertainty of the estimated labels for unlabeled data. To measure labeling uncertainty, the label distribution of the unlabeled data is estimated with two probabilistic local reconstruction (PLR) models. Then, the training data are generated by oversampling from the unlabeled data and their estimated label distribution. The sampling rate is different based on uncertainty. Finally, expected margin-based pattern selection (EMPS) is employed to reduce training complexity. We verify the proposed method with 30 regression datasets and a real-world problem: virtual metrology (VM) in semiconductor manufacturing. The experiment results show that the proposed method improves the accuracy by 8% compared with conventional supervised SVR, and the training time for the proposed method is 20% shorter than that of the benchmark methods.

Original languageEnglish
Pages (from-to)85-106
Number of pages22
JournalExpert Systems with Applications
Volume51
DOIs
StatePublished - 1 Jun 2016

Bibliographical note

Funding Information:
This work was supported by Basic Science Research Program through the National Research Foundation of Korea, South Korea (NRF) funded by the Ministry of Science, ICT, & Future Planning (NRF-2014R1A1A1004648).

Publisher Copyright:
© 2015 Elsevier Ltd. All rights reserved.

Keywords

  • Data generation
  • Probabilistic local reconstruction
  • Semi-supervised learning
  • Semiconductor manufacturing
  • Support vector regression
  • Virtual metrology

Fingerprint

Dive into the research topics of 'Semi-supervised support vector regression based on self-training with label uncertainty: An application to virtual metrology in semiconductor manufacturing'. Together they form a unique fingerprint.

Cite this