TY - JOUR
T1 - Revisiting Domain-Adaptive Semantic Segmentation via Knowledge Distillation
AU - Jeong, Seongwon
AU - Kim, Jiyeong
AU - Kim, Sungheui
AU - Min, Dongbo
N1 - Publisher Copyright:
© 1992-2012 IEEE.
PY - 2024
Y1 - 2024
N2 - Numerous methods for unsupervised domain adaptation (UDA) have been proposed in semantic segmentation, achieving remarkable improvements. These methods are categorized into an adversarial learning-based approach that utilizes an additional discriminator and image translation model, and a self-supervised approach that uses a teacher model to generate pseudo labels. Among them, the self-supervised UDA approaches based on a self-training show excellent adaptability in semantic segmentation. However, erroneous estimates of the pseudo ground truths (PGTs) used in the self-training may often lead to inaccurate updates in the teacher model. Although several attempts have been made to address this issue, the teacher model updated through exponential moving average (EMA) still has a risk of propagating inaccuracies from the PGTs. Inspired by the fact that UDA shares similar principles with knowledge distillation (KD), we revisit the self-training based UDA approach from the perspective of KD and propose a novel UDA approach that employs two different teacher models. Specifically, we utilize both an EMA-updated teacher model to generate PGTs and a frozen teacher model pretrained with source data to transfer knowledge on a feature space. Since the frozen teacher model has no constraint on the model architecture unlike the EMA updated teacher model, we can effectively leverage a better representation power from the larger frozen teacher. Extensive experiments on various backbones (DeepLab-V2 and DAFormer) and scenarios (GTA5 → Cityscapes and SYNTHIA → Cityscapes) show that the proposed method improves segmentation performance in the target domain with its scalability. In particular, our method achieves comparable or better performance than state-of-the-arts even with a lightweight backbone.
AB - Numerous methods for unsupervised domain adaptation (UDA) have been proposed in semantic segmentation, achieving remarkable improvements. These methods are categorized into an adversarial learning-based approach that utilizes an additional discriminator and image translation model, and a self-supervised approach that uses a teacher model to generate pseudo labels. Among them, the self-supervised UDA approaches based on a self-training show excellent adaptability in semantic segmentation. However, erroneous estimates of the pseudo ground truths (PGTs) used in the self-training may often lead to inaccurate updates in the teacher model. Although several attempts have been made to address this issue, the teacher model updated through exponential moving average (EMA) still has a risk of propagating inaccuracies from the PGTs. Inspired by the fact that UDA shares similar principles with knowledge distillation (KD), we revisit the self-training based UDA approach from the perspective of KD and propose a novel UDA approach that employs two different teacher models. Specifically, we utilize both an EMA-updated teacher model to generate PGTs and a frozen teacher model pretrained with source data to transfer knowledge on a feature space. Since the frozen teacher model has no constraint on the model architecture unlike the EMA updated teacher model, we can effectively leverage a better representation power from the larger frozen teacher. Extensive experiments on various backbones (DeepLab-V2 and DAFormer) and scenarios (GTA5 → Cityscapes and SYNTHIA → Cityscapes) show that the proposed method improves segmentation performance in the target domain with its scalability. In particular, our method achieves comparable or better performance than state-of-the-arts even with a lightweight backbone.
KW - Domain adaptation
KW - knowledge distillation
KW - semantic segmentation
UR - http://www.scopus.com/inward/record.url?scp=85210300328&partnerID=8YFLogxK
U2 - 10.1109/TIP.2024.3501076
DO - 10.1109/TIP.2024.3501076
M3 - Article
AN - SCOPUS:85210300328
SN - 1057-7149
VL - 33
SP - 6761
EP - 6773
JO - IEEE Transactions on Image Processing
JF - IEEE Transactions on Image Processing
ER -