In this paper, we study the task of source-free domain adaptation (SFDA), where the source data are not available during target adaptation. Previous works on SFDA mainly focus on aligning the cross-domain distributions. However, they ignore the generalization ability of the pretrained source model, which largely influences the initial target outputs that are vital to the target adaptation stage. To address this, we make the interesting observation that the model accuracy is highly correlated with whether attention is focused on the objects in an image. To this end, we propose a generic and effective framework based on Transformer, named TransDA, for learning a generalized model for SFDA. First, we apply the Transformer blocks as the attention module and inject it into a convolutional network. By doing so, the model is encouraged to turn attention towards the object regions, which can effectively improve the model’s generalization ability on unseen target domains. Second, a novel self-supervised knowledge distillation approach is proposed to adapt the Transformer with target pseudo-labels, further encouraging the network to focus on the object regions. Extensive experiments conducted on three domain adaptation tasks, including closed-set, partial-set, and open-set adaption, demonstrate that TransDA can significantly improve the accuracy over the source model and can produce state-of-the-art results on all settings. The source code and pretrained models are publicly available at: https://github.com/ygjwd12345/TransDA.
Self-training transformer for source-free domain adaptation / Yang, G.; Zhong, Z.; Ding, M.; Sebe, N.; Ricci, E.. - In: APPLIED INTELLIGENCE. - ISSN 0924-669X. - 53:13(2023), pp. 16560-16574. [10.1007/s10489-022-04364-9]
Self-training transformer for source-free domain adaptation
Zhong, Z.;Ding, M.;Sebe, N.;Ricci, E.
2023-01-01
Abstract
In this paper, we study the task of source-free domain adaptation (SFDA), where the source data are not available during target adaptation. Previous works on SFDA mainly focus on aligning the cross-domain distributions. However, they ignore the generalization ability of the pretrained source model, which largely influences the initial target outputs that are vital to the target adaptation stage. To address this, we make the interesting observation that the model accuracy is highly correlated with whether attention is focused on the objects in an image. To this end, we propose a generic and effective framework based on Transformer, named TransDA, for learning a generalized model for SFDA. First, we apply the Transformer blocks as the attention module and inject it into a convolutional network. By doing so, the model is encouraged to turn attention towards the object regions, which can effectively improve the model’s generalization ability on unseen target domains. Second, a novel self-supervised knowledge distillation approach is proposed to adapt the Transformer with target pseudo-labels, further encouraging the network to focus on the object regions. Extensive experiments conducted on three domain adaptation tasks, including closed-set, partial-set, and open-set adaption, demonstrate that TransDA can significantly improve the accuracy over the source model and can produce state-of-the-art results on all settings. The source code and pretrained models are publicly available at: https://github.com/ygjwd12345/TransDA.File | Dimensione | Formato | |
---|---|---|---|
Self-Training-AppliedIntelligence22.pdf
accesso aperto
Descrizione: first online
Tipologia:
Versione editoriale (Publisher’s layout)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
2.25 MB
Formato
Adobe PDF
|
2.25 MB | Adobe PDF | Visualizza/Apri |
s10489-022-04364-9.pdf
Solo gestori archivio
Tipologia:
Versione editoriale (Publisher’s layout)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
2.3 MB
Formato
Adobe PDF
|
2.3 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione