Learning to Adapt Neural Networks Across Visual Domains

Roy, Subhankar

doi:10.15168/11572_354343

In the field of machine learning (ML) a very commonly encountered problem is the lack of generalizability of learnt classification functions when subjected to new samples that are not representative of the training distribution. The discrepancy between the training (a.k.a. source) and test (a.k.a.target) distributions are caused by several latent factors such as change in appearance, illumination, viewpoints and so on, which is also popularly known as domain-shift. In order to make a classifier cope with such domain-shifts, a sub-field in machine learning called domain adaptation (DA) has emerged that jointly uses the annotated data from the source domain together with the unlabelled data from the target domain of interest. For a classifier to be adapted to an unlabelled target data set is of tremendous practical significance because it has no associated labelling cost and allows for more accurate predictions in the environment of interest. A majority of the DA methods which address the single source and single target domain scenario are not easily extendable to many practical DA scenarios. As there has been as increasing focus to make ML models deployable, it calls for devising improved methods that can handle inherently complex practical DA scenarios in the real world. In this work we build towards this goal of addressing more practical DA settings and help realize novel methods for more real world applications: (i) We begin our work with analyzing and addressing the single source and single target setting by proposing whitening-based embedded normalization layers to align the marginal feature distributions between two domains. To better utilize the unlabelled target data we propose an unsupervised regularization loss that encourages both confident and consistent predictions. (ii) Next, we build on top of the proposed normalization layers and use them in a generative framework to address multi-source DA by posing it as an image translation problem. This proposed framework TriGAN allows a single generator to be learned by using all the source domain data into a single network, leading to better generation of target-like source data. (iii) We address multi-target DA by learning a single classifier for all of the target domains. Our proposed framework exploits feature aggregation with a graph convolutional network to align feature representations of similar samples across domains. Moreover, to counteract the noisy pseudo-labels we propose to use a co-teaching strategy with a dual classifier head. To enable smoother adaptation, we propose a domain curriculum learning ,when the domain labels are available, that adapts to one target domain at a time, with increasing domain gap. (iv) Finally, we address the challenging source-free DA where the only source of supervision is a source-trained model. We propose to use Laplace Approximation to build a probabilistic source model that can quantify the uncertainty in the source model predictions on the target data. The uncertainty is then used as importance weights during the target adaptation process, down-weighting target data that do not lie in the source manifold.

Learning to Adapt Neural Networks Across Visual Domains / Roy, Subhankar. - (2022 Sep 29), pp. 1-110. [10.15168/11572_354343]

Learning to Adapt Neural Networks Across Visual Domains

Roy, Subhankar

2022-09-29

Abstract

In the field of machine learning (ML) a very commonly encountered problem is the lack of generalizability of learnt classification functions when subjected to new samples that are not representative of the training distribution. The discrepancy between the training (a.k.a. source) and test (a.k.a.target) distributions are caused by several latent factors such as change in appearance, illumination, viewpoints and so on, which is also popularly known as domain-shift. In order to make a classifier cope with such domain-shifts, a sub-field in machine learning called domain adaptation (DA) has emerged that jointly uses the annotated data from the source domain together with the unlabelled data from the target domain of interest. For a classifier to be adapted to an unlabelled target data set is of tremendous practical significance because it has no associated labelling cost and allows for more accurate predictions in the environment of interest. A majority of the DA methods which address the single source and single target domain scenario are not easily extendable to many practical DA scenarios. As there has been as increasing focus to make ML models deployable, it calls for devising improved methods that can handle inherently complex practical DA scenarios in the real world. In this work we build towards this goal of addressing more practical DA settings and help realize novel methods for more real world applications: (i) We begin our work with analyzing and addressing the single source and single target setting by proposing whitening-based embedded normalization layers to align the marginal feature distributions between two domains. To better utilize the unlabelled target data we propose an unsupervised regularization loss that encourages both confident and consistent predictions. (ii) Next, we build on top of the proposed normalization layers and use them in a generative framework to address multi-source DA by posing it as an image translation problem. This proposed framework TriGAN allows a single generator to be learned by using all the source domain data into a single network, leading to better generation of target-like source data. (iii) We address multi-target DA by learning a single classifier for all of the target domains. Our proposed framework exploits feature aggregation with a graph convolutional network to align feature representations of similar samples across domains. Moreover, to counteract the noisy pseudo-labels we propose to use a co-teaching strategy with a dual classifier head. To enable smoother adaptation, we propose a domain curriculum learning ,when the domain labels are available, that adapts to one target domain at a time, with increasing domain gap. (iv) Finally, we address the challenging source-free DA where the only source of supervision is a source-trained model. We propose to use Laplace Approximation to build a probabilistic source model that can quantify the uncertainty in the source model predictions on the target data. The uncertainty is then used as importance weights during the target adaptation process, down-weighting target data that do not lie in the source manifold.

Scheda breve

Scheda completa

Scheda completa (DC)

	Data di esame finale/Defended on
	
				29-set-2022
			
	Ciclo
	
				XXXIV
			
	Anno Accademico
	
				2018-2019
			
	Dipartimento
	
				Ingegneria e scienza dell'Informaz (29/10/12-)
			
	Corso di dottorato
	
				Informatica e telecomunicazioni (fino a.a. 2020-21, 36° ciclo)
			
	Supervisore/Relatore di tesi Unitn (Unitn internal supervisor)
	
				Ricci, Elisa
Sebe, Niculae
			
	Tesi in cotutela (Bi-nationally supervised Doctoral Thesis)
	
				no
			
	Codice DOI
	
				https://dx.doi.org/10.15168/11572_354343
			
	Lingua (Language)
	
				Inglese
			
	Appare nelle tipologie:
	
				08.1 Tesi di dottorato (Doctoral Thesis)

File in questo prodotto:

File	Dimensione	Formato
PhD_thesis_subhankar_roy.pdf accesso aperto Tipologia: Tesi di dottorato (Doctoral Thesis) Licenza: Creative commons Dimensione 11.44 MB Formato Adobe PDF Visualizza/Apri	11.44 MB	Adobe PDF	Visualizza/Apri