Self-Supervised Remote Sensing Image Change Detection and Data Fusion

Chen, Yuxing

doi:10.15168/11572_397832

Self-supervised learning models, which are called foundation models, have achieved great success in computer vision. Meanwhile, the limited access to labeled data has driven the development of self-supervised methods in remote sensing tasks. In remote sensing image change detection, the generative models are extensively utilized in unsupervised binary change detection tasks, while they overly focus on pixels rather than on abstract feature representations. In addition, the state-of-the-art satellite image time series change detection approaches fail to effectively leverage the spatial-temporal information of image time series or generalize well to unseen scenarios. Similarly, in the context of multimodal remote sensing data fusion, the recent successes of deep learning techniques mainly focus on specific tasks and complete data fusion paradigms. These task-specific models lack of generalizability to other remote sensing tasks and become overfitted to the dominant modalities. Moreover, they fail to handle incomplete modalities inputs and experience severe degradation in downstream tasks. To address these challenges associated with individual supervised learning models, this thesis presents two novel contributions to self-supervised learning models on remote sensing image change detection and multimodal remote sensing data fusion. The first contribution proposes a bi-temporal / multi-temporal contrastive change detection framework, which employs contrastive loss on image patches or superpixels to get fine-grained change maps and incorporates an uncertainty method to enhance the temporal robustness. In the context of satellite image time series change detection, the proposed approach improves the consistency of pseudo labels through feature tracking and tackles the challenges posed by seasonal changes in long-term remote sensing image time series using supervised contrastive loss and the random walk loss in ConvLSTM. The second contribution develops a self-supervised multimodal RS data fusion framework, with a specific focus on addressing the incomplete multimodal RS data fusion challenges in downstream tasks. Within this framework, multimodal RS data are fused by applying a multi-view contrastive loss at the pixel level and reconstructing each modality using others in a generative way based on MultiMAE. In downstream tasks, the proposed approach leverages a random modality combination training strategy and an attention block to enable fusion across modal-incomplete inputs. The thesis assesses the effectiveness of the proposed self-supervised change detection approach on single-sensor and cross-sensor datasets of SAR and multispectral images, and evaluates the proposed self-supervised multimodal RS data fusion approach on the multimodal RS dataset with SAR, multispectral images, DEM and also LULC maps. The self-supervised change detection approach demonstrates improvements over state-of-the-art unsupervised change detection methods in challenging scenarios involving multi-temporal and multi-sensor RS image change detection. Similarly, the self-supervised multimodal remote sensing data fusion approach achieves the best performance by employing an intermediate fusion strategy on SAR and optical image pairs, outperforming existing unsupervised data fusion approaches. Notably, in incomplete multimodal fusion tasks, the proposed method exhibits impressive performance on all modal-incomplete and single modality inputs, surpassing the performance of vanilla MultiViT, which tends to overfit on dominant modality inputs and fails in tasks with single modality inputs.

Self-Supervised Remote Sensing Image Change Detection and Data Fusion / Chen, Yuxing. - (2023 Nov 27), pp. 1-154. [10.15168/11572_397832]

Self-Supervised Remote Sensing Image Change Detection and Data Fusion

Chen, Yuxing

2023-11-27

Abstract

Self-supervised learning models, which are called foundation models, have achieved great success in computer vision. Meanwhile, the limited access to labeled data has driven the development of self-supervised methods in remote sensing tasks. In remote sensing image change detection, the generative models are extensively utilized in unsupervised binary change detection tasks, while they overly focus on pixels rather than on abstract feature representations. In addition, the state-of-the-art satellite image time series change detection approaches fail to effectively leverage the spatial-temporal information of image time series or generalize well to unseen scenarios. Similarly, in the context of multimodal remote sensing data fusion, the recent successes of deep learning techniques mainly focus on specific tasks and complete data fusion paradigms. These task-specific models lack of generalizability to other remote sensing tasks and become overfitted to the dominant modalities. Moreover, they fail to handle incomplete modalities inputs and experience severe degradation in downstream tasks. To address these challenges associated with individual supervised learning models, this thesis presents two novel contributions to self-supervised learning models on remote sensing image change detection and multimodal remote sensing data fusion. The first contribution proposes a bi-temporal / multi-temporal contrastive change detection framework, which employs contrastive loss on image patches or superpixels to get fine-grained change maps and incorporates an uncertainty method to enhance the temporal robustness. In the context of satellite image time series change detection, the proposed approach improves the consistency of pseudo labels through feature tracking and tackles the challenges posed by seasonal changes in long-term remote sensing image time series using supervised contrastive loss and the random walk loss in ConvLSTM. The second contribution develops a self-supervised multimodal RS data fusion framework, with a specific focus on addressing the incomplete multimodal RS data fusion challenges in downstream tasks. Within this framework, multimodal RS data are fused by applying a multi-view contrastive loss at the pixel level and reconstructing each modality using others in a generative way based on MultiMAE. In downstream tasks, the proposed approach leverages a random modality combination training strategy and an attention block to enable fusion across modal-incomplete inputs. The thesis assesses the effectiveness of the proposed self-supervised change detection approach on single-sensor and cross-sensor datasets of SAR and multispectral images, and evaluates the proposed self-supervised multimodal RS data fusion approach on the multimodal RS dataset with SAR, multispectral images, DEM and also LULC maps. The self-supervised change detection approach demonstrates improvements over state-of-the-art unsupervised change detection methods in challenging scenarios involving multi-temporal and multi-sensor RS image change detection. Similarly, the self-supervised multimodal remote sensing data fusion approach achieves the best performance by employing an intermediate fusion strategy on SAR and optical image pairs, outperforming existing unsupervised data fusion approaches. Notably, in incomplete multimodal fusion tasks, the proposed method exhibits impressive performance on all modal-incomplete and single modality inputs, surpassing the performance of vanilla MultiViT, which tends to overfit on dominant modality inputs and fails in tasks with single modality inputs.

Scheda breve

Scheda completa

Scheda completa (DC)

	Data di esame finale/Defended on
	
				27-nov-2023
			
	Ciclo
	
				XXXV
			
	Anno Accademico
	
				2022-2023
			
	Dipartimento
	
				Ingegneria e scienza dell'Informaz (29/10/12-)
			
	Corso di dottorato
	
				Information and Communication Technology
			
	Supervisore/Relatore di tesi Unitn (Unitn internal supervisor)
	
				Bruzzone, Lorenzo
			
	Supervisore aggiunto/Correlatore Unitn (Unitn Co-Supervisor)
	
				Vitale, Stefano
			
	Tesi in cotutela (Bi-nationally supervised Doctoral Thesis)
	
				no
			
	Codice DOI
	
				https://dx.doi.org/10.15168/11572_397832
			
	Lingua (Language)
	
				Inglese
			
	Appare nelle tipologie:
	
				08.1 Tesi di dottorato (Doctoral Thesis)

File in questo prodotto:

File	Dimensione	Formato
phd_unitn_yuxing_chen.pdf accesso aperto Tipologia: Tesi di dottorato (Doctoral Thesis) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 3.16 MB Formato Adobe PDF Visualizza/Apri	3.16 MB	Adobe PDF	Visualizza/Apri