Fusion of a panchromatic (PAN) image and corresponding multispectral (MS) image is also known as pansharpening, which aims to combine abundant spatial details of PAN and spectral information of MS images. Due to the absence of high-resolution MS images, available deep-learning-based methods usually follow the paradigm of training at reduced resolution and testing at both reduced and full resolution. When taking original MS and PAN images as inputs, they always obtain sub-optimal results due to the scale variation. In this paper, we propose to explore the self-supervised representation for pansharpening by designing a cross-predictive diffusion model, named CrossDiff. It has two-stage training. In the first stage, we introduce a cross-predictive pretext task to pre-train the UNet structure based on conditional Denoising Diffusion Probabilistic Model (DDPM). While in the second stage, the encoders of the UNets are frozen to directly extract spatial and spectral features from PAN and MS i...

Fusion of a panchromatic (PAN) image and corresponding multispectral (MS) image is also known as pansharpening, which aims to combine abundant spatial details of PAN and spectral information of MS images. Due to the absence of high-resolution MS images, available deep-learning-based methods usually follow the paradigm of training at reduced resolution and testing at both reduced and full resolution. When taking original MS and PAN images as inputs, they always obtain sub-optimal results due to the scale variation. In this paper, we propose to explore the self-supervised representation for pansharpening by designing a cross-predictive diffusion model, named CrossDiff. It has two-stage training. In the first stage, we introduce a cross-predictive pretext task to pre-train the UNet structure based on conditional Denoising Diffusion Probabilistic Model (DDPM). While in the second stage, the encoders of the UNets are frozen to directly extract spatial and spectral features from PAN and MS images, and only the fusion head is trained to adapt for pansharpening task. Extensive experiments show the effectiveness and superiority of the proposed model compared with state-of-the-art supervised and unsupervised methods. Besides, the cross-sensor experiments also verify the generalization ability of proposed self-supervised representation learners for other satellite datasets. Code is available at https://github.com/codgodtao/CrossDiff.

CrossDiff: Exploring Self-SupervisedRepresentation of Pansharpening via Cross-Predictive Diffusion Model / Xing, Yinghui; Qu, Litao; Zhang, Shizhou; Zhang, Kai; Zhang, Yanning; Bruzzone, Lorenzo. - In: IEEE TRANSACTIONS ON IMAGE PROCESSING. - ISSN 1057-7149. - 33:(2024), pp. 5496-5509. [10.1109/TIP.2024.3461476]

CrossDiff: Exploring Self-SupervisedRepresentation of Pansharpening via Cross-Predictive Diffusion Model

Bruzzone, Lorenzo
2024-01-01

Abstract

Fusion of a panchromatic (PAN) image and corresponding multispectral (MS) image is also known as pansharpening, which aims to combine abundant spatial details of PAN and spectral information of MS images. Due to the absence of high-resolution MS images, available deep-learning-based methods usually follow the paradigm of training at reduced resolution and testing at both reduced and full resolution. When taking original MS and PAN images as inputs, they always obtain sub-optimal results due to the scale variation. In this paper, we propose to explore the self-supervised representation for pansharpening by designing a cross-predictive diffusion model, named CrossDiff. It has two-stage training. In the first stage, we introduce a cross-predictive pretext task to pre-train the UNet structure based on conditional Denoising Diffusion Probabilistic Model (DDPM). While in the second stage, the encoders of the UNets are frozen to directly extract spatial and spectral features from PAN and MS i...
2024
Xing, Yinghui; Qu, Litao; Zhang, Shizhou; Zhang, Kai; Zhang, Yanning; Bruzzone, Lorenzo
CrossDiff: Exploring Self-SupervisedRepresentation of Pansharpening via Cross-Predictive Diffusion Model / Xing, Yinghui; Qu, Litao; Zhang, Shizhou; Zhang, Kai; Zhang, Yanning; Bruzzone, Lorenzo. - In: IEEE TRANSACTIONS ON IMAGE PROCESSING. - ISSN 1057-7149. - 33:(2024), pp. 5496-5509. [10.1109/TIP.2024.3461476]
File in questo prodotto:
File Dimensione Formato  
CrossDiff_Exploring_Self-Supervised_Representation_of_Pansharpening_via_Cross-Predictive_Diffusion_Model.pdf

embargo fino al 20/09/2026

Descrizione: Accepted Manuscript
Tipologia: Post-print referato (Refereed author’s manuscript)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 88.35 MB
Formato Adobe PDF
88.35 MB Adobe PDF   Visualizza/Apri
CrossDiff_Exploring_Self-SupervisedRepresentation_of_Pansharpening_via_Cross-Predictive_Diffusion_Model.pdf

Solo gestori archivio

Descrizione: > 10 MB
Tipologia: Versione editoriale (Publisher’s layout)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 12.34 MB
Formato Adobe PDF
12.34 MB Adobe PDF   Visualizza/Apri
CrossDiff_Exploring_Self-SupervisedRepresentation_of_Pansharpening_via_Cross-Predictive_Diffusion_Model_compressed.pdf

Solo gestori archivio

Tipologia: Versione editoriale (Publisher’s layout)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 2.77 MB
Formato Adobe PDF
2.77 MB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/444073
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 14
  • ???jsp.display-item.citation.isi??? 13
  • OpenAlex ND
social impact