Vision foundation models (VFMs), such as the segment anything model (SAM), allow zero-shot or interactive segmentation of visual contents; thus, they are quickly applied in a variety of visual scenes. However, their direct use in many remote sensing (RS) applications is often unsatisfactory due to the special imaging properties of RS images (RSIs). In this work, we aim to utilize the strong visual recognition capabilities of VFMs to improve change detection (CD) in very high-resolution (VHR) RSIs. We employ the visual encoder of FastSAM, a variant of the SAM, to extract visual representations in RS scenes. To adapt FastSAM to focus on some specific ground objects in RS scenes, we propose a convolutional adaptor to aggregate the task-oriented change information. Moreover, to utilize the semantic representations that are inherent to SAM features, we introduce a task-agnostic semantic learning branch to model the semantic latent in bitemporal RSIs. The resulting method, SAM-based CD (SAM-CD), obtains superior accuracy compared with the state-of-the-art (SOTA) fully supervised CD methods and exhibits a sample-efficient learning ability that is comparable to semisupervised CD methods. To the best of our knowledge, this is the first work that adapts VFMs to CD in VHR RSIs.

Adapting Segment Anything Model for Change Detection in VHR Remote Sensing Images / Ding, Lei; Zhu, Kun; Peng, Daifeng; Tang, Hao; Yang, Kuiwu; Bruzzone, Lorenzo. - In: IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING. - ISSN 0196-2892. - 62:5611711(2024), pp. 1-11. [10.1109/TGRS.2024.3368168]

Adapting Segment Anything Model for Change Detection in VHR Remote Sensing Images

Lei Ding;Hao Tang;Lorenzo Bruzzone
2024-01-01

Abstract

Vision foundation models (VFMs), such as the segment anything model (SAM), allow zero-shot or interactive segmentation of visual contents; thus, they are quickly applied in a variety of visual scenes. However, their direct use in many remote sensing (RS) applications is often unsatisfactory due to the special imaging properties of RS images (RSIs). In this work, we aim to utilize the strong visual recognition capabilities of VFMs to improve change detection (CD) in very high-resolution (VHR) RSIs. We employ the visual encoder of FastSAM, a variant of the SAM, to extract visual representations in RS scenes. To adapt FastSAM to focus on some specific ground objects in RS scenes, we propose a convolutional adaptor to aggregate the task-oriented change information. Moreover, to utilize the semantic representations that are inherent to SAM features, we introduce a task-agnostic semantic learning branch to model the semantic latent in bitemporal RSIs. The resulting method, SAM-based CD (SAM-CD), obtains superior accuracy compared with the state-of-the-art (SOTA) fully supervised CD methods and exhibits a sample-efficient learning ability that is comparable to semisupervised CD methods. To the best of our knowledge, this is the first work that adapts VFMs to CD in VHR RSIs.
2024
5611711
Ding, Lei; Zhu, Kun; Peng, Daifeng; Tang, Hao; Yang, Kuiwu; Bruzzone, Lorenzo
Adapting Segment Anything Model for Change Detection in VHR Remote Sensing Images / Ding, Lei; Zhu, Kun; Peng, Daifeng; Tang, Hao; Yang, Kuiwu; Bruzzone, Lorenzo. - In: IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING. - ISSN 0196-2892. - 62:5611711(2024), pp. 1-11. [10.1109/TGRS.2024.3368168]
File in questo prodotto:
File Dimensione Formato  
TGRS3368168.pdf

accesso aperto

Descrizione: Accepted Manuscript
Tipologia: Post-print referato (Refereed author’s manuscript)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 10.09 MB
Formato Adobe PDF
10.09 MB Adobe PDF Visualizza/Apri
Adapting_Segment_Anything_Model_for_Change_Detection_in_VHR_Remote_Sensing_Images.pdf

Solo gestori archivio

Tipologia: Versione editoriale (Publisher’s layout)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 3.3 MB
Formato Adobe PDF
3.3 MB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/443995
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 127
  • ???jsp.display-item.citation.isi??? 107
  • OpenAlex 117
social impact