Convolutional neural networks (CNNs) have performed notably in change detection (CD) tasks due to their superior learning and automatic feature extraction capabilities. However, they suffer from the limited receptive field and the weak modeling of long-range dependencies. Vision transformers (ViTs) excel in modeling long-range contexts and have been recently introduced in CD. Some works have combined CNN and transformers to obtain local–global information. However, these works do not fully consider the guidance and interactions from both local features (LFs) and global features (GFs). Most importantly, most of them involve a very large number of parameters and computational costs. To address these issues, in this article, we propose a lightweight CD network that mixes features across CNN and transformer (MixCDNet). We use EfficientNet as the backbone and design a novel mixing features block (MFB). First, we employ hierarchical feature extraction blocks, where local feature blocks (LFBs...

Convolutional neural networks (CNNs) have performed notably in change detection (CD) tasks due to their superior learning and automatic feature extraction capabilities. However, they suffer from the limited receptive field and the weak modeling of long-range dependencies. Vision transformers (ViTs) excel in modeling long-range contexts and have been recently introduced in CD. Some works have combined CNN and transformers to obtain local–global information. However, these works do not fully consider the guidance and interactions from both local features (LFs) and global features (GFs). Most importantly, most of them involve a very large number of parameters and computational costs. To address these issues, in this article, we propose a lightweight CD network that mixes features across CNN and transformer (MixCDNet). We use EfficientNet as the backbone and design a novel mixing features block (MFB). First, we employ hierarchical feature extraction blocks, where local feature blocks (LFBs) and global feature blocks (GFBs) are utilized for extracting information at different spatial resolutions. Second, we propose to exploit bidirectional interactions across LFBs and GFBs branches to provide complementary clues while capturing LFs and GFs. Moreover, a skip-connection and fusion separable self-attention layer (SFSSL) is designed to obtain GFs with low complexity. Comprehensive experiments are conducted on three high-resolution remote sensing (HRRS) images CD datasets: LEVIR-CD, WHU-CD, and CDD. The results show the effectiveness of the proposed MixCDNet in improving the performance of existing CD methods with fewer parameters (0.32 M) and lower computation costs (1.59G FLOPs).

MixCDNet: A Lightweight Change Detection Network Mixing Features Across CNN and Transformer / Wang, Linlin; Zhang, Junping; Bruzzone, Lorenzo. - In: IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING. - ISSN 1558-0644. - 62:4411915(2024), pp. 1-15. [10.1109/TGRS.2024.3438228]

MixCDNet: A Lightweight Change Detection Network Mixing Features Across CNN and Transformer

Lorenzo Bruzzone
Ultimo
2024-01-01

Abstract

Convolutional neural networks (CNNs) have performed notably in change detection (CD) tasks due to their superior learning and automatic feature extraction capabilities. However, they suffer from the limited receptive field and the weak modeling of long-range dependencies. Vision transformers (ViTs) excel in modeling long-range contexts and have been recently introduced in CD. Some works have combined CNN and transformers to obtain local–global information. However, these works do not fully consider the guidance and interactions from both local features (LFs) and global features (GFs). Most importantly, most of them involve a very large number of parameters and computational costs. To address these issues, in this article, we propose a lightweight CD network that mixes features across CNN and transformer (MixCDNet). We use EfficientNet as the backbone and design a novel mixing features block (MFB). First, we employ hierarchical feature extraction blocks, where local feature blocks (LFBs...
2024
4411915
Wang, Linlin; Zhang, Junping; Bruzzone, Lorenzo
MixCDNet: A Lightweight Change Detection Network Mixing Features Across CNN and Transformer / Wang, Linlin; Zhang, Junping; Bruzzone, Lorenzo. - In: IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING. - ISSN 1558-0644. - 62:4411915(2024), pp. 1-15. [10.1109/TGRS.2024.3438228]
File in questo prodotto:
File Dimensione Formato  
TGRS3438228.pdf

embargo fino al 05/08/2026

Tipologia: Post-print referato (Refereed author’s manuscript)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 4.35 MB
Formato Adobe PDF
4.35 MB Adobe PDF   Visualizza/Apri
MixCDNet_A_Lightweight_Change_Detection_Network_Mixing_Features_Across_CNN_and_Transformer.pdf

Solo gestori archivio

Tipologia: Versione editoriale (Publisher’s layout)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 2.37 MB
Formato Adobe PDF
2.37 MB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/443992
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 8
  • ???jsp.display-item.citation.isi??? 7
  • OpenAlex ND
social impact