With the development of high-resolution satellites, more and more attention has been paid to remote sensing (RS) scene classification. Convolutional neural networks (CNNs), which replace the traditional handcrafted features with a learning-based feature extraction mechanism, are widely used in scene classification. But, CNNs are less effective in deriving long-range contextual relations, which limits further improvement. Visual transformer (VT), an emerging image processing method, provides a new perspective for RS scene classification by directly acquiring long-range features. Although there have been limited works combining CNN and VT through simple concatenation, the collaborations between them are insufficient. To address these issues, we propose a local and long-range collaborative framework (L2RCF). First, we design a dual-stream structure to extract the local and long-range features. Second, a cross-feature calibration (CFC) module is designed for them to improve the representation of the fusion features. Then, combining deep supervision (DS) and deep mutual learning (DML), a novel joint loss is proposed to enhance the dual-stream feature extractor and further improve the fused features. Finally, a two-stage semi-supervised training strategy is designed to improve performance with unlabeled samples. To demonstrate the effectiveness of L2RCF, we conducted experiments on three widely used RS scene classification datasets: RSSCN7, AID, and NWPU. The results show that the L2RCF performs significantly better compared with some state-of-the-art scene classification methods.

Local and Long-Range Collaborative Learning for Remote Sensing Scene Classification / Zhao, Maofan; Meng, Qingyan; Zhang, Linlin; Hu, Xinli; Bruzzone, Lorenzo. - In: IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING. - ISSN 0196-2892. - 61:(2023), pp. 1-15. [10.1109/TGRS.2023.3265346]

Local and Long-Range Collaborative Learning for Remote Sensing Scene Classification

Bruzzone, Lorenzo
2023-01-01

Abstract

With the development of high-resolution satellites, more and more attention has been paid to remote sensing (RS) scene classification. Convolutional neural networks (CNNs), which replace the traditional handcrafted features with a learning-based feature extraction mechanism, are widely used in scene classification. But, CNNs are less effective in deriving long-range contextual relations, which limits further improvement. Visual transformer (VT), an emerging image processing method, provides a new perspective for RS scene classification by directly acquiring long-range features. Although there have been limited works combining CNN and VT through simple concatenation, the collaborations between them are insufficient. To address these issues, we propose a local and long-range collaborative framework (L2RCF). First, we design a dual-stream structure to extract the local and long-range features. Second, a cross-feature calibration (CFC) module is designed for them to improve the representation of the fusion features. Then, combining deep supervision (DS) and deep mutual learning (DML), a novel joint loss is proposed to enhance the dual-stream feature extractor and further improve the fused features. Finally, a two-stage semi-supervised training strategy is designed to improve performance with unlabeled samples. To demonstrate the effectiveness of L2RCF, we conducted experiments on three widely used RS scene classification datasets: RSSCN7, AID, and NWPU. The results show that the L2RCF performs significantly better compared with some state-of-the-art scene classification methods.
2023
Zhao, Maofan; Meng, Qingyan; Zhang, Linlin; Hu, Xinli; Bruzzone, Lorenzo
Local and Long-Range Collaborative Learning for Remote Sensing Scene Classification / Zhao, Maofan; Meng, Qingyan; Zhang, Linlin; Hu, Xinli; Bruzzone, Lorenzo. - In: IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING. - ISSN 0196-2892. - 61:(2023), pp. 1-15. [10.1109/TGRS.2023.3265346]
File in questo prodotto:
File Dimensione Formato  
TGRS3265346.pdf

accesso aperto

Tipologia: Post-print referato (Refereed author’s manuscript)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 9.69 MB
Formato Adobe PDF
9.69 MB Adobe PDF Visualizza/Apri
Local_and_Long-Range_Collaborative_Learning_for_Remote_Sensing_Scene_Classification.pdf

Solo gestori archivio

Tipologia: Versione editoriale (Publisher’s layout)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 3.76 MB
Formato Adobe PDF
3.76 MB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/400184
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? 13
social impact