Radar Sounders (RSs) are sensors operating in the nadir-looking geometry (with HF or VHF bands) by transmitting modulated electromagnetic (EM) pulses and receiving the backscattering response from different subsurface targets. Recently, convolutional neural network (CNN) architectures were established for characterizing RS signals under the semantic segmentation framework. In this paper, we design a Fast Fourier Transform (FFT) based CNN-Transformer encoder to effectively capture the long-range contexts in the radargram. In our hybrid architecture, CNN models the high-dimensional local spatial contexts, and the Transformer establishes the global spatial contexts between the local spatial ones. To overcome Transformer complex self-attention layers by reducing learnable parameters; - we replace the self-attention mechanism of the Transformer with unparameterized FFT modules as depicted in FNet architecture for Natural Language Processing (NLP). The experimental results on the MCoRDS dataset indicate the capability of the CNN-Transformer encoder along with the unparameterized FFT modules to characterize the radargram with limited accuracy cost and by reducing the time consumption. A comparative analysis is carried out with the state-of-the-art Transformer-based architecture.

An FFT-based CNN-Transformer Encoder for Semantic Segmentation of Radar Sounder Signal / Ghosh, R; Bovolo, F. - STAMPA. - 12267:(2022), p. 27. (Intervento presentato al convegno SPIE Remote Sensing tenutosi a Berlin, Germani nel 5-8 September 2022) [10.1117/12.2636693].

An FFT-based CNN-Transformer Encoder for Semantic Segmentation of Radar Sounder Signal

Ghosh, R;Bovolo, F
2022-01-01

Abstract

Radar Sounders (RSs) are sensors operating in the nadir-looking geometry (with HF or VHF bands) by transmitting modulated electromagnetic (EM) pulses and receiving the backscattering response from different subsurface targets. Recently, convolutional neural network (CNN) architectures were established for characterizing RS signals under the semantic segmentation framework. In this paper, we design a Fast Fourier Transform (FFT) based CNN-Transformer encoder to effectively capture the long-range contexts in the radargram. In our hybrid architecture, CNN models the high-dimensional local spatial contexts, and the Transformer establishes the global spatial contexts between the local spatial ones. To overcome Transformer complex self-attention layers by reducing learnable parameters; - we replace the self-attention mechanism of the Transformer with unparameterized FFT modules as depicted in FNet architecture for Natural Language Processing (NLP). The experimental results on the MCoRDS dataset indicate the capability of the CNN-Transformer encoder along with the unparameterized FFT modules to characterize the radargram with limited accuracy cost and by reducing the time consumption. A comparative analysis is carried out with the state-of-the-art Transformer-based architecture.
2022
Proceeding of SPIE Conference on Image and Signal Processing for Remote Sensing XXVIII
AA.VV.
1000 20TH ST, PO BOX 10, BELLINGHAM, WA 98227-0010 USA
SPIE-INT SOC OPTICAL ENGINEERING
9781510655379
9781510655386
Ghosh, R; Bovolo, F
An FFT-based CNN-Transformer Encoder for Semantic Segmentation of Radar Sounder Signal / Ghosh, R; Bovolo, F. - STAMPA. - 12267:(2022), p. 27. (Intervento presentato al convegno SPIE Remote Sensing tenutosi a Berlin, Germani nel 5-8 September 2022) [10.1117/12.2636693].
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/364906
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 1
social impact