Compared with remote sensing image (RSI) captioning methods based on the traditional encoder-decoder model, two-stage RSI captioning methods include an auxiliary remote sensing task to provide prior information, which enables them to generate more accurate descriptions. In previous two-stage RSI captioning methods, however, the image captioning and the auxiliary remote sensing tasks are handled separately, which is time-consuming and ignores mutual interference between tasks. To solve this problem, we propose a novel joint-training two-stage (JTTS) RSI captioning method. We use multilabel classification to provide prior information, and we design a differentiable sampling operator to replace the traditional nondifferentiable sampling operation to index the multilabel classification result. In contrast to previous two-stage RSI captioning methods, our method can implement joint training, and the joint loss allows the error of the generated description to flow into the optimization of th...

A Joint-Training Two-Stage Method For Remote Sensing Image Captioning / Ye, Xiutiao; Wang, Shuang; Gu, Yu; Wang, Jihui; Wang, Ruixuan; Hou, Biao; Giunchiglia, Fausto; Jiao, Licheng. - In: IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING. - ISSN 1558-0644. - 60:(2022). [10.1109/TGRS.2022.3224244]

A Joint-Training Two-Stage Method For Remote Sensing Image Captioning

Yu Gu;Fausto Giunchiglia;
2022-01-01

Abstract

Compared with remote sensing image (RSI) captioning methods based on the traditional encoder-decoder model, two-stage RSI captioning methods include an auxiliary remote sensing task to provide prior information, which enables them to generate more accurate descriptions. In previous two-stage RSI captioning methods, however, the image captioning and the auxiliary remote sensing tasks are handled separately, which is time-consuming and ignores mutual interference between tasks. To solve this problem, we propose a novel joint-training two-stage (JTTS) RSI captioning method. We use multilabel classification to provide prior information, and we design a differentiable sampling operator to replace the traditional nondifferentiable sampling operation to index the multilabel classification result. In contrast to previous two-stage RSI captioning methods, our method can implement joint training, and the joint loss allows the error of the generated description to flow into the optimization of th...
2022
Ye, Xiutiao; Wang, Shuang; Gu, Yu; Wang, Jihui; Wang, Ruixuan; Hou, Biao; Giunchiglia, Fausto; Jiao, Licheng
A Joint-Training Two-Stage Method For Remote Sensing Image Captioning / Ye, Xiutiao; Wang, Shuang; Gu, Yu; Wang, Jihui; Wang, Ruixuan; Hou, Biao; Giunchiglia, Fausto; Jiao, Licheng. - In: IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING. - ISSN 1558-0644. - 60:(2022). [10.1109/TGRS.2022.3224244]
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/441008
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 50
  • ???jsp.display-item.citation.isi??? 29
  • OpenAlex 45
social impact