The emerging field of vision–language models, which combines computer vision and natural language processing (NLP), has gained significant interest and exploration. This integration has opened up new research opportunities, particularly in remote sensing (RS), where it has the potential to enhance RS systems’ capabilities. In this context, this article presents a comprehensive review of more than 100 articles focusing on the integration of NLP techniques into RS understanding research. The review covers various vision–language modeling tasks, including but not limited to RS image captioning, RS text-to-image retrieval, RS visual question answering (VQA), and RS image generation. For each task, the review provides a summary of the state-of-the-art developments, including methods, evaluation metrics, datasets, and experimental results on benchmark datasets. The review is concluded by discussing the key challenges and highlighting potential research directions for future development, with the aim of inspiring further research in this important field.

Language Integration in Remote Sensing: Tasks, Datasets, and Future Directions / Bashmal, L; Bazi, Y; Melgani, F; Al Rahhal, M. M; Al Zuair, M. A. - In: IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE. - ISSN 2168-6831. - 11:4(2023), pp. 63-93. [10.1109/MGRS.2023.3316438]

Language Integration in Remote Sensing: Tasks, Datasets, and Future Directions

Bazi, Y;Melgani, F;
2023-01-01

Abstract

The emerging field of vision–language models, which combines computer vision and natural language processing (NLP), has gained significant interest and exploration. This integration has opened up new research opportunities, particularly in remote sensing (RS), where it has the potential to enhance RS systems’ capabilities. In this context, this article presents a comprehensive review of more than 100 articles focusing on the integration of NLP techniques into RS understanding research. The review covers various vision–language modeling tasks, including but not limited to RS image captioning, RS text-to-image retrieval, RS visual question answering (VQA), and RS image generation. For each task, the review provides a summary of the state-of-the-art developments, including methods, evaluation metrics, datasets, and experimental results on benchmark datasets. The review is concluded by discussing the key challenges and highlighting potential research directions for future development, with the aim of inspiring further research in this important field.
2023
4
Bashmal, L; Bazi, Y; Melgani, F; Al Rahhal, M. M; Al Zuair, M. A
Language Integration in Remote Sensing: Tasks, Datasets, and Future Directions / Bashmal, L; Bazi, Y; Melgani, F; Al Rahhal, M. M; Al Zuair, M. A. - In: IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE. - ISSN 2168-6831. - 11:4(2023), pp. 63-93. [10.1109/MGRS.2023.3316438]
File in questo prodotto:
File Dimensione Formato  
2023_GRSM_compressed.pdf

Solo gestori archivio

Tipologia: Versione editoriale (Publisher’s layout)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 598.97 kB
Formato Adobe PDF
598.97 kB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/400654
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 4
  • ???jsp.display-item.citation.isi??? 4
  • OpenAlex ND
social impact