Semantic change detection (SCD) refers to the task of simultaneously extracting changed areas and their semantic categories (before and after the changes) in remote sensing images (RSIs). This is more meaningful than binary change detection (BCD) since it enables detailed change analysis in the observed areas. Previous works established triple-branch convolutional neural network (CNN) architectures as the paradigm for SCD. However, it remains challenging to exploit semantic information with a limited amount of change samples. In this work, we investigate to jointly consider the spatio-temporal dependencies to improve the accuracy of SCD. First, we propose a semantic change transformer (SCanFormer) to explicitly model the 'from-to' semantic transitions between the bitemporal RSIs. Then, we introduce a semantic learning scheme to leverage the spatio-temporal constraints, which are coherent to the SCD task, to guide the learning of semantic changes. The resulting network semantic change network (SCanNet) significantly outperforms the baseline method in terms of both detection of critical semantic changes and semantic consistency in the obtained bitemporal results. It achieves the state-of-the-art (SOTA) accuracy on two benchmark datasets for the SCD.
Joint Spatio-Temporal Modeling for Semantic Change Detection in Remote Sensing Images / Ding, L.; Zhang, J.; Guo, H.; Zhang, K.; Liu, B.; Bruzzone, L.. - In: IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING. - ISSN 0196-2892. - 62:(2024), pp. 1-14. [10.1109/TGRS.2024.3362795]
Joint Spatio-Temporal Modeling for Semantic Change Detection in Remote Sensing Images
Ding L.;Zhang J.;Guo H.;Bruzzone L.
2024-01-01
Abstract
Semantic change detection (SCD) refers to the task of simultaneously extracting changed areas and their semantic categories (before and after the changes) in remote sensing images (RSIs). This is more meaningful than binary change detection (BCD) since it enables detailed change analysis in the observed areas. Previous works established triple-branch convolutional neural network (CNN) architectures as the paradigm for SCD. However, it remains challenging to exploit semantic information with a limited amount of change samples. In this work, we investigate to jointly consider the spatio-temporal dependencies to improve the accuracy of SCD. First, we propose a semantic change transformer (SCanFormer) to explicitly model the 'from-to' semantic transitions between the bitemporal RSIs. Then, we introduce a semantic learning scheme to leverage the spatio-temporal constraints, which are coherent to the SCD task, to guide the learning of semantic changes. The resulting network semantic change network (SCanNet) significantly outperforms the baseline method in terms of both detection of critical semantic changes and semantic consistency in the obtained bitemporal results. It achieves the state-of-the-art (SOTA) accuracy on two benchmark datasets for the SCD.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione