Bilevel progressive homography estimation via correlative region-focused transformer

IRIS

We propose a novel correlative region-focused transformer for accurate homography estimation by a bilevel progressive architecture. Existing methods typically consider the entire image features to establish correlations for a pair of input images, but irrelevant regions often introduce mismatches and outliers. In contrast, our network effectively mitigates the negative impact of irrelevant regions through a bilevel progressive homography estimation architecture. Specifically, in the outer iteration, we progressively estimate the homography matrix at different feature scales; in the inner iteration, we dynamically extract correlative regions and progressively focus on their corresponding features from both inputs. Moreover, we develop a quadtree attention mechanism based on the transformer to explicitly capture the correspondence between the input images, localizing and cropping the correlative regions for the next iteration. This progressive training strategy enhances feature consistency and enables precise alignment with comparable inference rates. Extensive experiments on qualitative and quantitative comparisons show that the proposed method exhibits competitive alignment results while reducing the mean average corner error (MACE) on the MS-COCO dataset compared to previous methods, without increasing additional parameter cost.

Bilevel progressive homography estimation via correlative region-focused transformer / Jia, Q.; Feng, X.; Zhang, W.; Liu, Y.; Pu, N.; Sebe, N.. - In: COMPUTER VISION AND IMAGE UNDERSTANDING. - ISSN 1077-3142. - 250:(2025). [10.1016/j.cviu.2024.104209]

Bilevel progressive homography estimation via correlative region-focused transformer

Jia Q.;Feng X.;Zhang W.;Liu Y.;Pu N.;Sebe N.

2025-01-01

Abstract

We propose a novel correlative region-focused transformer for accurate homography estimation by a bilevel progressive architecture. Existing methods typically consider the entire image features to establish correlations for a pair of input images, but irrelevant regions often introduce mismatches and outliers. In contrast, our network effectively mitigates the negative impact of irrelevant regions through a bilevel progressive homography estimation architecture. Specifically, in the outer iteration, we progressively estimate the homography matrix at different feature scales; in the inner iteration, we dynamically extract correlative regions and progressively focus on their corresponding features from both inputs. Moreover, we develop a quadtree attention mechanism based on the transformer to explicitly capture the correspondence between the input images, localizing and cropping the correlative regions for the next iteration. This progressive training strategy enhances feature consistency and enables precise alignment with comparable inference rates. Extensive experiments on qualitative and quantitative comparisons show that the proposed method exhibits competitive alignment results while reducing the mean average corner error (MACE) on the MS-COCO dataset compared to previous methods, without increasing additional parameter cost.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione (Date of publication)
	
				2025
			
	Titolo del periodico (Journal title)
	
				COMPUTER VISION AND IMAGE UNDERSTANDING
			
	DOI
	
				https://dx.doi.org/10.1016/j.cviu.2024.104209
			
	Codice Scopus (Scopus identifier)
	
				2-s2.0-85208658354
			
	Codice WOS (WOS identifier)
	
				WOS:001356384700001
			
	Tutti gli autori
	
						Jia, Q.; Feng, X.; Zhang, W.; Liu, Y.; Pu, N.; Sebe, N.
					
	Citazione
	
				Bilevel progressive homography estimation via correlative region-focused transformer / Jia, Q.; Feng, X.; Zhang, W.; Liu, Y.; Pu, N.; Sebe, N.. - In: COMPUTER VISION AND IMAGE UNDERSTANDING. - ISSN 1077-3142. - 250:(2025). [10.1016/j.cviu.2024.104209]

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/439555

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

0

0

ND

social impact