Disentangling Monocular 3D Object Detection

IRIS

In this paper we propose an approach for monocular 3D object detection from a single RGB image, which leverages a novel disentangling transformation for 2D and 3D detection losses and a novel, self-supervised confidence score for 3D bounding boxes. Our proposed loss disentanglement has the twofold advantage of simplifying the training dynamics in the presence of losses with complex interactions of parameters, and sidestepping the issue of balancing independent regression terms. Our solution overcomes these issues by isolating the contribution made by groups of parameters to a given loss, without changing its nature. We further apply loss disentanglement to another novel, signed Intersection-over-Union criterion-driven loss for improving 2D detection results. Besides our methodological innovations, we critically review the AP metric used in KITTI3D, which emerged as the most important dataset for comparing 3D detection results. We identify and resolve a flaw in the 11-point interpolated AP metric, affecting all previously published detection results and particularly biases the results of monocular 3D detection. We provide extensive experimental evaluations and ablation studies and set a new state-of-the-art on the KITTI3D Car class.

Disentangling Monocular 3D Object Detection / Simonelli, A.; Bulo, S. R.; Porzi, L.; Lopez-Antequera, M.; Kontschieder, P.. - ELETTRONICO. - 2019-:(2019), pp. 1991-1999. ( 17th IEEE/CVF International Conference on Computer Vision, ICCV 2019 kor 2019) [10.1109/ICCV.2019.00208].

Disentangling Monocular 3D Object Detection

Simonelli A.;Bulo S. R.;Porzi L.;Lopez-Antequera M.;Kontschieder P.

2019-01-01

Abstract

In this paper we propose an approach for monocular 3D object detection from a single RGB image, which leverages a novel disentangling transformation for 2D and 3D detection losses and a novel, self-supervised confidence score for 3D bounding boxes. Our proposed loss disentanglement has the twofold advantage of simplifying the training dynamics in the presence of losses with complex interactions of parameters, and sidestepping the issue of balancing independent regression terms. Our solution overcomes these issues by isolating the contribution made by groups of parameters to a given loss, without changing its nature. We further apply loss disentanglement to another novel, signed Intersection-over-Union criterion-driven loss for improving 2D detection results. Besides our methodological innovations, we critically review the AP metric used in KITTI3D, which emerged as the most important dataset for comparing 3D detection results. We identify and resolve a flaw in the 11-point interpolated AP metric, affecting all previously published detection results and particularly biases the results of monocular 3D detection. We provide extensive experimental evaluations and ablation studies and set a new state-of-the-art on the KITTI3D Car class.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione (Date of publication)
	
				2019
			
	Titolo del volume (Proceedings title)
	
				Proceedings of the IEEE International Conference on Computer Vision
			
	Luogo di edizione (Place of publication)
	
				New York
			
	Casa editrice (Publisher)
	
				Institute of Electrical and Electronics Engineers Inc.
			
	ISBN
	
				978-1-7281-4803-8
			
	Codice Scopus (Scopus Identifier)
	
				2-s2.0-85081063144
			
	Codice WOS (WOS identifier)
	
				WOS:000531438102013
			
	Tutti gli autori
	
						Simonelli, A.; Bulo, S. R.; Porzi, L.; Lopez-Antequera, M.; Kontschieder, P.
					
	Citazione
	
				Disentangling Monocular 3D Object Detection / Simonelli, A.; Bulo, S. R.; Porzi, L.; Lopez-Antequera, M.; Kontschieder, P.. - ELETTRONICO. - 2019-:(2019), pp. 1991-1999. ( 17th IEEE/CVF International Conference on Computer Vision, ICCV 2019 kor 2019) [10.1109/ICCV.2019.00208].
			
	Appare nelle tipologie:
	
				04.1 Saggio in atti di convegno (Paper in Proceedings)

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/259387

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

476

413

ND

social impact