Disentangling Monocular 3D Object Detection: From Single to Multi-Class Recognition

IRIS

In this paper we introduce a method for multi-class, monocular 3D object detection from a single RGB image, which exploits a novel disentangling transformation and a novel, self-supervised confidence estimation method for predicted 3D bounding boxes. The proposed disentangling transformation isolates the contribution made by different groups of parameters to a given loss, without changing its nature. This brings two advantages: i) it simplifies the training dynamics in the presence of losses with complex interactions of parameters, and ii) it allows us to avoid the issue of balancing independent regression terms. We further apply this disentangling transformation to another novel, signed Intersection-over-Union criterion-driven loss for improving 2D detection results. We also critically review the AP metric used in KITTI3D and resolve a flaw which affected and biased all previously published results on monocular 3D detection. Our improved metric is now used as official KITTI3D metric. We provide extensive experimental evaluations and ablation studies on the KITTI3D and nuScenes datasets, setting new state-of-the-art results. We provide additional results on all the classes of KITTI3D as well as nuScenes datasets to further validate the robustness of our method, demonstrating its ability to generalize for different types of objects.

Disentangling Monocular 3D Object Detection: From Single to Multi-Class Recognition / Simonelli, Andrea; Rota Bulo, Samuel; Porzi, Lorenzo; Lopez Antequera, Manuel; Kontschieder, Peter. - In: IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE. - ISSN 0162-8828. - 2020:(2020), pp. 1-1. [10.1109/TPAMI.2020.3025077]

Disentangling Monocular 3D Object Detection: From Single to Multi-Class Recognition

Simonelli, Andrea;Rota Bulo, Samuel;Porzi, Lorenzo;Lopez Antequera, Manuel;Kontschieder, Peter

2020-01-01

Abstract

In this paper we introduce a method for multi-class, monocular 3D object detection from a single RGB image, which exploits a novel disentangling transformation and a novel, self-supervised confidence estimation method for predicted 3D bounding boxes. The proposed disentangling transformation isolates the contribution made by different groups of parameters to a given loss, without changing its nature. This brings two advantages: i) it simplifies the training dynamics in the presence of losses with complex interactions of parameters, and ii) it allows us to avoid the issue of balancing independent regression terms. We further apply this disentangling transformation to another novel, signed Intersection-over-Union criterion-driven loss for improving 2D detection results. We also critically review the AP metric used in KITTI3D and resolve a flaw which affected and biased all previously published results on monocular 3D detection. Our improved metric is now used as official KITTI3D metric. We provide extensive experimental evaluations and ablation studies on the KITTI3D and nuScenes datasets, setting new state-of-the-art results. We provide additional results on all the classes of KITTI3D as well as nuScenes datasets to further validate the robustness of our method, demonstrating its ability to generalize for different types of objects.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione (Date of publication)
	
				2020
			
	Titolo del periodico (Journal title)
	
				IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
			
	DOI
	
				https://dx.doi.org/10.1109/TPAMI.2020.3025077
			
	Codice PubMed (PubMed Identifier)
	
				32946384
			
	Codice Scopus (Scopus identifier)
	
				2-s2.0-85121955666
			
	Codice WOS (WOS identifier)
	
				WOS:000752018000012
			
	Tutti gli autori
	
						Simonelli, Andrea; Rota Bulo, Samuel; Porzi, Lorenzo; Lopez Antequera, Manuel; Kontschieder, Peter
					
	Citazione
	
				Disentangling Monocular 3D Object Detection: From Single to Multi-Class Recognition / Simonelli, Andrea; Rota Bulo, Samuel; Porzi, Lorenzo; Lopez Antequera, Manuel; Kontschieder, Peter. - In: IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE. - ISSN 0162-8828. - 2020:(2020), pp. 1-1. [10.1109/TPAMI.2020.3025077]
			
	Appare nelle tipologie:
	
				03.1 Articolo su rivista (Journal article)

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/296081

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

5

41

41

ND

social impact