Towards Generalization Across Depth for Monocular 3D Object Detection

IRIS

While expensive LiDAR and stereo camera rigs have enabled the development of successful 3D object detection methods, monocular RGB-only approaches lag much behind. This work advances the state of the art by introducing MoVi-3D, a novel, single-stage deep architecture for monocular 3D object detection. MoVi-3D builds upon a novel approach which leverages geometrical information to generate, both at training and test time, virtual views where the object appearance is normalized with respect to distance. These virtually generated views facilitate the detection task as they significantly reduce the visual appearance variability associated to objects placed at different distances from the camera. As a consequence, the deep model is relieved from learning depth-specific representations and its complexity can be significantly reduced. In particular, in this work we show that, thanks to our virtual views generation process, a lightweight, single-stage architecture suffices to set new state-of-the-art results on the popular KITTI3D benchmark.

Towards Generalization Across Depth for Monocular 3D Object Detection / Simonelli, A.; Bulo, S. R.; Porzi, L.; Ricci, E.; Kontschieder, P.. - 12367:(2020), pp. 767-782. (Intervento presentato al convegno 16th European Conference on Computer Vision, ECCV 2020 tenutosi a Glasgow, UK nel 23–28 August, 2020) [10.1007/978-3-030-58542-6_46].

Towards Generalization Across Depth for Monocular 3D Object Detection

Simonelli A.;Bulo S. R.;Porzi L.;Ricci E.;Kontschieder P.

2020-01-01

Abstract

While expensive LiDAR and stereo camera rigs have enabled the development of successful 3D object detection methods, monocular RGB-only approaches lag much behind. This work advances the state of the art by introducing MoVi-3D, a novel, single-stage deep architecture for monocular 3D object detection. MoVi-3D builds upon a novel approach which leverages geometrical information to generate, both at training and test time, virtual views where the object appearance is normalized with respect to distance. These virtually generated views facilitate the detection task as they significantly reduce the visual appearance variability associated to objects placed at different distances from the camera. As a consequence, the deep model is relieved from learning depth-specific representations and its complexity can be significantly reduced. In particular, in this work we show that, thanks to our virtual views generation process, a lightweight, single-stage architecture suffices to set new state-of-the-art results on the popular KITTI3D benchmark.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione (Date of publication)
	
				2020
			
	Titolo del volume (Proceedings title)
	
				Computer Vision – ECCV 2020
			
	Luogo di edizione (Place of publication)
	
				Cham, Svizzera
			
	Casa editrice (Publisher)
	
				Springer Science and Business Media Deutschland GmbH
			
	ISBN
	
				978-3-030-58541-9
978-3-030-58542-6
			
	Codice Scopus (Scopus Identifier)
	
				2-s2.0-85097276967
			
	Tutti gli autori
	
						Simonelli, A.; Bulo, S. R.; Porzi, L.; Ricci, E.; Kontschieder, P.
					
	Citazione
	
				Towards Generalization Across Depth for Monocular 3D Object Detection / Simonelli, A.; Bulo, S. R.; Porzi, L.; Ricci, E.; Kontschieder, P.. - 12367:(2020), pp. 767-782. (Intervento presentato al  convegno 16th European Conference on Computer Vision, ECCV 2020 tenutosi a Glasgow, UK nel 23–28 August, 2020) [10.1007/978-3-030-58542-6_46].
			
	Appare nelle tipologie:
	
				04.1 Saggio in atti di convegno (Paper in Proceedings)

File in questo prodotto:

File	Dimensione	Formato
andrea_compressed (1).pdf Solo gestori archivio Tipologia: Versione editoriale (Publisher’s layout) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 424.43 kB Formato Adobe PDF Visualizza/Apri	424.43 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/285517

Citazioni

ND

34

ND

ND

social impact