Design of Viewpoint-Equivariant Networks to Improve Human Pose Estimation

IRIS

Human pose estimation (HPE) is an ever-growing research field, with an increasing number of publications in the computer vision and deep learning fields and it covers a multitude of practical scenarios, from sports to entertainment and from surveillance to medical applications. Despite the impressive results that can be obtained with HPE, there are still many problems that need to be tackled when dealing with real-world applications. Most of the issues are linked to a poor or completely wrong detection of the pose that emerges from the inability of the network to model the viewpoint. This thesis shows how designing viewpoint-equivariant neural networks can lead to substantial improvements in the field of human pose estimation, both in terms of state-of-the-art results and better real-world applications. By jointly learning how to build hierarchical human body poses together with the observer viewpoint, a network can learn to generalise its predictions when dealing with previously unseen viewpoints. As a result, the amount of training data needed can be drastically reduced, simultaneously leading to faster and more efficient training and more robust and interpretable real-world applications.

Design of Viewpoint-Equivariant Networks to Improve Human Pose Estimation / Garau, Nicola. - (2022 May 31), pp. 1-193. [10.15168/11572_345132]

Design of Viewpoint-Equivariant Networks to Improve Human Pose Estimation

Garau, Nicola

2022-05-31

Abstract

Human pose estimation (HPE) is an ever-growing research field, with an increasing number of publications in the computer vision and deep learning fields and it covers a multitude of practical scenarios, from sports to entertainment and from surveillance to medical applications. Despite the impressive results that can be obtained with HPE, there are still many problems that need to be tackled when dealing with real-world applications. Most of the issues are linked to a poor or completely wrong detection of the pose that emerges from the inability of the network to model the viewpoint. This thesis shows how designing viewpoint-equivariant neural networks can lead to substantial improvements in the field of human pose estimation, both in terms of state-of-the-art results and better real-world applications. By jointly learning how to build hierarchical human body poses together with the observer viewpoint, a network can learn to generalise its predictions when dealing with previously unseen viewpoints. As a result, the amount of training data needed can be drastically reduced, simultaneously leading to faster and more efficient training and more robust and interpretable real-world applications.

Scheda breve

Scheda completa

Scheda completa (DC)

	Data di esame finale/Defended on
	
			31-mag-2022
		
	Ciclo
	
			XXXIV
		
	Anno Accademico
	
			2020-2021
		
	Dipartimento
	
			Ingegneria e scienza dell'Informaz (29/10/12-)
		
	Corso di dottorato
	
			Information and Communication Technology
		
	Supervisore/Relatore di tesi Unitn (Unitn internal supervisor)
	
			Conci, Nicola
		
	Tesi in cotutela (Bi-nationally supervised Doctoral Thesis)
	
			no
		
	Codice DOI
	
			https://dx.doi.org/10.15168/11572_345132
		
	Lingua (Language)
	
			Inglese
		
	Appare nelle tipologie:
	
			08.1 Tesi di dottorato (Doctoral Thesis)

File in questo prodotto:

File	Dimensione	Formato
PhDThesis_Garau_Nicola.pdf accesso aperto Descrizione: PhD Thesis Tipologia: Tesi di dottorato (Doctoral Thesis) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 74.99 MB Formato Adobe PDF Visualizza/Apri	74.99 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/345132

Citazioni

ND

ND

ND

social impact