CapsulePose: A variational CapsNet for real-time end-to-end 3D human pose estimation

Garau, N.; Conci, N.

doi:10.1016/j.neucom.2022.11.097

Estimating 3D human poses from images is an ill-posed regression problem, which is usually tackled by viewpoint-invariant convolutional neural networks (CNNs). Recently, capsule networks (CapsNets) have been introduced as a viable alternative to CNNs, ensuring viewpoint-equivariance and drastically reducing both the dataset size and the network complexity, while retaining high output accuracy. We propose a real-time end-to-end human pose estimation (HPE) network which employs state-of-the-art matrix capsules [1] and a fast variational Bayesian capsule routing, without relying on pre-training, complex data augmentation or multiple datasets. We achieve comparable results to the HPE state-of-the-art, and the lowest error among methods using CapsNets, while at the same time achieving other desirable properties, namely greater generalization capabilities, stronger viewpoint equivariance and highly decreased data dependency, allowing for our network to be trained with only a fraction of the available datasets and without any data augmentation.

CapsulePose: A variational CapsNet for real-time end-to-end 3D human pose estimation / Garau, N.; Conci, N.. - In: NEUROCOMPUTING. - ISSN 0925-2312. - ELETTRONICO. - 523:(2023), pp. 81-91. [10.1016/j.neucom.2022.11.097]

CapsulePose: A variational CapsNet for real-time end-to-end 3D human pose estimation

Garau N.;Conci N.

2023-01-01

Abstract

Estimating 3D human poses from images is an ill-posed regression problem, which is usually tackled by viewpoint-invariant convolutional neural networks (CNNs). Recently, capsule networks (CapsNets) have been introduced as a viable alternative to CNNs, ensuring viewpoint-equivariance and drastically reducing both the dataset size and the network complexity, while retaining high output accuracy. We propose a real-time end-to-end human pose estimation (HPE) network which employs state-of-the-art matrix capsules [1] and a fast variational Bayesian capsule routing, without relying on pre-training, complex data augmentation or multiple datasets. We achieve comparable results to the HPE state-of-the-art, and the lowest error among methods using CapsNets, while at the same time achieving other desirable properties, namely greater generalization capabilities, stronger viewpoint equivariance and highly decreased data dependency, allowing for our network to be trained with only a fraction of the available datasets and without any data augmentation.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione (Date of publication)
	
				2023
			
	Titolo del periodico (Journal title)
	
				NEUROCOMPUTING
			
	DOI
	
				https://dx.doi.org/10.1016/j.neucom.2022.11.097
			
	Codice Scopus (Scopus identifier)
	
				2-s2.0-85144371350
			
	Codice WOS (WOS identifier)
	
				WOS:000904782300008
			
	Tutti gli autori
	
						Garau, N.; Conci, N.
					
	Citazione
	
				CapsulePose: A variational CapsNet for real-time end-to-end 3D human pose estimation / Garau, N.; Conci, N.. - In: NEUROCOMPUTING. - ISSN 0925-2312. - ELETTRONICO. - 523:(2023), pp. 81-91. [10.1016/j.neucom.2022.11.097]
			
	Appare nelle tipologie:
	
				03.1 Articolo su rivista (Journal article)

File in questo prodotto:

File	Dimensione	Formato
_Journal__Neurocomputing__CapsulePose-2.pdf accesso aperto Tipologia: Pre-print non referato (Non-refereed preprint) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 2.08 MB Formato Adobe PDF Visualizza/Apri	2.08 MB	Adobe PDF	Visualizza/Apri
1-s2.0-S0925231222015351-main.pdf Solo gestori archivio Tipologia: Versione editoriale (Publisher’s layout) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 2.03 MB Formato Adobe PDF Visualizza/Apri	2.03 MB	Adobe PDF	Visualizza/Apri