Estimating 3D human poses from images is an ill-posed regression problem, which is usually tackled by viewpoint-invariant convolutional neural networks (CNNs). Recently, capsule networks (CapsNets) have been introduced as a viable alternative to CNNs, ensuring viewpoint-equivariance and drastically reducing both the dataset size and the network complexity, while retaining high output accuracy. We propose a real-time end-to-end human pose estimation (HPE) network which employs state-of-the-art matrix capsules [1] and a fast variational Bayesian capsule routing, without relying on pre-training, complex data augmentation or multiple datasets. We achieve comparable results to the HPE state-of-the-art, and the lowest error among methods using CapsNets, while at the same time achieving other desirable properties, namely greater generalization capabilities, stronger viewpoint equivariance and highly decreased data dependency, allowing for our network to be trained with only a fraction of the available datasets and without any data augmentation.

CapsulePose: A variational CapsNet for real-time end-to-end 3D human pose estimation / Garau, N.; Conci, N.. - In: NEUROCOMPUTING. - ISSN 0925-2312. - ELETTRONICO. - 523:(2023), pp. 81-91. [10.1016/j.neucom.2022.11.097]

CapsulePose: A variational CapsNet for real-time end-to-end 3D human pose estimation

Garau N.;Conci N.
2023-01-01

Abstract

Estimating 3D human poses from images is an ill-posed regression problem, which is usually tackled by viewpoint-invariant convolutional neural networks (CNNs). Recently, capsule networks (CapsNets) have been introduced as a viable alternative to CNNs, ensuring viewpoint-equivariance and drastically reducing both the dataset size and the network complexity, while retaining high output accuracy. We propose a real-time end-to-end human pose estimation (HPE) network which employs state-of-the-art matrix capsules [1] and a fast variational Bayesian capsule routing, without relying on pre-training, complex data augmentation or multiple datasets. We achieve comparable results to the HPE state-of-the-art, and the lowest error among methods using CapsNets, while at the same time achieving other desirable properties, namely greater generalization capabilities, stronger viewpoint equivariance and highly decreased data dependency, allowing for our network to be trained with only a fraction of the available datasets and without any data augmentation.
2023
Garau, N.; Conci, N.
CapsulePose: A variational CapsNet for real-time end-to-end 3D human pose estimation / Garau, N.; Conci, N.. - In: NEUROCOMPUTING. - ISSN 0925-2312. - ELETTRONICO. - 523:(2023), pp. 81-91. [10.1016/j.neucom.2022.11.097]
File in questo prodotto:
File Dimensione Formato  
_Journal__Neurocomputing__CapsulePose-2.pdf

accesso aperto

Tipologia: Pre-print non referato (Non-refereed preprint)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 2.08 MB
Formato Adobe PDF
2.08 MB Adobe PDF Visualizza/Apri
1-s2.0-S0925231222015351-main.pdf

Solo gestori archivio

Tipologia: Versione editoriale (Publisher’s layout)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 2.03 MB
Formato Adobe PDF
2.03 MB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/379230
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 6
  • ???jsp.display-item.citation.isi??? 4
social impact