Estimating 3D human poses from images is an ill-posed regression problem, which is usually tackled by viewpoint-invariant convolutional neural networks (CNNs). Recently, capsule networks (CapsNets) have been introduced as a viable alternative to CNNs, ensuring viewpoint-equivariance and drastically reducing both the dataset size and the network complexity, while retaining high output accuracy. We propose a real-time end-to-end human pose estimation (HPE) network which employs state-of-the-art matrix capsules [1] and a fast variational Bayesian capsule routing, without relying on pre-training, complex data augmentation or multiple datasets. We achieve comparable results to the HPE state-of-the-art, and the lowest error among methods using CapsNets, while at the same time achieving other desirable properties, namely greater generalization capabilities, stronger viewpoint equivariance and highly decreased data dependency, allowing for our network to be trained with only a fraction of the available datasets and without any data augmentation.
CapsulePose: A variational CapsNet for real-time end-to-end 3D human pose estimation / Garau, N.; Conci, N.. - In: NEUROCOMPUTING. - ISSN 0925-2312. - ELETTRONICO. - 523:(2023), pp. 81-91. [10.1016/j.neucom.2022.11.097]
CapsulePose: A variational CapsNet for real-time end-to-end 3D human pose estimation
Garau N.;Conci N.
2023-01-01
Abstract
Estimating 3D human poses from images is an ill-posed regression problem, which is usually tackled by viewpoint-invariant convolutional neural networks (CNNs). Recently, capsule networks (CapsNets) have been introduced as a viable alternative to CNNs, ensuring viewpoint-equivariance and drastically reducing both the dataset size and the network complexity, while retaining high output accuracy. We propose a real-time end-to-end human pose estimation (HPE) network which employs state-of-the-art matrix capsules [1] and a fast variational Bayesian capsule routing, without relying on pre-training, complex data augmentation or multiple datasets. We achieve comparable results to the HPE state-of-the-art, and the lowest error among methods using CapsNets, while at the same time achieving other desirable properties, namely greater generalization capabilities, stronger viewpoint equivariance and highly decreased data dependency, allowing for our network to be trained with only a fraction of the available datasets and without any data augmentation.File | Dimensione | Formato | |
---|---|---|---|
_Journal__Neurocomputing__CapsulePose-2.pdf
accesso aperto
Tipologia:
Pre-print non referato (Non-refereed preprint)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
2.08 MB
Formato
Adobe PDF
|
2.08 MB | Adobe PDF | Visualizza/Apri |
1-s2.0-S0925231222015351-main.pdf
Solo gestori archivio
Tipologia:
Versione editoriale (Publisher’s layout)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
2.03 MB
Formato
Adobe PDF
|
2.03 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione