In this paper, we present a method to synthetically generate the training material needed by machine learning algorithms to perform human action recognition from 2D videos. As a baseline pipeline, we consider a 2D video stream passing through a skeleton extractor (OpenPose), whose 2D joint coordinates are analyzed by a random forest. Such a pipeline is trained and tested using real live videos. As an alternative approach, we propose to train the random forest using automatically generated 3D synthetic videos. For each action, given a single reference live video, we edit a 3D animation (in Blender) using the rotoscoping technique. This prior animation is then used to produce a full training set of synthetic videos via perturbation of the original animation curves. Our tests, performed on live videos, show that our alternative pipeline leads to comparable accuracy, with the advantage of drastically reducing both the human effort and the computing power needed to produce the live training material.

Generation of Action Recognition Training Data Through Rotoscoping and Augmentation of Synthetic Animations / Covre, N.; Nunnari, F.; Fornaser, A.; De Cecco, M.. - 11614:(2019), pp. 23-42. ((Intervento presentato al convegno 6th International Conference on Augmented Reality, Virtual Reality and Computer Graphics, SALENTO AVR 2019 tenutosi a italia nel 2019 [10.1007/978-3-030-25999-0_3].

Generation of Action Recognition Training Data Through Rotoscoping and Augmentation of Synthetic Animations

Covre N.;Fornaser A.;De Cecco M.
2019

Abstract

In this paper, we present a method to synthetically generate the training material needed by machine learning algorithms to perform human action recognition from 2D videos. As a baseline pipeline, we consider a 2D video stream passing through a skeleton extractor (OpenPose), whose 2D joint coordinates are analyzed by a random forest. Such a pipeline is trained and tested using real live videos. As an alternative approach, we propose to train the random forest using automatically generated 3D synthetic videos. For each action, given a single reference live video, we edit a 3D animation (in Blender) using the rotoscoping technique. This prior animation is then used to produce a full training set of synthetic videos via perturbation of the original animation curves. Our tests, performed on live videos, show that our alternative pipeline leads to comparable accuracy, with the advantage of drastically reducing both the human effort and the computing power needed to produce the live training material.
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
italia
Springer Verlag
978-3-030-25998-3
978-3-030-25999-0
Covre, N.; Nunnari, F.; Fornaser, A.; De Cecco, M.
Generation of Action Recognition Training Data Through Rotoscoping and Augmentation of Synthetic Animations / Covre, N.; Nunnari, F.; Fornaser, A.; De Cecco, M.. - 11614:(2019), pp. 23-42. ((Intervento presentato al convegno 6th International Conference on Augmented Reality, Virtual Reality and Computer Graphics, SALENTO AVR 2019 tenutosi a italia nel 2019 [10.1007/978-3-030-25999-0_3].
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11572/288854
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? 1
social impact