The availability of powerful GPUs and the consequent development of deep neural networks, have brought remarkable results in videogame levels generation, image-to-image translation , video-to-video translation, image inpainting and video generation. Nonetheless, in conditional or constrained settings, unconditioned generative models still suffer because they have little to none control over the generated output. This leads to problems in some scenarios, such as structured objects generation or multimedia manipulation. In the manner, unconstrained GANs fail to generate objects that must satisfy hard constraints (e.g., molecules must be chemically valid or game levels must be playable). In the latter, the manipulation of complex scenes is a challenging and unsolved task, since these scenes are composed of objects and background of different classes. In this thesis , we focus on these two scenarios and propose different techniques to improve deep generative models. First, we introduce Constrained Adversarial Networks (CANs), an extension of GANs in which the constraints are embedded into the model during training. Then we focus on developing novel deep learning models to alter complex urban scenes. In particular, we aim to alter the scene by: i) studying how to better leverage the semantic and instance segmentation to model its content and structure; ii) modifying, inserting and/or removing specific object instances coherently to its semantic; iii) generating coherent and realistic videos where users can alter the object’s position.
Exploring Deep generative models for Structured Object Generation and Complex Scenes Manipulation / Ardino, Pierfrancesco. - (2023 Apr 28), pp. 1-120. [10.15168/11572_375949]
Exploring Deep generative models for Structured Object Generation and Complex Scenes Manipulation
Ardino, Pierfrancesco
2023-04-28
Abstract
The availability of powerful GPUs and the consequent development of deep neural networks, have brought remarkable results in videogame levels generation, image-to-image translation , video-to-video translation, image inpainting and video generation. Nonetheless, in conditional or constrained settings, unconditioned generative models still suffer because they have little to none control over the generated output. This leads to problems in some scenarios, such as structured objects generation or multimedia manipulation. In the manner, unconstrained GANs fail to generate objects that must satisfy hard constraints (e.g., molecules must be chemically valid or game levels must be playable). In the latter, the manipulation of complex scenes is a challenging and unsolved task, since these scenes are composed of objects and background of different classes. In this thesis , we focus on these two scenarios and propose different techniques to improve deep generative models. First, we introduce Constrained Adversarial Networks (CANs), an extension of GANs in which the constraints are embedded into the model during training. Then we focus on developing novel deep learning models to alter complex urban scenes. In particular, we aim to alter the scene by: i) studying how to better leverage the semantic and instance segmentation to model its content and structure; ii) modifying, inserting and/or removing specific object instances coherently to its semantic; iii) generating coherent and realistic videos where users can alter the object’s position.File | Dimensione | Formato | |
---|---|---|---|
phd_unitn_Ardino_Pierfrancesco.pdf
accesso aperto
Tipologia:
Tesi di dottorato (Doctoral Thesis)
Licenza:
Altra licenza (Other type of license)
Dimensione
14.88 MB
Formato
Adobe PDF
|
14.88 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione