In this work, we propose a novel Cycle In Cycle Generative Adversarial Network (C2GAN) for the task of keypoint-guided image generation. The proposed C2GAN is a cross-modal framework exploring a joint exploitation of the keypoint and the image data in an interactive manner. C2GAN contains two different types of generators, i.e., keypoint-oriented generator and image-oriented generator. Both of them are mutually connected in an end-to-end learnable fashion and explicitly form three cycled sub-networks, i.e., one image generation cycle and two keypoint generation cycles. Each cycle not only aims at reconstructing the input domain, and also produces useful output involving in the generation of another cycle. By so doing, the cycles constrain each other implicitly, which provides complementary information from the two different modalities and brings extra supervision across cycles, thus facilitating more robust optimization of the whole network. Extensive experimental results on two publicly available datasets, i.e., Radboud Faces [19] and Market-1501 [58], demonstrate that our approach is effective to generate more photo-realistic images compared with state-of-the-art models.
Cycle in cycle generative adversarial networks for keypoint-guided image generation / Tang, H.; Xu, D.; Liu, G.; Wang, W.; Sebe, N.; Yan, Yan. - (2019), pp. 2052-2060. (Intervento presentato al convegno ACM International Conference on Multimedia (ACM Multimedia’19) tenutosi a Nice nel 21-25 October 2019) [10.1145/3343031.3350980].
Cycle in cycle generative adversarial networks for keypoint-guided image generation
H. Tang;D. Xu;G. Liu;W. Wang;N. Sebe;Y. Yan
2019-01-01
Abstract
In this work, we propose a novel Cycle In Cycle Generative Adversarial Network (C2GAN) for the task of keypoint-guided image generation. The proposed C2GAN is a cross-modal framework exploring a joint exploitation of the keypoint and the image data in an interactive manner. C2GAN contains two different types of generators, i.e., keypoint-oriented generator and image-oriented generator. Both of them are mutually connected in an end-to-end learnable fashion and explicitly form three cycled sub-networks, i.e., one image generation cycle and two keypoint generation cycles. Each cycle not only aims at reconstructing the input domain, and also produces useful output involving in the generation of another cycle. By so doing, the cycles constrain each other implicitly, which provides complementary information from the two different modalities and brings extra supervision across cycles, thus facilitating more robust optimization of the whole network. Extensive experimental results on two publicly available datasets, i.e., Radboud Faces [19] and Market-1501 [58], demonstrate that our approach is effective to generate more photo-realistic images compared with state-of-the-art models.File | Dimensione | Formato | |
---|---|---|---|
3343031.3350980.pdf
Solo gestori archivio
Tipologia:
Versione editoriale (Publisher’s layout)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
9.59 MB
Formato
Adobe PDF
|
9.59 MB | Adobe PDF | Visualizza/Apri |
1908.00999.pdf
accesso aperto
Tipologia:
Post-print referato (Refereed author’s manuscript)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
9.16 MB
Formato
Adobe PDF
|
9.16 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione