A Structure-Guided Diffusion Model for Large-Hole Image Completion

IRIS

Image completion techniques have made significant progress in filling missing regions (i.e., holes) in images. However, large-hole completion remains challenging due to limited structural information. In this paper, we address this problem by integrating explicit structural guidance into diffusion-based image completion, forming our structure-guided diffusion model (SGDM). It consists of two cascaded diffusion probabilistic models: structure and texture generators. The structure generator generates an edge image representing plausible structures within the holes, which is then used for guiding the texture generation process. To train both generators jointly, we devise a novel strategy that leverages optimal Bayesian denoising, which denoises the output of the structure generator in a single step and thus allows backpropagation. Our diffusion-based approach enables a diversity of plausible completions, while the editable edges allow for editing parts of an image. Our experiments on natural scene (Places) and face (CelebA-HQ) datasets demonstrate that our method achieves a superior or comparable visual quality compared to state-of-the-art approaches. The code is available for research purposes at https://github.com/UdonDa/Structure_Guided_Diffusion_Model.

A Structure-Guided Diffusion Model for Large-Hole Image Completion / Horita, Daichi; Yang, Jiaolong; Chen, Dong; Koyama, Yuki; Aizawa, Kiyoharu; Sebe, Nicu. - (2023), pp. 1-15. (Intervento presentato al convegno BMVC tenutosi a Aberdeen, UK nel 20-24 November, 2023).

A Structure-Guided Diffusion Model for Large-Hole Image Completion

Horita, Daichi;Yang, Jiaolong;Chen, Dong;Koyama, Yuki;Aizawa, Kiyoharu;Sebe, Nicu

2023-01-01

Abstract

Image completion techniques have made significant progress in filling missing regions (i.e., holes) in images. However, large-hole completion remains challenging due to limited structural information. In this paper, we address this problem by integrating explicit structural guidance into diffusion-based image completion, forming our structure-guided diffusion model (SGDM). It consists of two cascaded diffusion probabilistic models: structure and texture generators. The structure generator generates an edge image representing plausible structures within the holes, which is then used for guiding the texture generation process. To train both generators jointly, we devise a novel strategy that leverages optimal Bayesian denoising, which denoises the output of the structure generator in a single step and thus allows backpropagation. Our diffusion-based approach enables a diversity of plausible completions, while the editable edges allow for editing parts of an image. Our experiments on natural scene (Places) and face (CelebA-HQ) datasets demonstrate that our method achieves a superior or comparable visual quality compared to state-of-the-art approaches. The code is available for research purposes at https://github.com/UdonDa/Structure_Guided_Diffusion_Model.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione (Date of publication)
	
				2023
			
	Titolo del volume (Proceedings title)
	
				British Machine Vision Conference (BMVC’23)
			
	Luogo di edizione (Place of publication)
	
				UK
			
	Casa editrice (Publisher)
	
				BMVA
			
	Tutti gli autori
	
						Horita, Daichi; Yang, Jiaolong; Chen, Dong; Koyama, Yuki; Aizawa, Kiyoharu; Sebe, Nicu
					
	Citazione
	
				A Structure-Guided Diffusion Model for Large-Hole Image Completion / Horita, Daichi; Yang, Jiaolong; Chen, Dong; Koyama, Yuki; Aizawa, Kiyoharu; Sebe, Nicu. - (2023), pp. 1-15. (Intervento presentato al  convegno BMVC tenutosi a Aberdeen, UK nel 20-24 November, 2023).

File in questo prodotto:

File	Dimensione	Formato
2211.10437-compressed.pdf accesso aperto Tipologia: Post-print referato (Refereed author’s manuscript) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 1.01 MB Formato Adobe PDF Visualizza/Apri	1.01 MB	Adobe PDF	Visualizza/Apri
0258.pdf accesso aperto Tipologia: Versione editoriale (Publisher’s layout) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 6.47 MB Formato Adobe PDF Visualizza/Apri	6.47 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/401001

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

ND

ND

social impact