3D Weakly Supervised Semantic Segmentation with 2D Vision-Language Guidance

Xiaoxu, Xu; Yuan, Yitian; Li, Jinlong; Zhang, Qiudan; Jie, Zequn; Lin, Ma; Tang, Hao; Sebe, Nicu; Wang, Xu

doi:10.1007/978-3-031-73464-9_6

In this paper, we propose 3DSS-VLG, a weakly supervised approach for 3DSemantic Segmentation with 2D Vision-Language Guidance, an alternative approach that a 3D model predicts dense-embedding for each point which is co-embedded with both the aligned image and text spaces from the 2D vision-language model. Specifically, our method exploits the superior generalization ability of the 2D vision-language models and proposes the Embeddings Soft-Guidance Stage to utilize it to implicitly align 3D embeddings and text embeddings. Moreover, we introduce the Embeddings Specialization Stage to purify the feature representation with the help of a given scene-level label, specifying a better feature supervised by the corresponding text embedding. Thus, the 3D model is able to gain informative supervisions both from the image embedding and text embedding, leading to competitive segmentation performances. To the best of our knowledge, this is the first work to investigate 3D weakly supervised semantic...

3D Weakly Supervised Semantic Segmentation with 2D Vision-Language Guidance / Xu, X., Yuan, Y., Li, J., Zhang, Q., Jie, Z., Ma, L., Tang, H., Sebe, N., Wang, X.u.. - 15131:(2025), pp. 87-104. (18th European Conference on Computer Vision, ECCV 2024 Milano Sept. 2024) [10.1007/978-3-031-73464-9_6].

3D Weakly Supervised Semantic Segmentation with 2D Vision-Language Guidance

Xu, Xiaoxu;Yuan, Yitian;Li, Jinlong;Zhang, Qiudan;Jie, Zequn;Ma, Lin;Tang, Hao;Sebe, Nicu;Wang, Xu

2025-01-01

Abstract

In this paper, we propose 3DSS-VLG, a weakly supervised approach for 3DSemantic Segmentation with 2D Vision-Language Guidance, an alternative approach that a 3D model predicts dense-embedding for each point which is co-embedded with both the aligned image and text spaces from the 2D vision-language model. Specifically, our method exploits the superior generalization ability of the 2D vision-language models and proposes the Embeddings Soft-Guidance Stage to utilize it to implicitly align 3D embeddings and text embeddings. Moreover, we introduce the Embeddings Specialization Stage to purify the feature representation with the help of a given scene-level label, specifying a better feature supervised by the corresponding text embedding. Thus, the 3D model is able to gain informative supervisions both from the image embedding and text embedding, leading to competitive segmentation performances. To the best of our knowledge, this is the first work to investigate 3D weakly supervised semantic...

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione (Date of publication)
	
				2025
			
	Titolo del volume (Proceedings title)
	
				Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15131. Springer
			
	Luogo di edizione (Place of publication)
	
				GEWERBESTRASSE 11, CHAM, CH-6330, SWITZERLAND
			
	Casa editrice (Publisher)
	
				Springer Science and Business Media Deutschland GmbH
			
	ISBN
	
				9783031734632
9783031734649
			
	Codice Scopus (Scopus Identifier)
	
				2-s2.0-85212277377
			
	Codice WOS (WOS identifier)
	
				WOS:001416935000006
			
	Tutti gli autori
	
						Xu, Xiaoxu; Yuan, Yitian; Li, Jinlong; Zhang, Qiudan; Jie, Zequn; Ma, Lin; Tang, Hao; Sebe, Nicu; Wang, Xu
					
	Citazione
	
				3D Weakly Supervised Semantic Segmentation with 2D Vision-Language Guidance / Xu, X., Yuan, Y., Li, J., Zhang, Q., Jie, Z., Ma, L., Tang, H., Sebe, N., Wang, X.u.. - 15131:(2025), pp. 87-104. (18th European Conference on Computer Vision, ECCV 2024 Milano Sept. 2024) [10.1007/978-3-031-73464-9_6].

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/439690

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

1

0

1

3D Weakly Supervised Semantic Segmentation with 2D Vision-Language Guidance

Xu, Xiaoxu;Yuan, Yitian;Li, Jinlong;Zhang, Qiudan;Jie, Zequn;Ma, Lin;Tang, Hao;Sebe, Nicu;Wang, Xu

2025-01-01

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Informazioni

Attenzione

Citazioni

social impact

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)