T2TD: Text-3D Generation Model Based on Prior Knowledge Guidance

Nie, Weizhi; Chen, Ruidong; Wang, Weijie; Lepri, Bruno; Sebe, Nicu

doi:10.1109/TPAMI.2024.3463753

In recent years, 3D models have been utilized in many applications, such as auto-drivers, 3D reconstruction, VR, and AR. However, the scarcity of 3D model data does not meet its practical demands. Thus, generating high-quality 3D models efficiently from textual descriptions is a promising but challenging way to solve this problem. In this paper, inspired by the creative mechanisms of human imagination, which concretely supplement the target model from ambiguous descriptions built upon human experiential knowledge, we propose a novel text-3D generation model (T2TD). T2TD aims to generate the target model based on the textual description with the aid of experiential knowledge. Its target creation process simulates the imaginative mechanisms of human beings. In this process, we first introduce the text-3D knowledge graph to preserve the relationship between 3D models and textual semantic information, which provides related shapes like humans' experiential information. Second, we propose an effective causal inference model to select useful feature information from these related shapes, which can remove the unrelated structure information and only retain solely the feature information strongly related to the textual description. Third, we adopt a novel multi-layer transformer structure to progressively fuse this strongly related structure information and textual information, compensating for the lack of structural information, and enhancing the final performance of the 3D generation model. The final experimental results demonstrate that our approach significantly improves 3D model generation quality and outperforms the SOTA methods on the text2shape datasets.

T2TD: Text-3D Generation Model Based on Prior Knowledge Guidance / Nie, Weizhi; Chen, Ruidong; Wang, Weijie; Lepri, Bruno; Sebe, Nicu. - In: IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE. - ISSN 0162-8828. - 47:1(2025), pp. 172-189. [10.1109/TPAMI.2024.3463753]

T2TD: Text-3D Generation Model Based on Prior Knowledge Guidance

Nie, Weizhi;Chen, Ruidong;Wang, Weijie;Lepri, Bruno;Sebe, Nicu

2025-01-01

Abstract

In recent years, 3D models have been utilized in many applications, such as auto-drivers, 3D reconstruction, VR, and AR. However, the scarcity of 3D model data does not meet its practical demands. Thus, generating high-quality 3D models efficiently from textual descriptions is a promising but challenging way to solve this problem. In this paper, inspired by the creative mechanisms of human imagination, which concretely supplement the target model from ambiguous descriptions built upon human experiential knowledge, we propose a novel text-3D generation model (T2TD). T2TD aims to generate the target model based on the textual description with the aid of experiential knowledge. Its target creation process simulates the imaginative mechanisms of human beings. In this process, we first introduce the text-3D knowledge graph to preserve the relationship between 3D models and textual semantic information, which provides related shapes like humans' experiential information. Second, we propose an effective causal inference model to select useful feature information from these related shapes, which can remove the unrelated structure information and only retain solely the feature information strongly related to the textual description. Third, we adopt a novel multi-layer transformer structure to progressively fuse this strongly related structure information and textual information, compensating for the lack of structural information, and enhancing the final performance of the 3D generation model. The final experimental results demonstrate that our approach significantly improves 3D model generation quality and outperforms the SOTA methods on the text2shape datasets.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione (Date of publication)
	
				2025
			
	Titolo del periodico (Journal title)
	
				IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
			
	Numero e parte del fascicolo (Issue number and part)
	
				1
			
	DOI
	
				https://dx.doi.org/10.1109/TPAMI.2024.3463753
			
	Codice Scopus (Scopus identifier)
	
				2-s2.0-85204439035
			
	Codice WOS (WOS identifier)
	
				WOS:001370789100030
			
	Tutti gli autori
	
						Nie, Weizhi; Chen, Ruidong; Wang, Weijie; Lepri, Bruno; Sebe, Nicu
					
	Citazione
	
				T2TD: Text-3D Generation Model Based on Prior Knowledge Guidance / Nie, Weizhi; Chen, Ruidong; Wang, Weijie; Lepri, Bruno; Sebe, Nicu. - In: IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE. - ISSN 0162-8828. - 47:1(2025), pp. 172-189. [10.1109/TPAMI.2024.3463753]
			
	Appare nelle tipologie:
	
				03.1 Articolo su rivista (Journal article)

File in questo prodotto:

File	Dimensione	Formato
T2TD_Text-3D_Generation_Model_Based_on_Prior_Knowledge_Guidance.pdf Solo gestori archivio Tipologia: Versione editoriale (Publisher’s layout) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 4.34 MB Formato Adobe PDF Visualizza/Apri	4.34 MB	Adobe PDF	Visualizza/Apri