Dual Natural Gradient Descent for Scalable Training of Physics-Informed Neural Networks

IRIS

Natural–gradient methods markedly accelerate the training of Physics-Informed Neural Networks (PINNs), yet their Gauss–Newton update must be solved in the parameter space, incurring a prohibitive O(n3) time complexity, where n is the number of network trainable weights. We show that exactly the same step can instead be formulated in a general ly smal ler residual space of size m =∑γNγ dγ, where each residual class γ (e.g. PDE interior, boundary, initial data) contributes Nγ collocation points of output dimension dγ. Building on this insight, we introduce Dual Natural Gradient Descent (D-NGD). D-NGD computes the Gauss–Newton step in residual space, augments it with a geodesic-acceleration correction at negligible extra cost, and provides both a dense direct solver for modest m and a Nyström-preconditioned conjugate-gradient solver for larger m. Experimentally, D-NGD scales second-order PINN optimization to networks with up to 12.8 million parameters, delivers one-to three-order-of-magnitude lower final error L2 than first-order (Adam, SGD) and quasi-Newton methods, and —crucially —enables full natural gradient training of PINNs at this scale on a single GPU.

Natural–gradient methods markedly accelerate the training of Physics-Informed Neural Networks (PINNs), yet their Gauss–Newton update must be solved in the parameter space, incurring a prohibitive O(n3) time complexity, where n is the number of network trainable weights. We show that exactly the same step can instead be formulated in a general ly smal ler residual space of size m =∑γNγ dγ, where each residual class γ (e.g. PDE interior, boundary, initial data) contributes Nγ collocation points of output dimension dγ. Building on this insight, we introduce Dual Natural Gradient Descent (D-NGD). D-NGD computes the Gauss–Newton step in residual space, augments it with a geodesic-acceleration correction at negligible extra cost, and provides both a dense direct solver for modest m and a Nyström-preconditioned conjugate-gradient solver for larger m. Experimentally, D-NGD scales second-order PINN optimization to networks with up to 12.8 million parameters, delivers one-to three-order-of-magnitude lower final error L2 than first-order (Adam, SGD) and quasi-Newton methods, and —crucially —enables full natural gradient training of PINNs at this scale on a single GPU.

Dual Natural Gradient Descent for Scalable Training of Physics-Informed Neural Networks / Jnini, A.; Vella, F.. - In: TRANSACTIONS ON MACHINE LEARNING RESEARCH. - ISSN 2835-8856. - 2025-:(2025).

Dual Natural Gradient Descent for Scalable Training of Physics-Informed Neural Networks

Jnini A.^Primo;Vella F.^Ultimo

2025-01-01

Abstract

Natural–gradient methods markedly accelerate the training of Physics-Informed Neural Networks (PINNs), yet their Gauss–Newton update must be solved in the parameter space, incurring a prohibitive O(n3) time complexity, where n is the number of network trainable weights. We show that exactly the same step can instead be formulated in a general ly smal ler residual space of size m =∑γNγ dγ, where each residual class γ (e.g. PDE interior, boundary, initial data) contributes Nγ collocation points of output dimension dγ. Building on this insight, we introduce Dual Natural Gradient Descent (D-NGD). D-NGD computes the Gauss–Newton step in residual space, augments it with a geodesic-acceleration correction at negligible extra cost, and provides both a dense direct solver for modest m and a Nyström-preconditioned conjugate-gradient solver for larger m. Experimentally, D-NGD scales second-order PINN optimization to networks with up to 12.8 million parameters, delivers one-to three-order-of-magnitude lower final error L2 than first-order (Adam, SGD) and quasi-Newton methods, and —crucially —enables full natural gradient training of PINNs at this scale on a single GPU.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione (Date of publication)
	
				2025
			
	Titolo del periodico (Journal title)
	
				TRANSACTIONS ON MACHINE LEARNING RESEARCH
			
	Settori scientifico-disciplinari (validi fino a 24/06/2024) - Reference SSD (valid until 24/06/2024)
	
				Settore ING-INF/05 - Sistemi di Elaborazione delle Informazioni
			
	Settori scientifico-disciplinari (validi dal 09/05/2024) - Reference SSD (valid from 09/05/2024)
	
				Settore IINF-05/A - Sistemi di elaborazione delle informazioni
Settore IIND-01/F - Fluidodinamica
			
	Codice Scopus (Scopus identifier)
	
				2-s2.0-105021333534
			
	Tutti gli autori
	
						Jnini, A.; Vella, F.
					
	Citazione
	
				Dual Natural Gradient Descent for Scalable Training of Physics-Informed Neural Networks / Jnini, A.; Vella, F.. - In: TRANSACTIONS ON MACHINE LEARNING RESEARCH. - ISSN 2835-8856. - 2025-:(2025).
			
	Appare nelle tipologie:
	
				03.1 Articolo su rivista (Journal article)

File in questo prodotto:

File	Dimensione	Formato
4972_Dual_Natural_Gradient_Des.pdf accesso aperto Tipologia: Versione editoriale (Publisher’s layout) Licenza: Creative commons Dimensione 1.2 MB Formato Adobe PDF Visualizza/Apri	1.2 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/473911

Citazioni

ND

1

ND

ND

social impact