LaTex2Web logo

Documents Live, a web authoring and publishing system

If you see this, something is wrong

Table of contents

First published on Wednesday, Jun 3, 2026 and last modified on Wednesday, Jun 3, 2026 by François Chaplais.

Mitigating Gradient Pathology in PINNs through Aligned Constraint
Yichen Luo Department of Information Science and Engineering, KTH Royal Institute of Technology, Stockholm, Sweden, Peiyu Zhu School of Advanced Manufacturing and Robotics, Peking University, Beijing, China, Dongxiao Hu School of Advanced Technology, Xi’an Jiaotong-Liverpool University, Suzhou, China, Jia Wang School of Advanced Technology, Xi’an Jiaotong-Liverpool University, Suzhou, China, Tailin Wu Department of AI, School of Engineering, Westlake University, Hangzhou, China, Dapeng Lan Techforgood AS, Oslo, Norway, Yu Liu Techforgood AS, Oslo, Norway, Zhibo Pang School of Advanced Manufacturing and Robotics, Peking University, Beijing, China Email ,

Keywords: Machine Learning, ICML

Abstract

1 Main result

Figure 1. Visualization of the PDE residual loss landscape in both the function space and the parameter space, projected onto a 2D subspace. The red curve indicates the loss valley. While the landscape is relatively simple in the function space, it becomes highly distorted and non-convex when mapped to the parameter space by the neural network. More details are in Appendix C.1.
Figure 4. Schematic illustration of the intersection between PDE residual loss landscape and boundary-condition constraint landscape in function space. The zero-residual solutions of the governing PDE form a manifold, while boundary conditions restricting the admissible solutions to a subset. Their intersections correspond to valid solutions, which can be (a) unique or (b) non-unique.
Figure . (a) Phase I Dominated by \( \|\nabla_\theta \mathcal{L}_{\mathrm{res}}\| \gg \|\nabla_\theta \mathcal{L}_{\mathrm{bc}}\|\) , reaching \( \mathcal{L}_{\mathrm{res}} \leq \varepsilon\) .
Figure 7. Conceptual illustration of the proposed optimization mechanism. (a) Aligned constraints enlarge the admissible solution range, while (b) delaying the residual loss guides the optimizer into the PDE loss valley from a favorable region, enabling smoother convergence to the global optimum.
Figure 11. Relative \( L_2\) error and cumulative parameter update distance \( \sum\|\theta_{t+1}-\theta_t\|_2\) versus training iterations on Heat benchmark (MLP backbone).
Figure 15. Variation of the \( c\) curve across iterations on all benchmarks (MLP backbone).
Appendix

A Theoretical Proof

B Methodological Supplement


Algorithm 1 CAML: Constraint-Aligned Loss with Manifold Lifting
1. Input: PDE operator \( \mathcal{N}\) , boundary operator \( \mathcal{B}\) , network \( u_\theta\)
2. Parameters: weights \( w_\mathrm{res}, w_\mathrm{bc}\) , delay schedule \( \lambda(t)\)
3.Initialize network parameters \( \theta\)
4.for training step \( t=1\) to \( T\) do
5.Sample interior points \( {x_i}\) and boundary points \( {x_b}\)
6.Compute residuals: \[ r_i = \mathcal{N}[u_\theta](x_i) - f(x_i),   s_b = \mathcal{B}[u_\theta](x_b) - g(x_b) \]
7.Store all derivative terms \( \mathcal{D}[u_\theta](x_i)\) and \( \beta_b\, \nabla u_\theta(x_b)\cdot\mathbf{n}_b\) temporarily
8. Solve offset \( c\) :
9.if zeroth-order terms are linear then
10.Compute \( c\) by closed-form weighted least squares
11.else
12.\( K \leftarrow K_{\mathrm{few}} \cdot \mathbb{I}(t<t_c) + K_{\mathrm{init}} \cdot \mathbb{I}(t=1)\)
13.Update \( c\) using \( K\) Newton steps on \( \mathcal{L}(c)\)
14.end if
15.Apply aligned residuals \( \bar r_i = r_i(u_\theta+c)\) , \( \bar s_b = s_b(u_\theta+c)\) , where all derivative terms are directly loaded from cache
16.\( \mathcal{L} = w_\mathrm{res}\lambda(t)\mathcal{L}_\mathrm{res}^\mathrm{alg} + w_{bc}\mathcal{L}_\mathrm{bc}^\mathrm{alg}\)
17.Update \( \theta \leftarrow \theta - \eta \nabla_\theta \mathcal{L}\)
18.end for

C Experimental Supplement

Figure 20. Visualization of the loss valleys in the parameter space under 4 group of different random seeds.
Figure 25. Evolution of loss components and gradient statistics during PINN training, including the PDE residual loss, boundary condition loss, cosine similarity between their gradients, and the parameter optimize trajectory.
Figure 30. Visualization of parameter trajectories and the corresponding PDE residual loss landscape, from both (a) global and (b) local perspective. While the global loss curve appears nearly linear in loss valley, the local optimization path displays substantial oscillatory behavior, indicating persistent gradient conflicts within the residual-induced valley.
Figure 33. Visualization results on the Heat benchmark, showing the ground truth, prediction and error distributions during training with CAML integrated into the MLP backbone.
Figure 37. Visualization results on the Poisson benchmark, showing the ground truth, prediction and error distributions during training with CAML integrated into the MLP backbone.
Figure 41. Visualization results on the NS benchmark, showing the ground truth, prediction and error distributions during training with CAML integrated into the MLP backbone.
Figure 46. Visualization results on the Helmholtz benchmark, showing the ground truth, prediction and error distributions during training with CAML integrated into the MLP backbone.

References

[1] Maziar Raissi and Paris Perdikaris and George Em Karniadakis Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations Journal of Computational Physics 2019 378 686–707

[2] Ehsan Haghighat and Maziar Raissi and Adrian Moure and Hector Gomez and Ruben Juanes A physics-informed deep learning framework for inversion and surrogate modeling in solid mechanics Computer Methods in Applied Mechanics and Engineering 2021 379 113741

[3] E. Samaniego and C. Anitescu and S. Goswami and V.M. Nguyen-Thanh and H. Guo and K. Hamdia and X. Zhuang and T. Rabczuk An energy approach to the solution of partial differential equations in computational mechanics via machine learning: Concepts, implementation and applications Computer Methods in Applied Mechanics and Engineering 2020 362 112790

[4] Maziar Raissi and Alireza Yazdani and George Em Karniadakis Hidden fluid mechanics: Learning velocity and pressure fields from flow visualizations Science 2020 367 6481 1026–1030

[5] Xiaowei Jin and Shengze Cai and Hui Li and George Em Karniadakis NSFnets Journal of Computational Physics 2021 426 109951

[6] Shengze Cai and Zhicheng Wang and Sifan Wang and Paris Perdikaris and George Em Karniadakis Physics-Informed Neural Networks for Heat Transfer Problems Journal of Heat Transfer 2021 143 6 060801

[7] Youssef Haddout and Soufiane Haddout Deep Physics-Informed Neural Networks for Stratified Forced Convection Heat Transfer in Plane Couette Flow: Toward Sustainable Climate Projections in Atmospheric and Oceanic Boundary Layers Fluids 2025 10 12

[8] Yuyao Chen and Lu Lu and George Em Karniadakis and Luca Dal Negro Physics-informed neural networks for inverse problems in nano-optics and metamaterials Optics Express 2020 28 8 11618–11633

[9] Joowon Lim and Demetri Psaltis MaxwellNet APL Photonics 2021

[10] Aditi Krishnapriyan and Amir Gholami and Shandian Zhe and Robert Kirby and Michael W Mahoney Characterizing possible failure modes in physics-informed neural networks 35th International Conference on Neural Information Processing Systems (NeurIPS) 2021

[11] Sifan Wang and Yujun Teng and Paris Perdikaris Understanding and mitigating gradient flow pathologies in physics-informed neural networks SIAM Journal on Scientific Computing 2021 43 5 A3055–A3081

[12] Sifan Wang and Xinling Yu and Paris Perdikaris When and why PINNs fail to train: A neural tangent kernel perspective Journal of Computational Physics 2022 449 110768

[13] Pratik Rathore and Weimu Lei and Zachary Frangella and Lu Lu and Madeleine Udell Challenges in training PINNs: a loss landscape perspective 41st International Conference on Machine Learning (ICML) 2024

[14] Yesom Park and Changhoon Song and Myungjoo Kang Beyond derivative pathology of PINNs: Variable splitting strategy with convergence analysis Journal of Machine Learning Research 2024

[15] Changhoon Song and Yesom Park and Myungjoo Kang How does PDE order affect the convergence of PINNs? 38th International Conference on Neural Information Processing Systems (NeurIPS) 2024

[16] Jeremy Yu and Lu Lu and Xuhui Meng and George Em Karniadakis Gradient-enhanced physics-informed neural networks for forward and inverse PDE problems Computer Methods in Applied Mechanics and Engineering 2022 393 114823

[17] Chuwei Wang and Shanda Li and Di He and Liwei Wang Is L2 physics-informed loss always suitable for training physics-informed neural network? 36th International Conference on Neural Information Processing Systems (NeurIPS) 2022

[18] Apostolos F Psaros and Kenji Kawaguchi and George Em Karniadakis Meta-learning PINN loss functions Journal of Computational Physics 2022 458 111121

[19] Yiheng Du and Nithin Chalapathi and Aditi S. Krishnapriyan Neural Spectral Methods: Self-supervised learning in the spectral domain 12th International Conference on Learning Representations (ICLR) 2024

[20] Rui Zhang and Gordon P. Warn and Aleksandra Radlińska Physics-Informed Parallel Neural Networks with self-adaptive loss weighting for the identification of continuous structural systems Computer Methods in Applied Mechanics and Engineering 2024 427 117042

[21] Wenqian Chen and Amanda A. Howard and Panos Stinis Self-adaptive weights based on balanced residual decay rate for physics-informed neural networks and deep operator networks Journal of Computational Physics 2025 542 114226

[22] Bo Gao and Ruoxia Yao and Yan Li Physics-informed neural networks with adaptive loss weighting algorithm for solving partial differential equations Computers and Mathematics with Applications 2025 181 216–227

[23] Chenhong Zhou and Jie Chen and Zaifeng Yang and Ching Eng Png Dual-Balancing for Physics-Informed Neural Networks 34th International Joint Conference on Artificial Intelligence (IJCAI) 2025

[24] Youngsik Hwang and Dong-Young Lim Dual Cone Gradient Descent for Training Physics-Informed Neural Networks 38th International Conference on Neural Information Processing Systems (NeurIPS) 2024

[25] Qiang Liu and Mengyu Chu and Nils Thuerey ConFIG: Towards Conflict-free Training of Physics Informed Neural Networks 13th International Conference on Learning Representations (ICLR) 2025

[26] Sifan Wang and Bowen Li and Yuhan Chen and Paris Perdikaris PirateNets Journal of Machine Learning Research 2024 25 402 1–51

[27] Luning Sun and Han Gao and Shaowu Pan and Jian-Xun Wang Surrogate modeling for fluid flows based on physics-constrained deep learning without simulation data Computer Methods in Applied Mechanics and Engineering 2020 361 112732

[28] Songming Liu and Zhongkai Hao and Chengyang Ying and Hang Su and Jun Zhu and Ze Cheng A unified hard-constraint framework for solving geometrically complex PDEs 36th International Conference on Neural Information Processing Systems (NeurIPS) 2022

[29] Gregory Kang Ruey Lau and Apivich Hemachandra and See-Kiong Ng and Bryan Kian Hsiang Low PINNACLE 12th International Conference on Learning Representations (ICLR) 2024

[30] Hao Li and Zheng Xu and Gavin Taylor and Christoph Studer and Tom Goldstein Visualizing the loss landscape of neural nets 32nd International Conference on Neural Information Processing Systems (NeurIPS) 2018

[31] Nitish Shirish Keskar and Dheevatsa Mudigere and Jorge Nocedal and Mikhail Smelyanskiy and Ping Tak Peter Tang On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima 5th International Conference on Learning Representations (ICLR) 2017

[32] Timur Garipov and Pavel Izmailov and Dmitrii Podoprikhin and Dmitry Vetrov and Andrew Gordon Wilson Loss surfaces, mode connectivity, and fast ensembling of DNNs 32nd International Conference on Neural Information Processing Systems (NeurIPS) 2018

[33] Pierre Foret and Ariel Kleiner and Hossein Mobahi and Behnam Neyshabur Sharpness-aware minimization for efficiently improving generalization 9th International Conference on Learning Representations (ICLR) 2021

[34] Levi D. McClenny and Ulisses M. Braga-Neto Self-adaptive physics-informed neural networks Journal of Computational Physics 2023 474 111722

[35] Zhiyuan Zhao and Xueying Ding and B. Aditya Prakash PINN 12th International Conference on Learning Representations (ICLR) 2024

[36] Diederik P. Kingma and Jimmy Ba Adam: A Method for Stochastic Optimization 3rd International Conference on Learning Representations (ICLR) 2015

[37] Adam Paszke and Sam Gross and Francisco Massa and Adam Lerer and James Bradbury and Gregory Chanan and Trevor Killeen and Zeming Lin and Natalia Gimelshein and Luca Antiga and Alban Desmaison and Andreas Köpf and Edward Yang and Zach DeVito and Martin Raison and Alykhan Tejani and Sasank Chilamkurthy and Benoit Steiner and Lu Fang and Junjie Bai and Soumith Chintala PyTorch 33rd International Conference on Neural Information Processing Systems (NeurIPS) 2019

[38] Yunshu Du and Wojciech M. Czarnecki and Siddhant M. Jayakumar and Mehrdad Farajtabar and Razvan Pascanu and Balaji Lakshminarayanan Adapting Auxiliary Losses Using Gradient Similarity arXiv preprint arXiv:1812.02224 2018

[39] Dong C. Liu and Jorge Nocedal On the limited memory BFGS method for large scale optimization Mathematical Programming 1989 45 503–528

[40] Johannes Müller and Marius Zeinhofer Achieving High Accuracy with PINNs via Energy Natural Gradient Descent 40th International Conference on Machine Learning (ICML) 2023

[41] Nima Hosseini Dashtbayaz and Ghazal Farhani and Boyu Wang and Charles X. Ling Physics-informed neural networks: minimizing residual loss with wide networks and effective activations 33rd International Joint Conference on Artificial Intelligence (IJCAI) 2024

[42] N. Sukumar and Ankit Srivastava Exact imposition of boundary conditions with distance functions in physics-informed deep neural networks Computer Methods in Applied Mechanics and Engineering 2022 389 114333

[43]

[44] Anima Anandkumar and Kamyar Azizzadenesheli and Kaushik Bhattacharya and Nikola Kovachki and Zongyi Li and Burigede Liu and Andrew Stuart Neural Operator: Graph Kernel Network for Partial Differential Equations ICLR 2020 Workshop on Integration of Deep Neural Models and Differential Equations 2020 Workshop paper

[45] Zongyi Li and Nikola Borislavov Kovachki and Kamyar Azizzadenesheli and Burigede liu and Kaushik Bhattacharya and Andrew Stuart and Anima Anandkumar Fourier Neural Operator for Parametric Partial Differential Equations 9th International Conference on Learning Representations (ICLR) 2021

[46] Fabricio Dos Santos and Tara Akhound-Sadegh and Siamak Ravanbakhsh Physics-Informed Transformer Networks NeurIPS 2023 Workshop on The Symbiosis of Deep Learning and Differential Equations III 2023 Workshop paper

[47] Hailong Sheng and Chao Yang PFNN Journal of Computational Physics 2021 428 110085

[48] Lu Lu and Raphaël Pestourie and Wenjie Yao and Zhicheng Wang and Francesc Verdugo and Steven Johnson Physics-Informed Neural Networks with Hard Constraints for Inverse Design SIAM Journal on Scientific Computing 2021 43 B1105–B1132

[49] Yuchen Xie and Yu Ma and Yahui Wang Automatic boundary fitting framework of boundary dependent physics-informed neural network solving partial differential equation with complex boundary conditions Computer Methods in Applied Mechanics and Engineering 2023 414 116139

[50] Hang Zhou and Yuezhou Ma and Haixu Wu and Haowen Wang and Mingsheng Long Unisolver: PDE-Conditional Transformers Are Universal PDE Solvers 42nd International Conference on Machine Learning (ICML) 2025