操作员(生物学)
梯度下降
编码(集合论)
计算机科学
算法
人工神经网络
核(代数)
切线
人工智能
数学
深度学习
应用数学
机器学习
离散数学
几何学
基因
转录因子
生物化学
抑制因子
集合(抽象数据类型)
化学
程序设计语言
作者
Sifan Wang,Hanwen Wang,Paris Perdikaris
标识
DOI:10.1007/s10915-022-01881-0
摘要
Operator learning techniques have recently emerged as a powerful tool for learning maps between infinite-dimensional Banach spaces. Trained under appropriate constraints, they can also be effective in learning the solution operator of partial differential equations (PDEs) in an entirely self-supervised manner. In this work we analyze the training dynamics of deep operator networks (DeepONets) through the lens of Neural Tangent Kernel theory, and reveal a bias that favors the approximation of functions with larger magnitudes. To correct this bias we propose to adaptively re-weight the importance of each training example, and demonstrate how this procedure can effectively balance the magnitude of back-propagated gradients during training via gradient descent. We also propose a novel network architecture that is more resilient to vanishing gradient pathologies. Taken together, our developments provide new insights into the training of DeepONets and consistently improve their predictive accuracy by a factor of 10-50x, demonstrated in the challenging setting of learning PDE solution operators in the absence of paired input-output observations. All code and data accompanying this manuscript will be made publicly available at https://github.com/PredictiveIntelligenceLab/ImprovedDeepONets .
科研通智能强力驱动
Strongly Powered by AbleSci AI