反向传播
机制(生物学)
计算机科学
特征(语言学)
人工神经网络
人工智能
人工神经网络的类型
循环神经网络
深度学习
机器学习
卷积神经网络
多层感知器
多任务学习
任务(项目管理)
工程类
哲学
系统工程
认识论
语言学
作者
Adityanarayanan Radhakrishnan,Daniel Beaglehole,Parthe Pandit,Mikhail Belkin
出处
期刊:Science
[American Association for the Advancement of Science (AAAS)]
日期:2024-03-07
卷期号:383 (6690): 1461-1467
被引量:6
标识
DOI:10.1126/science.adi5639
摘要
Understanding how neural networks learn features, or relevant patterns in data, for prediction is necessary for their reliable use in technological and scientific applications. In this work, we presented a unifying mathematical mechanism, known as average gradient outer product (AGOP), that characterized feature learning in neural networks. We provided empirical evidence that AGOP captured features learned by various neural network architectures, including transformer-based language models, convolutional networks, multilayer perceptrons, and recurrent neural networks. Moreover, we demonstrated that AGOP, which is backpropagation-free, enabled feature learning in machine learning models, such as kernel machines, that a priori could not identify task-specific features. Overall, we established a fundamental mechanism that captured feature learning in neural networks and enabled feature learning in general machine learning models.
科研通智能强力驱动
Strongly Powered by AbleSci AI