SpikingBERT: Distilling BERT to Train Spiking Language Models Using Implicit Differentiation

计算机科学 尖峰神经网络 可扩展性 水准点(测量) 语言模型 趋同(经济学) 神经形态工程学 人工智能 人工神经网络 大地测量学 经济增长 数据库 经济 地理
作者
Malyaban Bal,Abhronil Sengupta
出处
期刊:Cornell University - arXiv 被引量:4
标识
DOI:10.48550/arxiv.2308.10873
摘要

Large language Models (LLMs), though growing exceedingly powerful, comprises of orders of magnitude less neurons and synapses than the human brain. However, it requires significantly more power/energy to operate. In this work, we propose a novel bio-inspired spiking language model (LM) which aims to reduce the computational cost of conventional LMs by drawing motivation from the synaptic information flow in the brain. In this paper, we demonstrate a framework that leverages the average spiking rate of neurons at equilibrium to train a neuromorphic spiking LM using implicit differentiation technique, thereby overcoming the non-differentiability problem of spiking neural network (SNN) based algorithms without using any type of surrogate gradient. The steady-state convergence of the spiking neurons also allows us to design a spiking attention mechanism, which is critical in developing a scalable spiking LM. Moreover, the convergence of average spiking rate of neurons at equilibrium is utilized to develop a novel ANN-SNN knowledge distillation based technique wherein we use a pre-trained BERT model as "teacher" to train our "student" spiking architecture. While the primary architecture proposed in this paper is motivated by BERT, the technique can be potentially extended to different kinds of LLMs. Our work is the first one to demonstrate the performance of an operational spiking LM architecture on multiple different tasks in the GLUE benchmark.

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
PDF的下载单位、IP信息已删除 (2025-6-4)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
1秒前
毒蛇如我完成签到 ,获得积分10
1秒前
Owen应助朱颜采纳,获得10
6秒前
量子星尘发布了新的文献求助10
6秒前
娃娃菜妮发布了新的文献求助10
6秒前
Owen应助嗯哈采纳,获得10
7秒前
深情安青应助nnnd77采纳,获得10
7秒前
7秒前
8秒前
8秒前
lorryyyy完成签到,获得积分20
9秒前
isonomia发布了新的文献求助200
9秒前
Tracey16发布了新的文献求助10
9秒前
9秒前
10秒前
紫色翡翠发布了新的文献求助10
10秒前
11秒前
11秒前
12秒前
12秒前
Orange应助硝基采纳,获得10
13秒前
liuzhongyi完成签到,获得积分10
13秒前
凯凯发布了新的文献求助10
13秒前
在水一方应助欢呼一斩采纳,获得10
13秒前
pluto应助科研通管家采纳,获得10
14秒前
科研通AI6应助科研通管家采纳,获得30
14秒前
大个应助科研通管家采纳,获得10
14秒前
浮游应助科研通管家采纳,获得10
14秒前
浮游应助科研通管家采纳,获得10
14秒前
呢呢完成签到 ,获得积分10
14秒前
yuan发布了新的文献求助10
14秒前
111应助科研通管家采纳,获得20
14秒前
彭于晏应助科研通管家采纳,获得10
14秒前
科研通AI2S应助科研通管家采纳,获得10
14秒前
pluto应助科研通管家采纳,获得10
14秒前
英俊的铭应助科研通管家采纳,获得10
15秒前
今后应助科研通管家采纳,获得10
15秒前
顾矜应助科研通管家采纳,获得10
15秒前
pluto应助科研通管家采纳,获得10
15秒前
浮游应助科研通管家采纳,获得10
15秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
Treatise on Geochemistry (Third edition) 1600
Clinical Microbiology Procedures Handbook, Multi-Volume, 5th Edition 1000
List of 1,091 Public Pension Profiles by Region 981
On the application of advanced modeling tools to the SLB analysis in NuScale. Part I: TRACE/PARCS, TRACE/PANTHER and ATHLET/DYN3D 500
L-Arginine Encapsulated Mesoporous MCM-41 Nanoparticles: A Study on In Vitro Release as Well as Kinetics 500
Virus-like particles empower RNAi for effective control of a Coleopteran pest 400
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 生物化学 物理 纳米技术 计算机科学 内科学 化学工程 复合材料 物理化学 基因 遗传学 催化作用 冶金 量子力学 光电子学
热门帖子
关注 科研通微信公众号,转发送积分 5458527
求助须知:如何正确求助?哪些是违规求助? 4564580
关于积分的说明 14295592
捐赠科研通 4489446
什么是DOI,文献DOI怎么找? 2459080
邀请新用户注册赠送积分活动 1448864
关于科研通互助平台的介绍 1424474