A Reconfigurable Floating-Point Division and Square Root Architecture for High-Precision Softmax

Softmax函数 管道(软件) 师(数学) 计算机科学 还原(数学) 浮点型 减法器 平方根 CMOS芯片 嵌入式系统 计算机硬件 电子工程 工程类 加法器 算法 数学 算术 深度学习 人工智能 几何学 程序设计语言
作者
Xiwei Fang,Yuhan Wang,Lei Chen,Fengwei An
出处
期刊:IEEE Transactions on Circuits and Systems I-regular Papers [Institute of Electrical and Electronics Engineers]
卷期号:: 1-14
标识
DOI:10.1109/tcsi.2024.3524307
摘要

With the advancement of deep learning models, the Softmax function with self-attention has become pervasive in everyday applications. As components of the Softmax function and its inputs, both division and square root operations impact its accuracy. However, these two non-linear operations bring significant area and power consumption for hardware implementation. To address these challenges, this paper proposes a reconfigurable floating-point division and square root (FDSR) architecture that achieves low resource consumption and high accuracy for general-purpose computation. The FDSR enhances the traditional non-restoring algorithm by using shift-registers and optimizing the leading-one detection and shift operations, reducing hardware resource usage while maintaining high accuracy (0.5 ULP). In the mantissa calculation, the division can be converted to a square root operation by simply switching the input to the subtractor through multiplexers. Additionally, a triple-mode reconfigurable iteration unit is introduced, featuring a multi-layer variable pipeline architecture to improve adaptability for different applications. By redesigning the pipeline depth and reusing logical units, the FDSR effectively addresses the issue of lengthy iteration cycles in the non-restoring method. Implementation results using 40nm CMOS technology demonstrate that the proposed design achieves a 76.49% power reduction and a 14.69% area reduction for floating-point division compared to Synopsys Design Ware and an 88.05% power reduction and a 90.57% area reduction for floating-point square root. With 28 nm CMOS technology, the FDSR reduces power consumption by 91.55% and reduces area by 64.39% for floating-point division compared to Synopsys Design Ware. On the FPGA platform, the FDSR significantly reduces hardware resource consumption, achieving an 85.23% reduction for floating-point division and 87.81% for floating-point square root, outperforming state-of-the-art designs.

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
PDF的下载单位、IP信息已删除 (2025-6-4)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
酷波er应助怕孤独的语兰采纳,获得10
刚刚
sdl发布了新的文献求助10
刚刚
元元完成签到,获得积分10
刚刚
1秒前
爆米花应助畅快的听枫采纳,获得10
1秒前
徐小发布了新的文献求助20
1秒前
xfyxxh完成签到,获得积分10
1秒前
yy发布了新的文献求助10
2秒前
2秒前
科研通AI2S应助最终幻想采纳,获得10
2秒前
洪山老狗完成签到,获得积分10
3秒前
3秒前
4秒前
小太阳完成签到,获得积分10
4秒前
黄科研应助jessie采纳,获得10
4秒前
4秒前
5秒前
直球科研发布了新的文献求助20
5秒前
橘子味汽水完成签到,获得积分10
5秒前
HollidayLee应助踏实十八采纳,获得30
6秒前
晓先森完成签到,获得积分10
6秒前
6秒前
自然如松完成签到 ,获得积分10
6秒前
6秒前
风雅发布了新的文献求助50
7秒前
洋洋洋完成签到,获得积分10
7秒前
douzi完成签到,获得积分10
7秒前
小迟完成签到 ,获得积分20
8秒前
8秒前
9秒前
9秒前
10秒前
10秒前
贰鸟应助明亮的代荷采纳,获得10
10秒前
嗯呢应助853210544cyz采纳,获得10
10秒前
晓薇发布了新的文献求助10
10秒前
完美世界应助鱼儿想游采纳,获得10
11秒前
11秒前
xinlei2023发布了新的文献求助10
11秒前
帅气的猫完成签到,获得积分10
11秒前
高分求助中
The Mother of All Tableaux Order, Equivalence, and Geometry in the Large-scale Structure of Optimality Theory 2400
Ophthalmic Equipment Market by Devices(surgical: vitreorentinal,IOLs,OVDs,contact lens,RGP lens,backflush,diagnostic&monitoring:OCT,actorefractor,keratometer,tonometer,ophthalmoscpe,OVD), End User,Buying Criteria-Global Forecast to2029 2000
Optimal Transport: A Comprehensive Introduction to Modeling, Analysis, Simulation, Applications 800
Official Methods of Analysis of AOAC INTERNATIONAL 600
ACSM’s Guidelines for Exercise Testing and Prescription, 12th edition 588
Residual Stress Measurement by X-Ray Diffraction, 2003 Edition HS-784/2003 588
T/CIET 1202-2025 可吸收再生氧化纤维素止血材料 500
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 生物化学 物理 内科学 纳米技术 计算机科学 化学工程 复合材料 遗传学 基因 物理化学 催化作用 冶金 细胞生物学 免疫学
热门帖子
关注 科研通微信公众号,转发送积分 3950365
求助须知:如何正确求助?哪些是违规求助? 3495846
关于积分的说明 11078987
捐赠科研通 3226245
什么是DOI,文献DOI怎么找? 1783653
邀请新用户注册赠送积分活动 867728
科研通“疑难数据库(出版商)”最低求助积分说明 800926