量化(信号处理)
内存占用
计算机科学
定位关键字
学习矢量量化
矢量量化
人工神经网络
足迹
定位
语音识别
算法
人工智能
古生物学
生物
操作系统
作者
Yuriy Mishchenko,Yusuf Gören,Ming Sun,Chris Beauchene,Spyros Matsoukas,Oleg Rybakov,Shiv Vitaladevuni
标识
DOI:10.1109/icmla.2019.00127
摘要
In this paper, we investigate novel quantization approaches to reduce memory and computational footprint of deep neural network (DNN) based keyword spotters (KWS). We propose a new method for KWS offline and online quantization, which we call dynamic quantization, where we quantize DNN weight matrices column-wise, using each column's exact individual min-max range, and the DNN layers' inputs and outputs are quantized for every input audio frame individually, using the exact min-max range of each input and output vector. We further apply a new quantization-aware training approach that allows us to incorporate quantization errors into KWS model during training. Together, these approaches allow us to significantly improve the performance of KWS in 4-bit and 8-bit quantized precision, achieving the end-to-end accuracy close to that of full precision models while reducing the models' on-device memory footprint by up to 80%.
科研通智能强力驱动
Strongly Powered by AbleSci AI