现场可编程门阵列
计算机科学
到达方向
麦克风阵列
声源定位
噪音(视频)
语音识别
信号处理
音频信号处理
语音处理
波束赋形
声学
音频信号
计算机硬件
人工智能
数字信号处理
声音(地理)
声压
电信
语音编码
话筒
图像(数学)
物理
天线(收音机)
作者
Weiming Xiang,Yu Wu,Y. Zhou,Yu Wu
出处
期刊:IEEE Transactions on Instrumentation and Measurement
[Institute of Electrical and Electronics Engineers]
日期:2024-01-01
卷期号:73: 1-12
标识
DOI:10.1109/tim.2023.3334344
摘要
The front-end speech enhancement system is regarded as an essential component for maximizing the performances of smart technology for voice interaction in complicated live acoustic scenes. Existing research has had the following limitations: short-distance detection, poor suppression of nonstationary interferers, and inaccurate estimation of the direction of arrival. To tackle these issues, this article proposes a 48-channel acoustic array system for directional sound capture (DSC). This system implements a field-programmable gate array (FPGA)-based acquisition and signal processing algorithm: broadband acoustic beamformer based on audio-visual (A-V). To the authors’ knowledge, this is the first time that a DSC system that uses A-V for terminal voice interaction has been implemented by FPGA. Experiments were set up in diverse acoustic scenes to evaluate the system’s performance. The results imply that the proposed system can be widely applied to smart scenes in complicated acoustic environments contaminated with intense background noise and competing nonstationary interferers, as well as provide real-time speech recognition and classification.
科研通智能强力驱动
Strongly Powered by AbleSci AI