A Target-Speech-Feature-Aware Module for U-Net Based Speech Enhancement

计算机科学 语音识别 语音增强 特征(语言学) 语音活动检测 语音编码 语音处理 线性预测编码 语音合成 人工智能 哲学 语言学 降噪
作者
Kaikun Pei,Lijun Zhang,Dejian Meng,Yikang He
出处
期刊:SAE technical paper series
标识
DOI:10.4271/2024-01-2021
摘要

<div class="section abstract"><div class="htmlview paragraph">Speech enhancement can extract clean speech from noise interference, enhancing its perceptual quality and intelligibility. This technology has significant applications in in-car intelligent voice interaction. However, the complex noise environment inside the vehicle, especially the human voice interference is very prominent, which brings great challenges to the vehicle speech interaction system. In this paper, we propose a speech enhancement method based on target speech features, which can better extract clean speech and improve the perceptual quality and intelligibility of enhanced speech in the environment of human noise interference. To this end, we propose a design method for the middle layer of the U-Net architecture based on Long Short-Term Memory (LSTM), which can automatically extract the target speech features that are highly distinguishable from the noise signal and human voice interference features in noisy speech, and realize the targeted extraction of clean speech. Then, in order to achieve deep fusion between the target speech features and the model, we design a multi-scale deep fusion skip connection method, so that when the effective information flows from the encoder to the decoder, the features with large correlation with the target speech are effectively screened through the weight coefficient of attention. Finally, in order to verify the effectiveness of the proposed module, experiments were carried out on the Voicebank+Demand speech dataset. The results show that the proposed method has strong robustness in the environment with human voice interference. It outperforms other algorithms on metrics such as PESQ, STOI, CSIG, CBAK, COVL, offering cleaner speech with higher perceptual quality and intelligibility. This makes it particularly promising for applications in scenarios with significant human voice interference, such as in-car environments.</div></div>
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
大幅提高文件上传限制,最高150M (2024-4-1)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
随便完成签到,获得积分10
1秒前
朱江涛发布了新的文献求助10
1秒前
2秒前
iconcrete应助Rainbow采纳,获得10
5秒前
小马甲应助feb采纳,获得10
5秒前
科研通AI2S应助纥江采纳,获得10
5秒前
hwezhu发布了新的文献求助10
7秒前
我是老大应助朱江涛采纳,获得10
7秒前
万能图书馆应助费费采纳,获得30
7秒前
su完成签到,获得积分10
8秒前
8秒前
小亮哈哈完成签到,获得积分10
10秒前
栗子完成签到,获得积分10
10秒前
15秒前
bkagyin应助balabala采纳,获得10
15秒前
武元彤发布了新的文献求助10
15秒前
ddding完成签到 ,获得积分10
16秒前
16秒前
CipherSage应助hwezhu采纳,获得10
16秒前
16秒前
bkagyin应助Echo采纳,获得10
16秒前
17秒前
流苏完成签到,获得积分20
17秒前
18秒前
18秒前
慕暖发布了新的文献求助10
19秒前
DZW发布了新的文献求助10
19秒前
20秒前
feb发布了新的文献求助10
21秒前
Cheese发布了新的文献求助10
22秒前
研友_LkKzoL完成签到,获得积分20
22秒前
zyyyy发布了新的文献求助10
24秒前
24秒前
深情安青应助科研通管家采纳,获得10
26秒前
酷波er应助科研通管家采纳,获得10
26秒前
猪猪hero应助科研通管家采纳,获得10
26秒前
星辰大海应助科研通管家采纳,获得10
26秒前
小二郎应助科研通管家采纳,获得10
27秒前
27秒前
JamesPei应助科研通管家采纳,获得10
27秒前
高分求助中
System in Systemic Functional Linguistics A System-based Theory of Language 1000
Дружба 友好报 (1957-1958) 1000
The Data Economy: Tools and Applications 1000
Essentials of thematic analysis 700
Mantiden - Faszinierende Lauerjäger – Buch gebraucht kaufen 600
PraxisRatgeber Mantiden., faszinierende Lauerjäger. – Buch gebraucht kaufe 600
A Dissection Guide & Atlas to the Rabbit 600
热门求助领域 (近24小时)
化学 医学 生物 材料科学 工程类 有机化学 生物化学 物理 内科学 纳米技术 计算机科学 化学工程 复合材料 基因 遗传学 催化作用 物理化学 免疫学 量子力学 细胞生物学
热门帖子
关注 科研通微信公众号,转发送积分 3117099
求助须知:如何正确求助?哪些是违规求助? 2767036
关于积分的说明 7689541
捐赠科研通 2422396
什么是DOI,文献DOI怎么找? 1286206
科研通“疑难数据库(出版商)”最低求助积分说明 620271
版权声明 599837