Aolong Zhou,Wen Zhang,Guojun Xu,Xiaoyong Li,Kefeng Deng,Junqiang Song
出处
期刊:IEEE/ACM transactions on audio, speech, and language processing [Institute of Electrical and Electronics Engineers] 日期:2023-01-01卷期号:31: 1851-1865被引量:5
标识
DOI:10.1109/taslp.2023.3275030
摘要
Underwater acoustic signal denoising is a challenging task due to the complexity of the underwater environment. Most of the existing methods cannot effectively cope with the problem of underwater acoustic signal (UWAS) denoising at low signal-to-noise ratios (SNRs). According to the characteristics of UWAS, a novel idea is proposed to simultaneously model latent features from both the time and frequency dimensions of complex-valued spectrum in a dual-branch self-attention network, namely DBSA-Net. In this model, both magnitude and phase information in the complex spectrum are enhanced from different dimensions by two branches. Specifically, DBSA-Net is an encoder-decoder based network with several global-local-self-attention (GL-SA) blocks distributed on dual branches between encoder and decoder. Each GL-SA block incorporates global self-attention and local self-attention to capture distant context and fine-grained local dependencies along the temporal and frequency dimensions. Moreover, we also design an information interaction module between two branches to exchange complementary information. This interaction module together with a merge block fuse features extracted from different dimensions, thus enhancing the capability of our model to learn the target signal features. Extensive experiments are conducted to evaluate our model on a publicly available dataset. Results of the ablation experiments show that the different modules of DBSA-Net play their respective roles in improving denoising performance and are empirically valid. In both the seen ships and unseen ships scenarios, the proposed DBSA-Net outperforms existing approaches by a large margin on various evaluation metrics.