DNN-based HRTF individualization for accurate spectral cues using a compact PRTF
计算机科学
语音识别
人工智能
作者
Byeong-Yun Ko,Hyeonuk Nam,Deokki Min,Y.K. Park
标识
DOI:10.3397/in_2024_3634
摘要
Head-Related Transfer Function (HRTF) plays a critical role in how the auditory system perceives spatial information. The spectral cues embedded in HRTF are vital for accurately determining the elevation of sound sources. In existing approaches, deep neural networks (DNNs) have been utilized to predict the magnitude spectra of HRTF from images of the pinna, typically employing the HRTF log-magnitude as the output during training. However, HRTF encompasses the acoustic characteristics of both the head and torso, exhibiting direction-dependent patterns that pose challenges in reconstructing its spectral cues. To address this complexity, we propose an innovative method for HRTF individualization. Our model uses Pinna-Related Transfer Function (PRTF) as the output during training, which helps alleviate the impact of sound reflections from the head and torso in the head-related impulse response (HRIR). Our experimental findings, based on an HRTF dataset, illustrate that our proposed model excels in reconstructing the first and second spectral cues. Furthermore, it outperforms previous deep learning models in terms of log spectral distortion (LSD).