计算机科学
延迟(音频)
推论
节拍(声学)
人工智能
深度学习
语音识别
实时计算
电信
物理
声学
作者
Xinlu Liu,Jia Qian,Qiqi He,Yi Yu,Wei Li
标识
DOI:10.1109/icme55011.2023.00192
摘要
Beat and downbeat tracking is to predict beat and downbeat time steps from a given music piece. Some deep learning models with a dilated structure such as Temporal Convolutional Network (TCN) and Dilated Self-Attention Network (DSAN) have achieved promising performance for this task. However, most of them have to see the whole music context during inference, which limits their deployment to online systems. In this paper, we propose LC-Beating, a novel latency-controlled (LC) mechanism for online beat and downbeat tracking, in which the model only looks ahead a few frames. By appending limited future information, the model can better capture the activity of relevant musical beats, which significantly boosts the performance of online algorithms with limited latency. Moreover, LC-Beating applies a novel real-time implementation of the LC mechanism to TCN and DSAN. The experimental results show that our proposed method outperforms the previous online models by a large margin and is close to the results of the offline models.
科研通智能强力驱动
Strongly Powered by AbleSci AI