计算机科学
卷积神经网络
语音识别
响度
演讲制作
人工智能
深度学习
模式识别(心理学)
计算机视觉
作者
Agustinus Bimo Gumelar,Eko Mulyanto Yuniarno,Wiwik Anggraeni,Indar Sugiarto,Vincentius Raki Mahindara,Mauridhi Hery Purnomo
标识
DOI:10.1109/ibiomed50285.2020.9487589
摘要
As a matter of fact, the system of human voice production is a sophisticated biological device that can modulate pitch and loudness. The essentials of internal and external factors often damage the vocal folds and change the vocal voice as a result. Thus, the consequences are well-portrayed in the function of the body and stand of emotion. Consequently, it is primary to identify voice changes at an early stage, deliver an opportunity to overcome any consequence, and enhance the patient's quality of life. In this case, voice disorder can be detected automatically by using Machine Learning (ML) techniques, which is, indeed, serves as a critical role. In this experiment, we specifically employ the Convolutional Neural Network (CNN), and a robust CNN model: the VGG-16. In investigating the performance of CNN in detecting disordered speech, we used the particular Pathological Voice Disorder (PVD) dataset, named the Respiratory Sound Database, which comprises hundreds of sampled PVD sound files. The experiment showed the accuracy of voice pathology detection arouses to 92.03%.
科研通智能强力驱动
Strongly Powered by AbleSci AI