计算机科学
图形
情态动词
人工智能
自然语言处理
语音识别
模式识别(心理学)
理论计算机科学
化学
高分子化学
作者
Nasser Ghadiri,Raj Samani,Fahime Shahrokh
出处
期刊:Lecture notes in networks and systems
日期:2023-01-01
卷期号:: 332-341
被引量:1
标识
DOI:10.1007/978-3-031-27440-4_32
摘要
With the availability of voice-enabled devices such as smartphones, mental health disorders such as depression could be detected and treated earlier, particularly post-pandemic. The current methods involve extracting features directly from audio signals. In this paper, two methods are used to enrich voice analysis for depression detection: the transformation of voice signals into a visibility graph and the natural language processing of the transcript text based on representational learning. The results of processing text and voice with different features are fused to produce final class labels. Experimental evaluation with the DAIC-WOZ dataset suggests that integrating text-based voice classification and learning from low-level and graph-based voice signal features can improve the detection of mental disorders like depression. Our text-based method has achieved %72.7 F1-score, which is higher than other single-modal scores. The fusion of all prediction models based on voice and text has resulted in %82.4 F1-score that outperforms other models.
科研通智能强力驱动
Strongly Powered by AbleSci AI