计算机科学
过度拟合
语音识别
人工智能
光谱图
模式识别(心理学)
特征选择
特征(语言学)
Mel倒谱
判决
人工神经网络
特征提取
语言学
哲学
作者
Nuha Qais Abdulmajeed,Belal Al‐Khateeb,Mazin Abed Mohammed
摘要
Abstract Voice pathology diagnosis requires extracting significant features from voice signals, and classical machine learning models can overfit to the training data, which can cause difficult issues and pose challenges. The study aimed to develop a reliable and efficient system for identifying voice pathologies utilizing the long short‐term memory (LSTM) method. The study combined unique feature sets such as the mel frequency cepstral coefficients (MFCCs), zero crossing rate (ZCR), and mel spectrograms, which have not been used together in previous works. Voice pathology identification improved the accuracy rate using the LSTM approach on the Saarbruecken voice database (SVD) samples. The best results achieved by the proposed system showed an accuracy rate of 99.3% for /u/ vowel samples in neutral pitch, 99.2% for /a/ vowel samples in high pitch, 99% for /i/ vowel samples in neutral pitch, and 99.2% for sentence samples. The experimental results were evaluated utilizing accuracy, precision, specificity, sensitivity, and F1 measures. Additionally, the study compared the performance of LSTM with that of artificial neural networks (ANNs) and found that LSTM achieved better outcomes.
科研通智能强力驱动
Strongly Powered by AbleSci AI