计算机科学
信号(编程语言)
人工智能
程序设计语言
作者
B.M. Mala,Smita Chormunge
标识
DOI:10.1016/j.csl.2024.101621
摘要
In the present scenario, the recognition of particular emotions or needs from an infant's cry is a difficult process in the field of pattern recognition as it does not have any verbal information. In this article, an automated model is introduced for an effective recognition of infant cries. At first, the infant cry signals are collected from the Baby Chillanto (BC) dataset and the Donate a Cry Corpus (DCC) dataset. These acquired signals are converted into feature vectors by employing nine techniques namely, Zero Crossing Rate (ZCR), acoustic features, audio features, amplitude, energy, Root Mean Square (RMS), statistical moments, autocorrelation, and Mel-Frequency Cepstral Coefficients (MFCCs). These obtained feature vectors are multi-dimensional; therefore, a Simulated Annealing Algorithm (SAA) is employed to select informative feature vectors. The selected informative feature vectors are passed to the leaky Bi-directional Long Short Term Memory (Bi-LSTM) model for classifying the types of infant cries. Specifically, in the leaky Bi-LSTM model, the conventional activation functions (Tangent (Tanh) and sigmoid) are replaced with the leaky Rectified Linear Unit (leaky ReLU) activation function. This process significantly mitigates the vanishing gradient problem and improves convergence during data training, which is vital for signal classification tasks. Furthermore, an Improved Artificial Rabbit's Optimization (IARO) algorithm is proposed to choose optimal hyper-parameters in the leaky Bi-LSTM model, where this mechanism reduces the complexity and training time of the classification model. In the IARO algorithm, selective opposition and Lévy flight strategies are integrated with the conventional ARO algorithm to enhance the dynamics and diversity of the population, along with the model's tracking efficiency. The empirical investigation denotes that the proposed IARO based leaky Bi-LSTM model achieves 99.66% and 95.92% of classification accuracy on the BC and DCC datasets, respectively. The proposed IARO based leaky Bi-LSTM model achieves maximum classification results when related to the conventional recognition models.
科研通智能强力驱动
Strongly Powered by AbleSci AI