摘要
Sarcasm detection is one the most challenging task in natural language processing. Though sentiment semantics are necessary to improve sarcasm detection performance, existing DL-based sarcasm detection models do not fully incorporate them. This research suggested the Hybrid RNN and Optimized LSTM for Multimodal Sarcasm Detection (HROMSD) model. The model is processed under the four stages: preprocessing, feature extraction, feature level fusion, and classification. The initial stage of this proposed technique is preprocessing, here input of the multimodal data, which comprises of text, video, and audio are preprocessed. Here, the text will be preprocessed under tokenization and stemming, the video will be preprocessed under face detection and the audio will be preprocessed under filtering technique. Then, the stage of feature extraction takes place, where the features from preprocessed text, video, and audio are extracted, here, n-grams, TF-IDF, improved Bag of Visual Words, and emojis are extracted as the text features; then CLM and improved SLBT based video features are extracted from the video features, and chroma, MFCC, jitter and special features are extracted from the audio features. The resultant extracted features set are subjected for feature level fusion stage, which makes use of an improved multilevel CCA fusion technique. The classification is carried out using Hybrid RNN and Optimized LSTM for detection purpose, where Improved BES (IBES) method utilized to increase the detection system’s performance. When compared to earlier research, the proposed work is more accurate.