隐藏字幕
计算机科学
卷积神经网络
人工智能
图像(数学)
特征提取
特征(语言学)
深度学习
过程(计算)
短时记忆
模式识别(心理学)
人工神经网络
建筑
自然语言处理
机器学习
循环神经网络
艺术
哲学
语言学
视觉艺术
操作系统
作者
C. S. Kanimozhiselvi,V Karthika,S P Kalaivani,S Krithika
标识
DOI:10.1109/iccci54379.2022.9740788
摘要
The process of generating a textual description for images is known as image captioning. Now a days it is one of the recent and growing research problem. Day by day various solutions are being introduced for solving the problem. Even though, many solutions are already available, a lot of attention is still required for getting better and precise results. So, we came up with the idea of developing a image captioning model using different combinations of Convolutional Neural Network architecture along with Long Short Term Memory in order to get better results. We have used three combination of CNN and LSTM for developing the model. The proposed model is trained with three Convolutional Neural Network architecture such as Inception-v3, Xception, ResNet50 for feature extraction from the image and Long ShortTerm Memory for generating the relevant captions. Among the three combinations of CNN and LSTM, the best combination is selected based on the accuracy of the model. The model is trained using the Flicker8k dataset.
科研通智能强力驱动
Strongly Powered by AbleSci AI