计算机科学
发电机(电路理论)
编码器
人工智能
布鲁
图像(数学)
集合(抽象数据类型)
领域(数学)
自然语言处理
语音识别
计算机视觉
模式识别(心理学)
机器翻译
功率(物理)
程序设计语言
物理
量子力学
数学
纯数学
操作系统
作者
Duc-Hieu Hoang,Tran Anh Khoa,Duc Ngoc Minh Dang,Phuong-Nam Tran,Hanh Dang-Ngoc,Cuong Tuan Nguyen
标识
DOI:10.1109/ictc58733.2023.10392496
摘要
In recent years, the topic of image caption generators has gained significant attention. Several successful projects have emerged in this field, showcasing notable advancements. Image caption generators automatically generate descriptive captions for images through the encoder and decoder mechanisms. The encoder leverages computer vision models, while the decoder utilizes natural language processing models. In this study, we aim to assess a comprehensive set of seven distinct methodologies, including six existing methods from prior research and one newly proposed. These methods are trained and evaluated with bilingual evaluation (BLEU) on the Flickr8K dataset. In our experiments, the proposed ResNet50 – BERT – Bahdanau Attention model outperforms the other models in terms of the BLEU-1 score of 0.532143 and BLEU-4 score of 0.126316.
科研通智能强力驱动
Strongly Powered by AbleSci AI