计算机科学
变压器
有害生物分析
人工智能
编码器
隐藏字幕
计算机视觉
模式识别(心理学)
自然语言处理
图像(数学)
工程类
电压
电气工程
营销
业务
操作系统
作者
Shansong Wang,Qingtian Zeng,Weijian Ni,Cheng Cheng,Yanxue Wang
标识
DOI:10.1016/j.compag.2023.107863
摘要
Pest image classification systems are key tools to identify pests in time. However, existing image classification systems can only predict the labels of pest images and lack the interpretation of image content. In this paper, image caption generation techniques are introduced to interpret the results of pest image classification. Specifically, we proposed the ODP-Transformer by imitating the three basic actions in the diagnostic process of agricultural experts, which are Observation, Description and Prediction. ODP-Transformer is a two-stage model, the first stage is a pest part detector based on the faster R-CNN framework. And the second stage contains three modules: Parts Sequence Encoder, Description Decoder and Classification Decoder, which are used for image caption generation tasks and classification tasks. At the same time, a prior knowledge matrix is introduced to guide the optimization direction of the attention mechanism in the Description Decoder, which is used to learn the concept correspondences in images and texts. Additionally, an agricultural pest textual and visual dataset (APTV-99) is collected, which contains not only the semantic annotations of images but also the textual descriptions of corresponding parts. Extensive experiments are implemented on APTV-99 to evaluate the performance of ODP-Transformer. In the pest image classification task, ODP-Transformer is 12.91% higher in accuracy than the 8 commonly used CNN models. In the image captioning generation task, compared with the other 6 methods, ODP-Transformer improves by 1.62, 8.08, and 1.08 for Bleu1, CiderD and Rouge indicators, respectively.
科研通智能强力驱动
Strongly Powered by AbleSci AI