计算机科学
变压器
成熟度
人工智能
编码器
卷积神经网络
模式识别(心理学)
解码方法
计算机视觉
电压
算法
电气工程
成熟
化学
食品科学
工程类
操作系统
作者
Bingjie Xiao,Minh Nguyen,Wei Qi Yan
标识
DOI:10.1007/s10489-023-04799-8
摘要
Abstract Pattern classification has always been essential in computer vision. Transformer paradigm having attention mechanism with global receptive field in computer vision improves the efficiency and effectiveness of visual object detection and recognition. The primary purpose of this article is to achieve the accurate ripeness classification of various types of fruits. We create fruit datasets to train, test, and evaluate multiple Transformer models. Transformers are fundamentally composed of encoding and decoding procedures. The encoder is to stack the blocks, like convolutional neural networks (CNN or ConvNet). Vision Transformer (ViT), Swin Transformer, and multilayer perceptron (MLP) are considered in this paper. We examine the advantages of these three models for accurately analyzing fruit ripeness. We find that Swin Transformer achieves more significant outcomes than ViT Transformer for both pears and apples from our dataset.
科研通智能强力驱动
Strongly Powered by AbleSci AI