计算机科学
弹丸
人工智能
判决
集合(抽象数据类型)
自然语言处理
一次性
机器学习
情态动词
模式识别(心理学)
零(语言学)
图像(数学)
语音识别
工程类
有机化学
语言学
程序设计语言
高分子化学
化学
哲学
机械工程
作者
Gunwoo Yong,Kahyun Jeon,Daeyoung Gil,Ghang Lee
摘要
Abstract Zero‐shot learning, applied with vision‐language pretrained (VLP) models, is expected to be an alternative to existing deep learning models for defect detection, under insufficient dataset. However, VLP models, including contrastive language‐image pretraining (CLIP), showed fluctuated performance on prompts (inputs), resulting in research on prompt engineering—optimization of prompts for improving performance. Therefore, this study aims to identify the features of a prompt that can yield the best performance in classifying and detecting building defects using the zero‐shot and few‐shot capabilities of CLIP. The results reveal the following: (1) domain‐specific definitions are better than general definitions and images; (2) a complete sentence is better than a set of core terms; and (3) multimodal information is better than single‐modal information. The resulting detection performance using the proposed prompting method outperformed that of existing supervised models.
科研通智能强力驱动
Strongly Powered by AbleSci AI