计算机科学
人工智能
嵌入
语义学(计算机科学)
班级(哲学)
特征(语言学)
编码器
领域(数学分析)
光学(聚焦)
特征向量
模式识别(心理学)
机器学习
特征学习
特征提取
数学
数学分析
哲学
语言学
物理
光学
程序设计语言
操作系统
作者
Yongli Hu,Lincong Feng,Huajie Jiang,Mengting Liu,Baocai Yin
出处
期刊:IEEE Transactions on Circuits and Systems for Video Technology
[Institute of Electrical and Electronics Engineers]
日期:2023-09-11
卷期号:34 (5): 3180-3191
被引量:2
标识
DOI:10.1109/tcsvt.2023.3313727
摘要
Generalized zero-shot learning(GZSL) aims to recognize images from seen and unseen classes with side information, such as manually annotated attribute vectors. Traditional methods focus on mapping images and semantics into a common latent space, thus achieving the visual-semantics alignment. Since the unseen classes are unavailable during training, there is a serious problem of recognition bias, which will tend to recognize unseen classes as seen classes. To solve this problem, we propose a Domain-aware Prototype Network(DPN), which splits the GZSL problem into the seen class recognition and unseen class recognition problem. For the seen classes, we design a domain-aware prototype learning branch with a dual attention feature encoder to capture the essential visual information, which aims to recognize the seen classes and discriminate the novel categories. To further recognize the fine-grained unseen classes, a visual-semantic embedding branch is designed, which aims to align the visual and semantic information for unseen-class recognition. Through the multi-task learning of the prototype learning branch and visual-semantic embedding branch, our model can achieve excellent performance on three popular GZSL datasets.
科研通智能强力驱动
Strongly Powered by AbleSci AI