计算机科学
特征(语言学)
人工智能
对象(语法)
光学(聚焦)
骨料(复合)
模式识别(心理学)
弹丸
特征学习
视觉对象识别的认知神经科学
机器学习
计算机视觉
光学
物理
哲学
复合材料
有机化学
化学
材料科学
语言学
作者
Xin Chen,Xiaoling Deng,Yubin Lan,Yongbing Long,Jian Weng,Zhiquan Liu,Qi Tian
标识
DOI:10.1109/tpami.2023.3325533
摘要
Zero-shot learning (ZSL) aims to recognize objects from unseen classes only based on labeled images from seen classes. Most existing ZSL methods focus on optimizing feature spaces or generating visual features of unseen classes, both in conventional ZSL and generalized zero-shot learning (GZSL). However, since the learned feature spaces are suboptimal, there exists many virtual connections where visual features and semantic attributes are not corresponding to each other. To reduce virtual connections, in this paper, we propose to discover comprehensive and fine-grained object parts by building explanatory graphs based on convolutional feature maps, then aggregate object parts to train a part-net to obtain prediction results. Since the aggregated object parts contain comprehensive visual features for activating semantic attributes, the virtual connections can be reduced by a large extent. Since part-net aims to extract local fine-grained visual features, some attributes related to global structures are ignored. To take advantage of both local and global visual features, we design a feature distiller to distill local features into a master-net which aims to extract global features. The experimental results on AWA2, CUB, FLO, and SUN dataset demonstrate that our proposed method obviously outperforms the state-of-the-arts in both conventional ZSL and GZSL tasks.
科研通智能强力驱动
Strongly Powered by AbleSci AI