计算机科学
分割
帕斯卡(单位)
人工智能
Boosting(机器学习)
特征(语言学)
模式识别(心理学)
变压器
光学(聚焦)
情报检索
光学
电压
量子力学
程序设计语言
语言学
哲学
物理
作者
Jun Ding,Zhen Zhang,Q. Wang,Huibin Wang
标识
DOI:10.1016/j.imavis.2023.104893
摘要
Few-shot Semantic Segmentation (FSS) refers to train a segmentation model that can be generalized to novel categories with limited labeled images. One challenge of FSS is spatial inconsistency between support and query images, e.g., appearance and texture. Most existing methods are only committed to utilizing the semantic-level prototypes of support images to guide mask predictions. These methods, nevertheless, only focus on the most discriminate regions of the object rather than holonomic feature representations. Besides, another question exists that the lack of interaction between paired support and query images. In this paper, we propose a self-align and cross-align transformer (SCTrans) to remedy the above limitations. Specifically, we design a feature fusion module (FFM) to incorporate low-level information from the query branch into mid-level semantic features, boosting the semantic representations of query images. In addition, a feature alignment module is designed to bidirectionally propagate semantic information from support to query images conditioned on more representative support and query features, increasing both intra-class similarities and inter-class differences. Extensive experiments on PASCAL-5i and COCO-20i show that our SCTrans significantly advances the state-of-the-art methods.
科研通智能强力驱动
Strongly Powered by AbleSci AI