计算机科学
人工智能
分割
特征(语言学)
集成学习
弹丸
模式识别(心理学)
语义特征
自然语言处理
机器学习
化学
哲学
语言学
有机化学
作者
Amin Karimi,Charalambos Poullis
标识
DOI:10.1038/s41598-024-54640-6
摘要
Abstract This paper addresses few-shot semantic segmentation and proposes a novel transductive end-to-end method that overcomes three key problems affecting performance. First, we present a novel ensemble of visual features learned from pretrained classification and semantic segmentation networks with the same architecture. Our approach leverages the varying discriminative power of these networks, resulting in rich and diverse visual features that are more informative than a pretrained classification backbone that is not optimized for dense pixel-wise classification tasks used in most state-of-the-art methods. Secondly, the pretrained semantic segmentation network serves as a base class extractor, which effectively mitigates false positives that occur during inference time and are caused by base objects other than the object of interest. Thirdly, a two-step segmentation approach using transductive meta-learning is presented to address the episodes with poor similarity between the support and query images. The proposed transductive meta-learning method addresses the prediction by first learning the relationship between labeled and unlabeled data points with matching support foreground to query features (intra-class similarity) and then applying this knowledge to predict on the unlabeled query image (intra-object similarity), which simultaneously learns propagation and false positive suppression. To evaluate our method, we performed experiments on benchmark datasets, and the results demonstrate significant improvement with minimal trainable parameters of 2.98 M . Specifically, using Resnet-101, we achieve state-of-the-art performance for both 1-shot and 5-shot Pascal- $$5^{i}$$ 5 i , as well as for 1-shot and 5-shot COCO- $$20^{i}$$ 20 i .
科研通智能强力驱动
Strongly Powered by AbleSci AI