人工智能
计算机科学
成对比较
名词
突出
自然语言处理
视觉推理
机器学习
模式识别(心理学)
作者
Weijiang Yu,Haofan Wang,Guohao Li,Nong Xiao,Bernard Ghanem
标识
DOI:10.1109/tpami.2023.3238699
摘要
The task of situation recognition aims to solve the visual reasoning problem with the ability to predict the activity happening (salient action) in an image and the nouns of all associated semantic roles playing in the activity. This poses severe challenges due to long-tailed data distributions and local class ambiguities. Prior works only propagate the local noun-level features on one single image without utilizing global information. We propose a Knowledge-aware Global Reasoning (KGR) framework to endow neural networks with the capability of adaptive global reasoning over nouns by exploiting diverse statistical knowledge. Our KGR is a local-global architecture, which consists of a local encoder to generate noun features using local relations and a global encoder to enhance the noun features via global reasoning supervised by an external global knowledge pool. The global knowledge pool is created by counting the pairwise relationships of nouns in the dataset. In this paper, we design an action-guided pairwise knowledge as the global knowledge pool based on the characteristic of the situation recognition task. Extensive experiments have shown that our KGR not only achieves state-of-the-art results on a large-scale situation recognition benchmark, but also effectively solves the long-tailed problem of noun classification by our global knowledge.
科研通智能强力驱动
Strongly Powered by AbleSci AI