计算机科学
可解释性
人工智能
图形
分割
领域(数学)
机器学习
语义学(计算机科学)
计算机视觉
模式识别(心理学)
理论计算机科学
数学
程序设计语言
纯数学
作者
Pingping Cao,Zeqi Zhu,Ziyuan Wang,Yanping Zhu,Qiang Niu
标识
DOI:10.1007/s00521-022-07368-1
摘要
Graph Convolutional Network (GCN) which models the potential relationship between non-Euclidean spatial data has attracted researchers’ attention in deep learning in recent years. It has been widely used in different computer vision tasks by modeling the latent space, topology, semantics, and other information in Euclidean spatial data and has achieved significant success. To better understand the work principles and future GCN applications in the computer vision field, this study reviewed the basic principles of GCN, summarized the difficulties and solutions using GCN in different visual tasks, and introduced in detail the methods for constructing graphs from the Euclidean spatial data in different visual tasks. At the same time, the review divided the application of GCN in basic visual tasks into image recognition, object detection, semantic segmentation, instance segmentation and object tracking. The role and performance of GCN in basic visual tasks were summarized and compared in detail for different tasks. This review emphasizes that the application of GCN in computer vision faces three challenges: computational complexity, the paradigm of constructing graphs from the Euclidean spatial data, and the interpretability of the model. Finally, this review proposes two future trends of GCN in the vision field, namely model lightweight and fusing GCN with other models to improve the performance of the visual model and meet the higher requirements of vision tasks.
科研通智能强力驱动
Strongly Powered by AbleSci AI