凝视
计算机科学
人工智能
计算机视觉
任务(项目管理)
量化(信号处理)
推论
特征(语言学)
回归
数学
统计
语言学
哲学
经济
管理
作者
Tianlei Jin,Zheyuan Lin,Shiqiang Zhu,Wen Wang,Shunda Hu
标识
DOI:10.1109/fg52635.2021.9666980
摘要
Gaze-Following is a complex task that needs to combine the gaze with the scene. Previous works performed well on predicting single-person gaze-following but expensive computations are impractical to the real-world project. Moreover, when there are multiple people appearing at the same time, previous works will excecute repeated scene feature extraction. In addition, obtaining gaze target point through the heatmap argmax method seems to be a convention for gaze-following while the quantization error of the heatmap is ignored. In this paper, a simple but efficient network structure is proposed to provide shared scene features for the multi-person gaze-following, and a numerical coordinate regression is firstly introduced to calculate the gaze target point and regression loss. Our experiments show that the accuracy of our method can achieve SOTA on both GazeFollow dataset and VideoAttentionTarget dataset. At the same time, by using the ghostnet, the FLOPs of our method is only about 1/18 of other methods with the same accuracy. Further, sharing scene features saves nearly 40% of inference time in multi-person gaze-following task when more than 6 people in the frame.
科研通智能强力驱动
Strongly Powered by AbleSci AI