计算机科学
行人
光流
人工智能
合并(版本控制)
空间语境意识
特征(语言学)
行人检测
背景(考古学)
卷积神经网络
模式识别(心理学)
计算机视觉
数据挖掘
情报检索
图像(数学)
工程类
地理
语言学
哲学
考古
运输工程
作者
Xiaofei Zhang,Xiaolan Wang,Weiwei Zhang,Yansong Wang,Xintian Liu,Dan Wei
标识
DOI:10.1177/09544070231190522
摘要
An essential prerequisite for autonomous vehicles deploying in urban scenarios is the ability to accurately recognize the behavioral intentions of pedestrians and other vulnerable road users and take measures to ensure their safety. In this paper, a spatial-temporal feature fusion-based multi-attention network (STFF-MANet) is designed to predict pedestrian crossing intention. Pedestrian information, vehicle information, scene context, and optical flow are extracted from continuous image sequences as feature sources. A lightweight 3D convolutional network is designed to extract temporal features from optical flow. Construct a spatial encoding module to extract the spatial features from the context. Pedestrian motion information are re-encoded using a collection of gated recurrent units. The final network structure is created through ablation research, which introduces attention mechanisms into the network to merge pedestrian motion features and spatio-temporal features. The efficiency of the suggested strategy is demonstrated by comparison experiments on the datasets JAAD and PIE. On the JAAD dataset, the intent recognition accuracy is 9% more accurate than the existing techniques.
科研通智能强力驱动
Strongly Powered by AbleSci AI