计算机科学
可解释性
人工智能
卷积神经网络
可视化
光学(聚焦)
深度学习
特征(语言学)
RGB颜色模型
语义特征
基本事实
背景(考古学)
机器学习
模式识别(心理学)
古生物学
语言学
哲学
物理
光学
生物
标识
DOI:10.1016/j.knosys.2022.109006
摘要
Driver focus of attention (DFoA) is a fundamental research problem in human-like autonomous driving systems. However, most existing methods require large amounts of ground-truth DFoA data for training, which are difficult to collect. Inspired by the visual interpretability of neural networks, this study innovatively developed a DFoA prediction method based on the feature visualization of a deep autonomous driving model that does not require ground-truth DFoA data to train. Here, we propose a multimodal spatiotemporal convolutional network with an attention mechanism for DFoA prediction. First, semantic and depth images were generated from RGB video frames to enable the multipath convolutional network to learn the spatiotemporal information of successive images. A parameter-free attention mechanism with 3-D weights was incorporated as an energy function to calculate the importance of each neuron. A graph attention network was used to learn the most driving behavior-relevant semantic context features. The learned features were fused, and a convolutional long short-term memory network (ConvLSTM) was adopted to achieve the evolution of the fused features in successive frames while considering historical scene variation. Finally, a novel feature visualization method was designed to predict the DFoA by visualizing the driving behavior-relevant features. The experimental results demonstrated that the proposed method can accurately predict DFoA.
科研通智能强力驱动
Strongly Powered by AbleSci AI