弹道
计算机科学
混合模型
多模态
人工智能
高斯分布
潜变量
模式识别(心理学)
机器学习
天文
量子力学
物理
万维网
作者
Liushuai Shi,Le Wang,Chengjiang Long,Sanping Zhou,Wei Tang,Nanning Zheng,Gang Hua
标识
DOI:10.1109/tpami.2023.3268110
摘要
Representing multimodal behaviors is a critical challenge for pedestrian trajectory prediction. Previous methods commonly represent this multimodality with multiple latent variables repeatedly sampled from a latent space, encountering difficulties in interpretable trajectory prediction. Moreover, the latent space is usually built by encoding global interaction into future trajectory, which inevitably introduces superfluous interactions and thus leads to performance reduction. To tackle these issues, we propose a novel Interpretable Multimodality Predictor (IMP) for pedestrian trajectory prediction, whose core is to represent a specific mode by its mean location. We model the distribution of mean location as a Gaussian Mixture Model (GMM) conditioned on sparse spatio-temporal features, and sample multiple mean locations from the decoupled components of GMM to encourage multimodality. Our IMP brings four-fold benefits: 1) Interpretable prediction to provide semantics about the motion behavior of a specific mode; 2) Friendly visualization to present multimodal behaviors; 3) Well theoretical feasibility to estimate the distribution of mean locations supported by the central-limit theorem; 4) Effective sparse spatio-temporal features to reduce superfluous interactions and model temporal continuity of interaction. Extensive experiments validate that our IMP not only outperforms state-of-the-art methods but also can achieve a controllable prediction by customizing the corresponding mean location.
科研通智能强力驱动
Strongly Powered by AbleSci AI