计算机科学
水准点(测量)
人工智能
采样(信号处理)
序列(生物学)
骨架(计算机编程)
班级(哲学)
接头(建筑物)
机器学习
样品(材料)
计算机视觉
建筑工程
化学
大地测量学
滤波器(信号处理)
色谱法
生物
工程类
遗传学
程序设计语言
地理
作者
Murchana Baruah,Bonny Banerjee,Atulya K. Nagar
出处
期刊:IEEE Transactions on Human-Machine Systems
[Institute of Electrical and Electronics Engineers]
日期:2023-02-06
卷期号:53 (2): 458-463
被引量:2
标识
DOI:10.1109/thms.2023.3239648
摘要
The human ability to infer others' intent is innate and crucial to development. Machines ought to acquire this ability for seamless interaction with humans. In this article, we propose an agent model for predicting the intent of actors in human–human interactions. This requires simultaneous generation and recognition of an interaction at any time, for which end-to-end models are scarce. The proposed agent actively samples its environment via a sequence of glimpses. At each sampling instant, the model infers the observation class and completes the partially observed body motion. It learns the sequence of body locations to sample by jointly minimizing the classification and generation errors. The model is evaluated on videos of two-skeleton interactions under two settings: (first person) one skeleton is the modeled agent and the other skeleton's joint movements constitute its visual observation, and (third person) an audience is the modeled agent and the two interacting skeletons' joint movements constitute its visual observation. Three methods for implementing the attention mechanism are analyzed using benchmark datasets. One of them, where attention is driven by sensory prediction error, achieves the highest classification accuracy in both settings by sampling less than 50% of the skeleton joints, while also being the most efficient in terms of model size. This is the first known attention-based agent to learn end-to-end from two-person interactions for intent prediction, with high accuracy and efficiency.
科研通智能强力驱动
Strongly Powered by AbleSci AI