A Local–Global Estimator Based on Large Kernel CNN and Transformer for Human Pose Estimation and Running Pose Measurement

姿势 人工智能 计算机科学 卷积神经网络 地点 估计员 编码器 变压器 模式识别(心理学) 计算机视觉 机器学习 工程类 数学 统计 操作系统 电气工程 哲学 电压 语言学
作者
Qingtian Wu,Yongfei Wu,Yu Zhang,Liming Zhang
出处
期刊:IEEE Transactions on Instrumentation and Measurement [Institute of Electrical and Electronics Engineers]
卷期号:71: 1-12 被引量:18
标识
DOI:10.1109/tim.2022.3200438
摘要

Running pose in the crowd can serve as an early warning of most abnormal events (e.g., chasing, fleeing and robbing), which can be achieved by human behavior analysis based on human pose measurement. Although deep convolutional neural networks (CNNs) have achieved impressive progress on human pose estimation, how to further improve the trade-off between estimation accuracy and speed remains an open issue. In this work, we first propose an efficient local-global estimator for human pose estimation (called LGPose). Then based on the keypoints estimated by our LGPose, a simple regression model is defined by using the geometry of the joints to achieve fast and accurate running pose measurement. To model the relationships between the human keypoints, visual transformer (ViT) encoder is adopted to learn the long-range interdependencies between them at the pixel level. However, the operation of transformer encoder is based on sequence processing that linearly projects 2D image patches to 1D tokens. It loses the important local information. Yet, locality is crucial since it has relevance to lines, edges and shapes. To learn the locality, we design effective CNN modules, rather than the original fully-connected network, into the feedforward module of ViT. Experiments on MPII and COCO Keypoint val2017 dataset show that the proposed LGPose achieves the best trade-off among the compared state-of-the-art methods. Moreover, we build a lightweight running movement dataset to verify the effectiveness of our LGPose. Based on the human pose estimated by our LGPose, we propose a regression model to measure running pose with an accuracy of 86.4% without training any other classifier. Our source codes and running dataset will be made publicly available.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
大幅提高文件上传限制,最高150M (2024-4-1)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
沉默千万发布了新的文献求助10
刚刚
刚刚
1秒前
武雨寒发布了新的文献求助10
1秒前
斯文伊发布了新的文献求助10
4秒前
marstar完成签到,获得积分10
4秒前
4秒前
wen发布了新的文献求助10
5秒前
6秒前
iKun发布了新的文献求助10
8秒前
沉默千万完成签到,获得积分10
8秒前
jwq完成签到,获得积分10
8秒前
10秒前
okayyup完成签到,获得积分10
11秒前
11秒前
11秒前
12秒前
13秒前
13秒前
brave heart完成签到,获得积分10
14秒前
温柔孤兰发布了新的文献求助10
15秒前
悲凉的英姑完成签到,获得积分10
16秒前
16秒前
纸飞机发布了新的文献求助10
16秒前
17秒前
AMAME12完成签到,获得积分20
18秒前
花开的声音1217完成签到,获得积分10
18秒前
今后应助任性半凡采纳,获得10
19秒前
xx发布了新的文献求助10
19秒前
20秒前
zwl发布了新的文献求助10
20秒前
AMAME12发布了新的文献求助10
20秒前
21秒前
冬灵发布了新的文献求助10
21秒前
英俊白莲发布了新的文献求助10
22秒前
22秒前
纸飞机完成签到,获得积分10
22秒前
23秒前
木子啊啊完成签到,获得积分10
24秒前
wen完成签到,获得积分10
24秒前
高分求助中
The late Devonian Standard Conodont Zonation 2000
Nickel superalloy market size, share, growth, trends, and forecast 2023-2030 2000
The Lali Section: An Excellent Reference Section for Upper - Devonian in South China 1500
Very-high-order BVD Schemes Using β-variable THINC Method 870
Mantiden: Faszinierende Lauerjäger Faszinierende Lauerjäger 800
PraxisRatgeber: Mantiden: Faszinierende Lauerjäger 800
Fundamentals of Dispersed Multiphase Flows 500
热门求助领域 (近24小时)
化学 医学 生物 材料科学 工程类 有机化学 生物化学 物理 内科学 纳米技术 计算机科学 化学工程 复合材料 基因 遗传学 催化作用 物理化学 免疫学 量子力学 细胞生物学
热门帖子
关注 科研通微信公众号,转发送积分 3254158
求助须知:如何正确求助?哪些是违规求助? 2896509
关于积分的说明 8292828
捐赠科研通 2565380
什么是DOI,文献DOI怎么找? 1392986
科研通“疑难数据库(出版商)”最低求助积分说明 652418
邀请新用户注册赠送积分活动 629856