计算机科学
RGB颜色模型
人工智能
融合
计算机视觉
模式
传感器融合
模态(人机交互)
对象(语法)
互补性(分子生物学)
人机交互
社会科学
哲学
语言学
社会学
生物
遗传学
作者
Xingyu Liu,Pengfei Ren,Yuchen Chen,Cong Liu,Jing Wang,Haifeng Sun,Qi Qi,Jingyu Wang
标识
DOI:10.1145/3543507.3587429
摘要
Web-based AR technology has broadened human-computer interaction scenes from traditional mechanical devices and flat screens to the real world, resulting in unconstrained environmental challenges such as complex backgrounds, extreme illumination, depth range differences, and hand-object interaction. The previous hand detection and 3D hand pose estimation methods are usually based on single modality such as RGB or depth data, which are not available in some scenarios in unconstrained environments due to the differences between the two modalities. To address this problem, we propose a multimodal fusion approach, named Scene-Adapt Fusion (SA-Fusion), which can fully utilize the complementarity of RGB and depth modalities in web-based HCI tasks. SA-Fusion can be applied in existing hand detection and 3D hand pose estimation frameworks to boost their performance, and can be further integrated into the prototyping AR system to construct a web-based interactive AR application for unconstrained environments. To evaluate the proposed multimodal fusion method, we conduct two user studies on CUG Hand and DexYCB dataset, to demonstrate its effectiveness in terms of accurately detecting hand and estimating 3D hand pose in unconstrained environments and hand-object interaction.
科研通智能强力驱动
Strongly Powered by AbleSci AI