计算机科学
杠杆(统计)
人工智能
标记数据
机器学习
推论
补语(音乐)
任务(项目管理)
半监督学习
集合(抽象数据类型)
训练集
监督学习
模式识别(心理学)
人工神经网络
经济
表型
化学
管理
互补
程序设计语言
基因
生物化学
作者
Chi Ian Tang,Ignacio Perez-Pozuelo,Dimitris Spathis,Søren Brage,Nicholas J. Wareham,Cecilia Mascolo
出处
期刊:Proceedings of the ACM on interactive, mobile, wearable and ubiquitous technologies
[Association for Computing Machinery]
日期:2021-03-19
卷期号:5 (1): 1-30
被引量:76
摘要
Machine learning and deep learning have shown great promise in mobile sensing applications, including Human Activity Recognition. However, the performance of such models in real-world settings largely depends on the availability of large datasets that captures diverse behaviors. Recently, studies in computer vision and natural language processing have shown that leveraging massive amounts of unlabeled data enables performance on par with state-of-the-art supervised models. In this work, we present SelfHAR, a semi-supervised model that effectively learns to leverage unlabeled mobile sensing datasets to complement small labeled datasets. Our approach combines teacher-student self-training, which distills the knowledge of unlabeled and labeled datasets while allowing for data augmentation, and multi-task self-supervision, which learns robust signal-level representations by predicting distorted versions of the input. We evaluated SelfHAR on various HAR datasets and showed state-of-the-art performance over supervised and previous semi-supervised approaches, with up to 12% increase in F1 score using the same number of model parameters at inference. Furthermore, SelfHAR is data-efficient, reaching similar performance using up to 10 times less labeled data compared to supervised approaches. Our work not only achieves state-of-the-art performance in a diverse set of HAR datasets, but also sheds light on how pre-training tasks may affect downstream performance.
科研通智能强力驱动
Strongly Powered by AbleSci AI