计算机科学
特征选择
相关性(法律)
离群值
特征(语言学)
机器学习
人工智能
数据挖掘
聚类分析
维数之咒
独立同分布随机变量
维数(图论)
物联网
语言学
哲学
统计
数学
随机变量
政治学
纯数学
法学
嵌入式系统
作者
Xunzheng Zhang,Alex Mavromatis,Antonis Vafeas,Reza Nejabati,Dimitra Simeonidou
标识
DOI:10.1109/jiot.2023.3237032
摘要
Under horizontal federated learning (HFL) in the Internet of Things (IoT) scenarios, different user data sets have significant similarities on the feature spaces, the final goal is to build a high-performance global model. However, not all features are great contributors when training the global HFL model, some features even impair the HFL. Besides, the curse of dimension will delay the training time and cause more energy consumption (EC). In this case, it is critical to remove irrelevant features from the local and select the useful overlapping features from a federated global perspective. In addition, the uncertainty of data being labeled and the nonindependent and identically distributed (non-IID) client data should also consider. This article introduces an unsupervised federated feature selection approach (named FSHFL) for HFL in IoT networks. First, a feature relevance outlier detection method is applied to the HFL participants to remove the useless features, which combines with the improved one-class support vector machine. Besides, a feature relevance hierarchical clustering (FRHC) algorithm is proposed for HFL overlapping feature selection. Experiment results on four IoT data sets show that the proposed methods can select better-federated feature sets among HFL participants, thus improving the performance of the HFL system. Specifically, the global model accuracy improves up to 1.68% since fewer irrelevant features. Moreover, FSHFL can lower the average training time as high as 6.9%. Finally, when the global model gets the same test accuracy, FSHFL can decrease the average EC of training the model by approximately 2.85% compared to federated average and roughly 68.39% compared to Fed-SGD.
科研通智能强力驱动
Strongly Powered by AbleSci AI