计算机科学
差别隐私
人工智能
机器学习
再培训
信息隐私
联合学习
大数据
数据挖掘
计算机安全
国际贸易
业务
作者
Lefeng Zhang,Tianqing Zhu,Haibin Zhang,Ping Xiong,Wanlei Zhou
标识
DOI:10.1109/tifs.2023.3297905
摘要
Over the past decades, the abundance of personal data has led to the rapid development of machine learning models and important advances in artificial intelligence (AI). However, alongside all the achievements, there are increasing privacy threats and security risks that may cause significant losses for data providers. Recent legislation requires that the private information about a user should be removed from a database as well as machine learning models upon certain deletion requests. While erasing data records from memory storage is straightforward, it is often challenging to remove the influence of particular data samples from a model that has already been trained. Machine unlearning is an emerging paradigm that aims to make machine learning models “forget” what they have learned about particular data. Nevertheless, the unlearning issue for federated learning has not been completely addressed due to its special working mode. First, existing solutions crucially rely on retraining-based model calibration, which is likely unavailable and can pose new privacy risks for federated learning frameworks. Second, today’s efficient unlearning strategies are mainly designed for convex problems, which are incapable of handling more complicated learning tasks like neural networks. To overcome these limitations, we took advantage of differential privacy and developed an efficient machine unlearning algorithm named FedRecovery. The FedRecovery erases the impact of a client by removing a weighted sum of gradient residuals from the global model, and tailors the Gaussian noise to make the unlearned model and retrained model statistically indistinguishable. Furthermore, the algorithm neither requires retraining-based fine-tuning nor needs the assumption of convexity. Theoretical analyses show the rigorous indistinguishability guarantee. Additionally, the experiment results on real-world datasets demonstrate that the FedRecovery is efficient and is able to produce a model that performs similarly to the retrained one.
科研通智能强力驱动
Strongly Powered by AbleSci AI