差别隐私
计算机科学
杠杆(统计)
政府(语言学)
互联网隐私
隐私政策
信息隐私
隐私软件
可信赖性
数据科学
计算机安全
机器学习
人工智能
数据挖掘
语言学
哲学
作者
Bahrul Ilmi Nasution,Yudhistira Nugraha,Irfan Dwiki Bhaswara,Muhammad Erza Aminanto
标识
DOI:10.1145/3605772.3624003
摘要
With the rise of personal data law in various countries, data privacy has recently become an essential issue. One of the well-known techniques used in overcoming privacy issues during analysis is differential privacy. However, many studies have shown that differential privacy decreased the machine learning model performance. It becomes problematic for any organization like the government to draw a policy from accurate insights from citizen statistics while maintaining citizen privacy. This study reviews differential privacy in machine learning algorithms and evaluates its performance on real COVID-19 patient data, using Jakarta, Indonesia as a case study. Besides that, we also validate our study with two additional datasets, the public Adult dataset from University of California, Irvine, and an Indonesia socioeconomic dataset. We find that using differential privacy tends to reduce accuracy and may lead to model failure in imbalanced data, particularly in more complex models such as random forests. The finding emphasizes differential privacy usage in government is practical for the trustworthy government but with distinct challenges. We discuss limitations and recommendations for any organization that works with personal data to leverage differential privacy in the future.
科研通智能强力驱动
Strongly Powered by AbleSci AI