Ke Cheng,Ning Xi,Ximeng Liu,Xinghui Zhu,Haichang Gao,Zhiwei Zhang,Yulong Shen
出处
期刊:IEEE Transactions on Computers [Institute of Electrical and Electronics Engineers] 日期:2023-12-01卷期号:72 (12): 3519-3531
标识
DOI:10.1109/tc.2023.3305754
摘要
The advances in deep neural networks (DNNs) have driven many companies to offer their carefully-trained DNNs as inference services for clients’ private data. The privacy concerns have increasingly motivated the need for private inference (PI), where DNN inferences are performed directly on encrypted data without revealing the client’s private inputs to the server or revealing the server’s proprietary DNN weights to the client. However, existing cryptographic protocols for PI suffer from impractically high latency, stemming mostly from non-linear operators like ReLU activations. In this paper, we propose PAPI, a Practical and Adaptive Private Inference framework. First, we develop an accuracy-adaptive neural architecture search (NAS) approach to generate DNN models tailored for high-efficiency ciphertext computation. Specifically, our NAS automatically generates the DNNs with fewer ReLUs while keeping the accuracy above a user-defined target. Second, we propose secure online/offline protocols for ReLU activation and its approximation variants (i.e., polynomial activations), which purely rely on the lightweight secret sharing techniques in the online execution and can well cope with our optimized DNNs in the ciphertext domain. Experimental results show that PAPI reduces online inference latency on the CIFAR-10/100 and ImageNet datasets by 2.7× ∼7.8× over the state-of-the-art.