Skip to yearly menu bar Skip to main content


Poster

Encrypted Data Pruning for Confidential Training of Deep Neural Networks

Yancheng Zhang · Mengxin Zheng · Yuzhang Shang · Xun Chen · Qian Lou

[ ]
Wed 11 Dec 4:30 p.m. PST — 7:30 p.m. PST

Abstract: Due to a lack of expertise and computational resources, data owners often delegate deep learning model training to cloud servers. There are many scenarios, e.g., healthcare and financial, in which the data required by model training is extremely confidentiality-sensitive. However, business, legal, and ethical constraints hinder directly sharing confidential data. Non-interactive cryptographic computing, Fully Homomorphic Encryption (FHE), provides a promising solution for confidential training on encrypted data. One challenge of FHE-based confidential training is its large computational overhead, especially the multiple rounds of forward and backward execution on each encrypted data sample. Considering the existence of largely redundant data samples, pruning them will significantly speed up the training, as proven in plain non-FHE training. The data pruning requires knowledge computation during the training phase, excluding the client-side data pruning. Executing the data pruning of encrypted data on the server side is not trivial since the knowledge calculation of data pruning needs complex and expensive executions on encrypted data. There is a lack of FHE-based data pruning protocol for an efficient confidential training. In this paper, we first construct a basic FHE data-pruning protocol and then design an FHE-friendly data-pruning algorithm under client-aided or non-client-aided settings, respectively. We observed that data sample pruning may not always remove ciphertexts, leaving large empty slots and limiting the effects of data pruning. Thus, we propose ciphertext-wise pruning to reduce ciphertext computation numbers without hurting accuracy. Experimental results show that our work can achieve a $16\times$ speedup with only a $0.6\%$ accuracy drop over prior work.The code is publicly available at https://anonymous.4open.science/r/PrivateDataPrune-23AC.

Live content is unavailable. Log in and register to view live content