Timezone: »
The wide adoption of deep neural networks (DNNs) in mission-critical applications has spurred the need for interpretable models that provide explanations of the model's decisions. Unfortunately, previous studies have demonstrated that model explanations facilitate information leakage, rendering DNN models vulnerable to model inversion attacks. These attacks enable the adversary to reconstruct original images based on model explanations, thus leaking privacy-sensitive features. To this end, we present Generative Noise Injector for Model Explanations (GNIME), a novel defense framework that perturbs model explanations to minimize the risk of model inversion attacks while preserving the interpretabilities of the generated explanations. Specifically, we formulate the defense training as a two-player minimax game between the inversion attack network on the one hand, which aims to invert model explanations, and the noise generator network on the other, which aims to inject perturbations to tamper with model inversion attacks. We demonstrate that GNIME significantly decreases the information leakage in model explanations, decreasing transferable classification accuracy in facial recognition models by up to 84.8% while preserving the original functionality of model explanations.
Author Information
Hoyong Jeong (KAIST)
Suyoung Lee (KAIST)
Sung Ju Hwang (KAIST, AITRICS)
Sooel Son (Korea Advanced Institute of Science and Technology)
More from the Same Authors
-
2021 Spotlight: Hardware-adaptive Efficient Latency Prediction for NAS via Meta-Learning »
Hayeon Lee · Sewoong Lee · Song Chong · Sung Ju Hwang -
2021 Spotlight: Task-Adaptive Neural Network Search with Meta-Contrastive Learning »
Wonyong Jeong · Hayeon Lee · Geon Park · Eunyoung Hyung · Jinheon Baek · Sung Ju Hwang -
2021 : Skill-based Meta-Reinforcement Learning »
Taewook Nam · Shao-Hua Sun · Karl Pertsch · Sung Ju Hwang · Joseph Lim -
2021 : Skill-based Meta-Reinforcement Learning »
Taewook Nam · Shao-Hua Sun · Karl Pertsch · Sung Ju Hwang · Joseph Lim -
2022 : SPRINT: Scalable Semantic Policy Pre-training via Language Instruction Relabeling »
Jesse Zhang · Karl Pertsch · Jiahui Zhang · Taewook Nam · Sung Ju Hwang · Xiang Ren · Joseph Lim -
2022 : SPRINT: Scalable Semantic Policy Pre-training via Language Instruction Relabeling »
Jesse Zhang · Karl Pertsch · Jiahui Zhang · Taewook Nam · Sung Ju Hwang · Xiang Ren · Joseph Lim -
2022 : Targeted Adversarial Self-Supervised Learning »
Minseon Kim · Hyeonjeong Ha · Sooel Son · Sung Ju Hwang -
2023 Poster: Knowledge-Augmented Reasoning Distillation for Small Language Models in Knowledge-Intensive Tasks »
Minki Kang · Seanie Lee · Jinheon Baek · Kenji Kawaguchi · Sung Ju Hwang -
2023 Poster: Generalizable Lightweight Proxy for Robust NAS against Diverse Perturbations »
Hyeonjeong Ha · Minseon Kim · Sung Ju Hwang -
2023 Poster: STXD: Structural and Temporal Cross-Modal Distillation for Multi-View 3D Object Detection »
Sujin Jang · Dae Ung Jo · Sung Ju Hwang · Dongwook Lee · Daehyun Ji -
2023 Poster: Effective Targeted Attacks for Adversarial Self-Supervised Learning »
Minseon Kim · Hyeonjeong Ha · Sooel Son · Sung Ju Hwang -
2022 Poster: Factorized-FL: Personalized Federated Learning with Parameter Factorization & Similarity Matching »
Wonyong Jeong · Sung Ju Hwang -
2022 Poster: Graph Self-supervised Learning with Accurate Discrepancy Learning »
Dongki Kim · Jinheon Baek · Sung Ju Hwang -
2022 Poster: Set-based Meta-Interpolation for Few-Task Meta-Learning »
Seanie Lee · Bruno Andreis · Kenji Kawaguchi · Juho Lee · Sung Ju Hwang -
2021 Poster: Edge Representation Learning with Hypergraphs »
Jaehyeong Jo · Jinheon Baek · Seul Lee · Dongki Kim · Minki Kang · Sung Ju Hwang -
2021 Poster: Hit and Lead Discovery with Explorative RL and Fragment-based Molecule Generation »
Soojung Yang · Doyeong Hwang · Seul Lee · Seongok Ryu · Sung Ju Hwang -
2021 Poster: Hardware-adaptive Efficient Latency Prediction for NAS via Meta-Learning »
Hayeon Lee · Sewoong Lee · Song Chong · Sung Ju Hwang -
2021 Poster: Task-Adaptive Neural Network Search with Meta-Contrastive Learning »
Wonyong Jeong · Hayeon Lee · Geon Park · Eunyoung Hyung · Jinheon Baek · Sung Ju Hwang -
2021 Poster: Mini-Batch Consistent Slot Set Encoder for Scalable Set Encoding »
Bruno Andreis · Jeffrey Willette · Juho Lee · Sung Ju Hwang