Timezone: »
Federated learning is used for decentralized training of machine learning models on a large number (millions) of edge mobile devices. It is challenging because mobile devices usually often have limited communication bandwidth, and local computation resources. Therefore, how to improve the efficiency of federated learning is critical for scalability and usability. In this paper, we propose to leverage partially trainable neural networks, which freeze a portion of the model parameters during the entire training process, to reduce the communication cost with little implications on model performance. Through extensive experiments, we empirically show that Federated learning of Partially Trainable neural networks (FedPT) can result in good communication-accuracy trade-offs, with up to 46x reduction in communication cost, at a small accuracy cost. Our approach also enables faster training, with a smaller memory footprint, and higher resilience to strong privacy guarantees. The proposed FedPT can be particularly interesting for pushing the limitations of overparameterization in on-device learning.
Author Information
Hakim Sidahmed (Google)
Zheng Xu (Google Research)
Yuan Cao (Google Brain)
More from the Same Authors
-
2022 : Motley: Benchmarking Heterogeneity and Personalization in Federated Learning »
Shanshan Wu · Tian Li · Zachary Charles · Yu Xiao · Ken Liu · Zheng Xu · Virginia Smith -
2022 : Adaptive Sparse Federated Learning in Large Output Spaces via Hashing »
Zhaozhuo Xu · Luyang Liu · Zheng Xu · Anshumali Shrivastava -
2022 : REACT: Synergizing Reasoning and Acting in Language Models »
Shunyu Yao · Jeffrey Zhao · Dian Yu · Izhak Shafran · Karthik Narasimhan · Yuan Cao -
2021 : Contributed Talk 5: Efficient and Private Federated Learning with Partially Trainable Networks »
Hakim Sidahmed · Zheng Xu · Yuan Cao -
2021 Poster: Federated Reconstruction: Partially Local Federated Learning »
Karan Singhal · Hakim Sidahmed · Zachary Garrett · Shanshan Wu · John Rush · Sushant Prakash -
2021 Poster: Understanding How Encoder-Decoder Architectures Attend »
Kyle Aitken · Vinay Ramasesh · Yuan Cao · Niru Maheswaranathan -
2020 Poster: Your GAN is Secretly an Energy-based Model and You Should Use Discriminator Driven Latent Sampling »
Tong Che · Ruixiang ZHANG · Jascha Sohl-Dickstein · Hugo Larochelle · Liam Paull · Yuan Cao · Yoshua Bengio