Timezone: »

FedGMA: Federated Learning with Gradient Masked Averaging
Irene Tenison · Sai Aravind Sreeramadas · Vaikkunth Mugunthan · Irina Rish

In cross-device federated optimization algorithms, the the two important constraints are the non-IIDness in the data distributed across clients and communication bottleneck. In this work, connections are drawn between the environments in an out-of-distribution(OOD) generalization setting and non-IID clients in a federated setting. To the federated setting, we adopt the OOD generalization hypothesis which states that learning only the invariant mechanisms while ignoring the spurious mechanisms in the train environment improves generalization performance on OOD test data. This paper proposes a gradient masked averaging that can be easily applied as an alternative to naive averaging updates in all federated algorithms like FedAVG, FedProx, SCAFFOLD, and adaptive federated optimizers like FedADAM and FedYogi. This masking improves the convergence of each algorithm in both IID and Non-IID data distributions across clients while reducing the number of communication rounds taken to converge. We introduce OOD generalization testing in federated learning and the proposed masking improves the OOD generalization performance of the corresponding federated algorithms.

Author Information

Irene Tenison (Mila/UdeM)
Sai Aravind Sreeramadas (MILA)
Vaikkunth Mugunthan (Massachusetts Institute of Technology)
Irina Rish (Mila/UdeM)

More from the Same Authors