Timezone: »

Jointly Learning from Decentralized (Federated) and Centralized Data to Mitigate Distribution Shift
Sean Augenstein · Andrew S Hard · Rajiv Mathews
Event URL: https://openreview.net/forum?id=s73HWNEtOcE »

With privacy as a motivation, Federated Learning (FL) is an increasingly used paradigm where learning takes place collectively on edge devices, each with a cache of user-generated training examples that remain resident on the local device. These on-device training examples are gathered in situ during the course of users’ interactions with their devices, and thus are highly reflective of at least part of the inference data distribution. Yet a distribution shift may still exist, because on-device training examples can be lacking for some data inputs expected to be encountered at inference time. This paper proposes a way to mitigate this shift: selective usage of datacenter data, mixed in with FL. By mixing decentralized (federated) and centralized (datacenter) data, we can form an effective training data distribution that better matches the inference data distribution, resulting in more useful models.

Author Information

Sean Augenstein (Google)
Andrew S Hard (Google)

I'm a Senior Software Engineer at Google, where I currently work on applications of federated learning. I hold a PhD in high-energy physics from the University of Wisconsin, and spent 5 years searching for the Higgs boson at CERN.

Rajiv Mathews (Google)

More from the Same Authors