Timezone: »
Current deep neural networks can achieve remarkable performance on a single task. However, when the deep neural network is continually trained on a sequence of tasks, it seems to gradually forget the previous learned knowledge. This phenomenon is referred to as catastrophic forgetting and motivates the field called lifelong learning. Recently, episodic memory based approaches such as GEM and A-GEM have shown remarkable performance. In this paper, we provide the first unified view of episodic memory based approaches from an optimization's perspective. This view leads to two improved schemes for episodic memory based lifelong learning, called MEGA-\rom{1} and MEGA-\rom{2}. MEGA-\rom{1} and MEGA-\rom{2} modulate the balance between old tasks and the new task by integrating the current gradient with the gradient computed on the episodic memory. Notably, we show that GEM and A-GEM are degenerate cases of MEGA-\rom{1} and MEGA-\rom{2} which consistently put the same emphasis on the current task, regardless of how the loss changes over time. Our proposed schemes address this issue by using novel loss-balancing updating rules, which drastically improve the performance over GEM and A-GEM. Extensive experimental results show that the proposed schemes significantly advance the state-of-the-art on four commonly used lifelong learning benchmarks, reducing the error by up to 18%.
Author Information
Yunhui Guo (University of California, San Diego)
Mingrui Liu (Boston University)
Tianbao Yang (The University of Iowa)
Tajana Rosing (University of California, San Diego)
Related Events (a corresponding poster, oral, or spotlight)
-
2020 Spotlight: Improved Schemes for Episodic Memory-based Lifelong Learning »
Thu Dec 10th 03:00 -- 03:10 AM Room Orals & Spotlights: Graph/Meta Learning/Software
More from the Same Authors
-
2020 Poster: A Decentralized Parallel Algorithm for Training Generative Adversarial Nets »
Mingrui Liu · Wei Zhang · Youssef Mroueh · Xiaodong Cui · Jarret Ross · Tianbao Yang · Payel Das -
2020 Poster: Optimal Epoch Stochastic Gradient Descent Ascent Methods for Min-Max Optimization »
Yan Yan · Yi Xu · Qihang Lin · Wei Liu · Tianbao Yang -
2019 Poster: Non-asymptotic Analysis of Stochastic Methods for Non-Smooth Non-Convex Regularized Problems »
Yi Xu · Rong Jin · Tianbao Yang -
2019 Poster: Stagewise Training Accelerates Convergence of Testing Error Over SGD »
Zhuoning Yuan · Yan Yan · Rong Jin · Tianbao Yang -
2018 Poster: First-order Stochastic Algorithms for Escaping From Saddle Points in Almost Linear Time »
Yi Xu · Rong Jin · Tianbao Yang -
2018 Poster: Adaptive Negative Curvature Descent with Applications in Non-convex Optimization »
Mingrui Liu · Zhe Li · Xiaoyu Wang · Jinfeng Yi · Tianbao Yang -
2018 Poster: Faster Online Learning of Optimal Threshold for Consistent F-measure Optimization »
Xiaoxuan Zhang · Mingrui Liu · Xun Zhou · Tianbao Yang -
2018 Poster: Fast Rates of ERM and Stochastic Approximation: Adaptive to Error Bound Conditions »
Mingrui Liu · Xiaoxuan Zhang · Lijun Zhang · Rong Jin · Tianbao Yang -
2017 Poster: ADMM without a Fixed Penalty Parameter: Faster Convergence with New Adaptive Penalization »
Yi Xu · Mingrui Liu · Qihang Lin · Tianbao Yang -
2017 Poster: Improved Dynamic Regret for Non-degenerate Functions »
Lijun Zhang · Tianbao Yang · Jinfeng Yi · Rong Jin · Zhi-Hua Zhou -
2017 Poster: Adaptive Accelerated Gradient Converging Method under H\"{o}lderian Error Bound Condition »
Mingrui Liu · Tianbao Yang -
2017 Poster: Adaptive SVRG Methods under Error Bound Conditions with Unknown Growth Parameter »
Yi Xu · Qihang Lin · Tianbao Yang -
2016 Poster: Homotopy Smoothing for Non-Smooth Problems with Lower Complexity than $O(1/\epsilon)$ »
Yi Xu · Yan Yan · Qihang Lin · Tianbao Yang -
2016 Poster: Improved Dropout for Shallow and Deep Learning »
Zhe Li · Boqing Gong · Tianbao Yang