Timezone: »
The conflicting gradients problem is one of the major bottlenecks for the effective training of machine learning models that deal with multiple objectives. To resolve this problem, various gradient manipulation techniques, such as PCGrad, MGDA, and CAGrad, have been developed, which directly alter the conflicting gradients to refined ones with alleviated or even no conflicts. However, the existing design and analysis of these techniques are mainly conducted under the full-batch gradient setting, ignoring the fact that they are primarily applied with stochastic mini-batch gradients. In this paper, we illustrate that the stochastic gradient manipulation algorithms may fail to converge to Pareto optimal solutions. Firstly, we show that these different algorithms can be summarized into a unified algorithmic framework, where the descent direction is given by the composition of the gradients of the multiple objectives. Then we provide an explicit two-objective convex optimization instance to explicate the non-convergence issue under the unified framework, which suggests that the non-convergence results from the determination of the composite weights solely by the instantaneous stochastic gradients. To fix the non-convergence issue, we propose a novel composite weights determination scheme that exponentially averages the past calculated weights. Finally, we show the resulting new variant of stochastic gradient manipulation converges to Pareto optimal or critical solutions and yield comparable or improved empirical performance.
Author Information
Shiji Zhou (Tsinghua-Berkeley Shenzhen Institute, Tsinghua University)
Wenpeng Zhang (Ant Group)
Jiyan Jiang (Tsinghua University)
Wenliang Zhong (Ant Group)
Jinjie GU (Ant Group)
Wenwu Zhu (Tsinghua University)
More from the Same Authors
-
2022 Poster: Generalizing Consistent Multi-Class Classification with Rejection to be Compatible with Arbitrary Losses »
Yuzhou Cao · Tianchi Cai · Lei Feng · Lihong Gu · Jinjie GU · Bo An · Gang Niu · Masashi Sugiyama -
2022 Poster: Module-Aware Optimization for Auxiliary Learning »
Hong Chen · Xin Wang · Yue Liu · Yuwei Zhou · Chaoyu Guan · Wenwu Zhu -
2022 Poster: Learning Invariant Graph Representations for Out-of-Distribution Generalization »
Haoyang Li · Ziwei Zhang · Xin Wang · Wenwu Zhu -
2022 Poster: Dynamic Graph Neural Networks Under Spatio-Temporal Distribution Shift »
Zeyang Zhang · Xin Wang · Ziwei Zhang · Haoyang Li · Zhou Qin · Wenwu Zhu -
2022 Poster: NAS-Bench-Graph: Benchmarking Graph Neural Architecture Search »
Yijian Qin · Ziwei Zhang · Xin Wang · Zeyang Zhang · Wenwu Zhu -
2023 Poster: Fused Gromov-Wasserstein Graph Mixup for Graph-level Classifications »
Xinyu Ma · Xu Chu · Yasha Wang · Yang Lin · Junfeng Zhao · Liantao Ma · Wenwu Zhu -
2023 Poster: Unsupervised Graph Neural Architecture Search with Disentangled Self-Supervision »
Zeyang Zhang · Xin Wang · Ziwei Zhang · Guangyao Shen · Shiqi Shen · Wenwu Zhu -
2023 Poster: Multi-task Graph Neural Architecture Search with Task-aware Collaboration and Curriculum »
Yijian Qin · Xin Wang · Ziwei Zhang · Hong Chen · Wenwu Zhu -
2023 Poster: Spectral Invariant Learning for Dynamic Graphs under Distribution Shifts »
Zeyang Zhang · Xin Wang · Ziwei Zhang · Zhou Qin · Weigao Wen · Hui Xue' · Haoyang Li · Wenwu Zhu -
2023 Poster: Joint Data-Task Generation for Auxiliary Learning »
Hong Chen · Xin Wang · Yuwei Zhou · Yijian Qin · Chaoyu Guan · Wenwu Zhu -
2022 Spotlight: NAS-Bench-Graph: Benchmarking Graph Neural Architecture Search »
Yijian Qin · Ziwei Zhang · Xin Wang · Zeyang Zhang · Wenwu Zhu -
2021 Poster: Asynchronous Decentralized Online Learning »
Jiyan Jiang · Wenpeng Zhang · Jinjie GU · Wenwu Zhu -
2021 Poster: Curriculum Disentangled Recommendation with Noisy Multi-feedback »
Hong Chen · Yudong Chen · Xin Wang · Ruobing Xie · Rui Wang · Feng Xia · Wenwu Zhu -
2021 Poster: Disentangled Contrastive Learning on Graphs »
Haoyang Li · Xin Wang · Ziwei Zhang · Zehuan Yuan · Hang Li · Wenwu Zhu -
2021 Poster: Graph Differentiable Architecture Search with Structure Learning »
Yijian Qin · Xin Wang · Zeyang Zhang · Wenwu Zhu -
2021 Poster: Not All Low-Pass Filters are Robust in Graph Convolutional Networks »
Heng Chang · Yu Rong · Tingyang Xu · Yatao Bian · Shiji Zhou · Xin Wang · Junzhou Huang · Wenwu Zhu -
2020 Poster: Implicit Graph Neural Networks »
Fangda Gu · Heng Chang · Wenwu Zhu · Somayeh Sojoudi · Laurent El Ghaoui -
2019 : The AutoDL Challenge »
Sébastien Treguer · Ildoo Kim · Ruirui Guo · Zhipeng Luo · Minghui Zhao · Yazhou Li · Xiawei Guo · Wenpeng Zhang · Noriaki Ota -
2019 Poster: Semantic Conditioned Dynamic Modulation for Temporal Sentence Grounding in Videos »
Yitian Yuan · Lin Ma · Jingwen Wang · Wei Liu · Wenwu Zhu -
2019 Poster: Learning Disentangled Representations for Recommendation »
Jianxin Ma · Chang Zhou · Peng Cui · Hongxia Yang · Wenwu Zhu -
2018 : AutoML3 - LifeLong ML with concept drift Challenge. Second place winner. A Boosting Tree Based AutoML System for High Cardinality Streaming Data Classification with Concept Drift »
Zheng Xiong · Jiyan Jiang · Wenpeng Zhang -
2018 Poster: Weakly Supervised Dense Event Captioning in Videos »
Xin Wang · Wenbing Huang · Chuang Gan · Jingdong Wang · Wenwu Zhu · Junzhou Huang -
2017 : Poster session (and Coffee Break) »
Jacob Andreas · Kun Li · Conner Vercellino · Thomas Miconi · Wenpeng Zhang · Luca Franceschi · Zheng Xiong · Karim Ahmed · Laurent Itti · Tim Klinger · Mostafa Rohaninejad