Timezone: »
Transformers have shown great power in time series forecasting due to their global-range modeling ability. However, their performance can degenerate terribly on non-stationary real-world data in which the joint distribution changes over time. Previous studies primarily adopt stationarization to attenuate the non-stationarity of original series for better predictability. But the stationarized series deprived of inherent non-stationarity can be less instructive for real-world bursty events forecasting. This problem, termed over-stationarization in this paper, leads Transformers to generate indistinguishable temporal attentions for different series and impedes the predictive capability of deep models. To tackle the dilemma between series predictability and model capability, we propose Non-stationary Transformers as a generic framework with two interdependent modules: Series Stationarization and De-stationary Attention. Concretely, Series Stationarization unifies the statistics of each input and converts the output with restored statistics for better predictability. To address the over-stationarization problem, De-stationary Attention is devised to recover the intrinsic non-stationary information into temporal dependencies by approximating distinguishable attentions learned from raw series. Our Non-stationary Transformers framework consistently boosts mainstream Transformers by a large margin, which reduces MSE by 49.43% on Transformer, 47.34% on Informer, and 46.89% on Reformer, making them the state-of-the-art in time series forecasting. Code is available at this repository: https://github.com/thuml/Nonstationary_Transformers.
Author Information
Yong Liu (Tsinghua University, Tsinghua University)

I ‘m currently a PhD student (from fall, 2021) at the School of Software of Tsinghua University and a member of the THUML, advised by Prof. MingSheng Long. My research interests cover Deep Learning and Transfer Learning. I am currently working on deep model applications for Time Series Forecasting. Previously, I have done researches on the transferability measurement of pretrained models (PTM). The pursuit of my reasearch is to implement deep learning methodology to valuable real-world applications. For more information, you may take a look at my publications.
Haixu Wu (Tsinghua University)
Jianmin Wang (Tsinghua University)
Mingsheng Long (Tsinghua University)
More from the Same Authors
-
2022 Poster: Hub-Pathway: Transfer Learning from A Hub of Pre-trained Models »
Yang Shu · Zhangjie Cao · Ziyang Zhang · Jianmin Wang · Mingsheng Long -
2022 Poster: Supported Policy Optimization for Offline Reinforcement Learning »
Jialong Wu · Haixu Wu · Zihan Qiu · Jianmin Wang · Mingsheng Long -
2023 Poster: ForkMerge: Mitigating Negative Transfer in Auxiliary-Task Learning »
Junguang Jiang · Baixu Chen · Junwei Pan · Ximei Wang · Dapeng Liu · Jie Jiang · Mingsheng Long -
2023 Poster: SimMTM: A Simple Pre-Training Framework for Masked Time-Series Modeling »
Jiaxiang Dong · Haixu Wu · Haoran Zhang · Li Zhang · Jianmin Wang · Mingsheng Long -
2023 Poster: Koopa: Learning Non-stationary Time Series Dynamics with Koopman Predictors »
Yong Liu · Chenyu Li · Jianmin Wang · Mingsheng Long -
2023 Poster: Pre-training Contextualized World Models with In-the-wild Videos for Reinforcement Learning »
Jialong Wu · Haoyu Ma · Chaoyi Deng · Mingsheng Long -
2022 : Domain Adaptation: Theory, Algorithms, and Open Library »
Mingsheng Long -
2022 Poster: Debiased Self-Training for Semi-Supervised Learning »
Baixu Chen · Junguang Jiang · Ximei Wang · Pengfei Wan · Jianmin Wang · Mingsheng Long -
2021 Poster: Cycle Self-Training for Domain Adaptation »
Hong Liu · Jianmin Wang · Mingsheng Long -
2021 Poster: Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting »
Haixu Wu · Jiehui Xu · Jianmin Wang · Mingsheng Long -
2020 Poster: Co-Tuning for Transfer Learning »
Kaichao You · Zhi Kou · Mingsheng Long · Jianmin Wang -
2020 Poster: Transferable Calibration with Lower Bias and Variance in Domain Adaptation »
Ximei Wang · Mingsheng Long · Jianmin Wang · Michael Jordan -
2020 Poster: Stochastic Normalization »
Zhi Kou · Kaichao You · Mingsheng Long · Jianmin Wang -
2020 Poster: Learning to Adapt to Evolving Domains »
Hong Liu · Mingsheng Long · Jianmin Wang · Yu Wang -
2019 Poster: Catastrophic Forgetting Meets Negative Transfer: Batch Spectral Shrinkage for Safe Transfer Learning »
Xinyang Chen · Sinan Wang · Bo Fu · Mingsheng Long · Jianmin Wang -
2019 Poster: Transferable Normalization: Towards Improving Transferability of Deep Neural Networks »
Ximei Wang · Ying Jin · Mingsheng Long · Jianmin Wang · Michael Jordan -
2018 Poster: Conditional Adversarial Domain Adaptation »
Mingsheng Long · ZHANGJIE CAO · Jianmin Wang · Michael Jordan -
2018 Poster: Generalized Zero-Shot Learning with Deep Calibration Network »
Shichen Liu · Mingsheng Long · Jianmin Wang · Michael Jordan -
2017 Poster: PredRNN: Recurrent Neural Networks for Predictive Learning using Spatiotemporal LSTMs »
Yunbo Wang · Mingsheng Long · Jianmin Wang · Zhifeng Gao · Philip S Yu -
2017 Poster: Learning Multiple Tasks with Multilinear Relationship Networks »
Mingsheng Long · ZHANGJIE CAO · Jianmin Wang · Philip S Yu -
2016 Poster: Unsupervised Domain Adaptation with Residual Transfer Networks »
Mingsheng Long · Han Zhu · Jianmin Wang · Michael Jordan -
2015 Workshop: Transfer and Multi-Task Learning: Trends and New Perspectives »
Anastasia Pentina · Christoph Lampert · Sinno Jialin Pan · Mingsheng Long · Judy Hoffman · Baochen Sun · Kate Saenko