Affinity Workshop: Women in Machine Learning

Synthetic Data Augmentation for Time Series Forecasting

Kasumi Ohno · Kohei Makino · Makoto Miwa · Yutaka Sasaki


In this study, we propose a method to automatically generate synthetic training data that can be applied to a wide range of Time Series Forecasting (TSF) tasks. TSF is to predict the future sequence from a given sequence. We generate multiple data patterns with different functions with the expectation of modeling general waveform characteristics such as periodicity and continuity that appear in time series data. Specifically, we automatically generate many periodic waveforms such as sine waves and square waves, and waveforms with irregular peaks by randomly changing parameters such as periods. We compared the performance of sequence prediction with and without the addition of the synthetic data. We used the dataset ETT (Electricity Transformer Temperature), one of the standard benchmarks for TSF tasks with constant sampling periods. We used the Mean Squared Error (MSE) as an evaluation metric for the prediction performance. We compared three cases with different training data: the entire training data, 50% (208 samples) of the entire training data, and the 50% of the entire training data with 208 samples of synthetic data. The MSE on the test data was 0.2779, 0.2784, and 0.2724, respectively, with the addition of the generated data having the smallest value and slightly higher prediction performance.

Chat is not available.