Better Accuracy in Demand Forecasting with LSTM Neural Networks

Better Accuracy in Demand Forecasting with LSTM Neural Networks

CHALLENGE

Time series forecasting for business applications relies on the assumption that predictive models trained on recently acquired historical data can be used to predict future observations. Of course, the future can only be estimated based on a mathematical extrapolation of what has already occurred, and assuming behaviors from the recent past will continue in the future. The efficacy of the forecasting procedure will depend on many factors, including choosing a type of model that is appropriately matched to the statistical behavior of the time series, which therefore requires an understanding of the dynamics of the time series, as well as knowledge of how the model will mathematically interact with these dynamics. One such use case involves the prediction of electricity consumption 24 hours before the actual customer demand loading is experienced by the utility. Because of the costs involved for the energy provider, highly accurate demand forecasts are essential, but can be difficult to achieve with more traditional predictive methods.

 

SOLUTION

In this study, AlgoTactica investigated the performance of Long Short-Term Memory (LSTM) neural networks as an alternative to more commonly-used time series regression strategies.  Unlike other methods, the LSTM network embodies a recurrent architecture which enables it to retain memory of influential data values over historical time intervals of arbitrary length. Consequently, when considering the observable dynamics of an energy demand time series, it would be logical to expect that the LSTM would offer an advantage in forecasting accuracy. As a baseline for comparison, predictive models were also designed using feed-forward neural network architectures (denoted as FFNN in the figures at right), boosted regression tree ensembles (TREES), as well as an ensemble of Bayesian-optimized Gaussian process regression models (GPR). All four model types were trained from a common 65-feature dataset comprised of engineered features and raw historical time series values.

 

RESULTS

Bootstrapped RMS error distributions for all four model types are shown in the Prediction Error by Model Type graph at the upper right, for forecasts on hold-out data not used during model training. In comparison to the other three model types, the LSTM achieved overall RMS error values that were at least 50% lower in magnitude for the 24-hour forecast horizon. Furthermore, LSTM error values were essentially the same when making predictions for either the first 12 hours of a day or the last 12, as shown by comparing the HOURS 1-12 and HOURS 13-24 graphs.  However, the other three models exhibited significant intra-day differences, with errors for the last half of a day noticeably higher than for the first half.

The Scattergrams: Predicted vs. Observed graphs, shown at middle right, depict the relationship between individual observed values and their associated predicted values, also for the hold-out data. Here, the LSTM results reveal a very compact and linear scatter cloud, while the other three models have comparatively much broader scatter relationships, denoting a much higher degree of forecast error. Typical forecast examples are shown in the Waveform Examples: Predicted vs. Observed graphs, at lower right. For the same 60-hour period, these graphs show the observed demand values in grey, overlain by the actual forecasted values, shown in red. In comparison to the other models, the result for the LSTM clearly demonstrates superior performance.

While the LSTM neural network is definitely a superior method for this particular use case, it might not always be the best approach for time series forecasting in general. Depending on the problem domain characteristics, it can be possible that the LSTM model will yield results very similar to other methods that are simpler to implement. Consequently, final best-model selection must be driven by a prior discovery process which compares performance across a range of candidate forecasting strategies.

-->
Back to top