Forecasting the future performance of a particular metric is a crucial issue for companies in any industry. Predicting the future accurately would lay the foundation for a company’s sure success. Without aspiring to paranormal abilities or available almanacs, it is necessary to rely on mathematics and common sense to obtain forecasts whose result is associated with a tolerable margin of error.
The year 2020 has been a black year in so many ways. For those involved in forecasting, perhaps a darker colour than black would be needed to describe it. What’s more, thanks to this anomalous year, 2021 and 2022 will also be very tough challenges for those who by trade try to forecast the future.
The reason? Analyzing a trend in a future period is made feasible primarily by one factor: seasonality. For example, if I want to predict the sales volumes of my goods, my revenues, the average spending of Italians per month, I can rely on historical data from past years that tell me, for example, that:
- In December we spend more on gifts
- Monday is a slow day
- On Saturday morning the commercial centres are crowded
- In August we go to the sea
Having such an anomalous year as 2020 necessarily leads to messing up the cards and not being able to rely much on predictions about what happened in 2019 and previous years.
- Methods based on regressions are likely to fail especially in the medium to long term as they would go to trace seasonal phenomena that are suddenly absent or otherwise different.
- Moving average methods allow me to intercept quite well changes in the trend of the most recent values of my historical series, but in the long term, they will lose accuracy.
- The ARIMA models (autoregressive integrated moving average), which combine the regressive component to the moving average, can lead to discrete results but still wrong in the long term because of the bad ability to detect new trends
- Methods that put even more emphasis on seasonal factors such as Holt-Winters, will suffer from these anomalous behaviours.
- Methods that use multiple approaches combined, such as Prophet, open-source software released by Facebook’s Core Data Science team
Moreover, in addition to changes in seasonality, 2020 has also led to changes in volumes, often transforming the time series being analyzed from stationary to non-stationary. For example, at a time of year that in years past exhibited a cyclical pattern, we may find ourselves with strictly increasing values. At least for this problem, however, there are effective time-series normalization techniques that allow the data to be prepared before being analyzed.
But, what should be done instead with seasonality? Not to extinguish hopes, but in the next few years it is unlikely that we will arrive at accurate forecasts with a low level of error. Algorithms, however, are helping us to make the situation as good as possible. With a good amount of data available, i.e. full-bodied historical series and not just 10 values, it is possible to try the path of artificial intelligence, in particular, Machine Learning or, if necessary, Deep Learning. An Artificial Neural Network or a Decision Tree are very large families of algorithms that can understand hidden relationships between the data that are provided to try to make inferences about new observations. In this way, you will have a certain higher probability of intercepting abrupt fields in the seasonal behavior of the series but you will lose a factor. The choice of algorithm will be a decisive factor in the success of the predictions. The fact of using a Machine or Deep Learning approach simply concerns the fact of screening more or less complex algorithms. If a “Machine” Neural Network, i.e. with a maximum of 3 layers, does not work well, you can try a “Deep” Neural Network. These algorithms, however, represent only a further attempt. Methods like ARIMA and Holt-Winters may work better. You have to try it. Every repository is a discovery. This is because every prediction is influenced by random behaviour, called noise. For example, let’s imagine to predict 103 purchases from Milan for our e-commerce for tomorrow. Due to a fault on the internet line, the actual purchases will be only 16. Was it possible to foresee such an effect? No. What repercussions will it have in the next few days? Difficult to say, there could be a boom in purchases or customers could have gone to competitors. So it is very difficult if not impossible to predict all the effects at play. Unfortunately, 2020 leads to nothing but more noise.