Easy Guide on Time Series Forecasting
Guide on Time Series Forecasting Problems
Time series forecasting is an important area of machine learning that is often neglected.
It is important because there are so many prediction problems that involve a time component and these problems are often neglected because it is this time component that makes time series problems more difficult to handle.
Before getting started with Time Series Analysis, let’s get our basics clear on Anomaly Detection
WHAT IS TIME SERIES ANALYSIS?
Time Series is a set of observations taken at a specified time usually at equal intervals. It is used to predict future values based on previous observed values.
Considering a graph, where ‘x’ is time & if the dependent variable ‘y’ depends on time parameter then it’s called as a time series.
COMPONENTS OF TS ANALYSIS:
WHEN NOT TO USE TS ANALYSIS?
- If the values are constant
- If the values are in form of a function
WHAT IS STATIONARITY?
In stationary TS, the mean, variance and covariance of the series should not be a function of time rather should be a constant. The image below has the left hand graph satisfying the condition whereas the graph in red has a time dependent mean
TS data can be stationary by removing:
- Trend – Varying over time
- Seasonality – Variations at specific time
How to make TS stationary?
- Constant mean – Average
- Constant variance – Distance from mean
- Auto covariance that does not depend on time
HOW TO TEST STATIONARITY?
Rolling Statistics: Visual technique
Augmented Dickey Fuller Test:
Here the null hypothesis is that the TS is non-stationary.
If ‘Test Statistics’ < ‘Critical Value’, then we can reject the hypothesis & conclude that TS is stationary.
If the test statistic is less than the critical value, we can reject the null hypothesis (so the series is stationary). Therefore, n=1 in Dickey Fuler Test means p >0.05 and we cannot reject the null hypothesis.
Auto Regression: Auto Regressive lags, If there’s a correlation between t & t-5, then that’s an autoregressive model If p is 5, then predictors of x(t) = x(t-1)….x(t-5)
Moving Average: Lagged forecast errors in prediction. If q is 2, predictors of x(t) will be e(t-1)..e(t-2) e(i) is the difference between moving average of ith instance & actual value
ACF: Auto Correlation function, it’s the measure of correlation between TS and the lagged value of itself. PACF: Partial Auto Correlation function, it’s the correlation of TS with a lagged value of itself but after removing variation.
LSTM (Long Short Term Memory)
Recurrent Neural Network:
Vector to Sequence – I/P (Image) Describes an image Example: Image Captioning Sequence to Vector – I/P(Product Reviews) O/P is in form of a vector [0.9 0.1] of positive: negative Example: Sentiment Analysis Sequence to Sequence – I/P(Sequence) O/P(Sequence) It’s based on Encoder-Decoder Architecture Example: Translation
- TS data is actually sequences.
- When dealing with weather data Precipitation, Rain, Temperature etc.
- Where some of the features can be relevant for forecasting, weather entirely is treated as a vector & is i/p to the neural network.
- TS can be modelled as a sequence to sequence problem
When fbProphet shines?
- Hourly, Weekly, Daily observations with at least a few months of history.
- Strong multiple “human-scale” seasonality’s
- Holidays that occur at irregular intervals
- Reasonable number of missing observations
This is a simple use case where I have tried 3 different models for the same dataset and have calculated the RMSE values for each of the models.
Below are the results:
Test MSE: 94.076
Test RMSE: 98.289
Test MSE: 49.039
Below are the actual vs predicted graph from various models:
Based on the MSE values, we can conclude that fbProphet has given us better results. But given that we have more data, with proper fine tuning, LSTM & ARIMA can even perform better.
Go through various features of Prophet here : https://facebook.github.io/prophet/docs/quick_start.html
Download the entire project here:
Reach out to us for any queries.