I want to know the basic steps of how to check or transform data before undertaking a time series analysis. Is there something that you need to check or test before just picking a method I.e. ARIMA. I'm thinking there is a way similar to testing data for normality before using parametric analysis. Is there something similar I need to do to help select an appropriate method?

My data is annual crash data and I have found that:

a) the GFC 2008 - 2012 affected the crash data - lower than the 95% CI and then it bounced back to pre-covid just above 95% CI. Do I exclude these periods and on what basis should I? e.g. 2 times StdDev?

b) Then when applying a suitable method how do I use another (independent) variable to help with the forecast. I found another not used before variable mimics the original crash series quite well and I want to use this variable to help with forecasting future crash trends. This is similar to vehicle kilometers travelled being correlated to the rise and fall of annual crashes, which in turn is linked to economic values such as GDP, but those two are difficult to predict as they are economic activity consequences of this other variable that creates economic up and down turns. This variable follows the annual crash trend much more closely, So we can predict this variable easier in the future which then will help with forecasting more accurate crash trends.

Similar questions and discussions