I'm predicting and explaining footfall numbers in 100 shopping centers. The AR1 model is better than a linear regression model (in r-squares). The question is if the AR1 CO model is the best or if there are better alternatives.
First of all, if you have this strong proof it means that your design research match with method and tools, and if you are trying to find out others methods it could be incorrect because you didn't designed in advanced, and attempt tentative method and tool essentially are wrong approach, so why are you seeking something else not designed?
Thank you for your quick answer. I was seeking for alternatives because I wanted to make sure that this is the right method. I know there can be used different time lag lenghts for autocorrelation and there is also a possibility to use trend analysis i guess.
I have data per day of the amount of people visiting shopping centers.
I can split the file by shopping centers and make models of each center, with the predictors: day of the week, month in the year, weather variables and economic variables.
I can see that in some centres the residuals are correlated because they show trends. This suggests that a lineair regression model cannot be used. Using a AR(1) model increases R-Square and doesnt violate the uncorrelated residual assumption, because all Durbin-watson values are close to 2,0.
Problem is actually, there can still be trends observed when plotting the residuals of the AR(1) model over time, so I guess I have non-stationary series. That is where ARIMA(p,d,q) (with d=1 for a lineair trend) comes in right? I'm not sure about this and how to use this correctly (using SPSS).
The autoregressive term was already added, still i can see non-stationarity (so not just temporary fluctuations, but trends or no full recovery after a random shock), which are ignored with the autogregressive b_2*y_(i-1) term right?
i knew that's an option, but the real challenge is to make this for 100 different cases. I can do your method 100 times but that is very time-consuming and maybe there will be more cases in the future.
if I understand your problem correctly, your problem is the high number of shopping centers, which would be too much for modeling individually, right? You might use a Principal Component Analysis first. Transforming the 100 time series into their principal components, you could run your time series analysis based on the important principal components first. This way you very likely won't need to regress on 100 but rather on 5-15 time series.
Regarding the stochastic term, in my mind, graphical analysis tells you more than statistical tests. Look at the time series of the errors, usually you get an idea whether, they are non-stationary. The augmented Dickey-Fuller test also helps very much (although it is not perfect either for small sample size). If your time series is non-stationary, you should differentiate it.
The "best" orders of your ARMA(p,q) model cannot be derived easily. You might take a look at the ACF and the PACF, like Chenying said, but be careful with the interpretation! It takes a lot of experience to see this from the ACF and PACF, its not as easy. I would rather use them as an indication of the maximum values of p and q and from then on try different ps and ds. Better use one or more information criteria for your decision: AIC; BIC; HQC.