Aya Galal The most appropriate econometric model for time series analysis, which could be VAR, VECM, ARIMA, or others, depends on the specific purpose of the study. Therefore, the choice of model should be based on the research question and the characteristics of the data.
The most suitable econometric model for time series analysis of 27 years and 11 variables will depend on the specific characteristics of the data and the research question. However, there are some common econometric models that are often used in time series analysis, including: -
1) Vector Autoregression (VAR): VAR is a popular model for analyzing the interdependencies among multiple time series variables. It can capture both the short-term and long-term relationships between variables, allowing for the analysis of how changes in one variable affect the others.
2) Autoregressive Integrated Moving Average (ARIMA): ARIMA is a commonly used model for modeling time series data that exhibits trends, seasonality, and autocorrelation. It is a flexible model that can capture a wide range of time series patterns.
3) Vector Error Correction Model (VECM): VECM is an extension of the VAR model that accounts for the presence of cointegration among the variables. It is particularly useful when there are long-run relationships among the variables.
4) Bayesian Structural Time Series (BSTS): BSTS is a flexible and powerful model that can handle a wide range of time series data, including non-stationary and non-linear data. It uses Bayesian methods to estimate the model parameters, allowing for the incorporation of prior knowledge and uncertainty.
Ultimately, the choice of model will depend on the specific characteristics of the data and the research question. It may be useful to consult with an econometrician or statistician to determine the most appropriate model for your analysis.
The econometric model is mainly determined by the stationarity of your time series data. If you run stationarity tests you will get the following outcomes.
All variables integrated of order zero
All variables integrated of order 1 that is after first difference I(1)
Having a mixture that is I(1) and I(0)
Variables integrated after second difference I(2)
My advice for you is to run stationarity tests using the ADF, Phillips Peron or other stationarity tests to determine the order of integration
For example given you want to check for a long run relationship/ cointegration in a time series having a mixed order of integration that is having both I(0) and I(1) variables, this invalidates the Johansen test to cointegration since it requires all I(1) variables. Therefore in this case you need to employ the Autoregressive distributed lag (ARDL) which model a mixed order of stationarity.
The choice of model will depend on the specific characteristics of the data and the research question being addressed. It may be helpful to consult with a statistician or econometrician to determine the most appropriate model for your analysis. One of the most popular models for time series analysis is the Autoregressive Integrated Moving Average (ARIMA) model. ARIMA models are used to analyze time series data that exhibit trends, seasonality, and other non-stationary patterns. They can capture both short-term and long-term dynamics in the data and can be used to make forecasts for future time periods.
Another commonly used model for time series analysis is the Vector Autoregression (VAR) model. VAR models are used to analyze the dynamic relationships between multiple time series variables. They can capture both direct and indirect effects between variables and can be used to forecast the behavior of one variable given the behavior of others.
Other models such as the Vector Error Correction Model (VECM), which is a special case of the VAR model that is appropriate for analyzing cointegrated time series data, or the Structural Time Series Model, which is a flexible model that can capture trends, seasonality, and other patterns in the data while also allowing for the inclusion of exogenous variables.
Eleven (11) variables for a small sample period of 27 years (n = 27) are considerably too much and will lead to over-specification of the model.
Based on your research objective(s), I suggest you please revisit your economic theories regarding the relationships your chosen dependent variable has with each of the regressors/explanatory variables, and choose those with highest relevance in line with the underlying theories to be tested.
If the set or number of explanatory variables remains large, I suggest you perform some data reduction of such variables using Principal Component Analysis (PCA) approach, and retain the independent variables that have highest loadings across components - you can use your expert judgement to determine the threshold cut-off of the loadings on the variables to retain.
Given the considerably small sample size (n = 27) of your dataset, I suggest you consider running Cointegration regression - three options you can choose from based on your sample properties being the following:
(a). Dynamic OLS (DOLS)
(b). Fully-Modified OLS (FMOLS)
(c). Canonical cointegrating regression (CCR).
These methods are designed to best handle small samples, and produce reliable estimates.
The most suitable econometric model may be structural for causal relationship and policy evaluation. However, ARIMA models are popular for modelling, analyzing and forecating time series data.
Please use the criteria of the coefficient of determination R2, Residual Mean Square (RMS), Mallow's Cp, Akaike Information Criterion (AIC)/ Bayesian Information Criterion (BIC) or apply the Bayesian Model Average (BMA) method to select the optimal multivariate regression model.
With 11 variables a VAR with one lag has 11 to 13 coefficients in each equation depending on your constant and trend term. Thus there are 121 to 143 coefficients to be estimated in your system. With annual or quarterly data this will lead to problems. You will also have problems with identifying such a system if you want to calculate impulse response functions. Suggestions that you use VAR are almost certainly unworkable
Similar comments will apply to estimating a VECM.
Tatenda Zibizapanzi The Johansen methodology can be used when you have a mixture of I(1) and I(0) variables. The process treats them differently and therefore you must first specify which variables are I(1) and which are I(0). You can also mix exogenous and endogenous variables. It may be the case that your software does not support these options. You may have to do some programming.
The ARDL bounds test also requires that the explanatory variables be at least weakly exogenous. I have seen many applications of the ARDL procedure where obvious feedback from the dependent variable to an explanatory variable(s) has been ignored. This invalidates the results.
@Aya Galal To return to your question I think that you must return to economic theory/common sense and specify a system of (probably simultaneous) equations as a model. Perhaps break it down into identified sub-models using theory. Perhaps some sector models may involve non-stationary data and error correction mechanisms which may impact other sectors. the suggestion is that you have a reasonable economic model or model(s) and data before you start your econometrics.
27 years and 11 variables???? model estimation will consume alot of degrees of freedom and in the process affect your estimates. Bayesian VAR will be the most appropriate model for your analysis.
I recommend fully-modified ordinary least squares (fmols), dynamic ordinary least squares (dols), and canonical cointegrating regressions (ccr) with "linear trend" activated.
It all depends on your research objectives. The sample size is small and if your specific objective has nothing to do with relationship between the variables, think about factor analysis.
Selecting the most suitable econometric model for time series analysis with 27 years of data and 11 variables depends on the specific objectives of your analysis and the characteristics of your data. Here are some common econometric models and techniques that you may consider for such a dataset:
Autoregressive Integrated Moving Average (ARIMA) Model:ARIMA models are widely used for univariate time series forecasting. If you are interested in analyzing and forecasting one variable at a time, you can apply ARIMA models to each of the 11 variables separately.
Vector Autoregression (VAR) Model:VAR models are suitable when you want to analyze the interdependencies and dynamic relationships among multiple time series variables simultaneously. VAR models can capture both short-term and long-term interactions among the variables.
Vector Error Correction Model (VECM):VECM extends VAR models to account for cointegration among variables, which implies a long-term relationship. VECM is appropriate when you suspect that some variables are co-integrated, meaning they move together in the long run.
Dynamic Regression Models:Dynamic regression models allow you to incorporate exogenous variables (variables external to your time series) into your analysis. If you have additional economic or financial variables that may impact your time series, this approach can be valuable.
State Space Models:State space models, including the Kalman filter and Bayesian structural time series (BSTS) models, can handle complex time series data with multiple variables. They allow for modeling seasonality, trends, and structural breaks.
Cointegration Analysis:If you suspect that some variables are cointegrated (i.e., they have a long-term relationship), you can perform cointegration tests like the Johansen test and apply cointegration-based models such as the Vector Error Correction Model (VECM).
Machine Learning Methods:Machine learning algorithms, such as neural networks, random forests, and support vector machines, can be applied to time series data for forecasting and modeling complex relationships. However, they may require substantial data preprocessing and tuning.
Bayesian Time Series Models:Bayesian time series models, like Bayesian structural time series (BSTS) models, allow for incorporating prior information and handling uncertainty in time series modeling.
Granger Causality Analysis:If you want to explore causal relationships between variables, Granger causality tests can help identify lagged dependencies.
The choice of the most suitable model depends on your research objectives, the nature of the data (stationary or non-stationary, cointegration), the presence of seasonality or trends, and your familiarity with and access to econometric software or programming languages (e.g., R, Python). It's often beneficial to start with exploratory data analysis, identify patterns and correlations, and then select or tailor a model that best fits your specific research questions and data characteristics. Additionally, seeking guidance from a statistician or econometrician can be valuable in model selection and analysis.