What are the main reasons when you see the attached model which includes FDI and the Governance index components? , is it only because of Multicolinarity?, or what ?
Divyakant Tehlyan has a good answer and I would speculate that his Answer #2 is what is likely. It would help us if you would share a few bits of the sample data and the names of the variables. For example, if we knew the means of each variable, it would allow us to be more confident in offering different interpretations.
It looks like as VA increases by 1, the dependent variable increases by 5504.31. If the dependent variable in an index number, which generally is close to 100, then my conjecture above would be incorrect.
Zeravan Asaad shared more detailed results and it looks like his dependent variable appears to have a mean of about 2140 with a standard deviation of 1672; whereas, the independent variables have means ranging from -2.4 to 0.65. This seems consistent with answer #2.
Answer 2 explains the coefficient values you have got. It also appears that one or more of your explanatory variables is/are strongly trended - and so an indication of non-stationarity . The very high values of R2 is an indication.
It would be better to drop the variables whose t-statistic is not significant at 1%. Then in the re-estimated equation check also the DW statistic or better calculate the more general Lagrange multiplier test for serial correlation.
The FDI have a high coefficient when you invest in a regulated economics country that offers a skilled workforce and an above average growth prospects for the invenstor. This is not true in open economies.
It is, of course simple, to reduce the magnitude of the coefficients by dividing the dependent variable by e.g. 1000 (Then you get 5.504 instead of 5504.310. Nevertheless, this regression is nonsense. It is bad econometric habit to throw a set of data into an stimation program (package) without theoretical considerations before and without analysing the information content of the data and the (possible) relations between the variables.
I generally agree with Anton - of course, the selection of the IVs should not be just a "throw" of a set of variables into the RHS without any theoretical consideration, etc. I would suggest that we don't know whether Z.A. have selected the variables "without" such considerations, or without following any “authority” paper, etc. I tried to respond to the question: What are the main reasons when you see the attached model which includes FDI and the Governance index components? , is it only because of Multicolinarity?, or what ? Ignoring the language “lapsus” (reasons for what?), the thing that naturally comes in mind is that the scale of the parameter estimates is just uprated – one is minus 40 thousand, another is even 65 thousand! My remark was ONLY in this respect. Otherwise – I am not sure that we can judge, e.g. “this regression is nonsense” – it could really be, or maybe not – we do not have information to evaluate the logic of this model. And I absolutely agree that we also need to analyze the “information content of the data and the (possible) relations between the variables” – as well as, for any model the author should check issues of specification (incl. omitted relevant variables, nonlinearities), diagnostics of residual component, necessity for HAC standard errors, etc.
First step is to plot your data. If you had done on same graph you would see that there is something strange with scaling. In addition you would make eyeball test of possible correlation. Wrong apprach is to start with the results that softwear spits out.
Bojan is right. I tis important to look at the (cor)relations one by one variable before the final regressions. Obviously this step was not done in the case of Zeravans estimations. Otherwise he would have detected that there is a measurement problem.
But I think, there should be a step before. First, one has to think over the problem one wants to solve (in this case: explaining or forecasting a variable). Then you get an impression which variables could be important and you can look if there are data which best picture the influencing factors. Already at this stage, one should have hypotheses about the principal functional form of the relations. It may well be that one cannot find suitable data for one or several of the variables. In this case, one can try to find data for a variable, which one can assume to be a good indicator. Otherwise, one can only do a partial analysis (regression) with the available data.
Fausto, I am not sure if I understand your question correctly. Ithink that Bojans proposal to proceed includes the examination of correlations between the IV (your question), too. As I do not know what BMW means, I may have overlooked the meaning of your next sentence ("without data"). It seems to mean that one can or even should build a model before collecting data (or if there are no or too few data available).
A larger coefficient regression means larger effect predictor variable associated with response variable with the provision of the model is statistically significant, from the result shows that a larger p-value suggests that changes in the predictor are not associated with changes in the response in other word the model is insignificant, the coefficient has no meaning.
In this case it is obvious that something is wrong with the model specification. P value directs us to the insignifcancy - that is one of the problem. But, insignificance in combination with such large coeficients points out the initial problems (scaling, colinearity....it has been mentioned). We can go deeper come to spurious regression, erodicity...but everything has been explained in the several previous post.