I don't think it is a great idea to drop the insignificant coefficients. Look at the point estimates, if they are large in magnitude, you are better off not dropping them.
The point estimates are not large. I prefer to drop the insignificant coefficients as I am estimating a variable the is not complete for all the observations.
Why not drop the insignificant variables from the regression? But you should be aware that the precision of predictions has not a whole lot to do with significance of the variables. See my papers:
“The Prediction Market for the Australian Football League”, in Vaughn Williams, L., Prediction Markets, Routledge, 2011, pp. 221-234.
and
“The Regression Tournament: A Novel Approach to Prediction Model Assessment”, (with Janez Sustersic), Journal of Prediction Markets, Volume 5, no.2, 2011, pp. 32-43.
Roughly speaking, a variable is statistically insignificant if zero is a likely value for its true slope parameter. Whenever a variable is statistically insignificant then some non-zero but possibly small values are also likely as values for the true slope parameter. I suggest that your first decision is to identify those variables that diagnostic statistics make you believe have zero influence. Drop these from the model and keep any that are statistically insignificant but you believe, nevertheless, possibly have non-zero influence. Then re-estimate and predict. (Or take a Bayesian approach?)
Beware Mina: from a mechanical point of view if you drop some of the estimated coefficients then fitted(y|x) is different from mean(y) (i.e. residuals mean is nonzero).
There is a chapter in this book (http://www.stat.columbia.edu/~gelman/arm/) where Prof. Gelman explains why keeping statistical insignificant covariates is sensible.
Because relative "significance" between regression coefficients change with the mix of predictors used, this is not a good way to determine which predictors to use. (Also, if you change sample size, then for a given "significance" level, you would even change the Number of predictors kept!)
If you compare models using a "graphical residual analysis," then you can see which fits better. If the models being compared differ by just the presence or absence of a single predictor, you can see how much difference it makes, but ONLY for that set of predictors, and ONLY for that sample. Regarding the sample, "cross-validation" can be used to try to avoid fitting more closely to a particular sample than can be supported by the population or subpopulation for which the model is supposed to apply.
To keep this visual (a picture being worth a thousand words ... or statistics), you might, as perhaps one option, do a graphical residual analysis with more than one model represented on the same scatterplot for a given sample, and then do the same thing on another scatterplot for another sample. Performances can be compared between models in each case.
Further please note that graphical residual analyses also can be used to study heteroscedasticity. Please see
https://www.researchgate.net/project/OLS-Regression-Should-Not-Be-a-Default-for-WLS-Regression, and the various updates there, in reverse chronological order.