Thus GARCH is more parsimonious as it uses just a couple of (or a few) parameters to achieve what the ARCH model would need an infinite number of parameters for. The argument is also very similar (essentially the same) to how an ARMA model is more parsimonious than an AR or an MA model.
To get a good model, in case of GARCH, the GARCH(1,1) is usually working well. In this case, variance equation has 3 parameters. In case of ARCH model, you would need ARCH(q), where q would be quite large, and number of parameters in the variance equation will be much more than 3.
The generalized autoregressive conditional heteroscedasticity (GARCH) model is more parsimonious than the autoregressive conditional heteroscedasticity (ARCH) model because it allows for a more flexible and general specification of the variance equation.
In the ARCH model, the variance equation is a function of past squared residuals:
where e(t) is the residual at time t and alpha0, alpha1, ..., alpha(p) are parameters to be estimated.
This specification is restrictive because it assumes that the impact of past squared residuals on the current variance decays linearly over time.
In contrast, the GARCH model allows for a more flexible specification of the variance equation by adding a weighting term that decays exponentially over time:
where var(yt-1) is the variance at time t-1 and beta1 is a parameter to be estimated.
This allows the GARCH model to capture more complex patterns in the data and to better capture the persistence of shocks to the variance. As a result, the GARCH model is generally considered to be more parsimonious and more flexible than the ARCH model.