The augmented UKF was the original form proposed in Julier and Uhlmann's paper ("A General Method for Approximating Nonlinear Transformations of Probability Distributions"). Section 5 ("The New Filter") of that paper describes this method clearly - having a look there might help.
What is going on in this augmented form is that the components of the noise vector (which I'll call "noise variables") are added to the state space. This is done so that the unscented transform takes nonlinearities in the noise model (e.g. multiplicative noise) into account.
The "augmented" fa function takes in:
(1) the value of the original state variables
(2) the value of the noise variables (potentially for the noise on both the state and measurement processes)
(3) any inputs
(in the paper discussed above, (1) and (2) are wrapped up in \chi_{i} - I'm looking at equation 45).
and spits out the value of the state variables at the next time step.
For very simple (i.e. additive white gaussian noise) noise models, it is common to think of the f function as taking in only (1) and (3). So the addition of the noise parameters as an input (and their consideration in the function itself) is the difference between f and fa.
ha similarly takes in the augmented state vector (i.e. (1) and (2)) and produces the predicted measurement.
Other possibly interesting papers to have a look at:
"Unscented Kalman filtering for additive noise case: augmented vs. non-augmented" by Wu et al. (from ACC 2005). They discuss the benefits of the augmented UKF - it does a better job of propagating odd moment information about the underlying pdf.
While f is a function from R(n) to R(n), fa will be a function from R(n+q) to R(n+q), n being the number of state variables and q being the number of noise variables.
Now, how do I get the extra q equations? In other words, how will I include the noise models in f to get fa?
I apologize, my original answer is incorrect - I have edited it.
fa is from R(n+q) to R(n) - that is, it takes in explicit values for the noise variables and produces the state at the next time step considering those noise values. It does *not*, as I originally stated, produce values of the noise variables at the new time t+1. This is made clearer by "The unscented Kalman Filter" by Wan and van der Merwe, in the book "Kalman filtering and neural networks". Equation 7.40 table 7.3.1 is what I was missing.
I also made it clearer that the noise parameters include the process and the measurement noise, as shown, for example, in eq. 7.38 of that same reference.