When performing a self-consistent electronic calculation, convergence can be strongly accelerated by not using directly the output charge density of the last iteration as input of the next iteration. Instead, a small "amount" of the previous input density can be used mixed with the last output. All modern DFT codes, as far as I know, take advantage of this mixing.
There are several mixing schemes you can use in VASP (check out the IMIX tag), but I do not know them deeply. From my experience, if your calculation seems to be in the right way to self-consistency (that is, your dE is decreasing reasonably in each step), then just increase the amount of electronic steps before VASP stops without having achieved self-consistency. This is determined by the NELM tag. If your dE does not go down, you may try reducing AMIX to smaller values. It it does go down but very slowly, you can increase it. Sometimes it is worth trying different minimisation algorithms too (check the ALGO tag).
I would be glad too if someone could explain the mixing schemes deeply
I'm not familiar with VASP, but I write a plane-wave DFT code called CASTEP which uses similar principles, so I will try to shed some light on it.
Overview of methods
At present, there are essentially two classes of methods to solve the Kohn-Sham equations: variational (self-consistent) methods; and density/potential-mixing methods (non-self-consistent).
1. Fully variational methods (self-consistent)
In a variational method the density is updated whenever the wavefunctions are changed, so that they are always consistent with each other and the solution is said to be self-consistent. In other words at all points in the calculation
n(r) = \sum_{bk} f_{bk} |\psi_{bk}(r)|^2
where n(r) is the density at point r, f_{bk} is the occupation number for band b at k-point k, and \psi_{bk} is the corresponding single-particle wavefunction.
An example of a variational method is the Ensemble Density Functional Theory method of Marzari, Vanderbilt and Payne, Phys. Rev. Lett. 79, 1337 (1997). This is the variational method commonly used in CASTEP; it tends to be computationally demanding, but extremely stable and robust.
In CASTEP this method is selected by the line
elec_method : edft
in the param file.
2. Mixing methods (non-self-consistent; SCF)
In a density/potential-mixing method the wavefunctions are updated for a fixed potential, so that the density and wavefunctions are partially decoupled. The density that was used to construct the potential is called n_{in} and the density which is obtained from the updated wavefunctions is called n_{out}, i.e.
n_{out}(r) = \sum_{bk} f_{bk} |\psi_{bk}(r)|^2
If n_{in} and n_{out} are the same then we have solved the Kohn-Sham equations and found the ground state density and single-particle wavefunctions \psi_{bk}.
In general n_{in} and n_{out} are not the same (the system is non-self-consistent), and so we need to somehow change n_{in} to bring it closer to the ground state density. The algorithm used to do this is (usually) a density-mixing algorithm. These methods tend to be less computationally demanding than variational methods, sometimes an order of magnitude quicker per iteration, but are far less robust and can fail to converge.
In CASTEP this is selected by the line
elec_method : dm
in the param file.
Density mixing algorithms
The aim is to find the best approximation to the ground-state density in the basis of the previous input and output densities (n_{in} and n_{out}). How can you do this? Suppose we assume a linear relationship between n _{in} and n_{out}, so that
M n_{in}(r) = n_{out}(r)
for some matrix (actually, tensor) M.The ground-state density n(r) is then the solution of
M n(r) = n(r)
in other words, the eigenstate of this matrix with eigenvalue 1. So now if we use information from all the different n_{in} and n_{out} from each previous iteration to construct a good approximation for M, the eigenstate gives us a good approximation to the ground-state density. (NB the actual implementation of these methods usually works on the changes in n_{in} and n_{out} between successive iterations.)
There are many possible methods to construct M. Usually a good initial approximation to M is important, and for plane-wave DFT this is usually taken from a jellium model, as proposed by Kerker[1] based on the original work by Manninen et al.[2]. An alternative viewpoint is to consider that the Hartree potential is the most problematic term in the Kohn-Sham potential, so it is most important to capture that contribution to M. Whether you consider the jellium- or Hartree-inspired method, M is most conveniently expressed in reciprocal-space and you end up with a diagonal form for M
where G and G' are reciprocal lattice vectors, A is a linear mixing amplitude and G_0 is a characteristic wavevector which may be identified with the Thomas-Fermi screening length; A and G_0 are parameters of this model, though usually 0.1=