The picture on the left usually refers to not an earthquake event, but a statistical result of relatively stable "noise", that is, the magnitude is a tiny passive vibration. Under seismic events, it is usually based on the characteristics of the wave itself, rather than the characteristics of the geological structure near the receiving point.
The relative advantage of ground motions at around 2s on the clay layer should be due to the dispersion and filtering of surface waves in the soil layer, so that vibrations of a certain frequency are accumulated, while other frequencies are suppressed. The characteristic frequency f0, usually corresponding to the wavelength of a quarter of the thickness of the soil layer, H=1/4*Lambda=1/4*V/f0
The picture on the right does not have enough annotations, I am not very clear