I illustrate the decrease of the maximum likelihood with a very simple case. Suppose a random binary draw ("success" or not "success") is done N times. K times there is a success. Let P be the unknown probability of success. Of course, the probability distribution of K depends on P.
By definition, the likelihood of P is:
L(P) = P^K times (1-P)^(N-K)
The maximum L(P) is at P = K/N.
Now let the number of draws be doubled, to 2N draws. And if success occurs -coincidentally- 2K times, then the maximum likelihood is at P = 2K/2N = K/N, as before. And the likelihood is:
L(P) = P^(2K) times (1-P)^(2N-2K)
The maximum is at P = 2K/2N = K/N
This L(P) is smaller than the previous L(P).
Basically all this is about the following: P^2 is smaller than P (unless P is 0 or 1).