Standard text books on extreme value theory cover this, not only for the case of a general distribution but also for the normal distribution in particular. Wikipedia can at least point you to some sources. The theroy points to a Gumbel distribution as the limiting distribution, after shifting an scaling by certain constants, for which formulae are given. Since the properties of the Gumbel distribution are known this might answer your immediate question and possibly other ones. But note the the asymptotic approximation is a poor one, and can be bettered by choosing shift and scale to make two percentage points (such as 10% and 90%) of the known distribution of the maximum agree with those of the approximating Gumbel distribution.
Depending on what you are actually doing, you might consider looking at the median rather than the mean. Then you can get an exact formula for the median (which you might or might not consider "simple") by inverting the known distribution function of the maximum (the n'th power of the original distribution function). You might be able to simplify this by looking at formulae for the tails of the inverse normal distribution function.
I suggest you have a look on Extreme Value Theory. If I'm not mistaken, T(n) ~ (2 log n)^(1/2). Find attached the response to your problem (slide 46) in the course by Christian Y. Robert (in French). Of course, this is only correct when you consider independent observaitons. Please note that, as n goes to infinity, the maximum becomes (in your situation, this is not always true) deterministic.
Hi, Geoffrey. I believe the article you recommend is not enough to explain the crisis of Extreme Value Theory in its 160 pages. It can not predict a plausible unique model that fits the Maximum and Minimum extreme values -at the same time- of a univariate dataset with its own distribution functions. Perhaps that is the reason why the article is only interested in finding a rough solution for the maximum and ignores the minimum. I have found a solution that can be explained in 10 pages and it may be found here in RG.
Supose a representative dataset of size N=1000, with Maximum=120, Minimum=40, and media U=100. May you propose a coherent statistical model that fits perfect the two extremes?
Using values a=120/100-1=0.2 and b=1-40/100=0.6 and exponent e=b/a=3 there is the following solution for media U=100, asumed right:
Lorenz Curve ordered from top to bottom, and population fraction P from 0 to 1:
L= (a+1)*P- a*Pe ....... L(P) = 1.2*P - 0.2 *P3
Its derivate produce cummulative distribution in medias:
K(P such that K>=Ki medias) = 1.2 - 0.6 P2
If P=1/2 the K value of median would be K(1/2) = 1.2 -0.6*0.25 =1.2-0.15=1.05 medias
Or median = 100*1.05 = 105 units
Maximum = K(P=0) 100*(1.2) = 120 units
Minimum= 100 * K(P=1) = 100*(1.2-0.6)= 40 units
This function K(P) has inverse function P(K>=Ki) easy to obtain.
I invite you and the author you recommended to give your solutions using traditional Extreme Value Theory, explaining it step by step.
I accept that this induction obtained from dataset is a proxy model, but it is much better and simpler than those made from transformations that never return to their original values. AT least it fits the extreme values, and if you measure its fitting power results are quite good. It is very easy to graph L(P), K(P) and P(K) either in original units, or in medias. Main advantage is that it does not require N=infinite, neither asymptotic premises, nor "normal" models with pdf of bell shapes. It only uses one parameter: the media obtained from sample.
Thanks. I wish EVT develops better models so youth may understand this important problem not solved during 1 century. Thanks, Emilio
@ David A. Jones: Thank you for your suggestions. I had not previously considered whether or not the median would be good enough for my purposes. I do not think it is, but in any case I did as you suggested and, as you predicted, found that the median of n samples drawn from a zero mean unit variance normal distribution) has a straight-forward asymptotic behavior. It goes as c*(log n)^(1/2) for large n where c is approximately constant.
@Geoffrey Laird: This is the form given in the slides you kindly provided. However the claim in that work is that c*(log n)^(1/2) is the asymptotic behavior of the expectation of the maximum (T(n) as defined above), not the median, leading me to doubt the result. More so due to numerical results (see below).
Regarding the Gumbel distribution (@David A. Jones:): Since I am interested only in the expectation , I cannot see how knowing that Max(z1, z2, z3, ..., zn) follows a Gumbel distribution helps - even if one could then perform the integration to compute an expectation. The 2 free parameters of the Gumbel distribution require (prior) knowledge of the mean and variance, the former being the very thing being sought.
Returning to , and the fact that Median(Max(z1, z2, z3, ..., zn)) -> c*(log n)^(1/2) , it would be useful to know under what circumstances the median might be a useful approximation to the mean. This is straight-forward when the distribution has a mirror symmetry (eg a normal distribution). But the distribution of Max(z1, z2, z3, ..., zn) clearly does not have a mirror symmetry. Can anything useful be said about the relationship between the two statistics in this case?
Lacking any insight into the mean, and knowing that the median ~ c*(log n)^(1/2), I ran some numerical trials attempting a regression onto the model
=a*(ln(n))^b
in order to find a and b. A file of the results is attached. It shows several runs with n=10^5, and one run with n=10^6. The method for each of these trials involved computation of Max(z1, z2, z3, ..., zn) as n was increased from 1 to 10^5 or 10^6, using a different set of GRVs for each n. (Thus, regression of the model onto n in [1,10^6] required the generation of ~ 10^12 GRVs. Generation and regression onto the model to find a and b took ~ 1 day on a standard desktop in that case.)
The findings were that a ~ 0.72, and b ~0.81, and seem stable over the ranges tested.
Some comments on these results:
The expectation of Max(z1, z2, z3, ..., zn) does not go as sqrt(ln(n)) when n
I did say that the usual asymptotic formulae are poor approximations. I find it difficult here to provide formulae directly, but if you go to http://stats.stackexchange.com/questions/105745/extreme-value-theory-show-normal-to-gumbel and scroll down to the end or search for "It is tempting to emulate the Central Limit Theorem" you will find the next-order adjustment for the formula for the location parameter ... the result is not of the form you are trying in your regression.
As for the mean of the Gumbel distribution, the properties of this distribution are "well-known" and for example can be found at https://en.wikipedia.org/wiki/Gumbel_distribution.
Michael. Yes, I believe it is an open question, not yet solved in a satisfactory way.
If I have made 2 plausible proposals of solution (both for only U parameter and different premises of method) it means that we may expect new and better proposals in future.
We are dealing with the fundamentals of inductive inferential methods from empirical datasets. This is a key research method in teaching and training young researchers. I am not statitian, neither mathematic, but I understand that the solution requires innovative approaches and simpler premises than those mentioned by Extreme Value Theory books and articles inspired in pdf of multiparametric curves as starting apriori premises. So my request is to reconsider the whole method behind it and to apply the principle of parsimony before studying multivariate "linear" correlations.
Look up Gumbel Distribution to see the general form such a distribution would have.
F(x) = Exp (-Exp (-x)).
Here is a simple derivation following the classic paper by Fisher and Tippett (1928) https://www.researchgate.net/publication/234318580_First_ranked_galaxies_in_groups_and_clusters
Article First ranked galaxies in groups and clusters
@ David A. Jones: Thank you for the very useful link to stats.stackexchange. I used the result at the (current) end of the thread to compute the asymptotic mean and found (provided I understood the reasoning properly) that the leading term is sqrt (2 ln(n) ) after all. This (I know) is also the leading term for the median. I had earlier discounted the possibility that these would converge, but now realize that, due to the form of the Gumbel Distribution, they must do so (in this case):
A CDF that goes as
F(x) ~ exp(-exp(-(x-a)/b))
where (as in this case)
b->0 as n->infinity
tends to a step function at x = a, for which the mean and median are therefore the same. (Is this reasoning correct?) Apologies to Geoffrey Laird for doubting your result. Seems from my numerical efforts that n ~ 10^6 is nowhere near big enough to see this behavior emerge.
The expressions for the mean, standard deviation and median are derived (equations 15-17) for the probability distribution (the standard Gumbel distribution) given in equation 11 in the paper I have cited above.https://www.researchgate.net/publication/234318580_First_ranked_galaxies_in_groups_and_clusters
Article First ranked galaxies in groups and clusters
May any of you give me the Maximum value, the minimum one, and the media measured for only one of your group-clusters observations? In the model I explained the media does not have to be equal to median, no matter how big is the sample size. I also found that given values a, b, e, the cummulate probability that variable is bigger or equal to one media is given by
P(K>=1 media) =(1+e)(-1/e)
The interesting point is that two distributions with different values of a and b may have the same exponent e=b/a, and therefore the same P(K>=1 media). When e=1, distribution is linear and it is the only case when media=median at P=1/2. I ignore if this apply to space clusters and groups like those you have measured. Thanks, Emilio