From a frequentist's prespective, "probability" is the limiting relative frequency (of an event). In this philosophy, only results of repeatable processes can have probabilities. Observed data is fixed and has nothing like a "probability". But the data-generating process can ascertain that possible outcomes have probabilities. Here the sampling enters the game:

The probability (=limiting relative frequency = f) of an item (or individual or event) being in a sample depends on the way the sampling is performed. This f will be identical to the frequency of the item in the population - but only if f is identical for each item in the population. This means the "probability" of being sampled must be identical for each item in the population. AFAIK this is called "random sampling".

But how is this ascertained? How can I say that a process has "equal sampling probabilities" when these probabilities will only establish in infinite repetitions of the sampling process?

No doubts that the probability (=f) of an event in repeated sampling tends to frequency of the item in the population - when the sampling is random, i.e. when the sampling probability (=f) of each item in the population is equal. This follows from the law of large numbers. I do not understand how the "random sampling" can create the same sampling probabilities of the items.

Prototype-example:

The population contains two items (A and B). The frequency of "A" = frequency of "B" = 0.5. A sample of size n=1 will either contain an A or a B. If a larger and larger random sample (with replacement!) is taken, then the theory sais that the frequency of "A" will approach 0.5. But this will only be the case when the sampling procedure itself will guarantee that the limiting relative frequencies (=probabilities) of "selecting A" and "selecting B" are equal.

Just to make it clear: The population may contain 3 items:  A, A, B. When the probabilities of sampling each item are equal (=1/3), then and only then will the relative limiting frequency in the sample of "A" be 2/3 (and 1/3 for "B") and thus match the relative frequencies in the population.

More Jochen Wilhelm's questions See All
Similar questions and discussions