Frequentists, as far as I understood, define probability as a limiting relative frequency. And they say that frequencies can only be defined for data (not for hypotheses), so probabilities can only be given for data and never for a hypothesis. The latter is either true or false, what is generally simply unknown.
I get lost in this line of arguemtation when it comes to the problem of the reference set. The relative frequency must be defined according to a reference set (related to the "population"). The assumption is that any element of the reference set will be observed with the same relative frequency when n->infinity. Von Mises introduced the place selection criterion to further define a required kind of "randomness" to ensure that data will behave this way. I do not understand what place selection is different to the statement that we can not know the order of the elements in the reference set. and therefor have to expect similar frequencies from any subsequence. If there is no difference, then the whole frequentistic approach essentially has an epistemic foundation and is, in principle, not different to approaches where probability is more directly linked to a "state of knowledge".
Now, consider the archetypical example of a coin toss. Based on the binomial distribution with parameter p, one can calculate P(k times head in n tosses). A typical argument of a frequentist is that k is a random variable and thus can have a probability assigned to it (that may be estimated from actual data). The parameter p, in contrast, is a hypothesis, somehow related to the reference set (or the population), what has a fixed (but unknown) value. It will not vary with different tosses and thus is no random variable and thus has no probability.
The definition of the reference set remains unclear. What is the infinite sequence of coin tosses? Under which conditions are the coins tossed? The typical answer is: under identical conditions. But is the conditions were identical, the results were identical, too. So the key is that the conditions of the repetitions are just assumed not to be identical. But what are the actual and possible differences that are allowed (and required)? - and what is the frequency distribution of the varying conditions in the infinite set of replications? And why is the parameter assumed to be constant in different replications (I really do not see a legitimation fo this assumption)?
At the end, so it looks to me, the fundament of the whole argumentation is that we do not know what frequency distribution there is. Thus, the whole procedure around the frequencies in "replicated" experiments is eventually about our uncertainty of the precise conditions, or to put it differently to avoid a discussion about determinism: our inability to precisely predict the results/data.
After this very long (and possibly not very helpful) excourse back to my question:
The frequentistic interpetation requires the imagination of an infinite series of replications (with unknown extend of variation), and uses this to assign a probability. I do not see a difference between the replication of an experiment in this world ad infinitum under different conditions - adn the "replication" of an experiment in infinitifely many similar (but somewhat different) worlds. This is often used in cases when the event under consideration is practically impossible to replicate under sufficiently similar conditions (let alone what "sufficiently" means here). For instance take the beginning of the second world war or the exctinction of the Polio virus, or the explosion of the Vesuv, etc.
But then: if I have some data from what I estimate a parameter (e.g. this p of the binomial distribution) - could then not be an infinite number of (similar) worlds be imagined where my copies all obtain some (more or less different) data and get (more or less different) estimates? And if so, where is the problem in assigning a probability (distribution) to the parameter (i.e. the hypothesis) then?
Looking forward for your comments and critiques.