'The notion of probability does not enter into the definition of a random variable.' (Ref.: page 43 of V. K. Rohatgi and A. K. Saleh, An Introduction to Probability and Statistics, Second Edition, Wiley Series in Probability and Statistics, John Wiley & Sons (Asia) Pte. Ltd., Singapore, 2001.) Here randomness has been defined in the measure theoretic sense.
On the other hand, it has also been said that 'A random variable is a set function whose domain is the elements of a sample space on which a probability function has been defined and whose range is the set of real numbers.' (Ref.: page 9 of J. D. Gibbons and S. Chakraborti, Nonparametric Statistical Inference, Third Edition, Marcel Dekker Inc., New York, 1992.)
One who is not conversant with measure theory would opine that a random variable must be probabilistic. But according to measure theory, a random variable need not be probabilistic, while a probabilistic variable is necessarily random by definition.
There should not be two different definitions of randomness. Perhaps a discussion is needed in this regard.
This is not the same thing. Randomness describes the process of random state changes, for example time series. Probability describes our knowledge about an individual event, the result of which will be announced in the future.
I think random goes with variable and probabilistic with models, as in random variables and probabilistic models. A random variable is a variable that takes on a value according to some probability measure, and a probabilistic model is a model that involves uncertainty. I have never heard of a random model or a probabilistic variable. So, they are not interchangeable. [Okay, 'random model' can be used, but it would mean a model that is randomly picked/constructed or something along those lines]
This is not the same thing. Randomness describes the process of random state changes, for example time series. Probability describes our knowledge about an individual event, the result of which will be announced in the future.
My interpretation of this would be that a probabilistic variable is a random variable that has been fitted with a probability function and now represents the probability of that variable.
To put it another way, a random variable is the raw data and the probabilistic variable is processed data.
If we define a random variable as a (real-values) function that is defined on a set of possible outcomes (which calls the sample space) then in order to be admissible such a function must be limited to the family of those functions for which a probability distribution exists. This probability distribution is derivable from a probability measure that turns the sample space into a probability space. In other words, it must be theoretically possible to compute the probability that the value of the random variable is less than any given (real) number.
This would explain why a random variable must be probabilistic (otherwise it will not be measurable) despite the fact that in general, a random variable need not be probabilistic. As Fristedt and Bert eloquently put this, “When we speak of a random variable, we think in addition of the probability distribution according to which it takes all possible values. As a result, a random variable is misnamed as it is neither random nor a variable”.
Dear Hemanta,
you are right, there is some confusion in the use of the two terms, the confusion mainly arises from the lack of appreciation of the difference between something we observe looking at nature and something we build for studying nature (or producing artificial devices).
If an event has no detectable known cause (or so many causes not deconvolvable among them) at the microscopic level (single unit), when we observe the series of events (think for example to the measurement errors of the weight of a given object) we obtain a random variable : imagine an object having the TRUE weight of 1,000000 grams, if you put the object on the plate of a scale you will reasonably get something like: 0.99999 ; 1,000001; 0,99997 ; 1,0003; 0,99996; 0,99995.....
The data set constitutes a random variable in which EACH SINGLE OUTCOME is per se unpredictable (this is consistent with our naive idea of what random is: total absence of order) , but if you collect a fairly long series of data, if the errors are TRULY RANDOM (no systematic bias like a progressive loose of weight of thew object) the corresponding PROBABILISTIC MODEL IS VERY PRECISE FOLLOWING A DETERMINISTIC EQUATION (in this case Gaussian distribution) whose parameters (mean and standard deviation) will be known with a precision going to infinite !
By the way, in the above sketched case, the mean will coincide with the TRUE WEIGHT of the object, this means we can be very precise at the population level THANKS to the randomness at the microscopic level (this is why thermodynamics works so well),
Summarizing we can say that random variables allow us to generate probabilistic models (that in turn are governed by strictly deterministic equations) to describe them, clearly we can use probabilistic models even fto 'apparently random' variables or to decidedly non-random variables, imagine the fitting of a curve to experimental data: discovering that the microscopic (the order of appearence of the values along a series) variability of a random variable Y scales with the corresponding values of another variable X with a correlation r=0.85 implies that the Y series has an embedded order (at least in part) dependent from X, to estimate the strength of the relation between X and Y we routinely use a probabilistic approach (e.g. analysis of variance).In any case we refer to Y (even if predictable by X) as being a 'random variable' becuase it is convenient to consider it as such for the estimation of the model..
I hope this could be of help to you...
A probabilistic variable/parameter/model/etc. is a random one with some information about its distribution/characteristics. Thus, we can have predictions on the behavior of a probabilistic one, while a random event is uncertain and unpredictable, although there are certain rules for generation of random numbers. So I agree with Phil that a probabilistic variable is a random one which is processed.
I am not sure why we have to make the question so difficult. Randomness does not have to be affiliated with a variable. An event could also be random without being quantified. It seems to me randomness means with equal chance, while probabilistic means with a chance (equal or unequal).
Dear All,
Let me ask the question a bit differently.
Is a 'random variable' different from a 'probabilistic variable'?
Measure theoretically speaking, a random variable is not necessarily probabilistic. On the other hand, in the statistical literature a random variable is usually described as a variable associated with some probability law. This is the point I am raising. Why should there be two different definitions of randomness?
Then, random variable has a standard definition. Please define "probabilistic variable".
Dear Aliakbar,
'Random Variable' has been defined differently in measure theory and statistics, That is why I have started this discussion. No word should have two different meanings in two different mathematical fields of study.
By 'probabilistic variable', I want to mean a variable that follows a probability law. When a variable follows a probability law, it is necessarily random according to the measure theoretic definition too. But when a variable is random measure theoretically, the notion of probability does not enter into it.
I have given references of two very standard books. I have mentioned the page numbers too. Please check the two references, if possible, that I have cited in my question.
In a probability space (Ω, A, P), one can distinguish two kinds on "randomness" one expressed by the σ-algebra A, and we can called it "qualitative" one and the other expressed by the probability measure, which we can call quantitative. Usually we take as the core of randomness, the concept of probability.
Now the "random variable" is just a measurable function. This property has as its main purpose to transfer the probability measure on the real line. Although the definition does not invoke the probability measure, the values of a random variable are really random since they are dependent on the particular ω that have been occur. In this sense, indirectly the concept of random variable depends on probability measure. There is no "probabilistic variable" distinct from "random variable".
The mathematical theory of probability arose from attempts to formulate mathematical descriptions of chance events. In this case chance is represented with a probability space.
The mathematical theory of probability arose from attempts to formulate mathematical descriptions of chance events. In this case chance is represented with a probability space.
Thus "probabilistic" is always connected with probability spaces.
On the other hand "random" refers not only to probability theory, but also, e.g to Algorithmic Information Theory (what constitutes a random sequence: Kolmogorov, Martin-Loef, etc.), Random events in quantum mechanics, etc. Thus "random" might be more general than "probabilistic", see e.g
http://en.wikipedia.org/wiki/Randomness
I'm no expert in the definition but I would tend to agree with Jefrey, whose explanation is the easiest for the brain to suck in :)
Professor Drossos,
I would like to come straight to the last paragraph of your comment. In 'random events in quantum mechanics', the term 'random' has actually been used as something associated with 'probability'! What I mean is that it has been used in the same sense in which the statistics fraternity commonly uses it. So the confusion was started by the statisticians, and that entered into all branches in which statistical analysis has made inroads. In 'random sequence', 'random search' and such other usages too, the term 'random' has been used in this same sense. In statistical mechanics for example, is it not in this meaning that the word 'random' is used?
But in measure theory, randomness is defined without reference to probability. This is what I am putting forward for discussions. Why should a standard mathematical word have two different definitions? In fact, mathematics must not be ambiguous anyway.
As for probability measure, please see whether it has anything to do with probability! ,
The simplest answer is that the two words mean exactly the same thing. For example, a random variable is one whose values can be described by a chance or simply probability law. Thus, calling a variable a probabilistic variable implies that one is referring to a variable that takes on values according to a chance or probability law.
That is a wrong definition of a ranmdom variable. A random varaible is a function whose domain is teh sample sapce.
No, the terms "random" and "probabilistic" have different means.
Dear Tayfun and Okhunov,
So you see, you have completely opposite views! If Okhunov is correct, then Tayfun cannot be correct! Indeed, that is why a thorough discussion has always been necessary in this context.
In my question, I have cited a book by Rohatgi and Saleh. According to them, the notion of probability does not have anything to do with the definition of a random variable. They have gone for the measure theoretic definition of randomness.
In my question, I have cited another book too, authored by Gibbons and Chakraborti. According to them, assumption of a probability law is inherent in defining randomness. They have gone for the statistical definition of randomness.
Let me cite from another book now: B. R. Bhat, Stochastic Models: Analysis and Applications, Reprint, New Age International (P) Ltd., New Delhi, 2004. In this book, two opposite views have been written!
In pages 1 - 2, Bhat has mentioned very clearly as follows. 'Ideal situation envisaged in a deterministic model hardly exists in everyday life. Also the model may not fit the observation well because some essential features have been ignored. Many times, improvement can be achieved by introducing random variables or chance factors in the model.' In other words, he seems to agree with Gibbons and Chakraborti that random variables occur due to chance factors.
In the same book, in page 7, he put forward the definition of a random variable, this time measure theoretically. In the definition on that page, there is no mention of probability!
On one hand, he says that randomness is synonymous to chance factors, while on the other hand, he cited the measure theoretic definition where there is no reference to a chance factor!
Most people do not seem to understand that a probability measure does not have anything to do with probability at all. Naming this special case of the Lebesgue-Stieltje measure as Probability Measure was the root of all problems it seems!
Regards.
To Aliakbar Haghighi
Any function on the sample space is only a special case of random variable which is called the statistics.
There is equivalence between the sample space of a random variable and the domain of its probability law. They are not expressed by using the same words, but they are equivalent, and they must be. It is analogous to writing or pronouncing "apple" in two different languages. An apple is an apple in either case. In any event, so far as the two definitions "random variable" and "probabilistic variable" go, these have the same meaning in engineering.
Hemanta, could you provide a source (preferably something that is available on the internet) for what you mean exactly with the statistical definition and the measure-theoretic definition of a random variable?
I think that the wikipedia pages on "random variable" and "randomness" give a clear description of how random variables are defined in both stochastics (which is pure mathematics) and statistics/machine learning. It is indeed based on a probability measure, and I have never seen otherwise*.
The definition given in the first quote in your original post is just unclear to me, I assume that the measure-theoretic definition would be by using a probability measure, and I do not understand how a probability measure does not involve probability. Also, the second quote leading to what you refer to as the statistical definition presumably leads to the same definition of a random variable by means of a probability measure.
http://en.wikipedia.org/wiki/Randomness
http://en.wikipedia.org/wiki/Random_variable
* Of course the more applied the research, the less explicit the mathematics.
Random is differ than probabilistic as all future event in every day life but not all events random.
Random is an indicator of probabilistic. In population-based survey for instance, a representative sample is also termed probabilistic sample and in this context, it is assumed that conclusion can be probabilistically generalized to the population.
For a sampling to be termed probabilistic or representative, generally three main conditions should be met:
• The characteristics/diversity of the population are taken into consideration; if the population has males and females, rural and urban settings, people of different educational levels etc. , all these characteristics shall be included in the sample and all the strata shall be properly weighted.
• Every individual in the population is given equal chance of being selected. This is only possible through simple random or systematic sampling which a two sampling techniques free from bias. However, this bias can be corrected in convenience sampling for instance using higher Design effect (DEFF).
• The sample shall be bigger for a bigger population. Use standard procedures/formula to calculate the sample size. The Design Effect is bigger enough to improve the variability when unbiased sampling technique cannot apply. In fact, it is difficult to apply simple random sampling or systematic sampling in population-based survey in several countries because this requires a complete data base of the population whereby each individual can be identified and located. We generally apply higher DEFF to correct this bias when using other sampling techniques that are subjected to bias.
Some scholars agree that we should use hypothesis only when a probabilistic sample is used. Generally, hypotheses are complementary to research questions and scientifically clarify what the researcher is out to verify probabilistically, though probabilistic appreciations can apply even when the researcher decides to present what he is out to prove, verify or argue using objectives or research questions. The main difference is that when hypothesis is used, a probabilistic (using a representative sample) argumentation is absolutely expected, which is not the case with research questions where the argumentation can be backed probabilistically or inductively (logical reasoning whereby one proceeds from facts to draw conclusions). However, probabilistic inference or illation is always more valid than inductive one because it follows well-defined statistical rules that improve the representativeness of the conclusion that can be extrapolated (generalized). Conclusively, I recommend that research question be used when one is not absolutely sure of the representativeness of the sample (non-probabilistic sampling) and hypothesis when a representative sample is assured or when Complete Randomized Block Design (or a well justified Restricted Randomized Block Design) is used in experimental research. In a non-probabilistic approach whereby a pseudo-random sampling such as convenience sampling is used, the validity and representativeness of the result are to some extent ensured when the sample size is corrected with high design effect to improve representativeness; in this situation we can tolerate a hypothesized prediction while assuming a probabilistic sample.
I guess it is a problem of language: in italian we usually say "variabile aleatoria" to mean random variable, while we say "casuale" to mean random in the sense of "campionamento casuale= random sampling" . Often we also use "random" (es: campionamento random), but never say "variabile random". Rarely italians use "variabile casuale", so the subtle difference between a random variable such as the binomial or the normal and a random sample that is a sample in wich every unit has the same chance to enter in, is quite immediately grasped.
I'd like to add that, in my opinion, not all probabilistic sampling methods are also random sampling methods. Simple Random sampling refers to a sampling with all probability is equal, that is it refers to the Uniform random variable: so we may conclude that only the uniform is an actual, sensu strictu, random variable...
Dear Jefrey,
I do not have any internet reference.
I have referred to two books in my question and one in my earlier post. I have cited the page numbers too. Please go through those pages in the three books if you can.
As for probability measure, the definition would be available in any classical book on measure theory. Indeed, in the definition of a probability measure, the notion of probability need not enter.
Just as the word 'men' does not really have anything to do with 'men of war', the definition of 'probability' does not enter into the definition of 'probability measure'!
Dear Tayfun,
You have mentioned about the usage of the word 'random variable' in Engineering.
In this regard, I would like to add the following. Actually, the statisticians have started this confusion initially. Thereafter, the confusion had entered into other branches of knowledge such as physics, chemistry, botany, zoology, geography, economics, medicine, engineering etc. After all, for statistical matters, people of such branches would go for books on statistics only. Why would they bother to think that there is something called measure theory in mathematics? Indeed, in all the fields mentioned above, the students need not know anything about measure theory!
A random event is an event which has more than one possible outcome. A probability may be associated with each outcome. The outcome of a random event is not predictable, only the probabilities of the possible outcomes are known.
My copy of the Random House Dictionary of the English Language does not list the word "probabilistic", probably because "random" is better, and to the extent one wants to use a "probabilistic" distinct from "random", one lands in a situation where common usage is unclear..
A sample can be drawn from a population, either randomly or in a deterministic way. A random sample is drawn from a population, by chance. In parametric statistics, it is assumed that a population distribution can be generated by a know probability density function. In this context, a random variable follows a probability density function on the population.
To complement, note that a series of n random variables each of which follows the same probability density function is called a random sample of size n.
Hemanta,
I went through three references:
* Probability and Random Processes with Applications to signal Processing" by Henry Stark and John Woods
* Measure Theory and Probability Theory by Krishna B. Athreya
In the first reference it first defines a random outcome if to every f we assign a real number X(f), and if it has certain constraints then it is called a random variable.
In the second reference it starts framing the problem in chapter 6 as "probability theory provides a mathematical model for random phenomena i.e. those involving uncertainty" and from there frames the definition problem.
Based on this reference I consulted a basic text book on mathematical statistics which is
* Mathematical Statistics with Applications by Wackerly, Mendenhall and Scheaffer
They defined a random variable as:
"a random variable is a real-valued function defined over a sample space. Consequently, a random variable can be used to identify numerical
events that are of interest in an experiment."
I think that the second reference substantiates Jefrey's definition and I find that the three references that I have do not fall into disagreement of the definition of random variable just the detail of the definition.
I think the problem with Gibbons and S. Chakraborti definition is that they framed the definition on a constraint by using "on which a probability function has been defined ". This contrasts Mendenhall et al. similar but looser definition by using "can be used to identify numerical events that are of interest in an experiment" (with emphasis on "can be").
One final comment clarifying Hamed's comment on random numbers is that the nature of randomness is that it lacks structure and all we have done is how to create "random numbers" which do pass statistical tests for non-randomness. This does not mean that we have generated truly random numbers.
My experience with the concept of "randomness" relates with sequences of values (usually integer or real). In this context, one of the few definitions of randomness goes back to Kolmogorov complexity: a finite sequence is considered random if listing its values is the shortest way to describe it. In practice one can hardly ever say for sure whether a given (long) sequence is random or not.
The issue of random sequences is of interest with so-called pseudo-random number generators. Such generators apply a fixed formula to modify the contents of an internal memory array, then output the first entry (usually with some "tempering"). Surely this will never produce a truly random sequence. The quality of a generator is judged by submitting its output to a battery of statistical tests derived from results in Discrete Mathematics (exact counting of certain phenomena). The often used test package TestU01 (by l'Ecuyer and Simard) is particularly impressive and handsome for non-experts in statistics like me.
Modern generators are very fast (they produce one billion pseudo-random reals between 0 and 1 and return their sum in just a few seconds) and have astronomical periods (magnitudes like 2 to the power 19000 are quite common) making them useful for large-scale simulations (they can mimic stochastic variables and select samples). They have a smart design based on finite field theory and they are quite able to mislead the test batteries. This provides a cheap and high-quality substitute for physically produced "random" sequences (which are not certain to be random either).
In the context considered here, randomness and statistics are definitely related but quite distinct topics.
Dear Arturo,
In a simple way, the measure theoretic explanation of randomness is as follows.
For example, integration of (3.x^2) with respect to x from 0 to 1, is equal to 1. The variable x here will be called a 'random variable' for which (3. x^2) is the 'density function'. This is what the measure theoretic definition says. Here there is no reference of 'probability'. Generally speaking, x defined in [a, b) will be called a random variable if there is a function f(x) associated with it such that integration of f(x) with respect to x from a to b is equal to 1.
Now if some variable y follows a probability law with 'probability density function' g(y), for y in [c, d), integration of g(y) with respect to y from c to d is 1 anyway. That is why a variable following a probability law is random automatically.
In other words, a probability density function is a density function by definition, but a density function is not necessarily a probability density function.
Dear Matts,
Uncertainties can be either probabilistic or fuzzy. Probabilistic uncertainty is studied using the theory of probability, while fuzzy uncertainty can be studied using the theory of fuzziness. That is how the word 'probabilistic' can be explained.
Hemanta,
Let me see if I got your posts correctly. Do you think it is more than a mere syntactical error on the part of statisticians (as I posted)? and more of an inferential error?
No. For example, a "random sequence" from a random number generator is not random, but deterministic. To point this out, the more precise term is "pseudo random numbers". However its behaviour is best studied by probilistic methods. This is the great achivement of Kolmogoroffs unification: to understand that randomness is not a necessary assumption for probabilistic (or stochastic) models.
I hope the following may help everyone who may want to get a clear insight about the difference between the term "random" and the term "probability".
1) Experiment: An experiment, such as throwing a coin or rolling a dice, terminates with an outcome which cannot be predicted with certainty but it can be described prior to the performance of the experiment. Tossing a coin is an experiment, no matter whether the coin is fair or not. If a fair coin is given, the possible outcomes of the experiment, head and tail, or the events that tossing a head and tossing a tail, will be equally likely. Thus, "Probability of tossing a head or tail with a fair coin" is 1/2.
2) Random Variable: The possible outcomes of an experiment is called the sample space of the experiment. A variable that takes on a value from the sample space of an experiment is called a "random variable". Random variables follow the probability density functions the actual chance/probability of occurrence of the possible outcomes followed. If a fair coin is given, then such a random variable will follow the uniform probability density function that assigns equal chance of occurrence to every possible outcome. Thus, "Probability that such a random variable takes on a value equal to head or tail", is 1/2.
3) Random Experiment: If an experiment can identically and independently be repeated under the same conditions, it is called a "random experiment" (random process). Tossing the same coin n times (or, equivalently, tossing n identical coins simultaneously) constitutes a random experiment, no matter whether the coin is fair or not. If a fair coin is given, then "Probability of observing k heads out of n identically and independently repeated coin tossing" is given by a binomial probability density function with p=1/2.
4) Random Sample-I: In the context of parametric statistics (as well as in the context of sampling theory), the set of n outcomes that would be yielded from a random experiment is called a "random sample" of size n (that could be drawn from the population of coins represented by the coin at hand). The population of fair coins can be generated by a binomial distribution (i.e., discovery of normal distribution). "Probability of drawing a random sample of size n with k heads from the population of fair coins" is therefore given by a binomial probability density function with p=1/2 (or can be approximated by a normal distribution because of the central limit theorem).
5) Random Sample-II: In the context of mathematical statistics, a random sample of size n represents a "set" of random variables each of which is associated with one of the n identically and independently repeated trials of the same experiment, i.e., independently and identically distributed (i.i.d) random variables.
6) Random Sample-III: In the context of stochastic process, a random experiment is referred to as a random process, in order to indicate the interaction between the successive trials of the same experiment. In this context, experiments are again repeated identically but usually not independent of each other. To emphasize this stochastic nature of repeated experiments, the term "set of random variables", which is used to refer to a random sample in the context of parametric statistics, turns out "series of random variables".
In summary, the term "probability" refers to the chance of occurrence of every event/outcome in the sample space of an experiment, or, equivalently, it refers to the chance that a random variable takes on a particular value from the sample space of an experiment. The term "random" basically refers to the characteristics governing the behavior of the successive trials of the same experiment. They are two terms that are used to refer to the two different but complementary parts of a probabilistic model that could be used to abstract a natural phenomenon (for example, Heisenberg Quantum Mechanics), which could alternatively be abstracted by a deterministic model (for the example, Newtonian Mechanics).
Note: This interpretation of "probability" is based on classical set theory, where events can span multiple outcomes of an experiment but they necessarily divide the sample space in mutually exclusive sub-sets. Whereas, in contrast, according to fuzzy set theory, events do not need to divide the sample space in mutually exclusive sub-sets: different sub-sets may contain the same outcome(s).
Dear Arturo,
It was an error anyway. I leave it to you to decide whether it was syntactical or inferential!
The total integral with respect to a Lebesgue-Stieltje measure is a constant c. A probability measure is a special case of the Lebesgue-Stieltje measure in the sense that the constant c is equal to 1 in this special case. The question of introducing the notion of probability while defining a probability measure does not therefore arise. However, when a variable follows a law of probability already, it must automatically follow the principles of a probability measure. The point is that if a variable follows the postulates necessary to define a probability measure, it is called a random variable in measure theory. In other words, if a variable is random, it need not follow a law of probability, while if a variable follows a law of probability, it is random anyway measure theoretically. The measure theoretic definition of randomness does not include the notion of probability, while the statistical definition of randomness includes the notion of probability. This is the difference between the two definitions, and I feel, there must not be two different definitions of a mathematical concept.
I always had in mind that a clarification of this is necessary. Through ResearchGate, this debate has now been going on! In the statistical definition of a random variable, the variable follows a law of probability. In the measure theoretic definition, the variable need not follow a law of probability. Why should such a double talk continue? That is the point.
Therefore when a statistician says that a random variable is one that follows a law of probability, it is apparent that he has defied measure theory! Indeed, a law of randomness need not be a law of probability, but a law of probability must anyway be a law of randomness.
Now it is up to you to decide whether the mistake was syntactic or inferential!
Dear Bekir,
You have raised a point on Stochastic Processes. I would like to explain something with an example. I sincerely understand that you are definitely an expert on Stochastic Processes. The following lines are not meant for you; I am interested to raise a different point at this juncture.
The trouble is that whenever people use some high sounding concepts, they either do not know the basics, or they do not care to know that there are some basics after all. Ask anyone to exemplify a stochastic process. The obvious answers would be queues, time series, etc..
Why should we go for such examples to understand what a stochastic process means? Do such people know that the sample mean is a stochastic process? By definition, an index dependent variable that follows a probability law is a stochastic process. Accordingly, if for a sample of n observations the sample mean is Mn, then the probability law of Mn will be dependent on n, and therefore the sample mean, our simple 'sample mean', is a stochastic process already.
"The question of introducing the notion of probability while defining a probability measure does not therefore arise."
The probability measure is what defines the laws of probability (Kolmogorov's axioms and the properties that constitute a probability measure are the same, right?), so I do not see how 'probability' is not relevant to the probability measure.
"However, when a variable follows a law of probability already, it must automatically follow the principles of a probability measure."
By definition, a probability measure follows the laws of probability, so that is rather obvious.
"The point is that if a variable follows the postulates necessary to define a probability measure, it is called a random variable in measure theory. In other words, if a variable is random, it need not follow a law of probability, while if a variable follows a law of probability, it is random anyway measure theoretically."
This is false, it contains a contradiction; a probability measure always follows the laws of probability. As random variables are defined by means of a probability measure, they must also follow the laws of probability.
"The measure theoretic definition of randomness does not include the notion of probability, while the statistical definition of randomness includes the notion of probability. This is the difference between the two definitions, and I feel, there must not be two different definitions of a mathematical concept."
No, the definitions are equivalent. Perhaps something that is not so obvious is that the definition of a probability measure does not involve randomness, but I'm not sure if that realisation is very important when merely applying statistics.
Dear Jefrey,
In that case, Rohatgi and Saleh (cited in my question) have to be wrong in saying that 'the notion of probability does not enter into the definition of a random variable'! Accordingly, all books on measure theory also must be wrong in this regard! Is that what you mean?
No, they are right, and that is the problem! Rohatgi and Saleh have stated what measure theory says.
In all books on Statistics that I could consult, other than this one by Rohatgi and Saleh, the definition of a random variable includes its association with a probability law.
Just because every other book that I could go through says that a random variable has to follow a probability law, it hardly means that Rohatgi and Saleh were wrong!
If a lie is repeated thousands of times, would that mean that it is true? Truth of a mathematical statement must not depend on popular votes!
Hemanta,
Thanks for your clarification. I will try to get some additional statistics books to see if the mistake repetitive, how widespread and the cause of it. Sometimes it is just the use of natural language, but from your first reply to my post it hints that it might be inferential. It will take me some time to look at it to determine where the flaw is and what is its underlying reason.
Thanks for pointing it out, it is a subtle point that if you had not brought it to my attention, I would have never thought about it. It is nice to have such talks.
Regards
@Jefrey,
with most of your previous contribution I agree, but with your statement
>
I strongly disagree:
A random variable (the name comes from times when it was common to say 'dependent variable' for function, which was not too bad a terminology, after all) is a, usally real-valued, function on a probability space (sample space X together with a probability measure P on X). The only thing that is probabilty related with f is this: whenever 'event x \in X happened', the value f(x) is determined (e.g. f is the function defined on a target returning the distance of a bullet hole from the center of the target; each randomly arriving bullet then gives a value, a 'random value'). Since the events x happen in accordance with a probability law P, the values f(x) happen according to the probability law f(P), defined by f(P)(S) := P(f⁻¹(S)), where S is a subset of R. If events x would happen following a deterministic plan, the role of f would be the same, nothing is probabilistic with f itself. In our target example it is a strictly geometric function.
They are not the same. Probabilistic describes a situation or model where there are multiple possible outcomes, each having varying degrees of certainty or uncertainty of its occurrence. Probabilistic is directly related to probabilities and therefore is only indirectly associated with randomness.
Probabilistic is often taken to be synonymous with stochastic but, strictly speaking, stochastic conveys the idea of (actual or apparent) randomness whereas, as mentioned above, probabilistic is only indirectly associated with randomness. Thus it might be more accurate to describe a natural event or process as stochastic, and to describe its mathematical analysis (and that of its consequences) as probabilistic.
Dear Arturo,
In fact, I have been asking this question to many a person from the statistics fraternity during the last thirty years or so. Thirty years ago, the book by Rohatgi and Saleh was not there. I happened to have come across this book in 2010. As soon as I have observed that these two authors have actually written what I had been asking whosoever from Statistics I had met, I was mildly surprised. After all, at least in one book such a statement has been openly made! Meanwhile, you may please search for other books in this regard.
As for the measure theoretic definition of randomness, it is available in that same book. I believe, you will find the description very well written.
@Ulrich, Hemanta: Okay, now I think I lost track as well.
f is just a function, I agree with that. However, I wouldn't call a variable random unless f is a probability measure, in which case it clearly involves probability. Are you implying that this view is the statistical interpretation, while in measure theory it is also used in non-probabilistic contexts?
It seems to me that random is in the meta realm -- an event, however we define that, manifests in an outcome, however we define that. To my understanding, measure theory maps these onto the real numbers and we begin to speak in terms of "random variable", etc in an objective language. (You will begin to notice my bias that mathematics is humanly constructed which influences how I argue below.)
I view probability as the result of experience. From across all experience we derive theorems (models) of how things "should be". From the way we model a coin toss, we choose the binomial theorem to test whether or not the coin is fair. For some other process, given its model we choose a theoretical distribution as a "seed" or "kernel" against which to treat our observations.
I'm having some trouble with words ... With random our focus is event --> outcome, almost in a looking forward manner. WIth probable our focus is on the relationships within a collection of outcomes from an identical event, a looking backward at what has been. Of course, our ultimate purpose is to predict -- why else would we care about what has been?
From a slightly different aspect: It seems to me that events are complex objects and thus may share properties (perhaps to varying degrees). And it seems to me that when we map events onto the real numbers that we tend to map similar events (those with properties in common) near one another. Thus, in the process of mapping events, we distribute a property (of various intensity?) among the real numbers. So, are we observing events as outcomes or are we observing properties of events as outcomes?
A slightly different point: Is all the variation in the event-outcome (i.e., we have perfect measure) or is all the variation in our measure (i.e., we have a perfect event)? I think the answer is both.
These considerations lead me to conjecture that randomness deals with distribution of events (of which outcomes are a subset) as mapped to the real numbers whereas probability deals with the distribution of the measurement of a (complex) property. Perhaps I'm not saying this quite right.
Dear Ulrich,
Please post a reply to the question thrown to you and me by Jefrey. I would like to go through your answer to his question first.
Probabilistic represents a "close circuit" in that we know exactly the probability curves associated with their motion or activity or chance; they are not random. Random is an open circuit where you have no formula to describe the event at all and you may not even know that an event like this exists. I call that uncertainty. Risk for example is probabilistic as is the weather--it will either rain or not but you know there is rain you just don;t know when it will fall.
The best way to think of this is perhaps through physics: while we never know where exactly a particular electron is at any moment, it is probabilistic since we know its path only we don't know where it is at every single moment and there is a chance that it is here or there. This is not random. Everything that can be defined in some way and be bounded within the 0% and 100% chance would be probabilistic. A random event will fall outside the range of this 0%-100% since you have no clue that it exists.
@Jefrey,
f a probability measure?
I did not expect that you are so ignorant about the rules of the game.
My apologies if I have offended you, I was merely trying to be open, honest and helpful. We all have limited knowledge and I am well aware of my own limitations.
I only meant to ask if people actually discuss random variables that do not satisfy any of the properties of a probabiliy measure, which are mathematically equivalent to, for example, Kolmogorov's axiomas of probability. There are many ways to write down the same thing.
Jefrey,
sorry if I was harsh, but you drive me desperate if you neither change nor defend your misconceptions as your last respond shows again.
To be pragmatic, consider for example sampling:
you have "random sampling" which mainly stress the randomness of elements estraction and often (but not always) refers to equal probability.
Moreover you have special cases of "probabilistic sampling" wehere randomness is specified by particular probabilities.
A probabilistic variable or event is a deterministic one, and takes a finite amount; Probabilistic variable is predictable. Random variable is a non-deterministic one and takes non predictable values, no functional could be correlated to it. Perhaps it is only a biased terminology in the measure theory.
As I know that the "random numbers" are generated by many simulation models i.e. Monte Carlo event generator but one can not generate the "probabilistic numbers". Only the probabilistic numbers are identified only in any experimental / theoritical data, it can not be generate by any model.
Thank you, Hemanta, for this wonderful question and thanks to all contributors. I read through the posts, stuggeling seriousely several times (I am not a mathematician) and surely became quite insecure here and there. Finally I have the strong feeling that a good part of this discussion is going in circles. I won't be able to resolve the problem, but I can possibly make a contribution that may help others - or provoking feedback about my possibly wrong understanding (what is well appreciated!).
* A random experiment is an experiment with an outcome that is one of several possible different outcomes, where it is NOT known a priori which of the outcomes will be observed.
This is (I think) related to the sample space (-> different possible outcomes). The key here is the ignorance, the lack of knowledge. It does not matter in any way if the outcome is deterministic or non-deterministic. ["Determinism" is a different philosophical problem that won't help here; most of the examples in stats books (and in research!) are clearly determinsistic, so the definitions of "random"/"probabalistic" must be applicable to deterministic outcomes. Thus:
* Randomness is not related to determinism
A simple example to illustrate that "random" is related to ignorance: Consider placed a coin on the desk. This is not a random experiment. I see what side is up, I possibly wanted a certain side to be up. For you, the observation of the coin is a random experiment, because you don't have the information to predict with certainty which side is up. This missingness of certainty is called randomness. If you would have known that I always, certainly, place heads up, then this would not be random - for you!
* Uncertainty is not an all-or-nothing thing
Among the possible outcomes we may not be able to say with certainty what will be the case, but -based on other information/knowlege - we may prefer one possibility over the other. When I am searching my keys when I leave the house, there are many different places possible where it could be. I can't tell the place where where it is with certainty. However, knowing my habits, I expect it most likely to be at the key holder, slightly less likely I expet it to lie on the shoe bin, and so on. I would almost exclude the possibility to find my keys in the fidge or in the bedside table. I may somehow quantify these expectations relative to each other. This is regulated by the measure theory (I think), to give a function assigning real values to my expectations:
* Probability is a measure for the (relative) strength of an expectation.
A probability distribution thus describes my state of knowledge (or ignorance) about the possible outcomes. The more even the distribution is the less I can prefer one event over another, the higher is the entropy, the larger my uncertainty. On the other extreme, I may have enough knowledge to exclude all possible events except one. Hence I am certain to observe exactely this event. Then I don't need distinguish different events, I dont need probabilities and the whole experiment is not anymore random.
As probabilites descibe expectations, they depend on the information/knowledge that is available and required to even formulate the different possible observation:
* Probabilities are always conditioned on prior knowledge
This was already illustrated in the coin-example above. Since probability describes/quantifies expectation, the term "knowledge" can be substituted by "assumption". It would not matter if you knew or if you just assumed that I prefer placing coins with their heads facing up. In the same way it is identical either having no reason to prefer any side and omitting an assumption about my preferences or assuming that I do not have any preference.
I can directly assign probability distributions for very simple experiments where I can reasonably well define the minimum required information. Most simple is the Bernoulli experiment with just two different possible events. Using the strict mathematical tools of the probability calculus, this can be used to derive other probability distributions like the Binomial, Poisson, Exponential, Gamma, Beta and so on. The derivation of the normal distribution by Maxwell, Hershell, and others, is very interesting in this context, since it was derived from only two simple statements: (i) the values scatter symmetrically around a common center and (ii) knowing the deviation of one of the values does not provide any information about the deviation of any other value.
* A variable is an operationalized attribute. A random variable is a variable where we are uncertain about the (yet un-)observed values.
So far to this. The term "probabalistic" has, as far as I understand, a quite simple definition: "having to do with probabilities" or "being associated with uncertainties". A model some simplifying caricature, sketching some aspect of something too complex to be grasped or of something that one can not directly experience. We use mathematical models to sketch relations between variables and to make predictions. F = m*a is such a model, for instance. Often, it is obvious that the model at hand is not perfect, missing predictors, relations, assuming wrong relations and so on. Often, we have no idea and no means to get a model that would be so good that discrepancies to actual observations are clearly negligible. The response is then a random variable (because its value is not lnow to us with certainty), the model is associated with (significant) uncertainty. Instead of saying "Here I have a mathematical model of the relation between X1, X2, and Y, where the the values of Y are not predicted with certainty" we just call it a "probabalistc model".
To my opinion, the terms "random variable" and "probabalistic variable" would thus by synonyms. However, the term "random model" would be misleading or at least ambigitious because it would imply that we are uncertain about the model itself instead about the response values. Therefore, "probablistic model" is a better term.
Yes, but randomness implies a measure of uncertainty which means no well defined causality.
"Many mathematical models of physical systems are deterministic. This is true of most models involving differential equations (notably, those measuring rate of change over time). Mathematical models that are not deterministic because they involve randomness are called stochastic. Because of sensitive dependence on initial conditions, some deterministic models may appear to behave non-deterministically; in such cases, a deterministic interpretation of the model may not be useful due to numerical instability and a finite amount of precision in measurement. Such considerations can motivate the consideration of a stochastic model even though the underlying system is governed by deterministic equations.[51][52][53] " Source : http://en.wikipedia.org/wiki/Determinism
"In probability theory, a stochastic system is one whose state is non-deterministic. The subsequent state of a stochastic system is determined both by the system's predictable actions and by a random element. A stochastic process is one whose behavior is non-deterministic; it can be thought of as a sequence of random variables. Any system or process that can be analyzed using probability theory is stochastic.[1][2] Stochastic systems and processes play a fundamental role in mathematical models of phenomena in many fields of science, engineering, and economics" Source : http://en.wikipedia.org/wiki/Stochastic
"Interpreting causation as a deterministic relation means that if A causes B, then A must always be followed by B. In this sense, war does not cause deaths, nor does smoking cause cancer. As a result, many turn to a notion of probabilistic causation. Informally, A probabilistically causes B if A's occurrence increases the probability of B. This is sometimes interpreted to reflect the imperfect knowledge of a deterministic system but other times interpreted to mean that the causal system under study has an inherently in-deterministic nature. (Propensity probability is an analogous idea, according to which probabilities have an objective existence and are not just limitations in a subject's knowledge).[1]"
" Prigogine In his 1997 book, The End of Certainty, contends that determinism is no longer a viable scientific belief. "The more we know about our universe, the more difficult it becomes to believe in determinism." This is a major departure from the approach of Newton, Einstein and Schrödinger, all of whom expressed their theories in terms of deterministic equations. According to Prigogine, determinism loses its explanatory power in the face of irreversibility and instability.[26] Prigogine traces the dispute over determinism back to Darwin, whose attempt to explain individual variability according to evolving populations inspired Ludwig Boltzmann to explain the behavior of gases in terms of populations of particles rather than individual particles.[27] This led to the field of statistical mechanics and the realization that gases undergo irreversible processes. In deterministic physics, all processes are time-reversible, meaning that they can proceed backward as well as forward through time. As Prigogine explains, determinism is fundamentally a denial of the arrow of time. With no arrow of time, there is no longer a privileged moment known as the "present," which follows a determined "past" and precedes an undetermined "future." All of time is simply given, with the future as determined or undetermined as the past. With irreversibility, the arrow of time is reintroduced to physics. Prigogine notes numerous examples of irreversibility, including diffusion, radioactive decay, solar radiation, weather and the emergence and evolution of life. Like weather systems, organisms are unstable systems existing far from thermodynamic equilibrium. Instability resists standard deterministic explanation. Instead, due to sensitivity to initial conditions, unstable systems can only be explained statistically, that is, in terms of probability.
Prigogine asserts that Newtonian physics has now been "extended" three times, first with the use of the wave function in quantum mechanics, then with the introduction of spacetime in general relativity and finally with the recognition of indeterminism in the study of unstable systems.
Against Einstein and others who advocated determinism, indeterminism — as championed by the English astronomer Sir Arthur Eddington — says that a physical object has an ontologically undetermined component that is not due to the epistemological limitations of physicists' understanding. The Uncertainty Principle, then, would not necessarily be due to hidden variables but to an indeterminism in nature itself.[30]
Determinism and indeterminism are examined in Causality and Chance in Modern Physics by David Bohm. He speculates that, since determinism can emerge from underlying indeterminism (via the law of large numbers), and that indeterminism can emerge from determinism (for instance, from classical chaos), the universe could be conceived of as having alternating layers of causality and chaos.[31] Source: http://en.wikipedia.org/wiki/Indeterminism
So The question is randomness could be deterministic in reality??
Dear Professor Fairouz , really you explanation is very nice. I am agreed with you.
Fairouz, determinism and causality are philosophical, well, meta-physical concepts (as well as the "truth"). It is actually unscientific(*) to argue about such things. Related to the empirical sciences, the aim is to construct *working*, *useful* models. That's it. Some specifications of the models can be uncertain, and probability-distributions are a mean or a measure for this uncertainty. Again, it makes no difference if the state of a system cant't be predicted because it is non-deterministic (in the philosophical sense) or if is is deterministic but we don't know the rules and inflencing factors good enough.
When I ask you for the direction in which my pencil lying in front of me is pointing, then this is a random variable for you, but not for me (I can see it). If the universe is deterministic (philosophically; there are only cause-effect relationships) then one could in principle (but never ever practically) predict this direction. If the universe was non-deterministic, one could not even in principle predict the direction. However, from the point of probability, the result is the same: you are uncertain, so the best you can do is rank all the possible directions for your expectancy. Unless you have some better idea there should not be any reason to prefer any direction, so the probability distribution reflecting your state of knowledge is uniform in [0°...360°]. In my case, the probability distribution collapses to 0 for alpha < the actual direction (A) and 1 for alpha >= A. All this is independent of whether or not the direction is somehow physically determined, nor does this require the assumption of a "true direction" of the pencil, or the assumption of the existence of a god. When it comes down to the point of "minimum number of assumptions", then our expectations is all we have, and they are not related or based or dependent on the existence of a "truth" or a "determinism" or a "causation".
(*) All these concepts like "truth", "causation", "deteminism" and so on heavily depend on a few concepts: time, space, and energy. They are only definied *within* this conceptualization. But these are only mental constructs! Very, very useful ones, but still no more than mental constructs. It is, to my understanding, not scientific to postulate the "real" existence of something, just because it is a useful concept.
Thank you Jochen for your good explanation. I'm not a statistician so my view could be different ; I think that all the known mathematical and physics theories and laws are based on the philosophical item of causality and determinism 'or in-determinism'. As a researcher involved on Ndt informatics,sometimes i have to deal with stochastic processes, and randomness is a key question. How to assign this from a practical 'or as you said working' view without taking in account the basic theory and its concept of causation??
For the definition of the truth ' or the absolute truth' i think that it is a mental concept with lot of uncertainty and complexity, and still nobody and no theory could define it, it is may be philosophical but i agree with you that is not scientific concept; May be it is cultural, sociological or religious believe, but not scientifically demonstrated one by theorems or algorithms or equations..
So the question of the deterministic or in-deterministic property of randomness still with no definite 'exact or explicit' response;
Jochen, in reading one of your sentences I got stuck. You wrote: "When I ask you for the direction in which my pencil lying in front of me is pointing, then this is a random variable for you, but not for me (I can see it).
The pencil positioning for Fairouz at this point is not random but probabilistic since Faiorhouz knows there is a pencil. The question is which way it is pointing and the equation of which way it may be pointing is a probabilistic one since we know the pencil exists and that it can only point in a limited number of directions, bounded by our 3 dimensions.
If you put the question in such a way that: do pencils exist? do I have a pencil and if I do which way does it point, it is still probabilistic since there is a chance for the pencil to exist (exists or not) plus there is a chance that you have a pencil (you do or not) plus there is a chance that it is pointing in some direction (within our 3D world).
I have a book chapter with a coauthor in a book I edited with two other editors in which I define the difference between risk (probabilistic) and ambiguity (more like random but even more exotic). The chapter is in Neuroeconomics and the Firm in Chapter 2 titled "Risk and ambiguity: entrepreneurial research from the perspective of economics".
I think the confusion between probabilistic and random comes from the misconception of what probability is: we know it exists only we don't know where but we do have a probability curve where the object may show up at any time, only we don't know when. In the case of random, we have no formula.. the stuff is random so it does not obey any rules or equations and is not measurable. A "random variable" in a mathematical sense simply means it is a variable that changes but it says nothing about it being statistically random.
Under the same cap I can bring in your point of "uncertain" since if something is uncertain then by definition it is not probabilistic since we have no model to describe it. If tomorrow the earth opens up and outcome little green men greeting us, that is uncertain and random since we had no clue that there is such a thing (well ...we know now.. lol). If we know that there are little green men somewhere in the middle of earth and we just don't know if and when they come visit us, that it is neither random nor uncertain, but probabilistic.
Angela, so if I get you correctly, you say that
random = we don't even have a model or we don't knoe the sampling space
probabalistic = we have sample space and model but we cannot make perfect predictions
Thus, in your framework something like "species living on earth" would be a random variable (we don't know the sampling space, even the words "life" and "species" are not well defined if one has a closer look!), whereas "number of women in a group of 10 patients" is a probabalistic variable, because the sampling space is well-defined and we can model the probability function for such variables.
If this is correct, the the usage of "random" in "random sampling" totally confuses me.
Fairouz, it is the very purpose of these concepts (space, time, ...) to give a frame(work) to be able to operationalize many physical attributes. And it is clear that all derived variables are intimately linked to theses concepts. The proplem I am pointing to is take these concepts for "real" in order to fit interpretations of something that is not a derived variable from within this concept. The "propensity" was such an attempt.
For me, it looks like mixing the psychological meaning of "color" with the wavelength of electromagnetic waves. The do not need to be related. However, for many purposes it is advantagous, reasonable, useful, *if* there is some kind of correlation. For "probability" for example, there are some scenarios thinkable where such a resonable correlation may be defined by a similarity between probability and relative frequency.
@all participants: I feel happy that I seem to have provoked and I am grateful for your constructive feedback.
Interesting points you bring up Jochen--indeed, quite thought provoking.
Your initial summary of what I suggested is completely correct. The one about species is not the way I imagined and here is why.
You recall about your pencil problem earlier and you said "you know" only the person having to choose the direction does not know. As long as "someone" knows of the existence of all living organisms or their possibilities on earth, we are talking probability and not uncertainty. There are a lot of species out there we have never ever encountered but they are not random. They are here and many are evolving as I write this only we have not yet found them.
If you run a biological and chemical table elemental organic calculation, you can (and probably someone already did) come up with a percentage of how many species are out there that we have not yet seen or heard of. Probability.
As long as we can define it in some way, like by a mathematical probabilistic way, it is not random but probabilistic. There was a time people thought the earth was flat (some still do) and there are people still living with fig leaves around their private parts and then here we are discussing the highest levels of probabilistic mathematical concepts. It is entirely possible to state that for those in the know (like us) most everything is probabilistic since everything in the universe has a limited number of elements in it and is quite predictably expanding and moving in a particular direction. If we had the equipment to count every single element--and we are pretty close to that if not already there--including dark matter, etc., we can confidently say that everything in our universe is probabilistic and uncertainty doesn't exist unless you believe in God whose existence then is probabilistic to those who believe and random to those who don't.
I hope my thoughts are clearer now--I am not the queen of statistics but I sure am a pain for those who submit articles for publishing since I reject many for the very reasons we have just come to conclude. It seems to me that a very large percent of academicians forget that in probability, by definition, we MUST know everything except the actual outcome. In randomness or uncertainty we know absolutely nothing and no one does--it is not within our universe but must come from outside that we are not familiar with.
I will be taking a trip for 2 weeks in 2 days into an internet-less and cellphone cell-less world so I will not be able to contribute until the first week of October.
I am very much looking forward to further brain enlargement upon my return and hugging my computer, cell phone, and internet so I can get my addiction withdrawal over with and start participating again in this amazingly exciting conversation to me.
Best of times and good wishes to all of you until I return!
Angela
Jochen, I think that a unified terminology is needed in the fields of pure statistics and the applied statistics; for me colors are not a psychological meaning but real measurable facts with waveform, wavelength and frequency. As an informatician ‘ informatics needs clear and logical reasoning for building a reliable and exact software’, I do not insert psychological tools or approaches in the systems that I developed. I think that lot of uncertainty could arise and the system couldn't converge.
Angela gives a good explanation ‘random vs. probabilistic’ from space sampling scheme may be uncertainty must be linked to time sampling approach?
Anyway, correlation from uncertainty, randomness and predictability still be a key question from deterministic systems regulated by some chaotic properties; So my question remain posed : is randomness deterministic or non deterministic from the view point of statisticians.
Fairouz, I think the statistician is clearly the wrong person to answer this. Determinism is a philosophical concept. Randomness eventually and practically reduces to the inability of certainly correct prediction (it does not matter why). The statistician knows tools how to quantify the uncertainty and how to calculate with such uncertainties.
Angela,
"As long as "someone" knows of the existence of all living organisms or their possibilities on earth, we are talking probability and not uncertainty."
Hmm... For this someone there are no open questions regarding the living organisms. I agree that there is no uncertainty for this person. But the the term probability is useless either. Why should a probability being assigned to the existence of, say, "Quebbles", if the person knows well that this species in fact does exist?
Only is this person would not know of the (non-)existence of this species, then he could express an expectation about that: it is more likely that it exists that that it doesn't? This is an impossible task as long as there are no conditions specified. But this is rarely the case. A very fundamental -and obvious- condition is that the species must have properties that are in accordance to what we already know about the world. For instance, no function or property should be incompatible to the physical laws. If such incompatibilities are recognized, the peroson would assign a probability of "next to impossible". Further, it might be elaborated if there is a known (or a likely) invironment for such a species. If this is the case, the existence would be expected more likely, or in other words: the probability of its existence gets a higer value.
Particularily, there doesn't need to be a mathematical model defining the probability function. A biologist would defnie a different function than a bricklayer or a politician or a nurse...
If we would agree on a common set of conditions, relations, natural laws, evolutionary principles and and and... and further agree on their interactions, and finally agree on probability assignments for all the different possible cases, then probabilty calculus can be used to derive a probability function for the existence of this species. Given the same initial assumptions (let it be knowledge, guesses or assumptions - it does not matter), the result is the same for all people. This is the domain of statistics, that it defines an objective (and resonable) way of integrating initial assumptions and expectations and data(!) to derive (modified) expectations about something.
Next point:
"As long as we can define it in some way, like by a mathematical probabilistic way, it is not random but probabilistic. "
But then the heads and tails in a series of coin tosses is not a random variable. Right?
And random variables may turn into probabalistic ones as soon as we invent/find a way of a mathematical way to describe the probabilities?
Further:
"[...] and uncertainty doesn't exist unless you believe in God whose existence then is probabilistic to those who believe and random to those who don't"
As I understood you, as long as there is some at least imaginable way of a definition, there is nothing like uncertainty? So nothing in research based on physics, chemistry, biology, etiology, socialogy, phycology, economics, ... is uncertain? Results, predictions, expectations are certain, but probabalistic?
Then I think this is a shift in the layer we are talking about. To make it clear by example: I measure the pH of a solution. I get a value of 7.2. I know that if I would again measure it, I would get a slighly different result. So what can I say about the pH? Having a physico-chemical model and also a probabalistic model, I can derive a probability distribution of the pH, given my measurement of 7.2. I would say: my best guess is 7.2, but it is a guess, so I cant't be certain that this is the correct value. The probability distribution tells me that for instance a value of 7.3 is also quite a good guess, 7.8 would be a poor guess and so on. In contrast, you seem to say that the entire distribution is given/known, and this says all there is to be said. Therefore, there is no uncertainty in giving this particular probabalistic answer.
However, I do not at all understand the second part with the believe in God. If I do not believe in God, what should his/her existence be random? Let A be the existence of God. The believer would set P(A)=1 and P(not A)=0. The atheist would set P(A)=0 and P(not A)=1. The indifferent person (is there a name for this?) would set P(A)=a with 0
Thank you Jochen for your replay, and for your good contributions that enrich the debate on this pertinent question posed by Hemanta Baruah. The tools to quantify and calculate the uncertainty are derived from mathematics, and mathematics is mostly built on causality principle; stochastic processes 'in probability theory' are non deterministic and are fundamental in mathematical modeling of several natural phenomena. So, i don't believe that statisticians miss the mathematical fundamentals and their philosophical concepts. I think that may be there is a mix ' perhaps a little ambiguity' in the terminology.
.
No specific link to the ongoing thread, but an interesting paper (which by the way allows to remind that randomness can be defined with no reference to probability !)
.
The terms, random and probabilistic are usually not used interchangeably but at the same time they are very closely related and suggests the same meaning. Both the terms describe the nature of an uncertain situation where probabilities can be attached to various possibilities in the situation. Well, if one goes by the abstract definition of a random variable then apparently probability doesn't come in to picture. But, if one looks at the role of the notion of a random variable then probability is right up there. What a (real-valued) random variable does is transferring the inherent randomness in a sample space to the real line. This enables better mathematical handling of the situation. In fact one may define a (real-valued) random variable X as a function from the sample space to the real line such that Prob{X
By the way, the paper Vitanyi - Randomness.pdf by our colleague Clerot Fabrice makes interesting reading. Thank you Sir.
I totally support the response of Boris. There is much confusion out there about "random" versus" probability" but "random" describes a variable and "probability" describes a function. We are comparing apples to oranges.
I can see why the confusion is; I just read up on Wikipedia about the definition of probability and I nearly fainted from what I read--as I almost got my PhD in statistics originally (left it since I found it kind of sort of.. well.. ugly)--I am still familiar with the differences and the confusion as I can see it here.
In statistic we deal with "random variables" that are within a population that we describe in an equation of probability. Whether I toss a coin or not is not random but probabilistic; since I may or may not toss it.. there is nothing random about that. There is a random variable in the equation--as it should else we have no equation... correct?
When one goes to gamble in a casino and sits down at the black jack table, we know the number of cards, we know the chance (probability) of each card showing up. We simply don't know when it will show its face and that is a random "variable" but is not a random event at all. We know everything there is to know about paying black jack or roulette or any other game since they have been defined by sets of rules that then one can formulate a probability equation with the particular random variable of choice: when is the Ace going to show up? When is the Ace of Hearts going to show up? etc. We know exactly how many times they CAN show up only don't know when that will occur.
This is pure probability and easy to follow. Everything we know of is describable by an equation with one or more random variable(s). Everything that we are familiar with thus has some probability associated with it happening or not, when, etc. There are no random evens only probabilistic ones. If you look up "randomness" in Wikipedia, here is what you get in the introduction (and I copy-pasted this since this is closer to how I would put it):
"Randomness means different things in various fields. Commonly, it means lack of pattern or predictability in events. The Oxford English Dictionary defines "random" as "Having no definite aim or purpose; not sent or guided in a particular direction; made, done, occurring, etc., without method or conscious choice; haphazard." This concept of randomness suggests a non-order or non-coherence in a sequence of symbols or steps, such that there is no intelligible pattern or combination. Applied usage in science, mathematics and statistics recognizes a lack of predictability when referring to randomness,..."
Note "lack of predictability" toward the end of the sentence implies "lack of probability". If you read further, you will read that there are some patterns possible in randomness but once pattern is observed, it can be described by a mathematical equation and hence becomes probabilistic.
Random and probability exclude one another.
So neither random variables nor random events are random and have nothing to do with random(ness)? Why are they called "random" then? I am confused.
Can it be that you are saying that randomness is something were we cannot say *anything* about it? In this case, talking about "randomness" would not be better that "divine intervention" or "gods thoughts"... how should it be part of a scientific understanding of the world?
If "random" is defined like I understood from the paper of Vitanyi (linked some posts above by Clerot), it is of no use at all, since the universe is not infinite. Randomness is degraded to an ideal (of neccessarily infinite sequences) and the distinction between "random" and "probabalistic" is an academic debate since no finite sequence can really be random and thus must be, at best, probabalistic. I have not understood why not any (random or not) sequence shouldn'd theoretically be describable in form of giving the position in the decimal places of Pi where this sequence starts. Pi is theoretically known, and so is any arbitrary sequence. Does this not prove that there can't be anything real that is random by this definition?
Can you please give precise and correct definition of "random variable", "random event" , "random sampling"? Why do mathematicians promote and use these words, when - as I understood so far - "probabalistic variable", "probabalistic event" and "probabalistic sampling" would be the more correct terms?
Finally, I don't get your point with tossing (or not tossing) the coin. Would be nice if you could explain it.
Just a side remark "random sampling" is a term used in population studies, and a lot of statistics terminology originates from population studies. In physics this is not
used since it is awkward to call a set of physical measurements a "population".
a random variable is actually a function from a space of events to the interval zero to one. this is how probability of ocurrence of events are represented numerically. the more complex issue, sometimes called contruct operationalization, is how events are recognized. consider for example happiness or anger. for chemical and physical measurements an documet called VIM is explaining this.
Dear Professor Angela Stanton,
You gave a very nice explanation, yes I am agreed with your nice opinion.
Thank you.
Almost all experiments are random experiments in the sense that (i) the set of possible outcomes of the experiments is known beforehand (ii) the outcome of a specific trial cannot be predicted and (iii) a large number of repetitions of the experiment shows certain pattern in the proportion of occurrence of each possible outcome. The unpredictability mentioned at (ii) is the uncertain nature of the outcomes of the experiment and by virtue of (iii) the experimenter can attach a numerical value in [0,1] as probabilities to various events. It is in this sense the nature of outcomes is random or probabilistic.
Recall that even at high school level students (in Physics lab.) conduct the same experiment at least 3 times and they get 3 different readings. Here they are satisfied with the mean of the values, but in college the go for the variance in the readings as well. The underlying thinking is statistical
Please note that the original query is about the terms random and probabilistic and not probability and our discussion took it away to random variable and probability.
The description that a random variable is actually a function from a space of events to the interval zero to one, by one colleague in the discussion, is not correct, I am afraid. It is the range of the probability function that is always [0,1]. For a random variable no such restriction is there.
the values taken by the random variable vary. they can be discrete or continuous. the probabilities of taking these values is always between 0 and 1. the coin falling heads can correspond to a value of 1 (or 999 or AXYZ). for a fair coin the probability is 0.50. this is basic staff and most of the above discussion is due to misunderstandings of such basic things.
Random variable
A random variable is defined as a function that associates a real number (the probability value) to an outcome of an experiment.
In other words, a random variable is a generalization of the outcomes or events in a given sample space. This is possible since the random variable by definition can change so we can use the same variable to refer to different situations. Random variables make working with probabilities much neater and easier.
A random variable in probability is most commonly denoted by capital X, and the small letter x is then used to ascribe a value to the random variable.
For examples, given that you flip a coin twice, the sample space for the possible outcomes is given by the following:
'Probabilistic’
Situation or model where there are multiple possible outcomes, each having varying degrees of certainty or uncertainty of its occurrence. Probabilistic is often taken to be synonymous with stochastic but, strictly speaking, stochastic conveys the idea of (actual or apparent) randomness whereas probabilistic is directly related to probabilities and therefore is only indirectly associated with randomness. Thus it might be more accurate to describe a natural event or process as stochastic, and to describe its mathematical analysis (and that of its consequences)
I am still puzzled by the fact that the word "random" seems to have (as I understood) an entirely different meaning in "random variable" and in "random event".
Jochen, a random variable is, in fact, neither random nor a variable. Look carefully at its definition.
@Janamejay
The sentence without the bracketed addition is OK, but this bracketed addition is the utmost nonsense as was stated several times in this thread.
It says that you never worked out in detail an example in which random variables play a role. This given, the right order would be to teach yourself first and then try to teach people.
I could not agree more with Ulrich, as I gave the definition at the very early stage of responses. It is very much unfortunate that standard terminologies and definitions are ultered to make nonsense points. If a particular term is intended to be interoduced, it would be okay as long as the well-established terms are reserved and respected.
The term random - typically used in conjunction with variable or process, is a descriptor used within probability theory to denote an object with a probabilistic description. The distinction is probably best seen by examining the mapping between probability theory and measure theory (probability theory being a special case of measure theory involving a specific type of measure). Expected value maps to integral, random variable maps to function.
I'll add the the term random came into the lexicon of probability theory due to Joe Doob and William Feller who were debating over whether the proper term should be "random variable" (Feller) or "chance variable" (Doob). They settled the matter via a coin toss (natch) and the common usage ever since has been random. Part of the confusion expressed in the question arise from the usage of random in the common vernacular. Within probability theory its usage is restricted to describing a function (or associated process) that assigns a number to an experimental outcome.
From the discussions, it seems, the confusion regarding the definition of randomness may continue to remain. A mathematical term should not have two different meanings. Anyway, perhaps we would ultimately be able to arrive at some sort an agreement regarding the exact definition of randomness.
Respecting all other views expressed by learned scholars, I strongly go with a correct and clear distinction between randomness and probabilistic made by Jefrey. And, I very strongly resent the behaviour of the person who unwittingly down voted this correct answer. It is expected that an scholar if does not agree with a response may ignore it and write a better response as Ulrich Mutze responded in deserving manner.
I do believe that randomness is associated with variables that appear to have no purpose, no connections with each other and for that matter do not occur or found in orderly and systematic trend/pattern. Probabilistic is prefix before model. If one cannot find meanings in random observations using deterministic analytical or other mathematical solutions, is bound to use probabilistic methods have roots in deterministic mathematics, but gives no unique solution. There are possible number of solutions.
According to Professor Julia Davidson
Any method of sampling that uses some form of random selection, that is, one that will ensure that all units in the population have an equal probability or chance of being selected. Random selection is an assumption of probability theory and the ability to draw inferences from samples to populations. Random sampling techniques include: the simple random sample; the stratified random sample; the systematic random sample; and multi-stage cluster samples. Probability sampling is most closely associated with quantitative research, and particularly with survey research. In simple random sampling all units within the sampling frame have an equal chance of being selected. Computer software is often used to generate random numbers. MS Excel, for example, has a random number generator facility.
random is a sampling method where every element in the set has equal chance of being chosen while probability is a measure of chance of the event occurring or being chosen. I agree with the explanation given by Mohammad Ayaz Ahmad .
It's all right, Dear Professor Zuhaimy.
your opionion is also very much meaning full.
I am also agreed with you.
So in random sampling there is an equal probability value for all event in the set {"take item 1", "take item 2", ... "take item n"}. From this I would conclude that probability is a measure for ignorance (as we don't know which item is actually taken). And I am again lost while seeing the variable "selected item" as a random variable...
A random variable represents a quantity measured from the experiments corresponding to the elements of the sample space S (which is the set of all possible outcomes of experiment), and it is mathematically expresses as a function X: S->R, where R is the set of real numbers (possible values of random variable).
In a sample space S the probability is also defined by assigning a function p: S->[0,1]. It should also be mentioned that usually only a subset of S satisfies all properties of the probability, therefore p is defined as a function on a collection A subsets of S, p:A->[0,1].
The random variable and probability relate to each other once we define a distribution of the random variable; we can associate each value of random variable with a probability (which can either be P(X
I think that the concept of variable, model, process and event should be well defined for randomness or probabilistic issue. May be the ambiguity could be removed
Fairouz,
In other words, you have accepted that ambiguity is actually there. Perhaps this ambiguity can be removed. Let us see what the conclusion of these discussions would be.
I'd argue that you can expect the adjectives "random" and "probabilistic" to have unambiguous meanings without consideration of context. In your original statement the context seemed to refer to variables. I am not familiar with the term "probabilistic variable" as a standard definition. The historical choice of the term "random variable" whether a good choice or not, is clearly defined and well established. Finally I'd argue that whether a "random variable" has "probabilistic properties" is different than asking whether that enters into its definition.
In modern computing the random number so called Monte Carlo simulations are of vital importance to sake of compression purpose to any experimental data sets but the probabilistic number come out by default. .
Random and probabilistic are largely similar when referred to entirely within the field of unapplied mathematical statistics. However, the difference seems to come when distinguishing them in the so-called "real" world. Random variables are those that have no known (maybe also measurable) associations with other variables. They are the error term in many models. Whereas probabilistic variables fluctuate, within some framework of variability, with other variables -- and all can be measured. Probabilistic variables (and stateents) allow estimates of previously unknown present conditions or estimates of future (or past) conditions if governed by similar parameters.
Jochen, I am just back from my trip and starting to read from where I left of. You state: "But then the heads and tails in a series of coin tosses is not a random variable. Right?"
I believe you are confusing the terms "random event" and "random variable". A "random variable" is necessity in a statistical probabilistic equation, such as a coin toss--we know it has exactly 50% chance of being a head versus a tail in a fair coin. Pure probability that can be described mathematically.
By contrast a "random event" is not a variable and is not part of the statistical equation since it is an event that is random and we have no clue of its existence. A coin toss is not a random event since we have the coin in our hand, we can see it is either heads or tails and we can define the mathematical formula that gives us the probability for the next toss to be heads or tails. There is absolutely nothing random about this as an event--we know everything about it except the outcome: heads or tail.
There are a lot of confusions around the terms: random, probabilistic, uncertain, and ambiguous. Each has a completely different meaning and each is defined precisely in terms of whether it can or cannot be formulated in a mathematical equation.
An event that is "random" (not a random variable but an event that is random) has no mathematical formula and is impossible to define because we do not have a random variable; in fact we know nothing of the event that it even exists--I very much agree that this is a hard concept to relate to but in terms of mathematics it is pretty simple: if I know nothing about it, it is random event; if I know something about it, it is probabilistic with at least one random variable; if I know everything about it, it is a law.
You noted above--and sorry I only got so far and have not yet read the other responses--that in terms of species we may have a problem but actually we don't.
We can now calculate with high precision, for example, the mutation of the flu virus such that when it hits, the vaccines prepared in advance will protect us, even though at the time the vaccines were mixed, we use probability to estimate what strain the virus might be. The match is not always perfect but typically pretty close--pure probability!
Life is relatively easy to place into probability because now that we understand DNA and the mutation mechanism, we can backward engineer with great precision, for example, where the human race came from as well as what species may have been or are now that we do not see. We can also calculate the relations between species, etc.
I know that in some fields probability is a lot looser in its definition than what I am pushing here but I am pushing for the stronger definition to try to unite our understandings through the many fields of sciences so when we say something has a probability associated with it, we all know and understand the same thing. Today we clearly do not. A cross-disciplinary process can work well at this point but eventually it would be nice if every scientists learned the same way and the same things about statistics... a nice dream that is not likely to come true in the near future!