Can sociology be based on nonprobability samples exclusively? Is it necessary a frame of identified units before selection of sample? Why do reasons exist to be based on probabilisticly selected samples in a social research? Which are these reasons?
Random sampling is more rigorous, while probability sampling cannot be generalized most times. You can use either one depending on the nature of your research questions. In my experience, sometimes in social research you may have to settle for probability sampling due to the controversial nature of your study and this method of sampling may the only way to access your population.
Many thanks,
Debra
Polls used in Sociology cannot be random because their outcome will depend on the willingness of the people that join these polls?
Random sampling is more rigorous, while probability sampling cannot be generalized most times. You can use either one depending on the nature of your research questions. In my experience, sometimes in social research you may have to settle for probability sampling due to the controversial nature of your study and this method of sampling may the only way to access your population.
Many thanks,
Debra
RANDOM SAMPLING: All elements in the population must have equal chance of being selected or at least all elements within the sampling frame must have equal chance of being selected. This is a difficult standard to meet in social science, unless we are sampling captured data or population. For example, sampling prisoners in a confined prison or sampling past data in accounting audit. However, in a world where the population is dynamic: birth, death, drop out, etc., random sampling may be difficult.
BIAS TESTING: If random sampling is not possible or not practicable, the researcher should opt for other sampling method. The rationale for preferring random sampling is the argument that "randomness" does not contain bias. If non-random sampling method is used, one should test for bias in the selection procedures as well as in the data itself.
SAMPLE FRAME: Generally, this is defined before sampling is affected. It would not be appropriate nor methodologically honest to construct a sampling frame after the sample has been collected. The procedure comes before the operation, not after. To re-construct the procedure after the data collection is to cover one's track. Scientific objectivity in such case may be lacking.
PROBABILITY SAMPLING: This is a case where all elements in the population has known probability to be included in the sample. This is not the same as "having equal chance of being selected"---as in random sampling. If the probability of each element is known, it may be defensible against an argument of bias if the probability for each element is not too much different from one another. For non-finite population, this type of sampling might be difficult to accomplish.
WHICH METHOD OF SAMPLING? Knowing various types of sampling method is good so that we can properly select the appropriate sampling method to fit target population we are working with. The "nature of the study" does not dictate the sampling method, in practice, it is the population or the availability of and accessibility to the data and access that are the controlling factors on the sampling method to use. At times, these factor also influence the direction of the study.
LINKS: The links below may be of interests.
http://www.socialresearchmethods.net/kb/sampprob.php
http://stattrek.com/statistics/dictionary.aspx?definition=Probability_sampling
http://apps.who.int/medicinedocs/en/d/Js6169e/7.4.html
Dear Mariano Ruiz Espejo,
Random sampling - part of a general set of elements, which is covered by the experiment (observation, survey).
Characteristics of the sample:
The sample of - what we choose and what methods of sampling we use for this purpose.
Quantitative characterization of the sample - select how many cases, in other words, the volume of the sample.
The need for sampling:
The object of study is very extensive. For example, consumers of a global company - a huge number of geographically dispersed markets.
There is a need to collect primary data.
Volume of the sample - the number of cases included in the sample.
The samples can be divided into large and small, as in mathematical statistics used different approaches, depending on the sample size. It is believed that the larger sample volume 30 can be attributed to large.
The set of all possible elements of the research object, which is to be studied as part of a particular study, and which will apply the findings of the study, called the general population.
Because by itself the object of study can be numerically very large: tens and hundreds of thousands of people that make up the population of the region, province, city; thousands of industrial workers of the enterprise, research institute or school, the researcher does not have the opportunity to ask all members of the general population, ie conduct a continuous study. It is believed that if the object of study consists of more than 500 people, it is expedient to conduct is not solid, and sampling, ie, instead of the entire population being studied is only part of it, selected by a special scientific methodology - sample frame. Sample surveys are used where information about each element of the population is impossible or too costly, ie in most cases, the actual sociological studies, which are the subject of large social groups, phenomena or processes. The widespread use of sample surveys following explains their benefits: efficiency; rapidity; flexibility; higher quality.
Regards, Shafagat
I understand random sampling to the obtained by chance but not in a probabilistic form. Probabilistic selection of the sample is necessary to work the statistical properties studied in theory. For this, a random sample does not guarantee, for example, unbiasedness of estimator. But a probabilistic sample selection, yes, it can assure the unbiasedness of a correct estimator.
Are participants here trying to make a distinction between simple random sampling, and more complex probability of selection designs? Or is this a frame question, where the distinction is with regard to something like list vs area or unknown? I cannot see what is being discussed to contrast "random" and "probability." Perhaps some are thinking "nonprobability?" Or perhaps this is a matter of semantics that I just have not run across in many years of reading from articles & papers, and from textbooks, though that seems doubtful. But I have dealt with industry rather than social settings, so perhaps I have missed some specialized language use? (I do have a note on the problem of statistical jargon being reviewed by a statistical magazine editor right now, who recently published a note i wrote on a particular semantics problem.)
Where, for example, would you see something like this distinction in a classic like Cochran, W.G(1977), Sampling Techniques, 3rd ed., John Wiley & Sons?
Where, say, in Lohr, S.L(2010), Sampling: Design and Analysis, 2nd ed., Brooks/Cole?
I cannot think of any place. A reference and page numbers might help, especially if legally available on the internet, so i guess that would mean a paper somewhere.
At any rate, for fun, consider the following from the 2013 winner of the Waksberg Award:
Brewer, K.R.W. (2014), “Three controversies in the history of survey sampling,” Survey Methodology,
(December 2013/January 2014), Vol 39, No 2, pp. 249-262. Statistics Canada, Catalogue No. 12-001-X.
http://www.statcan.gc.ca/pub/12-001-x/2013002/article/11883-eng.htm
Ken Brewer's paper contains an interesting history of the development of survey sampling and estimation, which many may find fascinating.
I don't think the question is on probability vs nonprobability sampling and estimation, but I think many people think that is what was asked, so, again let me point you to Ken Brewer's Waksberg article:
Brewer, K.R.W. (2014), “Three controversies in the history of survey sampling,” Survey Methodology,
(December 2013/January 2014), Vol 39, No 2, pp. 249-262. Statistics Canada, Catalogue No. 12-001-X.
http://www.statcan.gc.ca/pub/12-001-x/2013002/article/11883-eng.htm
He believed in using probability sampling and models together, but he explains the different approaches.
And for more on nonprobability considerations, these may be of interest:
"The Future of Survey Sampling"
Article in Public Opinion Quarterly 75(5):872-888 · December 2011
Impact Factor: 2.25 · DOI: 10.2307/41345915
J. Michael Brick
https://www.researchgate.net/publication/261967521_The_Future_of_Survey_Sampling
--------
"Beyond traditional survey taking: adapting to a changing world
Explorations in Non-Probability Sampling Using the Web"
Proceedings of Statistics Canada Symposium 2014
J. Michael Brick
http://www.statcan.gc.ca/sites/default/files/media/14252-eng.pdf
-------
"Summary Report of the AAPOR Task Force on Non-probability Sampling"
Article · October 2013
Journal of Survey Statistics and Methodology
DOI: 10.1093/jssam/smt008
https://www.researchgate.net/publication/273561892_Summary_Report_of_the_AAPOR_Task_Force_on_Non-probability_Sampling
.......
.......
.......
And for establishment surveys in particular:
https://www.researchgate.net/publication/303496276_When_and_How_to_Use_Cutoff_Sampling_with_Prediction
Article The Future of Survey Sampling
Article Summary Report of the AAPOR Task Force on Non-probability Sampling
Method When and How to Use Cutoff Sampling with Prediction
Dear Prof. Ruiz Espejo,
Thank you for sharing. Sociology is not included in my field and thus, I can not give a professional answer. Nevertheless, this interesting thread offers me a good point to start learning.
Regards
Dear Prof. Mariano Ruiz Espejo,
Thank you for sharing this question, but I should say sorry, because I am not sophisticated in this issue. On the other hand, I could learn about "Sociology research" by pursuing of RG member's answers and links that they shared.
Regards,
Mehdi
“The purely random sample is the only kind that can be examined with confidence by means of statistical theory, but there is one things wrong with it. It is so difficult and expensive to obtain for many uses that sheer cost eliminates it. A more economical substitute, which is almost universally used in such fields as opinion polling and market research, is called stratified random sampling.”
― Darrell Huff, How to Lie with Statistics
Thanks James for the very interesting and diverse references and thanks to Mariano for provide the discussion threat. I agree with the shared opinion about that sampling as well as any other research tool is justified first from the purpose and epistemological stance that guide the research, then I strongly believe that practical matters could make changes in our research plan and we have to go back and forward until every piece fits into a coherent framework. Then as with many other issues in scientific research there are not a best method or tool instead are more suitable methods or tools according with the problem we are dealing with.
Identified units make possible to reproduce in practice and the real world what it is suppossed in the sampling theory. Random sampling (without controlled probabilities) would be a rough approximation to the theory but without theoretic real basis.
Identified units have interest in and with security. Nonprobability sampling is based on hazard and not in preserved proportions of the population and of selection of units.
Ah. Please note all that a word in the question was just recently changed from "probability" to "nonprobability," which clarifies what this question was actually meant to be.
Previous answers were based on the original question.
Note that with regard to nonprobability sampling, I provided some information from Mike Brick and others. But to me, a critical difference regarding rigor for inference from such sampling is when regressor data can be used for prediction. I would imagine that less feasible for social research. I think, however, that what Brick and others were showing in general is what to consider when random sampling may not be feasible either.
Nonprobability sampling, without prediction (regression) for estimation, is not my area. But if you want to know something about that for social research, you might find that report of the American Association for Public Opinion Research (AAPOR), which I passed on, to be of interest to you:
"Summary Report of the AAPOR Task Force on Non-probability Sampling"
Article · October 2013
Journal of Survey Statistics and Methodology
DOI: 10.1093/jssam/smt008
https://www.researchgate.net/publication/273561892_Summary_Report_of_the_AAPOR_Task_Force_on_Non-probability_Sampling
----------
Here is the AAPOR website:
https://www.aapor.org
----------
There were several people involved in the writing of that report, but Mike Brick gave a presentation to perhaps a couple of hundred statisticians in the Washington DC area a few years ago, based, he noted, largely on that report. I heard that presentation, and it was excellent, so if you found his slides online, they might be useful. It was the WSS Presidential invited presentation for that year. (Perhaps 2013.) (The WSS is a large chapter of the American Statistical Association.)
Article Summary Report of the AAPOR Task Force on Non-probability Sampling
I think that Jim K. gave a number of very good references to work from and Paul L. gave a good summary of the differences.
I always like to think that a probability sample allows you to control the sample selection by stratification and oversampling or undersampling some portions of the population of interest based on the purpose of the study (for example oversampling smaller populations and undersampling the larger populations). The important thing is to know the probability of selection so you can compute and use weights (the inverse of the probability of selection). Unweighted estimates can be biased. The issue about whether to use a probability design and weights depends on the cost of making a wrong decision. If the cost of a wrong decision is high, then I suggest that you expend sufficient effort to ensure a sufficiently rigorous sample design.
Yes, Brenda. Non-probability sampling is very used but without statistical and mathematical guarantees for an objective study.
Can sociology be based on nonprobability samples exclusively?
No. In the Probability Sampling, the model behind the data is congruent with its reality, because of the procedure done by the Sampler in the selection of the sample. A probability sample may be, more or less difficult to obtain, but this method may obtain justified results always. A non probability haphazard sample in the best situation requires that Nature did this randomization for us. But if it does it: With which probability is selected any unit? To analice it, we have impose one model or other that would alow us to get some conclusion, but whitout any justification. In the best case the researcher may have verified that there are not evidences of lack of adjust of the imposed model. But it is enough to warant us a good conclusion?
When, in a probability sample, we have lack or colaboration of some surveyed persons, this may be corrected and taken in count in our study. The study of the problems behind this situation may be a source of very rich information, that would be loss if the care of the selection of the sample was relaxed.
Is it necessary to have a frame of identified units before selection of sample?
It is necessary to have a frame of the larger units, eg: blocks, houses, phone numbers, but in the smallest units (surveyed persons into a home) sometimes it is reasonably to replace the probability selection by other simpler methods as asking who will be the next to have his/her birthday. In this case the selection of the last unit would be non probabilistic but the size of the home, warrants that we may supervize that there was not any haphazard bias because of our pollsters, and that in most examples there would not be any relation behind the birthday and the matter under study. (eg it would not be good enough to study the habits of celebration of birthdays)
Dear Brenda
Any of the non probability methods mentioned in your quote about "Choosing a sampling method" for qualitative studies may be very good for them. They may produce good qualitative information. The same occurs when a detective is studing the evidence of a crime scene. He does not select a probability sample of evidences. He does an intensive and quick trip arround the place, and when he finds any hint which is congruent with his intuitive guess of the crime, he follows it where it points. This procedure is very similar to the Snowball method quoted by you. But many social studies are cuantitative or are cuantitative and qualitative at the same time. I think that Mariano is asking about that studies, and that in them, the non probability samples are very poor for estimating parameters of social populations and about testing cuantitative hypotesis about them.
Identified units are an important measure in some situations as political participation, rights of property, treatment in a hospital, etc. Why are identified persons not considered too as an important base of research in sociology or in social sciences? I think the answer is for commodity in prejudice of the good science.
There are researchers who use probability to analyse collected data without probabilistic selection of them. I think this is a theoretic analysis without secure practical implications.
Estimation from survey data has an interesting history, as I noted earlier, linking Ken Brewer's Waksberg Award article.
The following is with regard to theory and application of which I have found many people on ResearchGate are unaware, so I'll mention it with references below, FYI:
Prediction is a method of estimation backed by a great deal of theory. (See Valliant, R., Dorfman, A.H., and Royall, R.M. (2000), Finite Population Sampling and Inference, A Predictive Approach, John Wiley & Sons.) It's just a different theory, and it has been argued that the theory of estimation based on a sampling design is irrelevant. But many leading statisticians recognize the value of both approaches, and Ken Brewer combined the two, as I noted with that link to his Waksberg Award paper earlier. I think stratification is often essential in either approach. Both approaches need to be used in an informed manner, however. Remember "Basu's Elephant Fable," in a foundations essay where the "circus statistician" lost his job from his application of the Horvitz-Thompson estimator (which is the basis for estimation from design-based sampling - though he could have done better)?
Ken Brewer did much to show both approaches, and how to combine them: Brewer, KRW (2002), Combined survey sampling inference: Weighing Basu's elephants, Arnold: London and Oxford University Press.
Also, see Särndal, CE, Swensson, B. and Wretman, J. (1992), Model Assisted Survey Sampling, Springer-Verlang.
A great deal of other work using modeling has been done by many other world-renowned statisticians. I believe Yves Tille has work available on ResearchGate.
Also please consider, An Introduction to Model-Based Survey Sampling with Applications, Ray Chambers and Robert Clark, 2012, Oxford Statistical Science Series.
One should keep an open mind.
Its not about whether Non-probability sampling or Probability sampling is more appropriate, BUT rather "What is the best mode of MIXING the two to provide better measurements for study variables in social research?."
Basing sociology research on non-probability samples exclusively will imply not doing realistic inference; not being able to monitor confidence intervals of your estimates; not being able to measure validity and reliability of constructs for your study variables. And generally, not being able to know the errors such as sampling errors and standard errors due to estimation.
In summary, sociology research should be based on MIXED METHODS and mode of mixing must be informed by the problem under investigation.
Wesonga
The problem of identified units is that if we like to have the frame of a finite population of persons, the register could be at hands of official institutions and these do not give the information to anyone.
It is possible to sample a finite population with a probabilistic design, this requires an expert in sample selection of the units of the sample from the finite population. But curiously in many studies this expert is not used to do such probabilistic selection. So, the resultant study has defficiencies in the design to do scientific inferences.
Dear Marcel
If you do not have a random sample behind a Poll in Sociology, probably you will get a very biased inference. The willingness of the people to join a poll is a real issue in a probabilistic sample but the expert may correct this issue with adequate weights. But the bias derived from the lack of randomization may never be corrected. We may try to build a good model to explain our results in relationship with independent variables from the people in the sample, but any good model requires that we may ascertain its accuracy from an independent source and we do not have it without a probabilistic sample that may check the lack of fit of the model.
Nonidentificability and anonymous answers cannot be controlled by a sampling design. Both ones are interests of whom do not like honesty/control and these are not a good foundation for science.
In Social Studies it is important to preserve the anonimity of the persons who give the answers, because if they feel ashamed or have any worry about this matter, they may not respond or give a biased answer, but it is possible and necesary, to do this without compromising the procedures of a good probability sample.
For mathematics-based inference, it is better identified units probabilistic sampling in social research, but for other purposes no scientific but for example descriptive nonprobability sampling could be used without specifications of confidence intervals which do not have scientific base in this case.
To obtain a probabilistic sample is not strictly neccessary to identify each unit, what you need is to compute the probability of obtain each unit in the sample. I will try to ilustrate this with a real example. Suppose that a railway company, asked me to design a probabilistic sample of its users, to measure the degree of satisfaction with its service. They reduced the population of interest to those passengers that travel from 6 am to 22 pm. They knew how many tickets were sold by day in each railway station in this lapse.
Suppose that we have means and personnel to sample the passengers during a week (from Sunday to Saturday) in a half of the trains stations. The stations have only two points of entry, one in each direction (I will call them 1 toward a head and 2 the other head).
In first place I will select, with probability (Pi) proportional to the number of tickets sold in each station, a sample of a half of the stations. Here the station is a Primary Unit or Cluster.
Then I could design a probabilistic systematic sample (1 in K) with a random begining for each of the combinations of selected station, day of week and direction of travel.
I should compute for each day-station-direction an interger Kijk value that is as near as posible to Nijk/nijk, then Kijk=round(Nijk/nijk) where Nijk is the estimated minimum of users in the ith day, jth station and kth direction, within the hours of observation. and nij is the desired sample size from each day, station and direction.
Finally we get a random number, Aijk, for each station-day-directon which was selected between 1 and Kijk with probability 1/Kijk.
In the field, our people goes to the station which was asigned to them before 6 o clock, knowing Aijk and Kijk, in each point there are two of our perssonel as a minimum, while one of them ask the questions, the other go on counting to find the next unit to participate in the research. Ideally they will do this alternating their roles along all the day.
They begin at 6 o clock to count each passenger that arrives to the entry point and when the number Aijk th arrives they ask him/her to answer some questions to improve the quality of the service. After the first, they will ask the same to each Kijk th passenger till 22 o clock when the sample of this day is over.
After a day of sampling you may get an unbiased (or almost unbiased) estimate of each of the parameters in the instrument but you have not any of the respondents individualized.
The probability of participate in the research would be Pi * 1/Kijk and the population researched would be that of the passengers in that week of observation.
This probabilistic design do not provide us with an unbiased estimator of its variance but we may use as approximation, the formula of the variance for a Cluster Design in two steps, with probability of selection proportional to the Cluster size.
Besides you may obtain an unbiased estimator using for each station-day-direction an m in K probabilistic systematic sample instead of a 1 in K one.
Without identified units, how is it possible to select a probabilistic sample? I do not talk of artificial samples. The answer is that it is impossible.
"Purposive selection" or "selection at hand" are the possibilities without identified units. And in such hypothesis the statistical theory does not work with perfection. It would be an approximation to the theory and not a clean theory and practice together.
Nonprobability sampling is for relaxed researchers that they do not care the work of sampling. Identified units probabilistic sampling is for qualified professionals of statistics who follow strict norms of control and security in the research.
I think that nonprobability sampling is appearance because it cannot conclude a perfect scientific thesis.
This is the permanent problem between a good achievement in sampling science or a relative but not good achievement with sampling.
Dear Brenda,
In Spanish random has a wider sense than probabilistic, because it can include other random sampling as purposive sampling which is not probabilistic.
Identified units do not imply nonanonymous sampling. Both things, identified units and anonymous statistics, are possible. This is a good way in health studies.
In Spanish: random = aleatorio (sinónimo de "al azar"), purposive = a propósito (sinónimo de "intencional").
The proved results are for identified units probabilistic sampling, other things introduce additional errors to the sampling error.
Estimado Mariano
El 1/9/2016 te di un ejemplo de muestra probabilística con unidades muestrales no individualizadas, en otra ocasión hice lo propio con un ejemplo de silo. Los resultados de la teoría de muestreo probabilístico están probados para cuando conocemos la probabilidad de selección de las unidades que caen en la muestra. No requiere tener a toda la población identificada. Esta sería una condición suficiente pero no necesaria si logramos calcular esas probabilidades de otro modo.
Saludos
Estimado Guillermo:
Pero en la selección de la muestra de semillas en un silo haces implícitamente la hipótesis de que una semilla que pesa el doble que otra tiene la misma probabilidad de ser seleccionada que esta última. O como si en la selección de los números premiados de una lotería, la bola de un número de ellos pesase el doble que las bolas de los demás números. Yo creo que si lo supiese compraría el primer número en esa lotería antes que los otros números, porque aunque las bolas sean parecidas en la forma una pesa más y por tanto puede ser seleccionada antes que los otras.
No Mariano. Las semillas son aproximadamente del mismo peso pues si hubiera más grandes hubieran sido retenidas por zarandas de filtro y si hubieran sido vanas o más livianas, habrían salido despedidas de la máquina cosechadora. Existe igualmente una ligera heterogeneidad de los tamaños de las semillas, pero a los fines prácticos es despreciable. Cochran y otros analizaron estas situaciones, se puede ver como el pequeño error que podría ocasionar tratar las semillas como de igual tamaño cuando son muy parecidas no provoca sesgos, ni incrementa las varianzas.
Estimado Guillermo:
Entonces sería como se seleccionan los números en la lotería nacional, cada semilla sería equivalente en peso y forma a las demás semillas. Entonces bastaría pesar una semilla y multiplicar por el número de ellas para saber la producción total en peso.
De todos modos aunque el ejemplo no fuera de tu agrado, mi punto es que para que exista una muestra probabilística, siempre se habló de la posibilidad de computar las probabilidades de inclusión de los elementos observados en la muestra.
Esto puede mostrarse con más facilidad cuando los elementos están perfectamente individualizados, pero también puede hacerse en muchos otros casos, donde aún cuando estos no lo estén, si sea posible calcular la probabilidad de inclusión de unidades muestrales que los contienen, o existen buenas razones para computar con buena aproximación la probabilidad de inclusión de las unidades muestrales en las distintas etapas del muestreo. En base a esto se hace rutinariamente en todo el mundo muestras probabilísticas de hogares, viviendas, empresas, personas, aguas, suelos, etc.
Por otro lado cuando las unidades muestrales y sus elementos están completamente individualizadas, suele haber sorpresas por datos faltantes, no respuesta y errores de información. Estos problemas nuevamente se resuelven apelando a modelos aproximados de como faltan los datos, por qué no responden o cual es la magnitud del error de información pero sin apartarse de la muestra probabilística.
Decía mi profesor de Teoría de Muestras que "para muestra vale un botón" cuando las unidades son muy parecidas, por ejemplo dentro de un estrato muy homogéneo. Con un solo botón se tiene la información de todos los botones del cajón de ellos iguales.
Si pero la evaluación de la calidad tiene que ver con otras cosas:
Presencia de granos con exceso de humedad,
con ataques de gorgojos (insectos),
con ataques de hongos (e.g. aflatoxinas)
mezcla de variedades,
impurezas, etc.
Identified units probabilistic sampling requires a careful selection of the sample. This is not required for nonprobability sampling.
The statistical theory of inference does not serve for nonprobability sampling.
Dear Mariano
I think that the Statistical theory of inference is not respalded by a nonprobability sample or by a non random experimental design without the aid of several assumptions that we can not independently test.
In the case that these assumptions were very flawed, we are guessing without noticing that we have not any statistical support, instead of using a good Statistic inference. But if the assumptions were not flawed (because of the same nature of the problem that might produce data like that suposed by the model behind he assumptions), we may be doing a good Statistical inference.
The problem is that we will never know for sure if the assumptions are flawed or not, and in most situations we could have avoid all these assumptions doing a good probability sample or a good random experimental design.
Dear Guillermo,
In many examples of statistical inference the results are for certain conditions, for example 5 conditions to proof the mathematical thesis. But, in practice this is the problem, the majority of researchers suppose that with for example 3 of these conditions are sufficient to fullfil the thesis but without alternative mathematical proof with this assumption. This is an actitude of risk that a good scientist could not allow in a scientific work.
From a correct scientific viewpoint identified units probabilistic sampling is the way, other possibilities could be of more easy use but I believe they are not scientific because do not respect the laws of the science.
Without identified units the sampling would be relativistic and not credible necessarily.
One cannot talk of bias, variance or mean squared error without probability in the selection of units. Then of what statistics could we talk without such concepts?
Without identified units there is not natural random selection in the sample.
Nonprobability sampling could be testimonial but not be scientific to measure biases or errors of sampling.
However, dear Nabeel, testimonies can have part or total truth.
If you like to experiment or observe to who you like, nonprobability sampling is the way. But if you like give certain scientific conclusions indentified units probability sampling is the way.
Dear Brenda,
I treate these things in my book Sampling Science, in Spanish, in Chapter 9. Thank you for your participation in my Question. This book will be available in the website
www.bubok.es/autores/MarianoRuiz
https://www.bubok.es/autores/MarianoRuiz
Nonidentifiability does not allow selection according a sampling design and, as a consequence, does not allow objective statistical inference.
Without probabilistic sampling the researcher is bad because he selects its sample and one cannot do scientific inferences with his selected data.
For Spanish readers, I recommend the book Ciencia del Muestreo published in Bubok, Madrid.
Without probability there is not a studied neutral distribution of the sample which is necessary to do statistical inferences.
Sociology that uses mathematics without assuring the hipothesis in practice I would not call it science.
I think that the best problem in this area is to use sampling techniques without having reasonned them mathematically and/or without being sincere in its application.
Truth-based social research is very good, but how is such truth in our research?
Without a frame of identified units one cannot say which is the population of reference, that is to say, one cannot say with correction of what population he is studying.
I think that the difficulty to appreciate why is identified units a solution to social research could be the excarce mathematical formation in studies of Sociology and Social Sciences. If one knows well the statistical proofs and the reasons one can assure that identified units of the population is the unique solution to a correct statistical inference.
Dear Mariano
I subscribe all what you said about probability sampling, and I know that with a frame of identified units it is posible to perform a very good probability sampling, that is the ideal situation.
But I know also, that many times it is posible to perform a probability sampling without having that ideal situation before building the sample.
A probability sample is one where you can compute the probability of selecting each sampling unit of the sample. If you have the frame before selecting the sample it is most easy.
A correct methodology is not always the more easy way to act in social research. It is easier to follow our imagination and to forget the scientific principles of research, but this is a risk behaviour because the principles have to be always present in all research.
If the proved theory needs n hypothesis to affirm a scientific result, I think that the omission of any or some of these hypothesis is not scientific but degradation of science to unscientific states.
Without identification of persons or subjects there is not individual responsibility.
Without personal identification could have more irresponsibility.
If with nonprobability sampling is possible to do correct statistical inferences, many social studies would have justification, but this is not the case because prefixed probabilities assures the properties of estimators. Change the probabilities for other random selection has its uncertainties.
The citizens of a society require respect to their will in collaborate or not in a social research, that is to say, they must manifest their consent to the study in their persons. This is other right of the surveyed in a social study.
The problem of having a frame of population is its labourious effort to dispose of it. But I think it is the only method to infer scientific and correctly.
Nonidentified units would have an intractable social study and the inference will be based on ideas without mathematical proofs.
The impact of nonidentified units is do not know of what population one refers to.
Probability is a measure of unit(s) representativity in the sample, and without it it does not possible to infer in statistics.
Sin probabilidad uno puede contar lo que se encuentra en su trabajo rutinario, pero para que pueda hacer inferencias estadísticas seriamente necesita el uso de unidades identificadas y de la probabilidad para la selección de la muestra. Esta muestra no es solo aleatoria sino además probabilística.
For the purity of science, identified units probabilistic sampling is superior to nonprobability sampling.
The use of probability controlately makes possible a distribution in the selection of the samples and the units. Without probability this is not possible with control, but it would be an uncontrolled supposition or an approximative pronostic.
Without identification of units it would not have population but only uncertain masses.
Exclusivity to nonprobability samples is a limitation. Probabilistic samples are base for more scientific studies.
Probability is a mathematical concept which has relation with an idea about the theoretic frequency around the observed frequencies are approximated. Such concept can be implemented with computers, and then it can be used in experimental observation of a social research. This is possible but it is not easy of implementation since it requires important work to provide the frame of the society or of the population and select from such frame the sample to be observed.
It is true that sociology with probabilistic samples are very limited in the usual practice. This requires a great effort for the creation of population frames.
If one likes empirical data only, one can elect nonprobabilistic sampling.
Nonprobability sampling or identified units probabilistic sampling, which is the more appropriate for social research?
To generalize our research findings to the entire population, we need probability / random sampling. Even though we try to apply random sampling in social science research for generalization via filling up the survey questionnaire by randomly selected respondents - but whether the respondents willing to fill up the questionnaire or not is already affecting how random the sample is.