The Reader's Digest prediction of the election in the US in the 1930's is a reminder forvever what can happen with non-probablistics samples that are large.
Large is no quality criterion per se for samples, it is only good if accompanied by other techniques that can assure that the samples are representative.
No. all of inferential statistics depends on probability models. That is the reason probability samples are so important. You can't really trust an inference made without a probability sample.
The Reader's Digest prediction of the election in the US in the 1930's is a reminder forvever what can happen with non-probablistics samples that are large.
Large is no quality criterion per se for samples, it is only good if accompanied by other techniques that can assure that the samples are representative.
Large sample sizes can result in significant results purely by virtue of their size. Conversely, small sample sizes often result in invalid conclusions. That's why it's best to do a power study before the main experiment to determine optimal sample size, which may be 30, 90, or even 1000! I hope this helps :-)
In general, for what most people on ResearchGate would likely encounter, I agree with David, but in areas where one has good auxiliary (regressor) data on the population, there are exceptions, often with highly skewed data. See https://www.researchgate.net/publication/303496276_When_and_How_to_Use_Cutoff_Sampling_with_Prediction.
I implemented regression-based methodology with proven results, and used for thousands of tables of official energy statistics since about 1990 that way.
An interesting quick look at probability-of-selection-based methods, prediction-based methods, and a mixture of the two, is given in Ken Brewer's Waksberg Award article: Brewer, K.R.W. (2014), “Three controversies in the history of survey sampling,” Survey Methodology,
(December 2013/January 2014), Vol 39, No 2, pp. 249-262. Statistics Canada, Catalogue No. 12-001-X.
An Introduction to Model-Based Survey Sampling with Applications, 2012, Ray Chambers and Robert Clark, Oxford Statistical Science Series
Finite Population Sampling and Inference: A Prediction Approach, 2000, Richard Valliant, Alan H. Dorfman, Richard M. Royall,
Wiley Series in Probability and Statistics.
So, in general, you will need randomization, but there are exceptions when modeling with regressor data. I doubt that this is the case for you.
At any rate, stratification is hugely important, regardless.
In general, for applications I've seen on ResearchGate, even a very large sample from a relatively small population should still be randomized ... often stratified random sampling being best. But if sample size n is close to population size N, then you might be able to treat this as a census with nonresponse. In that case, to reduce bias from your lack of randomization (without a model either, I presume) you could look into "response propensity" groups - basically a kind of poststratification to turn "nonignorable nonresponse" into somewhat more "ignorable nonresponse." (Ignorable nonresponse does not really mean to ignore it. That just means you can consider it more like you had a random sample when you really didn't.)
So, there may be some help from a larger sample size, under conditions above, but if you do not have a randomized sample design (and no model either), then you generally would still have a problem with biased results for which you cannot even estimate uncertainty, for a 'quantitative' study. You just would know that your results may not be very good, and would have little idea as to how bad they are ... until possibly later from other evidence when it is too late.
Method When and How to Use Cutoff Sampling with Prediction