Basically the 1 and 2 sigma ranges are the natural consequence of considering a measurement (such as determining 14C/12C ratios of the sample and the standard) an interval defined by an average (position index) plus and minus (dispersion index) the measurement uncertainty (mostly statistical in the case of radiocarbon dating). Obviously to a given defined interval defined probabilities may be associated by assuming that measurements are characterized by normal distributions. Measuring (indirectly) the 14C age of a sample schematically means I) determining the isotopic ratios of the standards and the unknown, II) correct them for the machine and pretreatment background, III) correct them for the isotope fractionation by measuring their 13C ratios, IV) measure their RC age by applying of a logarithmic function to the ratio of their corrected (II and III) ratios.
The usage of the RC age term (i.e. different from true age) indicates that there is still a bias left after applying such a pathway of data handling. This is mostly due to the failure of the RC dating main assumption: "constant radiocarbon atmospheirc concentrations over time". To correct also for the non constancy of 14C atmospheric activity over earth history Calibration dataset should be utilized. Since these datasets are not monotone (often showing local wiggles) the measured RC age (still a Gaussian distribution) has to be projected over the true age axis by means of the calibration dataset. This implies that sometimes distortions (i.e. more than one 1 and 2 sigma interval) may be introduced in the obtained final true age distribution.
Now if we want to compare 2 regular statistical distributions (hypothesis test) the usage of 1 sigma as dispersion index may be restrictive (i.e. for a gaussian distrib only about 60% of the cases falls within the average plus and minus 1 sigma). That is why a comparison at 2 sigma level is more conservative implying that i am leaving only 5% of the experimental cases out of my test.
Things get worser when you want to compare real age distributions coming from the calibration of RC ages!
One of the approaches to follow when comparing two RC measurements is to apply the 2 sigma statistics (e.g. t test) on the rawest data you can (i.e. F14C (the ratio of the corrected 14/12 C ratios of the sample and the standard) in the case of 14C measurement). But if you need to compare a 14C derived dating with an independent chronological constraint you will be obliged to work on the true ages axis and eventually deal with multiple 2 sigma intervals.
I hope I was able to reply correctly to your enquiry.
Usually you report the radiocarbon data with 1-sigma error. That mean that if you analyze the same sample many times, 68% of the values will be in the range of average - 1-sigma and average+1-sigma. If you use 2-sigma, this means that approximately 95% (95.4%) of your measurements will fall on that range. So overall, as you want to bee more confident then larger the uncertainty will be. For more information search for "uncertainty" keyword.
This all started, I believe in the 1980s, because the Oxford geochronologists wanted to show how good they were and so quoted their calculated age errors at 2 sigma. Others followed suite and so now we have always to say which we are using, usually analytical data is given with 1 sigma errors (standard error of the mean) as it always has been, but age calculations at two sigma. It's true that two sigma provides a better way of asessing the difference between two dates. Cheers, Dave
I think we should stick to probability based confidence intervals such as the commonly used 95% CI. This is what I have done in my paper on radiocarbon dating which I believe is available on Researchgate and if not it may also be found at