Which test should I use if I'm comparing a non-normally distributed numeric variable with a normally distributed numeric variable?

More Vedran Lenz's questions See All

What do you associate most with this logo?

Berlin University of Digital Science (commencing 2018) Please share your perspective. What do you associate most with this logo? 1 = BERLIN, 2 = UNIVERSITY, 3 = DIGITAL, 4 = OTHER (please specify)

18 March 2017 140 13 View

Syntax for Biserial Correlation Coefficient, NOT Point-Biserial in SPSS or R?

Are the assumptions for biserial correlation same as those in point-biserial except that one difference regarding the underlying continuous nature of dichotomous variable? I need a syntax for...

01 June 2016 3,093 1 View

What could cause dramatic changes in Cellular ATP concentration?

Hi All, I've been looking into a protein whose overexpression (which occur naturally in some cancers) leads to enhanced proliferation. i was able to detect in addition in shorter doubling time an...

28 May 2015 3,797 3 View

Is there any evidence that small farms (in Europe) provide more socio-ecological benefits compared to large ones?

We especially ask for ecological and social effects (crop variety, biotop quality, other protective goods like soil and water; employment ratio, social cohesion...)

19 January 2015 2,847 8 View

Can anyone recommend a choice of standard proteins as QC spikes for label-free quantification experiments?

I am looking for standard proteins that are readily commercially available, not too expensive, are readily digested with trypsin, produce some suitable peptides for LC/MS/MS, and can be used as QC...

13 May 2014 3,376 9 View

What are the differences and/or similarities between the "assurance and consulting concept" of Internal Audit and in the External Audit profession?

Some studies provide evidence that the provision of non-audit services may harm the external auditors’ independence, consequently, SOX regulation in the U.S. limited the scope of such services.

03 March 2013 387 2 View

How to learn more about SPSS and its Application?

I would like to learn more about SPSS and Its application especially in regards to data analysis. Please suggest me how I can learn more about it. Thank you so much.

11 August 2024 9,101 4 View

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

I'm currently exploring the application of Python in textile engineering, specifically in areas like data analysis, process automation, and the development of smart textiles. I'm interested in...

10 August 2024 7,429 2 View

If we are using snowball sampling technique, how do we justify the true representativeness of the sample statistically? is there any statistical test?

Are there any statistical methods to justify your sampling technique using SPSS or AMOS?

05 August 2024 9,153 4 View

What are possible strategies can be used to analyze data under sequential explanatory mixed method approach?

Better ways to analyze the qualitative and quantitative data in a sequential explanatory mixed method approaches

04 August 2024 2,703 6 View

Why 3 replicates for most biological assays? Is it enough to examine the data fits normal distribution?

Just bounced on me. Before statistically analysing significant difference, shouldn't we see if data fits normal distribution first? Is 3 replicates enough to testify the hypothesis of normal...

31 July 2024 8,141 13 View

Posthoc test lettering in JAMOVI?

Does anyone know of a module for the JAMOVI software that is capable of generating mean separations using the classic letters based on post hoc results (e.g., Tukey test)? If, as I believe, such...

31 July 2024 3,333 4 View

How to back transform the results generated from analyses using log transformed with In(X+1) data?

I am conducting my analysis using SPSS. I log transformed my data using In(X+1) as my data contain zero values. However, when I want to back transform the regression coefficients generated from my...

31 July 2024 7,860 3 View

Request for Advice: Starch Metabolism Research Project?

I am currently considering a research project focusing on a comparative analysis of starch metabolism in orchids and roses. I am particularly interested in identifying the types and quantities of...

30 July 2024 4,267 2 View

Can the limit of quantification (LOQ) of an analytical method fall outside its linear dynamic range, or must it always be within it?

Can an analytical method's limit of quantification (LOQ) be outside its linear dynamic range, or is it always required to be within it? Please provide a thorough explanation supported by verified...

29 July 2024 7,198 9 View

Pragmatic inquiry research design?

Employing a pragmatic inquiry research design, looking for published research using this method, employing qualitative research data collection methods of semi-structured interview and focus...

28 July 2024 540 2 View

Ariel Linden

Hi Vedran,

I am not clear on what you are measuring? Are you trying to see if there is a correlation between two continous variables, or are you running a t-test? Please provide clarity on what the samples are, what you are trying to estimate, and what your hypothesis is. That will help determine what sort of test you should be using, and if a transformation is warranted.

Ariel

Vedran Lenz

Hi Ariel,

thank you for your involvement in this question. I am trying to find out is there any statistically significant difference between two groups based on age. I have a group of 128 participants whose age is non-normally distributed based on K-S test (with median of 70 years), and the other group of 15 participants whose age is normally distributed (also K-S test) (with mean value of 59.6 years). I also calculated the variances of the two groups and they are unequal based on a two samples F-test so I ran the two samples T-test assuming unequal variances. Is this a viable solution regarding the fact that T-test is comparing two mean values, and as i mentioned above, the first group of 128 is non-normally distributed and thus has a median value.

Can you recommend me the most accurate test for my problem?

Thank you in advance.

I suggest that you stick with a non-parametric solution such as a rank-sum test, and an even more robust test to consider in the non-parametric realm, is the Hodges-Lehmann aligned-ranks test. All software packages implement the rank-sum test, and fewer offer the aligned ranks method. I have written a program in Stata that performs this analysis, called alignedranks (from within Stata type: ssc install alignedranks). However, there is program in R that does this analysis as well.

I hope this helps

Enrique Quintero-Torres

Hi Vedran.

In case you want to use a parametric alternative, I recommend the t'-Welch. Montilla and Kromrey (2010) found that: "the t`-Welch test is more robust than the t-Student test and this one, more than the Yuen test under the conditions of normality, absence of normality and heteroscedasticity"

The reference can be found at the following link:

http://150.185.138.105/ojs/index.php/cienciaeingenieria/article/view/1125

Thank you Enrique.

Parametric alternative seems more accurate to me in this case and I would give it slight advantage over the non-parametric solutions.

Jochen Wilhelm

1) Are there no other covariates?

2) How do you conclude that the distribution in the small group is normal?

3) Is the deviation from a normal distribution in tha large group *relevant* to your problem?

And besides all these questions a general note:

You seem to be unable to formulate a precise statistical hypothesis, and you also seem to be unable to specify a precice alternative hypothesis, so there is little basis in doing some hypothesis tests (parametric or not); the results won't help you to get any sensible or reasonable conclusion. Assuming you are interested in the expected difference in Age, then you should test this (rather than testing something else just because you know a test for something else!). If there is no test dealing with the assumptions you can or cannot make, then you can still bootstrap confidence intervals and p-values.

My suggestion is that you plot the data and that you try to make your conclusions based on what you see. If you are dying to get a confidence interval or p-value of the relevant statistic (like the expected difference, for instance) than bootstrap it. But don't use a test for some argitrary hypothesis only because this is the only thing you seem to be able to do, and also do not base your conclusions simply on a p-value (no matter how it was calculated and whether or not assumptions are met).

Basilio de Braganca Pereira

If the models are separate see

PEREIRA, B. de B. . Separate Families of Hypotheses. In: Peter Armitage; Theodore Calton. (Org.). ENCYCLOPEDIA OF BIOSTATISTICS. 2ed.Londres: Wiley, 2005, v. 7, p. 4881-4886.

PEREIRA, B. de B. . Tests for discriminating separate or non-nested models. In: Miodrag Lovric. (Org.). International Encyclopedia of Statistical Sciences. 1ed.New York: Springer, 2010, v. 1, p. 1592-15956.

Frequentist and Bayesian procedures are described

Pavel Kovanic

Unlike you, I would prefer non-parametric approach: comparison of robust kernel estimates of probability distributions of both data samples. This would eliminate relying on a priori assumptions on statistical data models and could provide you with more information on differences between the variables. It may be that you are not used to this type of analysis. In such a case you could entrust to me your data sets (in Excel format). I would try to show you "what the data say for themselves".