What is the most reliable procedure for examining variance for the following dataset, where different variables are being examined across four groups from the same survey (sample) n = 1850:

e.g. Group 1 (n =700): Score = 95

Group 2 (n = 450): Score = 48

Group 3 (n = 350): Score = 16

Group 4 (n = 380) : Score = 35

The above example approximates the actual data (attached) for the variable with the most responses (n = 195). The variables of potential interest range from n = 195 down to n = 20, but likely the cut off would need to be higher to be meaningful ( n = 50, n = 100? - this would account for 13 or 8 variables for further examination, instead of up to 20...).

In the initial analysis I have examined ten variables (cut off would then become n = 55). Some variables overlap in the discussion, so n = 8-12 seems reasonable overall. Still, I would like to report for up to 20 in the table.

When the above data is recalcluated into %:

Group 1 = 5.61; Group 2 = 5.72, Group 3 = 6.98, Group 4 = 4.90

I have been carrying out the preliminary analysis with the above relative frequencies (%), which is fine initially for highlighting differences, for example, above between Group 3 and 4. However, now I am looking to consolidate the data analysis with the correct procedure(s) reported to support the qualitative findings.

Thanks for clarification on the above example(s) and guidance on correct methods.

More Joel Adam Gordon's questions See All
Similar questions and discussions