I investigated the relationship between A and B in diabetes subgroup, but I am not clear about the purpose for doing a similar analysis in the control group and the whole group? I was wondering could anyone give me detailed explanation?
If what you want to know is - is there a difference in the relationship between A and B for diabetics, you HAVE to run the same analysis on the other group. Otherwise you have no idea if what you are seeing is a normal relationship or something that only applies to diabetics.
It does not, however, gain you anything to run the same analysis for the whole group, since it will just give you an averaged relationship, which isn't your question.
If, on the other hand, that is not your question, then the answer to your analysis problem hinges on what question you are trying to answer.
If the question is the comparison of the relationship between the two groups, a whole group analysis with the group as a covariate and an interaction term could lead to better results than a separate analysis in each subgroup.
As for me, it's in the first case. I will do it in the control group to know the normal relationship, but what for the whole group analysis? It combines both the effects from control group and diabetic group. That is to say, I don't think it would help confirming the exact relationship among patients.
Thanks for your explanation on the whole group analysis. However, I am not sure about how to set the whole diabetic group as a covariate. What the interaction term you mean here? Is it refer to the interaction between all the potential A-related variables and B? Would it be possible for you to give an example or more specific contents?
The method for testing an interaction depends on (1) what kind of measure (categorical or continuous) your variables are. If your outcome (B?) and your primary predictor (A?) are both continuous, you would need to do the following:
1) recode your identification of whether a patient is diabetic or not into an effects-coded variable (so, for example, diabetic coded 1 and control coded -1).
2) compute an interaction measure by multiplying your continuous primary predictor with the effects-coded diabetes indicator.
3) run a regression in which your continuous outcome is the dependent variable, and your primary predictor, your effects coded diabetes indicator, and the interaction variable are independent variables.
You are checking whether that interaction variable is significant after controlling for the main effects.
You would run the same analysis if your outcome was categorical and your predictor was continuous, except you would have to run a logistic or multinomial regression.
If your predictor is categorical and your outcome is continuous, you can test the interaction directly with a two-way ANOVA (a univariate GLM with two main or fixed effects predictors). The interaction between the two categorical variables will be run automatically.
And to answer the question you asked me - when you run parallel analyses with your two groups, you know exactly what the relationship looks like for each group but you have no information about the significance of the interaction (i.e., whether the difference in relationship between the two groups is statistically significant). That would be the only reason to run the analysis on the full group = to tet the interaction term.
@ Yingyun Gong : you may want to read basic statistics text books or follow some introductory courses --- or even best, consult a local statistician. It would be much more efficient, and more suited to your special case.