Hypothetical:
I conduct a survey and ask the respondents to answer a yes/no question "do you consider yourself to be an alcoholic". Their responses correlates with high/low averages on another measure "Qualities of Alcoholism" (QoA, e.g. yes=high, no=low). Most of the responses are in the 'no' camp. My questions are:
- Should I include everyone in the sample (yes and no's) if I'm comparing QoA to another measure (e.g. 'Effects of Drinking' scale) or should I just section out the 'yes' cases?
- What would the effects be on regression analyses if I took either approach?