Hi there,
Please see the attached table for context. Please note that each year they were sampling new freshmen (i.e., not within-subjects longitudinal design).
So, the authors of a book I read were assessing the efficacy of a health campaign at their university designed to change perceived norms around alcohol consumption on campus. They provided the attached table and claimed the campaign to be successful based on, among other things:
In each case, I glean that they used the formula below to simply compare the year the campaign began (1999; V1) and the most-recent data (2003; V2).
(V2 - V1)/|V1| *100 = Percentage change
However, the authors did not conduct any inferential analyses, nor provide simple descriptive statistics necessary for me to do some kind of follow up analysis—including even the sample size, they simply said that “the undergraduate students took this survey each spring.” While I can look up enrollment for that year—I doubt every student took the survey.
When I plot these percent changes on a chart (also attached), they are trending downward, as expected. Since, in both cases, the x-axis is interval-level and the y-axis is ratio-level; I can fit a regression line to the charts, I can get R2 and r (i.e., the correlation). However, I don’t think that this is very helpful.
So, my question is:
With the data I have available, is there any way to examine whether or not this change over time is due to natural response variability from year to year, as compared to a statistically significant decrease in percentages?
When you have four time-points,1 comparing the first round of data collection with the very last seems too simplistic to conclude that it was the campaign itself, and not just random variation in students’ responses from year to year (or perhaps another factor), that accounted for this decrease.
1 The authors don’t even describe why data from 2002 is missing.