Hello,

For a research paper I am writing, I am attempting to analyse the effect of on-screen vs. off-screen visual attention to commercials in natural viewing conditions, on a range of brand outcomes (i.e. brand choice & recall). However, I require some guidance in terms of appropriate practice for running these variables in my regression model.

As an avid reader of these forums, I know many of you request more information when responding to help queries, so here it is:

In this study, viewer attention was measured on a scale of 0-100 where 0 indicates no attention was registered (i.e. the person may have been out of the room), while a score of 100 indicates that attention was fully on-screen. In consultation with my supervisors, we have defined off-screen attention as anywhere between 1-50, while on-screen attention reflects scores within a range of 75-100.

An attention score was computed for each second the commercial was displayed on-screen, and average/mode scores were also computed across the entire view. This provides me with an average attention value for all participants across the commercial length, but by using SPSS’ ‘count value within cases’ function, I can also specify how many seconds viewers spent either on-screen or off-screen. These two attention variables (I believe) allow me to analyse whether any attention has an impact on brand outcomes, as well as whether this potentially differs between types of attention.

All outcome variables (i.e. choice and recall) were dummy-coded as binary variables, with 0 meaning no choice/recall for the brand of interest and 1 indicating choice/recall for the brand of interest. Conversely, the predictor variables I intend to analyse (i.e. avg. attention & on-/off-screen attention seconds) are continuous and discrete. Additionally, I can also examine the role of other categorical predictors, such as gender, age and purchase frequency. Hence, I have chosen binary logistic regression, to explore the impact of attention on binary choice/recall items.

The problem

I am uncertain as to what is the ‘correct’ practice for regression modelling, based on my analysis goals. Initially, I was told to run each attention predictor on choice/recall variables individually, which produced some significant findings and some insignificant findings. However, in playing with the data, I found that when I filtered by variables such as Gender, certain findings became significant when they weren’t previously. As a result, I decided to enter in all predictors (both attention and demographic) in terms of main effects, as well as the interactions between them, and again, many of the attention variables I found to be significant in single-predictor models were no longer so.

This leads me to doubt my practices (or at the very least, seek confirmation). In terms of assumptions of the test, I believe I meet all – DV is binary, observations are independent, large sample size, multicollinearity (see next sentence), and linearity of IVs and log odds. So far, I have tried centering all non-categorical predictors, as well as running attention-only models (i.e. without demographics and interactions). However, the demographics may add unique insights to the data, so I’d like to retain those in a model. Though, if it is the case that their inclusion overrides any small effect attention may have exerted, then so be it.

Is there anything else I have missed, overlooked, or could benefit from knowing?

Looking forward to your responses.

Thank you.

Similar questions and discussions