Hello,
I was hoping to reach out to this research community to seek some perspective and advice on a statistical task I am working on through R. The objective is very simple and straightforward, but I am taken back by how complex the process may actually be.
The objective is outlined as follows: I have three boxes in a single room. One is blue, one is green, and one is red. In each box is a temperature reader recording a single temperature reading per day over a 30-day period. So now for each box, I have data that contains a single temperature reading for each day for 30 days.
To my understanding, this would be time-series data. Now I want to address the question: Is there a significant difference in temperature between the three boxes? Phrased differently: Does the choice of color dictate the overall temperature in the box?
Of course I can just take the mean temperature for the 30-day period for each box and just compare that, but this doesn't seem complete. Since I am working with categorical data (color of box) and continuous data (temperature), I was thinking I need to perform an ANOVA test. Then I would perform the Tukey HSD post-hoc test to look for individual comparisons, such that in addition to comparing all the boxes together, I would compare the blue box to the red box and the blue box to the green box, and compare the green box to the red box. This however, would just be looking at the affect of color on temperature, ignoring the whole time-series component. How can I add the time component to this?
I know R has the time series function: ts(), so would it work to just make each data set for each box a time-series abject, and then just run the ANOVA and Tukey HSD post-hoc test on these time-series objects? How should I best proceed with my objective? I know there are factors such as seasonality and auto-correlation here, but I am not sure how to incorporate these considerations. Is there a simple way to do all of this? Could you perhaps provide some R code examples?
Thank you so much!