I am doing an exploratory analysis using 40 biomarkers to predict treatment success or failure after 8 weeks. I have measures for the 40 markers at baseline and 8 weeks later. What is the usual process for this kind of exploration? I imagine using logistic regression to see if baseline measures predict treatment success, then looking at measures from week 8, then doing a paired t-test to look at change between baseline & week 8. Other thoughts:
- If the biomarker results are not normally distributed and I log transform them for the baseline analysis, should I do the paired t-test on the original or transformed values?
- A colleague suggested using the Kolmogorov-Smirnov distance test. what is the advantage in that?
- Is it appropriate to throw all the biomarkers into one regression model? What if they are highly statistically correlated? What if they are on the same causal pathway?
Thanks for your help!!