I am doing an exploratory analysis using 40 biomarkers to predict treatment success or failure after 8 weeks. I have measures for the 40 markers at baseline and 8 weeks later. What is the usual process for this kind of exploration? I imagine using logistic regression to see if baseline measures predict treatment success, then looking at measures from week 8, then doing a paired t-test to look at change between baseline & week 8. Other thoughts:

- If the biomarker results are not normally distributed and I log transform them for the baseline analysis, should I do the paired t-test on the original or transformed values?

- A colleague suggested using the Kolmogorov-Smirnov distance test. what is the advantage in that?

- Is it appropriate to throw all the biomarkers into one regression model? What if they are highly statistically correlated? What if they are on the same causal pathway?

Thanks for your help!!

Similar questions and discussions