09 September 2017 3 7K Report

HI All, 

I am working on a C-SAT model. The outcome variable is 0 and 1 i.e. either Dis-Sat or C-Sat. 

There are more than 100 predictor variables i have from business. 

What are the best ways to exclude some of these variables. I have a process in mind which i am listing below but need some suggestions from the larger audience:

1) Dropping variables which have multicollinearity 

2) Using Pearson Correlation or ANOVA basis data type 

3) Dropping variables with high missing values 

4) This would be too early to use any regression methods such as - forward, backward or stepwise 

5) Can we also look at PCA

All the above steps that i am listing would take a lot of time, would you please advice on any other approach which helps to put only significant predictor in the model. 

Thank you. Shivi

Similar questions and discussions