More specifically, Is there a clear correlation between feature values distribution among training samples and the weight assigned by the linear SVM model?

Say, I have a feature f1 that appears in both positive and negative classes in training samples, and for simplicity, let's assume the feature values are binary. If f1=1 appeared more frequently among positive training instances than among the negative ones. Would linear SVM learn a positive weight for this feature f1?

And does dimensionality influence such correlation? (I'm thinking maybe higher dimensionality will make it harder to correlate as there could be redundant features?)

Many thanks in advance!

More Gabby Xiong's questions See All
Similar questions and discussions