In feature fusion, should we extract features from the same (seen) data used to train the base models, or from unseen data? Since these features are later used to train a final classifier, I’m concerned that using seen data may introduce bias, overfitting, or even data leakage. What is the best practice to ensure generalization and fairness in this scenario?

More Anam Nasir's questions See All
Similar questions and discussions