Suppose a binary class dataset has: a) Feature = 10 b) No. of instances (row) = 100
As we know we need the same dimensions to do training and testing of a model.
If we use PCA to reduce the dataset's dimensions from 10 to 3. So, now the dataset has 3 features to do the train test.
After getting a good result by using PCA = 3 features, we want to deploy the project in real-life but in real life, there could be more than 3 even 10 features to predict the class. - So after training a model on PCA=3 dimensions data how could we handle testing the same model for 10 or more dimensional instances? - - Or, how could we use PCA in a real-life project? - Suppose, in a clinic, there is only one patient who wants to query whether he has lung cancer or not by providing 10 features (integer, float). If it's not possible then why PCA is used in research areas, especially in Excel data?
In Python, after fitting PCA with components = 3 for training data (nx10) , call PCA's tranform function to transform test data (mx10) into 3 features.
@Chris Tsai I wanted to ask just about one thing that how could we apply this for only 1 patient who has (1×10) features not (m×10) features. just imagine when a patient comes to know his lung cancer probability from your system, do we consider m or just 1?
N.B. I am not asking about training testing the model. Imagine it’s done perfectly and you want to deploy the trained model in real life hospital.