Better formatted question here (https://stats.stackexchange.com/questions/626845/variances-explained-by-each-feature-on-pc-in-pca)
I came across this article (https://medium.com/@gaurav_bio/creating-visualizations-to-better-understand-your-data-and-models-part-1-a51e7e5af9c0) associated with this python codebase (https://github.com/gaurav-kaushik/Data-Visualizations-Medium/blob/master/code/Interactive_PCA_and_Feature_Correlation.ipynb).
In brief, there is a section "Understanding How Features Contribute to PCs", where:...
"One method for understanding which features are ‘important’ is to examine how each feature contributes to each principal component. To do this, we can take the dot product of our original data and our principal components.**Assuming our data is rescaled, the relative magnitudes of its dot product with the principal components will indicate the co-linearity or correlation of individual features and PCs.** In other words, if a feature is nearly co-linear with a PC, the magnitude of the dot product will be relatively large."
Found in line 156 (https://github.com/gaurav-kaushik/Data-Visualizations-Medium/blob/master/code/pca_feature_correlation.py)
The PC-Feature matrix is surely not a correlation matrix? perhaps a covariance matrix? Or am I missing something..?
Anyway then they go on and...
"I decided to re-normalize the heatmap with the explained variance of each PC. I essentially took the **dot product** of our new matrix (the one above, but scaled/z-normalised) and the explained variance as a vector. This would immediately reveal not only how each feature correlates with each PC, but how they contribute to the variance in the dataset."
Found in line 192 (https://github.com/gaurav-kaushik/Data-Visualizations-Medium/blob/master/code/pca_feature_correlation.py)
I think there is a typo there, as you can not take a dot product of your PC-Feature matrix and the variances, but just a normal multiplication. Also they apparently take the absolute value of the Z-normalised PC-Feature matrix to multiply by the variance.
But anyway...Implementing this in R:
library(tidyverse)
#Get Data
data %
select(where(is.numeric))
#PCA
pcaFit %
prcomp(scale = TRUE)
#Transpose of original dataset
tData %
scale() %>%
as.matrix() %>%
t()
#Get PC data
PCA %
as.matrix()
#Dot product of transposed dataset with PCA data
corrMatrix %
scale() %>% #And then scaling
abs() #Remove negative numbers
#Multiply by variance
frankensteinMatrix