I want to understand the difference between PCA (principal component analysis) and NMF (non-negative matrix factorization) in terms of explaining variability.
When we apply PCA into high-dimensinal data (suppose 100) then the largest PC explain the highest axis of variability in the data, similarily second largest PC explain the second highest axis of variabiity in the data and so on. So if we keep 10 or 20 PC in then final analysis then it certainly caputure the most variability of the data.
Now in the case of NMF when we increase the number of factors then what the factors basically learn from the data. Does it learn the axis of variability or it learn the key components on the data?
When we should say in the case of NMF that these number of factors loading has learnt almost all the axes of variability in the data.
Thanks.