I want to use KPCA on large data for dimensional reduction. As I know, a large data set leads to a large K, and storing K may become a problem. One way to deal with this is to perform clustering on the dataset, and populate the kernel with the means of those clusters.

So, How can I calculate kernel by performing clustering on the datset? How can I populate the kernel with the means of those clusters?

Similar questions and discussions