For the question in the title, I guess the answer is too long to fit here: Support Vector Machines, kernel density estimation, kernel PCA, kernel clustering, kernel ridge regression, kernel adaptive filters, kernel hypothesis testing... I could go on for hours probably. I would advice the paper "Kernel Methods in Machine Learning" by Hoffman et al. for a wonderful survey.
For the question in the title, I guess the answer is too long to fit here: Support Vector Machines, kernel density estimation, kernel PCA, kernel clustering, kernel ridge regression, kernel adaptive filters, kernel hypothesis testing... I could go on for hours probably. I would advice the paper "Kernel Methods in Machine Learning" by Hoffman et al. for a wonderful survey.
Much of the Kernel usage, as I'm aware, is related to classification. By using the kernel trick to inflate the dimensionality (yeah, rather unintuitive) of the input data into a higher "feature space", differences between groups can be magnified. In the Kernel Hilbert space, it can thus be easier to separate overlapping groups.
The problem of classifying an observation into one of several different categories, or patterns, is considered. The observation consists of a sample function of a continuous-time parameter stochastic process observed over a finite-time interval. When only two categories are involved the general pattern recognition problem reduces to the signal detection problem. The methods used are based upon results from the theory of reproducing kernel Hilbert spaces. This theory has been developed within the last few years and the application of these results to problems of statistical inference for stochastic processes has taken place only recently. Therefore, a reasonably serf-contained exposition of the results required from the theory of reproducing kernel Hilbert spaces is presented. It is pointed out that the decision rule employed by the optimum pattern recognition system is based on the likelihood ratio. This quantity exists fi, and only if, the probability measures are equivalent, i.e., mutually absolutely continuous with respect to each other. In the present work only Gaussian processes are considered, in which case it is well known that the probability measures can only be either equivalent or perpendicular, i.e., mutually singular. It is shown that the reproducing kernel Hilbert space provides a natural tool for investigating the equivalence of Gaussian measures. In addition, this approach provides a convenient means for actually evaluating the likelihood ratio. The results are applied to two pattern recognition problems. The first problem involves processes which have the same covariance function but different mean-value functions and the second problem concerns processes with different covariance functions and zero mean-value functions.