For two-class classification problems, any type of neural network (rbf or otherwise) with a sigmoid output (between 0 and 1) is a function g(x, w) where x is the vector of features and w the vector of parameters (weights). If the two types of classification errors are equally costly, the decision threshold is taken equal to 0.5, hence the equation of the boundary is g(x, w)=0.5. If the output is a tanh (between -1 and +1), and if the two types of classification errors are equally costly, the decision threshold is taken equal to zero, hence the equation of the boundary is g(x, w)=0. If the two types of errrors are not equally costly, then you must use a value t for the threshold, different from 0 or 0.5, hence the equation of the boundary is g(x, w)=t.
Gérard Dreyfus thanks. My understanding is that the Perceptron in dual space has no w.
The dual space weights, alpha, and the kernel, do not ofer a means to evaluate the prediction at a new point, x, as they do not depend on it in any way.
In what way then does one plug in the values into the Logistic function?
Asher Klatchko Sorry, in my previous answer, I got confused by your terminology "rbf Perceptron". Actually, it seems that your question can be reformulated as: "How to infer class boundaries in the primal feature space from the solution found in the dual space, for an SVM with Gaussian (or RBF) kernel". This is a notoriously difficult, ill-posed problem, known as the "pre-image problem": how to find the inverse of a mapping defined by a kernel? The following reference might be useful for you: P. Honeine and C. Richard, "Preimage Problem in Kernel-Based Machine Learning", IEEE Signal Processing Magazine 28(2): 77 - 88, April 2011.
Gérard Dreyfus only that it isn’t an SVM classifier but a nonlinear Perceptron. In principle the question could be generalized to: given a set of alpha parameters in the dual space, inferred from a training set, {x,y}, how does one make predictions for a test set, x’, to obtain the sought y’?
Here, x_i, y_i are the training set features, and their classification values; x is a new test point to be classified and
K(x, x_i) , is the kernel at each point x_i. This seems to generate pretty good boundaries for the two spirals data set: http://www.ccs.neu.edu/home/vip/teach/MLcourse/data/TwoSpirals/twoSpirals.txt
Please comment: it appears to me that plugging an affinity Gaussian kernel (rbf) into the sum above is akin to a weighted clustering procedure, where one clusters a test point with an already classified point from the training, with the alpha coefficients as weights. The logistic function is then the means by which one determines the probability of the the resulting prediction.