Yes you can. Support vector machine is not a probabilistic model; i.e., it does not postulate a probability distribution and thus does not assume any randomness. It merely tries to draw a simple line(or plane or hyperplane in higher dimensions) to separate the data points into two parts. That's all. Note that the dataset contains labeled data.
One difficulty was that oftentimes the classifier(the separating 'line' or 'hyperplane') cannot be defined linearly, which means it's not actually a straight line or plane that separates the two sets. It should rather be a wavy curve or surface. So what do we do? We lift the feature space to a higher or possibly an infinite dimensional space so that a linear classifier is possible. This is called the kernel trick. This is what the support vector machine does.
Now applying this to a regression problem, linear regression could be described as an attempt to draw a line (or similarly plane or hyperplane in higher dimensions) that minimizes the error(or the loss function). Therefore, if we choose different loss functions, the regression line(or plane, hyperplane) changes. When the feature space seemingly isn't best served by a simple line or plane but rather calls for something wavy as seen in the classification problem, instead of approximating the wavy object, we again use the kernel trick to lift the feature space into a higher dimension.
This is the best and the simplest explanation I can come up with :). To understand the somewhat ambiguous and undefined terms like "kernel trick" or "lifting the feature space" with a certain level of rigor, we need to get into more math, especially functional analysis. If you want more detail, I suggest you look into the excellent book, The Elements of Statistical Learning.
Originally, SVM is the best for binary classification with numeric multivariate data. I did not know whether SVM is used for regression model but Prof. Turki Haj Mohamad mentioned SVR. It is nice.
Yes you can. Support vector machine is not a probabilistic model; i.e., it does not postulate a probability distribution and thus does not assume any randomness. It merely tries to draw a simple line(or plane or hyperplane in higher dimensions) to separate the data points into two parts. That's all. Note that the dataset contains labeled data.
One difficulty was that oftentimes the classifier(the separating 'line' or 'hyperplane') cannot be defined linearly, which means it's not actually a straight line or plane that separates the two sets. It should rather be a wavy curve or surface. So what do we do? We lift the feature space to a higher or possibly an infinite dimensional space so that a linear classifier is possible. This is called the kernel trick. This is what the support vector machine does.
Now applying this to a regression problem, linear regression could be described as an attempt to draw a line (or similarly plane or hyperplane in higher dimensions) that minimizes the error(or the loss function). Therefore, if we choose different loss functions, the regression line(or plane, hyperplane) changes. When the feature space seemingly isn't best served by a simple line or plane but rather calls for something wavy as seen in the classification problem, instead of approximating the wavy object, we again use the kernel trick to lift the feature space into a higher dimension.
This is the best and the simplest explanation I can come up with :). To understand the somewhat ambiguous and undefined terms like "kernel trick" or "lifting the feature space" with a certain level of rigor, we need to get into more math, especially functional analysis. If you want more detail, I suggest you look into the excellent book, The Elements of Statistical Learning.
a lot has been said regarding the theoretical properties of SVM and SVR in the previous comments. What I can complement are some words regarding the practical application of SVR. In my experience this method has proven to be very effective for datasets with continuous variables and high dimensionality. In particular I have noticed good predictions in multi-spectral data with multidimensional Gaussian distributions. Also my team have obtained good accuracies on electronic-signal data. In those scenarios I have seen SVR outperforming traditional regression methods.
for a recent example, please check Article Assessing Wheat Traits by Spectral Reflectance: Do We Really...
Yes, it is SVR. But it will be "black-box" model and really you could not analyse influence of single parameters - it is possible only for Linear Kernels. For classification tasks I often use SVM, but for my point of view, for regression more better to use direct (white-box) regression algorithms - e.g. fitlm of Matlab.
In general there is no best regression algorithm for all the possible regression problems. What we have are theoretical properties that make us think that certain algorithms will perform better in certain scenarios. The experimental approach, on the other hand, is simply to benchmark many regression algorithms and select the one with best performance. SVR has become famous because there is a decent range of application where it has observed outstanding performance. Random Forest and recently Deep Learning also do.
The regressor version of SVM classifier is known as SVR. SVR is an improvement over some learning algorithms, however, in the world of deep learning, using SVR may not worth your time. If your dataset is transactional dataset, I will recommend H2O deep neural network.