In most of my study I found out epsilon=0 is getting a better results among other epsilons. As epsilon=0 means all vectors are slack variables which needs to be minimized (SVR), why not using epsilon =0.
Parameter ε controls the width of the ε-insensitive zone, used to fit the training data. The value of ε can affect the number of support vectors used to construct the regression function. The bigger ε, the fewer support vectors are selected. On the other hand, bigger ε-values results in more flat estimates.
"The value of epsilon determines the level of accuracy of the approximated function. It relies entirely on the target values in the training set. If epsilon is larger than the range of the target values we cannot expect a good result. If epsilon is zero, we can expect overfitting. Epsilon must therefore be chosen to reflect the data in some way. Choosing epsilon to be a certain accuracy does of course only guarantee that accuracy on the training set; often to achieve a certain accuracy overall, we need to choose a slightly smaller epsilon."
Parameter ε controls the width of the ε-insensitive zone, used to fit the training data. The value of ε can affect the number of support vectors used to construct the regression function. The bigger ε, the fewer support vectors are selected. On the other hand, bigger ε-values results in more flat estimates.
"The value of epsilon determines the level of accuracy of the approximated function. It relies entirely on the target values in the training set. If epsilon is larger than the range of the target values we cannot expect a good result. If epsilon is zero, we can expect overfitting. Epsilon must therefore be chosen to reflect the data in some way. Choosing epsilon to be a certain accuracy does of course only guarantee that accuracy on the training set; often to achieve a certain accuracy overall, we need to choose a slightly smaller epsilon."
eps=0 leads to median estimation, see http://arxiv.org/pdf/1102.2101v1
eps>0 leads, in most cases, also to median estimation, and, usually to less support vectors, see http://papers.nips.cc/paper/3466-sparsity-of-svms-that-use-the-epsilon-insensitive-loss
over- and underfitting is controlled by the regularization parameter "C" (or lambda) as well as by the kernel width if a Gaussian RBF kernel is used
Hamed. If the best fit is achieved with epsilon = 0, that's for sure means that you will need to change the kernel, since probable the SVM cannot find a linear relationship in the space you test.