What kernel are you using? I believe a linear kernel would be the fastest. In any case, you have a very low dimensionality (two) so I don't see why it should be slow.
I don't know which SVM program you are using. LibSVM toolbox is fast to train this type of two dimensional data set. If not you may try some other methods as stated below.
SVM with linear kernel may be fast but may not provide good classification results if the data set is non-linearly separable. In that case nonlinear kernel may be used with other tricks, like reduced kernel techniques. In this method, 10~15% of the training data are selected randomly to form the basis of the kernel matrix. As the kernel size is reduced it will take less time to compute the kernel matrix and hence less time for training. It is proved statistically that the information contained in the reduced kernel is almost same to that of the full kernel matrix. Classification results also identical.
whether a linear kernel may have a chance to work in your case can be easily seen by a 2d-plot: plot classes with different colors and if it looks like that most data can be separated by a linear line, a linear kernel is good for you.
if not, you should use a nonlinear kernel such as the gaussian. here you need to optimize two parameters of the svm which is usually time-consuming. a fast implementation which does that optimization for you is here
To quickly train the SVM, you can try following things:-
Use Linear SVM
Use Primal SVM form
Use scaled data
Use optimum parameter values.
Explanation:
1. Use Linear SVM (linear Kernels) like LIBLINEAR library. But the conditions to use Linear SVM are that: (a) Data should be linearly separable, otherwise test accuracy could be very low. You can check whether data is linearly separable or not by the method mentioned by Sir Ingo Steinwart (b) Training time is more important than test accuracy. This is because test accuracy of linear SVM is always less than non-linear SVM.
2. Use Primal SVM for your problem. This is because number of features are very less (2) as compared to number of training instances (30000) so primal should be very very fast as compared to dual form.
3. Use scaled data as mentioned in the paper "A Practical Guide to Support Vector Classification". This can be helpful, as mentioned by Minh-Tien Nguyen.
4. Use optimum values of parameters to get the best result. You can use grid search as mentioned in the above mentioned paper.
Thanks
Vinod
Article A Practical Guide to Support Vector Classification Chih-Wei ...