What are the criterion of the given dataset that makes it suitable for L1 SVM instead of L2? I know that it is favorable to use large dimensional features with L1 SVM to utilize its implicit feature selection but in my case even with large dimensions like 20000, L1 SVM lacking compared to L2. What are your suggestions to get improved results from L1 SVM?