neural networks and svm are "model free" methods, as they work without any assumption on the data generation process. Many clustering techniques (such as k-means and fcm) can be customized with different distance functions, so as to adapt their behavior to non-normal data.
If your data has labels (meaning you have examples where you know which group/classthey belong to), then you should focus on classification models to learn the relationship between your features and labels. As Corrado mentioned above, random forest, softmax regression and SVC are good places to start. I'd leave neural networks alone until you're convinced your problem requires all of their bells and whistles (unless your problem is image, speech or natural language classification, in which they're the clear option).
If you don't have examples with labels, I'd recommend k-means or hierarchical clustering as a first stab at the problem, which as Corrado pointed out, need appropriate distance functions in order to define what data is closely grouped vs. not. If you're not satisfied, from there you may also consider GMMs or DBSCAN, which excel at certain classes of problem but require more delicate handling (tuning of hyperparameters can prove tricky).
In general would recommend learning about what the assumptions behind each model are and thinking through whether they make sense for your data. Wikipedia and YouTube are great resources for learning about all the things I've listed.
One other thing you might want to consider as well is transforming your (supposedly skewed?) data prior to analysis. You could use e.g. a log- or Box Cox-transformation and after doing so, a simple histogram or a Chi-squared test to see whether that helps to achieve normality.
As soon as your data is normally distributed, most analyses will become much easier.
Like what Corrado said, NN and SVM are good models for solving classification data. But, for better accuracy and recognition of the outputs, we need preprocessing data (such as data transformation, data normalization, etc.). Besides, the activation function that we choose is important too.
There is a spectrum of non-parametric classification algorithms, such as SVM, NN, k-nearest-neighbor and Parzen, that could deal with non-normal data. Choosing the best classifier from this spectrum (from one side of the spectrum to the other side) then depends on many factors such as the amount of data, dimension of the feature space, computational power and time, and the difficulty of the task (separability of the classes in the feature space).
Abhishek Verma First you should need to decide/verify that whether you should need to use/apply classification or clustering method because these two are different fro each other (if in doubt/cannot decide )
and for handling/dealing with non normally distributed data check the below links, it will be helpful for you : https://www.researchgate.net/post/How_to_deal_with_non_normal_data