Over a decade I have used and compared most current SDM algorithms, including GARP, GAM, GLM and MaxEnt. Although most may produce a good result locally with a lucky or considered choice of environmental predictors, I can only recommend the latter two (GLM and/or MaxEnt), if you plan to transfer your model to a disjunctive area. I have a slight preference for MaxEnt for three reasons. It has been show to be the more robust algorithm for spatial resolution, incidental species presence data and the number of environmental variables (attached Majella articles). Further, I like the embedded response graphs for both categorical and continuous variables in MaxEnt to make ecological sense of the model output. Finally, I like MaxEnt for its fundamental concept, low entropy, and therefore its apparent applicability across the social and natural sciences. However, I have not tested yet the sensitivity of MaxEnt for high numbers of species presence points (Gorilla article attached). I would be grateful for a reference to a presence point thinning study.
Having said the above, the choice of SDM algorithm should currently not be a main issue. It is easy enough to apply several SDMs in parallel. The real challenge are the environmental predictors fed into the model. I am critical about the use of environmental data layers of untested spatial accuracy, particularly climate (e.g. WorldClim) and land cover as if these represent observations (attached).
Article Sharing natural resources: Mountain gorillas and people in t...
Article Fine resolution distribution modelling of endemics in Majell...
Article Where the bears roam in Majella National Park, Italy
Probably Maxent is the mostly used SDM in 2015. According to information in Web of Science (http://apps.webofknowledge.com/) by this time of the year, 177 paper are published that contain Maxent modelling.
I support ensemble modelling approach for robust modelling. You can use BiodiversityR package in R, which provides a graphical user interface and utility functions for ensemble modelling. It include 19 different SDM algorithms, including most frequently used algorithms like GAM, GLM and MaxEnt, and many more.
Regards
Article Ensemble forecast of climate suitability for the Trans-Himal...
Article Separation of the bioclimatic spaces of Himalayan tree rhodo...
Hi, it is not easy to define "the best algorithm" because performances are strictly connected with the quality of data. Many recent works used Random Forest, GAM or MaxEnt as unique algorithm while others used an ensemble prediction as a weighted mean of different models. My opinion is always to use comparative methods. My suggestion is to compare GLM, GAM, MARS, MAXENT and Random Forest. You can use the best of them or a weighted mean based on accuracy of prediction. Try ho have a look on this link.
In addition I can say that it also depends on your computer skills. If you are more confident with ArcGIS you can try SDMTools while if you are an R user you can try biomod2
I'm not sure we can say that there is a single best algorithm, generally. The choice depends on several aspects (data type and quality, objective of the study, etc...). I've been using ensemble modelling with the R package biomod.
More than combining several algorithms, I think that this approach has the advantage of repeating several analytical choices as many times as desired. For example, generating a set of pseudo-absences and a data partition (traditionally 70-30% of data for calibration and evaluation respectively) repeatedly has the advantage of, since we're taking this decision once (as before), we are not affected by eventual biases in that unique decision.
it is not wise to choose SDM method based only on the use in 2015. Performance of classical regression model can be as good as any sophisticated recent machine learning method (e. g. boosted regression tress). As it was mentioned by others many things depend on the data (quality and quantity) and empirical relationships between species distribution and environmental predictors. Therefore, my suggestion firstly to look into your data and see if you can use simple and well known methods (e. g. logistic regression) following classical literature on that (e. g. Analyzing Ecological Data (2007) by Zuur, Ieno & Smith). I also attach my paper on comparison of different SDM methods, which supports what I wrote above.
Article Empirical modelling of benthic species distribution, abundan...
Unfortunately, do not exist the best method to model species distribution. Any a priori choice is nothing more than unrealistic preferences by user. To solve this problem, people have used a ensemble from many (most possible) methods (see the pioneering paper by Araújo and New 2007 - TREE). However, there are so many methods based on regression techniques (GLM, GAM, regression trees ...), so that if you use all available (possible) methods, your ensemble is much more a result of regression techniques than other one. So, I recommend you choose one method from each class of methods. See a relaxed, but robust, labeling of methods in Rangel and Loyola (2012) - Natureza & Conservação.
Hi, as often happen the best answer is "it is impossible to define the BEST" method which can perform correctly in all the circumstances. Many Researchers are used to work with a single-algorithm (for example Generalized Linear Models, Maximum Entropy, Generalized Additive Models, Neural network, Random Forest etc.) while others are more willing to work with ensemble projections and consensus models (i.e. mean or weighted mean of more than one algorithm). Some examples of single-model and ensemble model:
http://dx.doi.org/10.5424/fs/2016253-09476
http://dx.doi.org/10.1111/gcb.12604
http://dx.doi.org/10.1007/s13595-014-0439-4
http://dx.doi.org/10.1111/gcb.12476
I don't know if you would like to work with R or other tools but, just to begin and read something, in addition to the papers above, I could suggest you two R packages, dismo and biomod2:
I think the question about best model is wrong. You can choose the model according to your data (with expert opinion) or you can test your data with more than one model. But I would say that in recent years Maxent for wild animals, Random Forest and Generalized Addicted Model for plant species are seen as the methods giving the best results.
I wrote a paper and a tool for this exact issue. This is common for most machine learning applications, you should always try to experiment with many different methods for the specific problem/dataset. You can have a look at the paper here: https://joss.theoj.org/papers/b0166a4b4c9cfa39761ec2a2fa71ff1c
I have now added also a survey of modern open-source software available for SDM work: Preprint Review of species distribution modeling open-source software
"fully-grown" classification/decision-tree models must be pruned in order to identify the most accurate formulation, please read article https://odajournal.com/2019/01/31/optimizing-suboptimal-classification-trees-s-plus-propensity-score-model-for-adjusted-comparison-of-hospitalized-vs-ambulatory-patients-with-community-acquired-pneumonia https://odajournal.com/2019/02/12/more-on-optimizing-suboptimal-classification-trees-s-plus-propensity-score-model-for-adjusted-comparison-of-hospitalized-vs-ambulatory-patients-with-community-acquired-pneumonia For overfitting https://odajournal.com/2019/03/21/some-machine-learning-algorithms-find-relationships-between-variables-when-none-exist-cta-doesnt
I would say those algorithms that include monotonicity constrains like Boosted Regression Trees or AdaBoost in order to overcome the issue of overfitting generated by more relaxed fitting such as that of MaxEnt and Maxlike.