Dear all

I am faced with a question on selecting species occurrences as input

data for a biomod run. Specifically, I have a number of beaver (/Castor

fiber/) observations which I want to use to model future beaver range

expansion. The data amount to about 1.800 occurrences, representing 72

territories. The number of occurrences per territory varies between 1

and 65 (mean: 25). There are two obvious choices I can make: 1) use all

1800 data, increasing sample sizes -- but running the risk that the

results will (too?) strongly be influenced by the territories with a

large number of occurrences. 2) only use 1 occurrence per territory,

allowing an 'equal weight' for each territory -- but reducing sample

size (i.e. reducing how good territories are 'sampled').

An alternative would be to do multiple model runs whereby I randomly

select 25 occurrences (the mean number) from the territories with > 25

observations while using all available occurrences for the other

territories. Another way would be to weight or scale the occurrence data

-- for example downweighting the influence of occurrences belonging to a

territorial with a large number of occurrences.

While I can implement standalone R scripts to sub select/downweight

data, I am not sure how to feed this into the biomod flow. Any

suggestions on how to tackle this are much appreciated!

Best wishes and thanks in advance,

Diederik

-- Dr.Diederik Strubbe Evolutionary Ecology Group Department of Biology

[email protected]

Similar questions and discussions