Can cases in a Random Forest be spatially or temporally correlated?

Manuel Villar-Argaiz @Manuel_Villar-Argaiz

05 May 2017 1 8K Report

Our research campaing involved sampling of 4 rivers (at three different altitude stations each river). At every station we collected three different samples in a longitudinal 100 m-transect of the river taking special care to sample the full heterogeneity of substrata, and analyzed for benthic macroinvertebrates.

At the same time we evaluated numerous catchment variables in order to test the relevance of the land use and catchment properties on the macroinvertebrate community.

Therefore, we ended up with 4 rivers * 3 stations * 3 samples per station = 36 samples.

My question is Whether all samples could be wisely included as cases in a Random Forest Model (n=36)?…, or should I instead average macroinvertebrate samples per station to avoid pseudoreplication (n=12)?

I would greatly appreciate any help and advice on this issue.

Salud, y gracias

Manuel

Andrew Paul McKenzie Pegman

Your actual replicates are 'river' i.e. they are your 'sites' or 'areas'. But they must be significantly spatially separated to be true replicates. Then 'altitude' is a 'subsample' of the replicate. So when you construct your ANOVA table, you will have the factors 'river' (fixed), 'altitude' (fixed, hopefully), and then your catchment variables. Your ANOVA table will automatically work out degrees of freedom (i.e. n–1) so don't worry about calculating n in advance. Do NOT average your macro-invertebrate samples – put ALL the data into one big ANOVA/MANOVA/GLM table (each sample should correspond to a station in your table). If you average your data before entering it in your table, the software has nothing to analyse lol. Pseudo-replication is only a problem if you sampled one river, but you have 4 of them lol :)

Badges
Science method

More Manuel Villar-Argaiz's questions See All

Do any of you have access to Repbase? Or do you can bring me the fungi database?

The data are necessary for one analysis on TE of the ectomycorrhizal fungus Laccaria trichodermophora in my phd thesis The database is called RepBase24.04.fasta.tar.gz and are located in...

04 May 2019 8,550 11 View

How to learn more about SPSS and its Application?

I would like to learn more about SPSS and Its application especially in regards to data analysis. Please suggest me how I can learn more about it. Thank you so much.

11 August 2024 9,101 4 View

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

I'm currently exploring the application of Python in textile engineering, specifically in areas like data analysis, process automation, and the development of smart textiles. I'm interested in...

10 August 2024 7,429 2 View

How does grain and grain boundary affect the ceramic when studying its dielectric properties?

I am not able to get good literature and the physics behind how first these grains and grain boundaries arises out of no where when we make a pellet to study its dielectric properties and then how...

07 August 2024 5,177 3 View

How to normalize and take the significance of the MTT OD values with 3 replicates for the same cell-line?

Hi, I have a question about normalizing the MTT OD values for doing the statistical analysis. So, if we have 3 different plates and we call them 3 different replicates, so, first we would...

07 August 2024 8,106 4 View

RNA later for the preservation of RNA in fecal samples at room temperature for one day (37°C)?

I am planning to collect human fecal samples for metatranscriptomic analysis using MGI. These samples are from indigenous people living in a region with high temperatures. I will have access to a...

06 August 2024 1,367 3 View

Research Methodology - Impact of Corporate Reputation on Stakeholders Behaviors?

Please can anyone support with the survey questions based on RQ measures and propose how to do it in FMCG industry and include as well the role of brand equity Thanks

06 August 2024 949 0 View

Anyone having idea about VN primer for miRNA primer design ?

How to design VN primer to attach with universal reverse primer

05 August 2024 2,116 3 View

If we are using snowball sampling technique, how do we justify the true representativeness of the sample statistically? is there any statistical test?

Are there any statistical methods to justify your sampling technique using SPSS or AMOS?

05 August 2024 9,153 4 View

How to report results of Generalised Linear Mixed Models in a journal article?

Hi everyone, If you have written or come across any papers where Generalised Linear Mixed Models are used to examine intervention (e.g., in mental health) efficacy, could you please share the...

04 August 2024 4,130 4 View

Why is the molecule's orientation with an electric field affect polarizability?

Why is the molecule's orientation with an electric field affect polarizability? Electrons are diffuse enough to be independent with respect to orientation and effect of electric field on...

03 August 2024 7,843 1 View