Spatial PCA: How to interpret the amount of variance explained by each principal component?

17 September 2020 4 1K Report

Hey everyone, I hope someone can help me. Please!

I've carried out a spatial PCA using the adegenet package in R, following Dr. Jombart's tutorial (i.e. NAs in data replaced to mean allele frequency, etc.). My problem is with the interpretation of the variance explained by each component... Obviously it's not like a regular PCA where you need all the components together in order to explain 100% of the variance in data.

Here in sPCA it's easy to see (in the screeplot for example) that combining just a couple of principal components exceeds 100% of the variance.

Showing the summary for the sPCA:

[Call: spca.genind(obj = mi_genind, xy = mi_genind$other$xy, cn = data.graph,

scannf = FALSE, nfposi = 2, nfnega = 0)]

Scores from the centred PCA

_________var___________cum___________ratio____________moran

Axis 1___1.184406_____1.184406_____0.07550004____0.3562353

Axis 2___1.022800_____2.207206_____0.14069851____0.1799373

sPCA eigenvalues decomposition:

___________eig_______________var_______________moran

Axis 1_____0.15675044______1.0088656_______0.6214918

Axis 2_____0.08220275______0.7455009_______0.4410605

###################################################

So I want to have some sort of idea whether this analysis is meaningful to explain the pattern in variability. As Jombart says in the tutorial: "The maximum attainable variance by a linear combination of alleles is the one from an ordinary PCA, indicated by the vertical dashed line on the right [of the screeplot]". I could take that value as my 100% variance and calculate the percentage explained by my Axis 1 on the sPCA... but I'm still confused because doing this to just a couple of principal components and then combining them would exceed 100% of variance explained.

Thanks for any help you can give me!

Ette Etuk

Each principal component(PC) has a corresponding eigenvalue of the variance-covariance matrix or the correlation matrix. The first PC corresponds with the highest eigenvalue. The sum of the eigenvalues is the variance. The proportion of the variance accounted for by the first PC is the ratio of the highest eigenvalue to the sum of the eigenvalues, and so on.

Isa Baba Koki

The principal components (PCs) with the largest Eigen value are the most important and explain the larger variation in the data set. The eigen value is the measure of importance of the PCs. The first and second PC usually accounted for the total variance in the entire data set. You can equally consider the eigen vectors of the variables under the PCs to determine and evaluate the parameter loading/significance.

Kindly go through the attached article for your reference purpose.

Thaer Dawood Salman

I agree with previous answers.

Alberto Fameli

Thank you Ette Etuk and Isa Baba Koki for your kind contributions!

I understand those concepts in a normal PCA, but I've read papers more related to my particular case (spatial PCA) where authors directly take the value "variance" from the table I've shown (for my case in the first axis "1.0088656", as shown in the original post) and consider that as the proportion of variance explained. The thing is, my first axis is explaining more than 100% of the variance, which doesn't make any sense. Principal components in spatial PCA are not the same as in regular PCA, as they are a product of variance and autocorrelation.

State of art in natural disasters?

Looking for TEM images of osteoclasts under CC BY license: could somebody provide it to me?

Is there a commercially available cell line from human cancer-associated fibroblasts (CAFs) isolated from malignant pleural mesothelioma or lung ca ?

How can the resistance to movement exerted by fine (non-sandy) sediments be calculated?

How to Resolve Error in AutoDockTools ?

Can I use glycerol as a preservation medium for plant samples that will be scanned in a CT Scan?

I need to identify this geophilomorph?

Is biodiversity a continuous function?

Q-TwisT (Quality adjusted survival analysis)?

Does anyone know the Gitelson formula parameters to enter for each species in the CCM-300 chlorophyl meter?

How to learn more about SPSS and its Application?

Can I base on reverse DNA sequences to perform alignment, convert to amino acids and GenBank submission?

Using OBD technique i am trying to measure laser induced shockwaves velocity i found that at start velocity increases and then decay?

Baseline drift in HPLC? What causes this?

Text-Communication from the M1 Hand Area using BCI—and then there is Elon Musk?

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

How can I use the cif data obtained from rietveld refinement extracted via gsas2, for microstructural analysis using ETEX software?

How are iso-frequency contours plotted?

How to fix errors in my heat transfer steel structure with reinforced concrete slab model Abaqus?

How to prepare the nanoparticle treated fungal sample for Environmental SEM analysis?