Multivariate analysis: Can I perform multivariate analysis such as PCA,PLS-DA and clusterization methods using the significant variables as the input?

More Roberta R. Martins-Chaves's questions See All

What is homomenthyl cinnamate?

the molecule name is citated in old literature about organic sunscreen molecules

10 July 2024 5,576 2 View

2x=0.'. x=1/2*2x=0.'. x = 1/2 multiplica se pelo inverso dos elementos x?

Asked 1 minute ago 2x=0 1/2*2x=0 X=0 Multiplica se o inverso do elemento que multiplica x em sua nulidade (x=0)

01 July 2024 2,814 2 View

Do various drugs determine the composition and metabolic function of gut microbiota and therapeutics for Alzheimer’s disease?

The role of gut microbiota and diet has become a major concern for the health of Alzheimer’s disease (AD) individuals. The literatures shows that the microbiome composition has been linked to...

23 June 2024 9,503 1 View

Does Faecal microbiota transplantation induce liver diseases with relevance to treatment for diabetes and Alzheimer’s disease?

The literature shows that gut microbiota plays a significant role in the development and progression of liver diseases. Research suggests that a disturbance to the gut microbiome leads to hepatic...

21 May 2024 5,967 2 View

Does the anti-aging gene Sirtuin 1 need to be assessed for the effective use of CRISPR–Cas9 in clinical practice and the medical field?

1. Researchers highlight that they are the first to report on the function and applications of the CRISPR–Cas9 system. The authors explain how the CRISPR technology in clinical practice could be...

16 May 2024 7,017 0 View

Is anyone familiar with techniques that can be used to induce oviposition in female beetles (or insects in general)?

More specifically: We are working on breeding some mycophagous Erotylidae and I would be very interested in obtaining eggs for life cycle studies. Thanks

20 April 2024 9,785 1 View

Scotch tape Life Cycle Assessment or production environmental impacts?

Hi all, In the framework of a LCA of perovskite solar cells, I am looking for information about scotch tape, that is used to recover the metals deposited in some layers. Does anyone know of...

18 April 2024 1,678 0 View

Porque o pernilongo não transmite HIV?

É por causa que é pouca à quantidade de sangue amostrada ou alguma enzima digestiva?

11 April 2024 1,112 0 View

Why is my QX200 Droplet Reader for digital PCR (ddPCR) not recognizing the plate?

Good morning, I urgently need to perform ddPCR analysis on some samples using a QX200 Droplet Reader that's been sitting in my lab for a while. A technician came in last month to set it up and I...

09 April 2024 4,115 2 View

M=E/c^2 shadows could be a little energy and do not without “Matéria”?

Shadows could be a little energy and do not “without anything”?

30 March 2024 7,426 1 View

Do you know best conditions for formation of best REE deposits?

I want to know more about REE deposits.

03 August 2024 6,930 2 View

What is the best conditions for formation of REE ore deposits?

I want to know more about REE deposits.

31 July 2024 7,366 1 View

Which statistical test should we use?

N=6 Comparing pre and post test likert scale responses. Participants are mix of practicing & preservice teachers.

30 July 2024 7,233 4 View

Can you recommend provider Intermediate laboratory test for hydrocarbon, polyethylene & polypropylene?

Accredited laboratory of JV "Uz-Kor Gas Chemical" LLC (Uzbekistan) is looking for providers to conduct interlaboratory comparison tests on gas chromatographic analysis of hydrocarbons (component...

25 July 2024 8,108 0 View

Is a reliability test necessary in my survey on translations?

Dear all, I gave 116 respondents 18 translated sentences and asked them to indicate their levels of acceptance of these translations on a five-point scale. Some translations result from strategies...

24 July 2024 8,245 5 View

Best statistical test for three groups and binary dependent variable ?

Hello everyone, I am currently working on a project where I aim to analyze color preferences in data visualizations across participants from three different countries. The dependent variable in...

19 July 2024 4,178 2 View

Use of NC or NTC to descibe negative control?

Hello, I would like to ask which is the best terminology to use for describing a negative control in an experiment (drug testing in-vitro, viral infection): NC (negative control) or NTC? Thank you...

18 July 2024 4,183 2 View

How can reinforcement learning algorithms be effectively applied to study decision-making processes in neuroscience?

I am interested in exploring the intersection of neuroscience and AI, particularly through reinforcement learning for understanding decision-making in the brain. I also want to integrate...

17 July 2024 1,225 0 View

If I want to invent my own hypothesis testing method, where should I get started ?

15 July 2024 5,376 5 View

How do the complex dynamics of the oral microbiome interact with the host immune system affect periodontal disease development and treatment response?

This research question explores the complex dynamics of interactions between the oral microbiome and the host immune system, particularly in the development and treatment response of periodontal...

09 July 2024 7,922 1 View

Guillermo Quintas

Hi, in my opinion you can build a PLS model using 'discriminant' features, but the outcomes (predictive performance, etc) from that model will not be generalizable and overly optimistic as you might get a similar result using random labels. To avoid overfitting the feature selection should only be applied to the training set (so you would need a test test). I would suggest instead to use a model that allows adding a regularization term into a cost function. This way you can control the effect of uninformative features by minimizing their weights.

Catherine M G C Renard

HI, you are actually asking two different questions here, as PCA is only a data reduction method while PLS and clusterization aim to interpret the data.

You have 2 groups and only 20 samples all in all, so of course way too many compounds and potential confusions. And I would not rely exclusively on "significantly different" between the 2 groups to select data as, with 4k variables, it is more likely than not that you may have false positive (depending on how stringent you ran the statistical test).

My approach would be to do the PCA and look, is there some spontaneous separation of the 2 groups on one of the early principal compoments? Do your data "cluster" i.e. do you have groups of very correlated data? In which case you can probably simplify these groups to a single variable.

You have only 20 samples so when there are significant differences (72 variables) it might be worthwhile to actually plot and look, is the difference pulled by a few samples or is nicely repeatable in one group.

Anyway you look at it, you are in trouble... too many variables for few samples, this is going to require a lot of brain exercise. This is where statistical tools, however good they are, must leave way to knowledge and human intelligence. Do you have hypotheses as to the related biological mechanisms to help you sift through the data , and the predictive value, as said by Guillermo Quintas, will stay poor. Tentative I'd say, it might be a tool to help the practician but not totally relaible as a diagnostic.

Fawzan Sigma Aurum

Hi Roberta,

I guess you want to find the discriminative or important variables from your dataset, don't you?

Normally, we can do PCA analysis for that kind of fat dataset (variables >> samples) in the first place. It is useful to understand the natural separation in your dataset. Also, you can check the loading of the PCA, what variables that significantly drive such separation.

For further analysis PLS-DA may help a lot to find your VIPs from each sample class. However, considering your 4K variables, it might be a good option to compare several machine learning feature selection algorithms to find a good model as well as to determine your important features to get more reliable data.

Good luck!