Can a statistically significant predictor variable mask the predictive effects of other variables on the outcome variable?

More Aakash N Bodhit's questions See All

How to calculate linear absorption coefficients?

Currently I am studying the absorption behaviour of multilayer thin film sample using XRF.The thin film has 3 layers ( WC/Co as substrate, Ti(C,N) as middle layer and Al2O3 as top layer). Here...

26 June 2024 6,065 2 View

Please provide a research abstract on narcissism?

Baumeister

16 March 2024 4,525 2 View

Does any one know R package for least cost path and corridor analysis using genetic data?

Hi.. I want to do least cost path and corridor analysis from genetic data. I want to show possible dispersal corridor using hapotype (genetic data) for my target species. I know it can be done in...

12 March 2024 4,505 3 View

Request for submitting a Survey on the Knowledge and use of ICF in Pakistan?

Dear Concern, Hope you are doing well. I am conducting a research study on the *“Knowledge and Use of ICF for health needs assessment and rehabilitation: A Survey in Pakistan”* to investigate...

07 March 2024 367 0 View

Can high/low pH kill the bacteria?

I understand the extreme pH can kill the normal skin microflora bacteria. But how. can someone explain the process?

21 February 2023 10,049 4 View

Can primer dimers have size of 250 bp?

When analyzing my PCR results, I found apart from amplification of my expected band I also another unusually amplified band at 250 bp from both my control and recombinant strains. I believe that...

22 April 2022 9,213 3 View

Does using two different fluorescent dye with same primar can change the fragment length of an allele?

I was using fluorescent dye PET with a SSR primer and the fragment peak was at 180 and 184 (in first set of 48 accessions), and now i am using dye SUN with same primer used for PET, giving the...

25 February 2022 2,978 0 View

How to prepare the input file for PERMUT?

How to prepare the input file for software PERMUT to calculate Gst and Nst. I have 14 haplotypes from 15 populations (6-8 individuals in each population) and i need to know the presence of...

14 September 2021 3,916 0 View

What happens if you haven't recorded the verbal consent taking process?

Hello! Hope you are doing well. I am currently doing a research with the unorganized sector workers. I am using interviewing as my method. Because of the unique nature of my participant...

03 August 2021 2,640 3 View

How to calculate suitable and not suitable habitat area in square Km (MaxEnt output) using QGIS ?

I want to calculate suitable and not suitable habitat area in square Km of my species and i don't have licence for ArcMap. Is it possible to calculate area using QGIS?

05 August 2020 9,140 4 View

How to learn more about SPSS and its Application?

I would like to learn more about SPSS and Its application especially in regards to data analysis. Please suggest me how I can learn more about it. Thank you so much.

11 August 2024 9,101 4 View

Can I base on reverse DNA sequences to perform alignment, convert to amino acids and GenBank submission?

I have reverse sequences (AB1 format), can I base on reverse DNA sequences to perform nucleotide alignment, convert nucleotides to amino acids and deposit the sequence in GenBank database?

11 August 2024 5,138 1 View

Baseline drift in HPLC? What causes this?

Hello, Why do i see this baseline drift when i compare my blank (black) to the sample (blue)? Any suggestions as to why this happened? Thank you!

11 August 2024 3,770 4 View

Text-Communication from the M1 Hand Area using BCI—and then there is Elon Musk?

Willett, Shenoy et al. (2021) have developed a brain computer interface (BCI) that used neural signal collected from the hand area of the motor cortex (area M1) of a paralyzed patient. The...

10 August 2024 7,180 0 View

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

I'm currently exploring the application of Python in textile engineering, specifically in areas like data analysis, process automation, and the development of smart textiles. I'm interested in...

10 August 2024 7,429 2 View

How can I use the cif data obtained from rietveld refinement extracted via gsas2, for microstructural analysis using ETEX software?

09 August 2024 7,718 0 View

How are iso-frequency contours plotted?

Let's say we have a standard, regular hexagonal honeycomb with a 3-arm primitive unit cell (something like the figure attached; the figure is only representative and not drawn to scale). The...

07 August 2024 1,937 1 View

How to prepare the nanoparticle treated fungal sample for Environmental SEM analysis?

A fungal strain was treated with nanoparticles. We want to do an environmental SEM analysis. So could anyone share your views on preparing the sample? Thank you.

07 August 2024 5,307 1 View

How to normalize and take the significance of the MTT OD values with 3 replicates for the same cell-line?

Hi, I have a question about normalizing the MTT OD values for doing the statistical analysis. So, if we have 3 different plates and we call them 3 different replicates, so, first we would...

07 August 2024 8,106 4 View

Is there an alternative to a multinomial regression which allows the DV to be non mutually exclusive?

I am trying to analyse data from a survey examining what variables affect teachers perceived barriers to incorporating technology into their classroom. I have 5 predictor variables however my DV...

06 August 2024 1,752 3 View

Andrew Messing Popular answer

A couple of (I hope) at least someone relevant, if not helpful, notes:

1) It is vitally important to distinguish between statistical significance (in the sense of p values and whatnot) and predictive power, or at least to understand how these relate and what they mean. For example, I can probably perfectly predict that, given a person who doesn't drink water, that person won't be an alcoholic. I can likewise show that drinking water and alcoholism are (statistically significantly) correlated. However, I cannot build a model that predicts alcoholism given that a person drinks water. This is why null hypothesis significance testing (the "if p

Imam Salehudin

I see that you have checked the collinearity between predictors. If no multicollinearity exist (shown if VIF higher than 10), I don't think any such masking would occur. If you still think the masking exist and don't want to eliminate other predictors, I suggest you to employ factor analysis prior to the regression and use the factors as predictors since factor analysis should produce (near) uncorrelated factors.

Andrew Messing

Bruce Weaver

That sounds like "negative confounding". Usually, people think of confounding as masking (or attenuating) the true association. But in some cases, a confounder can have the opposite effect. Try Googling . Also take a look at this article by MacKinnon et al.

http://link.springer.com/article/10.1023/A:1026595011371

HTH.

James R Knaub

Aakash - I would be hesitant to transform continuous variables to categorical ones, as it would seem to me to be throwing away possibly some of your best information. Looking at the interesting comments above, I have a few observations: I'm not big on VIF factors or significance because they are rather nebulous. (A p-value is a function of sample size, so changing your sample size can change your conclusions.) Scatterplot graphs are very informative. Negative confounding does sound like an interesting topic to research. - Finally, I think you may want to look at the impact your proposed model changes have on the variance of the prediction error. Graphing those results may be very informative. - Thanks - Jim

Jochen Wilhelm

@Andrew Messing: I would give 7 up-votes :)

One more advice: In models with continuous predictors it is recommended to use the centered (or even standardized) values, especially when interactions are part of the model.

And a least advice: Consult a statistician.

As Rafael indicated, it would be interesting to hear something about sample sizes used here.

Jayadevan Sreedharan

Dear Aakash

You did not mention about the sample size. For any study, the size of the sample is very important. As we know in many studies the fallacy of small sample size is not considering. If your sample size is very small and when distributing that to different cells, the cell value will be very small. I think both way you have to think. One is multicollinearity and the other is sample size.

Roshini Sooriyarachchi

Yes. A confounding variable is a variable which is a risk factor for the disease (response) and is associated with the risk factor but is not a consequence of the risk factor. When this confounding variable is not adjusted for the association between the risk factor and disease can be highlighted or masked.

Robert C Elston

One more point. One can have multiple association without multiple co-linearity. One must check for linearity (or any other assumptions) in the final model that is produced. And, as others have stated, prediction is not the same as causation. The current issue of the American Statistician has a series of articles on Simpson's paradox which is well worth reading.

Piotr Skórka

Why do not you try using multimodel inference based on, for example, Akaike criterion? Having competing models you might have estimated relative importance of each variable, averaged estimates of function slopes and confidence intervals... and you would not need to calculate P values. Moreover, you would be able to see how variables behave in different models.

One question more. How did you check multicollinearity between categorical variables? What I would have done first, is building simple logistic regression models between continuous covariates and independent categorical variables to see if they respond in similar way as your original dependent variable (inication of redundancy in dataset).

Brendan J. Morse

You may want to look into the effects of "suppressor variables" here. I have attached an excerpt from an excellent multivariate statistics book by Tabachnick and Fidell that discusses the role of suppression in multiple regression models.