In multiple linear regression analysis, what is the best criterion to select the covariates?

More Carlos Martínez's questions See All

What is recommended statistical test to compare survival curves estimated by the Cox method when there is a mixture of several types of covariates?

Continuous and non-continuous covariates.

03 April 2014 7,218 1 View

How to learn more about SPSS and its Application?

I would like to learn more about SPSS and Its application especially in regards to data analysis. Please suggest me how I can learn more about it. Thank you so much.

11 August 2024 9,101 4 View

Can I base on reverse DNA sequences to perform alignment, convert to amino acids and GenBank submission?

I have reverse sequences (AB1 format), can I base on reverse DNA sequences to perform nucleotide alignment, convert nucleotides to amino acids and deposit the sequence in GenBank database?

11 August 2024 5,138 1 View

Baseline drift in HPLC? What causes this?

Hello, Why do i see this baseline drift when i compare my blank (black) to the sample (blue)? Any suggestions as to why this happened? Thank you!

11 August 2024 3,770 4 View

Text-Communication from the M1 Hand Area using BCI—and then there is Elon Musk?

Willett, Shenoy et al. (2021) have developed a brain computer interface (BCI) that used neural signal collected from the hand area of the motor cortex (area M1) of a paralyzed patient. The...

10 August 2024 7,180 0 View

Handling Missing Data and Building a Predictive Model with Incomplete Information ?

I am developing a predictive model for a water supply network that involves 20 influencing points. However, I only have historical data for 10 out of these 20 points. I would like to know how to...

10 August 2024 4,005 2 View

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

I'm currently exploring the application of Python in textile engineering, specifically in areas like data analysis, process automation, and the development of smart textiles. I'm interested in...

10 August 2024 7,429 2 View

How can I use the cif data obtained from rietveld refinement extracted via gsas2, for microstructural analysis using ETEX software?

09 August 2024 7,718 0 View

How are iso-frequency contours plotted?

Let's say we have a standard, regular hexagonal honeycomb with a 3-arm primitive unit cell (something like the figure attached; the figure is only representative and not drawn to scale). The...

07 August 2024 1,937 1 View

How to prepare the nanoparticle treated fungal sample for Environmental SEM analysis?

A fungal strain was treated with nanoparticles. We want to do an environmental SEM analysis. So could anyone share your views on preparing the sample? Thank you.

07 August 2024 5,307 1 View

How to normalize and take the significance of the MTT OD values with 3 replicates for the same cell-line?

Hi, I have a question about normalizing the MTT OD values for doing the statistical analysis. So, if we have 3 different plates and we call them 3 different replicates, so, first we would...

07 August 2024 8,106 4 View

Jochen Wilhelm Popular answer

Honestly, since you are asking for "the best criterion": expertise. Just expertise. No selection rule based solely on statistics is scientifically generally good. It can maximize some long-run fit-measure or minimize some long-run risk rate, but this does not neccesarily coincide with scientifically good models. Those can onl be justified by expert judgement.

Jochen Wilhelm

Guillermo Enrique Ramos

I think that Jochen has a good point.

But when we have many variables, it is good to employ the statistical authomatic methods of selection in the first aproximation. Then we will see that some variables are unuseful, and so we may reduce the scope to consider each of the rest of the variables in detail applying common sence and expertise. CP and Stepdisc are good automatic methods and I use both always.

Andrew Ekstrom

You can use something called best subsets. It is like a stepwise method.

If you have lots of covariates to test, you need to watch out for correlation among them. You will need to remove variables that have a VIF over 10. You also need to worry about interactions among terms in your model.

Sometimes you can use your intuition to find the appropriate terms. By using the best subsets method, you will find multiple models that fit your data. If most of these models say the same terms are significant, and those terms are not highly correlated to each other, I would go with that. I always listen to my data not tell it what to do.

Mahfuz Judeh

I agree with Andrew. You can use the" best subsets". Also you should remove any multicollinearity among the independent variables.

Bruce Weaver

Here are some comments that are critical of algorithmic methods of variable selection (including "stepwise").

http://www.stata.com/support/faqs/statistics/stepwise-regression-problems/

http://biostat.mc.vanderbilt.edu/wiki/Main/ManuscriptChecklist -- see "Multivariable Modeling Problems"

Carlos Martínez

Please check: Article Methods and Tools for Bayesian Variable Selection and Model ...

Nabi Shah

step-wise analysis