Should all variables (DVs & IDVs) have normal distribution to run regression analysis (sample is more than 100)?

How to learn more about SPSS and its Application?

I would like to learn more about SPSS and Its application especially in regards to data analysis. Please suggest me how I can learn more about it. Thank you so much.

11 August 2024 9,101 4 View

Can I base on reverse DNA sequences to perform alignment, convert to amino acids and GenBank submission?

I have reverse sequences (AB1 format), can I base on reverse DNA sequences to perform nucleotide alignment, convert nucleotides to amino acids and deposit the sequence in GenBank database?

11 August 2024 5,138 1 View

Baseline drift in HPLC? What causes this?

Hello, Why do i see this baseline drift when i compare my blank (black) to the sample (blue)? Any suggestions as to why this happened? Thank you!

11 August 2024 3,770 4 View

Text-Communication from the M1 Hand Area using BCI—and then there is Elon Musk?

Willett, Shenoy et al. (2021) have developed a brain computer interface (BCI) that used neural signal collected from the hand area of the motor cortex (area M1) of a paralyzed patient. The...

10 August 2024 7,180 0 View

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

I'm currently exploring the application of Python in textile engineering, specifically in areas like data analysis, process automation, and the development of smart textiles. I'm interested in...

10 August 2024 7,429 2 View

How can I use the cif data obtained from rietveld refinement extracted via gsas2, for microstructural analysis using ETEX software?

09 August 2024 7,718 0 View

How are iso-frequency contours plotted?

Let's say we have a standard, regular hexagonal honeycomb with a 3-arm primitive unit cell (something like the figure attached; the figure is only representative and not drawn to scale). The...

07 August 2024 1,937 1 View

How to prepare the nanoparticle treated fungal sample for Environmental SEM analysis?

A fungal strain was treated with nanoparticles. We want to do an environmental SEM analysis. So could anyone share your views on preparing the sample? Thank you.

07 August 2024 5,307 1 View

How to normalize and take the significance of the MTT OD values with 3 replicates for the same cell-line?

Hi, I have a question about normalizing the MTT OD values for doing the statistical analysis. So, if we have 3 different plates and we call them 3 different replicates, so, first we would...

07 August 2024 8,106 4 View

Is there an alternative to a multinomial regression which allows the DV to be non mutually exclusive?

I am trying to analyse data from a survey examining what variables affect teachers perceived barriers to incorporating technology into their classroom. I have 5 predictor variables however my DV...

06 August 2024 1,752 3 View

Bita Mashayekhi

By the law of large numbers and the central limit theorem, the ordinary least squares (OLS) estimators in linear regression technique still will be approximately normally distributed around the true parameter values, which implies the estimated parameters and their confidence interval estimates remain robust. Hence, in a large sample, the use of a linear regression technique, even if the dependent variable violates the “normality assumption” rule, remains valid.

Ref: Article Are Linear Regression Techniques Appropriate for Analysis Wh...

Mohamed A. Omran

I see. Thank you

James R Knaub

Mohamed -

Data have no 'normality' requirement for using regression. It is the estimated residuals, or better, the random factors of the estimated residuals in weighted least squares regression, which would ideally be close enough to being normally distributed that the central limit theorem would help when estimating prediction intervals, but that is usually a rather weak requirement.

What is important are the estimated residuals, which are distributed vertically in a residual analysis graph.

Normality can help, but it is not a big requirement, and is for the estimated residuals, not the dependent and independent variables.

Cheers - Jim

FYI: In Applied Regression Analysis and Generalized Linear Models, 2nd ed, 2008, John Fox, Sage Publications, Inc, on page 196, in a footnote, he states that if you add the assumption of normality to those of the Gauss-Markov Theorem, which address the "errors," then the least-squares estimator can be shown not only to be the best linear unbiased estimator, BLUE, but also the best of all unbiased estimators. Bonus!

For that, Fox references, as an example, page 319 in C.R. Rao(1973), Linear statistical inference and its applications, 2nd ed., New York, Wiley.

Thank you.

Graphical residual analyses are your best tool. Here are two links for that from Pennsylvania State University:

https://newonlinecourses.science.psu.edu/stat501/node/277/

and

https://newonlinecourses.science.psu.edu/stat501/node/279/

It may also be good, if you have enough data, to save some data not used for model selection or estimating regressor coefficients, and then see how well you would have 'predicted' them. This may help you to avoid overfitting your model to a specific data set.

[Also, you do not want to model data together which actually fall under separate models. Does one model really apply to everything, or are there separate categories to be considered? Dummy variables? Should multilevel modeling be considered?]

Thanks for your clarification and valuable information.

Regards,

Mohamed