Linear Regression For Missing Value Imputation?

More Shivi Bhatia's questions See All

P-nitrophenol acetate assay for esterase and media supernant colour interference. what can be done?

I am performing p nitrophenol acetate assay for esterase enzyme. My media contains a different concentration of peptone (0.5, 2, 5%) and after centrifugation the supernatant is slight yellow....

16 June 2024 4,381 0 View

Can anyone recommend a company that does LC-MS/MS for silver stained gels where I can outsource my samples for proteomics analysis?

Looking for most cost-effective manner to outsource my samples (silver stained gels) for proteomics analysis using LC-MS/MS. If anyone has outsourced samples outside or in INDIA for LC-MS/MS and...

21 April 2024 309 2 View

What could be the cause of getting such blots?

my protein is a nucleolar protein which is tagged with HA, FLAG and myc tag. i have over-expressed it and good GFP expression was observed which is directly proportional to the amount of my...

11 January 2024 2,057 2 View

What is new in Menstrual Health and Hygiene in terms of marketing?

I am researching the taboo behavior around women's menstrual health and hygiene, and exploring ways to shift it towards marketing through better research design.

16 December 2023 2,530 1 View

When I published my article to a journal, it was not Scopus indexed. Now the journal is indexed in Scopus. Will my paper also be indexed now?

The paper was published one year back in a peer-reviewed DOAJ indexed journal by Elsevier. Now the journal has been indexed in Scopus. Will my previously published article in that journal be also...

11 December 2023 4,932 4 View

What are R packages for detecting peaks and peaks area for HPLC chromatogram?

I am performing snake venom fractionation using Reverse-phase HPLC. It generates multiple peaks with different area under the curve. Since I'll be analyzing multiple chromatograms at different...

08 November 2023 9,134 0 View

Can anyone guide how to use the minitab software for optimization studies ?

I want to learn minitab software and how to use that software for optimization studies of the enzyme. How we can find the actual and predicted values from the software.? how we can make Plackett...

23 September 2023 5,983 2 View

Looking for a Topic for my Phd project research which could solve a Problem?

I enrolled for a PhD Programme in Computer Science. For my research work I am looking for a topic to choose which could make an impact and solve a Business problem. my area of interest are Data...

19 March 2023 8,842 4 View

What could be reason for my protein bands not resolving properly?

I have performed venom fractionation using HPLC and run the samples in SDS PAGE. I wanted to know what could be the possible reason for my protein bands to look like this. It should have been...

15 March 2023 3,542 6 View

If I have to perform a spectrophotometric assay using p-nitrophenyl acetate, then the standard graph should be made of p-nitrophenyl??

the standard graph of p-nitrophenyl acetate, how we can make it, which concentrations can be used?

04 August 2022 841 0 View

How can I prepare virus for a TEM or SEM imaging?

I have virus (viral hemorrhagic septicemia virus) in suspension and the experiment will not involve cells. What level of TCID50 is preferred?

11 August 2024 3,115 1 View

How to learn more about SPSS and its Application?

I would like to learn more about SPSS and Its application especially in regards to data analysis. Please suggest me how I can learn more about it. Thank you so much.

11 August 2024 9,101 4 View

Can I base on reverse DNA sequences to perform alignment, convert to amino acids and GenBank submission?

I have reverse sequences (AB1 format), can I base on reverse DNA sequences to perform nucleotide alignment, convert nucleotides to amino acids and deposit the sequence in GenBank database?

11 August 2024 5,138 1 View

Baseline drift in HPLC? What causes this?

Hello, Why do i see this baseline drift when i compare my blank (black) to the sample (blue)? Any suggestions as to why this happened? Thank you!

11 August 2024 3,770 4 View

Text-Communication from the M1 Hand Area using BCI—and then there is Elon Musk?

Willett, Shenoy et al. (2021) have developed a brain computer interface (BCI) that used neural signal collected from the hand area of the motor cortex (area M1) of a paralyzed patient. The...

10 August 2024 7,180 0 View

Handling Missing Data and Building a Predictive Model with Incomplete Information ?

I am developing a predictive model for a water supply network that involves 20 influencing points. However, I only have historical data for 10 out of these 20 points. I would like to know how to...

10 August 2024 4,005 2 View

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

I'm currently exploring the application of Python in textile engineering, specifically in areas like data analysis, process automation, and the development of smart textiles. I'm interested in...

10 August 2024 7,429 2 View

Is it possible to use the Fused Deposition Modeling (FDM) to additively manufacture interconnected porous structure generation of >100-200 micrometer?

Usually, additive manufacturing techniques like SEBM, SLS, and SLM are used for interconnected porous lattice structure generation with sizes of >100–200 micrometers. Can the Fused Deposition...

09 August 2024 7,892 0 View

How can I use the cif data obtained from rietveld refinement extracted via gsas2, for microstructural analysis using ETEX software?

09 August 2024 7,718 0 View

How to define an anisotropic material with asymmetric elastic compliance/stiffness matrix in ANSYS APDL?

I need to model an anisotropic material in which the Poisson's ratio ν_12 ≠ ν_21 and so on. Therefore, the elastic compliance matrix wouldn't be a symmetric one. In ANSYS APDL, for TB,ANEL...

09 August 2024 5,048 2 View

Miranda Mortlock

Imputation is a big area and in general I generally do not impute values, but leave the values missing. You sound like you have many variables, so why not use the actual data and then there is no doubt in any biases involved by imputation.

When you do a multivariate regression, you are assuming the variables are normally distributed. The variables may be of different magnitudes or units. The concept of using the normalized values relates to both of these points, (a) that the z scores are putting all variables on an equal footing so to speak with a mean of zero and a variance of 1 and any extreme values will be noted easily. Multicollinarity is a separate issue and for this you can look at the correlation matrix of the variables.

Harold Chike

I agree with Mortlock on both issues raised. However, if the available data are scanty,, you may not wish to ignore the data on the missing values. I do not see why the substitute input should be the most correlated. It could as well be least correlated. You will end up using the mean value. As for the issue of using z. values. it is the normal practice to transform the different units of research variables to one unitless. measure. The z. values have no units. That is the only way you can add , subtract .divide or multiply them and obtain a result that is unitless and also in conformity.

Thom Baguley

Using linear regression to impute missing data isn't a great idea. If you do it I would use all variables to predict the missing items not just the most correlated predictors.

A better approach is to use multiple imputation or full information maximum likelihood to impute the missing items as these procedures allow you to incorporate information from multiple variables and also take into account the error inherent in imputation (which otherwise leads you to underestimate noise in the data and inappropriately inflates sample size).

Chapter Dealing with missing data. Online Supplement 2 to Serious st...

Giovanni Di Franco

Hi Shivi,

Missing data represent a typical problem in social research both when the unit of analysis is the individual, and the data is constructed through survey or is territorial with data drawn from national and/or international statistical sources.

In addition to the atomistic assumption , data matrices also require completeness. To meet this assumption, information must be avaible for all the cases in the matrix on all the properties/variables for which information was originally collected. Missing data may depend on a variety of factors which are either of random or systematic nature. In the case of missing data due to random factors, we can identify two instances: (a)

completely random, and (b) partially random.

I suggest you read my article dedicated to the treatment of the missing data that I send in attachment.

Greetings

Shivi Bhatia

Thank you all for the excellent and very helpful answers above. Really appreciate.