Is there a way to compare the multiple imputations used in R Sofware to impute missing values?

More Farideh Bagherzadeh Khiabani's questions See All

Is there any way to produce the same random numbers in R when using Linux or Windows?

I'm getting slightly different random numbers depending on the OS (Windows vs Linux) although I have specified the seed using set.seed. Is there anyway to guarantee reproducibility across...

02 March 2018 2,298 11 View

Does any one know how to calculate the bias and variance of a logistic regression model?

I have fitted a logistic regression model in R. I am not sure how to estimate the bias and variance of this model.

01 February 2017 4,190 5 View

Is there an R function to find the position of a vector of elements in another vector?

I am looking for an R function like the following one: f

07 August 2016 3,569 2 View

Do you know any available PhD position in data-mining/machine-learning?

Dear Colleagues Does anyone know any available phd scholarship in the field of data mining concerning clinical applications and bioinformatics (university ranking < 100)? I have a good...

31 December 2015 4,045 0 View

Where can I find a data set containing the two variables: "Prostate-specific antigen" and "prostate cancer"?

Where can I find a data set containing the two variables: "Prostate-specific antigen" and "prostate cancer"? The dataset has been used in a lot of studies by Vickers such as: Prostate-specific...

31 December 2015 7,585 3 View

Which one is superior: Save, dput, dump in R?

I used to save the result of my analysis (like an imputed data, ...) using save () and load it by load(). I recently came across dput() and dump(). Are they superior to save()?

09 October 2015 1,665 0 View

Any advice on a data set with both numerical and categorical variables and a two-class response?

I need a data set containing both numerical and categorical variables and a two-class outcome to be used in examples of my R package. Do you know any well-known one?

09 October 2015 5,125 1 View

Do you know any R Package for search algorithms?

I need a couple of search algorithms. Does anybody know a package containing a lot of search algorithms such as stepwise, hill climbing, genetic search?

08 September 2015 2,564 0 View

While importing a package in another Package in R, do I have access to the hidden functions ?

Hi, I am writing a package and I need the hidden functions of another package. If I import that package while writing my package, do I have access to its hidden functions?

06 July 2015 482 6 View

How can I import SPSS data into R and retain both labels and values?

I need to import SPSS data into R and retain both the values and value labels for the variables. The read.spss() function from foreign package gives me option to retain either values OR value...

04 May 2015 2,445 1 View

Posthoc test lettering in JAMOVI?

Does anyone know of a module for the JAMOVI software that is capable of generating mean separations using the classic letters based on post hoc results (e.g., Tukey test)? If, as I believe, such...

31 July 2024 3,333 4 View

All math can be explained by iterator of code?

all math can be traversed by code? all math can be translate to code?

26 July 2024 9,530 0 View

How do I access .vcf files without an R statistical package?

I am currently working on a mendelian randomization study, and I have downloaded the datasets needed from the ieu opengwas project (mrcieu.ac.uk) in .vcf format. I do not have access to an R...

19 July 2024 2,342 5 View

How to decide whether the refinement is correct or not, based on Rwp and Rexp factors by Fullprof?

One of the papers I read by Toby, where (title of the paper was "R factors in Rietveld analysis: How good is good enough?"), he tells us that to get good chi square value, you must have low Rwp,...

17 July 2024 9,668 4 View

Which book and outline do you recommend for computational physics course for BS level ?

students already took 1. numerical methods 2. programming language 3. Probability and statistics

09 July 2024 6,271 3 View

Why does our stiff biochemical ODE model in R produce unreasonable results (negative values, NAM) despite using solvers like lsoda, vode, and rk4)?

We have developed an ODE model comprising 25 interrelated equations with common coefficients. This biochemical model, applied in wastewater treatment, is characterized by stiffness. Utilizing the...

06 July 2024 7,077 4 View

Which is better for the student : Implementing the principles of object-oriented programming using Java or C++?

Object-Oriented Programming

29 June 2024 4,877 12 View

How to design an online training, learning platform ?

when designing an e-learning platform what model and programming language do you select?

29 June 2024 7,504 4 View

Can we use MTCMOS instead of CMOS for designing SRAM Latch ?

Basically CMOS are used for designing the SRAM cells . How the functionality will differ if we use Multi threshold CMOS instead of CMOS generally MTcmos decrease the leakage power but it will...

28 June 2024 9,845 0 View

How to reconstruct original observations using PCA?

I ran PCA on 4 variables using the prcomp library. All variables were normalized to have a mean of zero and a standard deviation of one (z-score) before the PCA. prc 1 and I performed a varimax...

26 June 2024 6,792 1 View

Dan E Kelley

Could you define "impute"? You use that word in many of your questions, and the meaning seems to differ from question to question. You may get more answers if people understand what you want to know. It is a good idea to be as specific as possible; in this question, for example, I haven't the foggiest idea what you may mean, no matter what possible synonym I try for "impute".

Farideh Bagherzadeh Khiabani

I mean replacing the missing values with proper substitutions!

Like for example, you can use the mean of a variable instead of the missing values in that variable. There are lots of other methods possible by commands in R, like irmi(), kNN(), .... . I need a way to compare these methods to see which one is the best for my data.

As one of the ways to compare these methods,I extracted the complete records of my data. I created missing values randomly in this data,and then imputed them by lots of methods, then for each of the variables I subtracted the real values from the imputed ones, and I plotted the box plot to get an idea which method gives better approximations of my data!

There is a problem to this approach though, and that is the missing values created by me here are completely at random, which in my real data that might not be the case!

I want to see if anyone can come up with a better idea so that I can figure out which method of imputation is the best for my data!

Thanks for the clarification. You now probably have enough information for someone to help you. Good luck!

John Christie

This thread will turn into a mess if it gets too long because tracking that first second will be problematic. So, it owuld be best if you could edit or delete the question and re-ask.

The imputed values are not completely at random. There has to be noise in your data. That part is random, you have to include that in the imputed values. But, the random value will be constrained by features of the model you apply, which comes back to your question. Those different imputation methods are different methods following different models of the data. They're not just different ways to get numbers, they express various things about what you think the data mean and how they can be modelled or represented more simply. Therefore, it's hard to quantify best in terms of measurement.

Perhaps you really need to ask the, "given research design X, missing values Y, and model M, what would be the best way to impute missing values?"

I`m not sure if I`m right. But I did achieve what I was looking for by these charts