How does one carry out a multiple imputation of a categorized continous variable? Recategorize after imputing continuous or impute into categories?

25 August 2016 1 1K Report

I am trying to impute a continuous variable prior to running logistic regression and having trouble with it. Multiple imputation of the continuous variable works perfectly, however, when I categorize it, after imputation but prior to adding to the logistic model, somehow my results change. It appears to be significant loss of power for some reason such that my highly significant variable falls out of significance. For instance, when trying to impute blood sugar values and then running a logistic model against a binary outcome, the parameter estimates are very comparable for all covariates including blood sugar with or without imputed data. However, when I categorize the blood sugar into 4 categories in the imputed dataset and then run it, the variable falls out of significance. If I run the logistic model excluding patients with missing value (pre-imputation) with blood sugar categories, I get very plausible, logical results (comparable to results from an entirely different dataset). Is it wrong to categorize the imputed variable in the imputed dataset?

James R Knaub

Shveta -

Imputing within categories would help make nonignorable nonresponse, more "ignorable;" that is, you would account better for the mechanism of nonresponse. If you are having a "significance" 'problem,' please note that p-values are sample size dependent, and your smaller sample sizes by category, if I understand your problem correctly, is what is giving you bigger p-values. That is why you need to pay attention to effect size, or a type II error analysis, and not set the same "significance" threshold for every case. (Note that confidence intervals and prediction intervals are less open to serious misinterpretation.)

Please see the following:

Press release for the American Statistical Association:

http://www.amstat.org/newsroom/pressreleases/P-ValueStatement.pdf

My letter in The American Statistician:

https://www.researchgate.net/publication/262971440_Practical_Interpretation_of_Hypothesis_Tests_-_letter_to_the_editor_-_TAS

Hope I did not misunderstand your question.

Cheers - Jim

PS - Actually, however, even though you are using multiple imputation to avoid artificially reducing variance, it still depends on the original data, so the standard errors you estimate for the above should not include imputed data, as that would still artificially increase the sample size. Breaking up by categories is good. That helps with "representativeness," but too much missing data is still a problem. - If you are imputing in the case of multiple independent variables and have missing data that varies in the number and/or identity of missing cases, and doing multiple imputation on each, so that sometimes for a given "i" you have one regressor, but not another, you might look at the prediction interval with and without imputed regressors, to get bounds on the 'true' prediction interval for each category. That's what I think I'd do with ordinary or weighted least squares regression. I suppose analogously with logistic regression, with which I have not worked, however.

Article Practical Interpretation of Hypothesis Tests - letter to the...

Badges
Science topic

More Shveta S Motwani's questions See All

What is MDA5/RNase-protection assay?

Hi, Can anyone explain me what is MDA5/RNase-protection assay and how it is helpful in analyzing the type of transposable elemnst expressed during dsRNA pathway? Thanks for your time!

08 May 2023 8,687 0 View

How much concentration of cellulase enzyme can be use for extraction of essential oils from citrus waste and keeping extraction time limited to 1hr?

Here feed will be 100g of citrus waste mainly contains peels of sweet-lime,oranges and lemon and there size is reduced to 1-5mm and to maintain pH(4-6) water is use as a buffer.

18 August 2022 5,093 0 View

Quantity of Resin and Fibers in Solid FRP Bar?

How to estimate the quantity of resin and fibers that would go in 1 metre of solid round cross-section of FRP bar with say 12 mm diameter. We need to estimate the production capacity and cost...

06 December 2020 4,562 2 View

How to do electrochemical surface characterization, i.e. how the size of particles will change the redox potential?

I have to do morphology characterization of the modified working electrode. At present, I only have CV instrument with me. Can anyone help if I can do electrochemical surface characterization,...

24 June 2018 3,114 4 View

How to make 2.0mM solution of polymer (such as PVF)?

Hi, I am trying to make 1000 mL of 2mM PVF solution. Can anyone suggest do I have to consider the mol wt of VF for determining the wight of PVF to be used.

31 October 2017 1,826 3 View

How to calculate thickness of the polymeric film using CV?

Hi, I have polymer film modified Pt electrode. How can I calculate the thickness of this polymeric film using CV? Thanks in adavance

29 October 2017 8,846 2 View

Problem in Running Subroutine in ABAQUS?

I am trying to link ABAQUS 6.13 with intel parallel studio 2013 and visual studio 2010. On opening abaqus i get following notifications. (please see attached image ) Whereas while running the...

22 October 2017 8,888 2 View

Can Hashin Damage Model for FRP laminates be Utilized in FRP Rebars ?

Hi, I am trying to model FRP strands (3D Continuum ) in ABAQUS. Since ABAQUS does not allow Hashin damage model to be utilized in 3D elements, I am planning to write a UMAT code for it. My doubt...

19 October 2017 4,550 0 View

For measuring the sensitivity of my modified electrode, how to calculate area?

I need to calculate sensitivity of modified Pt electrode. For this, I need slope (which I got from calibration curve) and area. Can anyone please tell, how to calculate area to be used here.

10 October 2017 8,872 6 View

How to calculate LOD and sensitivity from CV?

Hi I am trying to calculate LOD and sensitivity from the calibration curve I obtained using CA technique. Can anyone tell me how to calculate LOD and sensitivity?

04 October 2017 1,235 5 View

Request Python code?

Request Python code from this article : Gender equity of authorship in pulmonary medicine over the past decade. THANKS!

08 August 2024 6,242 2 View

Why does everyone use vs code?

Visual Studio Code (VS Code) has become a popular choice among developers for several reasons: 1. **Free and Open Source**: VS Code is free to use and open source, making it accessible to...

07 August 2024 7,013 4 View

I want to remove Urea from my protein sample?

I want to remove Urea from my protein (6kDa) sample. I did use : 1. Step down the Urea gradient (dialysis) in order to remove urea and refold my protein, but my protein still contains urea in the...

06 August 2024 1,350 4 View

Is an invitation to join the editorial board of Clinical Cardiology Updates a scam?

I received an e-mail invitation to join the editorial board of Clinical Cardiology Updates. While I have published a few articles related to cardiovascular disease, there are lots of colleagues...

06 August 2024 8,981 8 View

How can i do multivariate Time Series forecast using MLP, ANFIS and LSTM?

I need the python code to forecast what crop production will be in the next decade considering climate and crop production variables as seen in the attached.csv file.

05 August 2024 2,977 3 View

For an in-vitro drug release study, what molecular weight cut-off (MWCO) dialysis bag is required for a 117 kDa protein?

kindly reply me. Thanking you in advance.

05 August 2024 7,727 4 View

Radiogenomics Cancer Research Challenges?

what are the top 3 challenges to the advancement of the field of Radiogenomics in cancer research? is it the availability of easily available low-cost matched imaging and biosamples with clinical...

03 August 2024 5,828 4 View

What is the best blank for nanodrop if I want to read a recombinant protein concentration?

Is it the "elution buffer" or the "dialysis buffer"? Note: I'll be using NanoDrop OneC

01 August 2024 967 3 View

How are aPTT and PT values reduced after hemodialysis compared to before dialysis value in patients with chronic kidney disease?

I am conducting my research regarding blood coagulation in patients with CKD. I observed a reduction in post hemodialysis aPTT and PT values compared to pre hemodialysis values. But most of...

30 July 2024 3,467 0 View

Can anyone please provide me the full text article of this clinical Trial?

Roflumilast Cream Improves Signs and Symptoms of Plaque Psor...

29 July 2024 5,250 0 View