24 May 2021 1 232 Report

I have proteomics data that have missing values. What is the best way to impute the data if my data is not normally distributed?

I have 4 treatment groups, and I have scenarios where one treatment group has proteins that no other group has, and another scenario where one group is completely missing proteins that the other 3 group has.

I also have scenarios where 7/10 of my samples (let's call this sample A) have values and 7/10 of my other samples (sample B) have missing values. How do you impute this?

Is it valid to say that sample B has no protein detection since 7/10 samples are NaN and the other 3 samples were random hits?

If so, is there a way to make sample A impute with an average/normal distribution of ONLY that group (rather than the whole data set), and sample B to make all the values 0?

Any suggestions/links/tutorials would be greatly appreciated!

More Shi Li's questions See All
Similar questions and discussions