Using Back transformation

More Timothy A Ebert's questions See All

What is the interpretation of different measures of kurtosis?

I have been working in R and Excel. I have the following data: 23, 24, 36, 44, 2431, 543, 43, 25, 64. In R the kurtosis is 6.51838 and in Excel the kurtosis is 7.84453 (both programs give more...

07 August 2018 2,202 2 View

What sample size in transformed data?

I ran an experiment. I analyzed the data after log transformation. I got a non-significant result. I took the mean and SD from the transformed data and entered it into G*Power. The estimated...

03 April 2018 3,361 10 View

Does the Box-Cox transformation make studies more or less comparable?

Does a Box-Cox transformation make it easier or more problematic to compare the outcomes of multiple research projects? (Here is a hypothetical situation, if that helps to answer the...

10 November 2017 6,842 6 View

What do you do with a standard deviation or standard error?

When you read a scientific article, what do you do with the information when the authors report standard deviation or standard error? Do you care which is reported? Just to be clear: I already...

10 November 2015 7,307 15 View

Is it valid to do statistical analysis using data from monotonically increasing sequence?

I examine feeding behavior of an insect for several hours. One behavior is phloem ingestion. It has been postulated that the first instance of phloem ingestion is on average shorter than the...

10 November 2014 9,760 6 View

How to learn more about SPSS and its Application?

I would like to learn more about SPSS and Its application especially in regards to data analysis. Please suggest me how I can learn more about it. Thank you so much.

11 August 2024 9,101 4 View

Can I base on reverse DNA sequences to perform alignment, convert to amino acids and GenBank submission?

I have reverse sequences (AB1 format), can I base on reverse DNA sequences to perform nucleotide alignment, convert nucleotides to amino acids and deposit the sequence in GenBank database?

11 August 2024 5,138 1 View

Baseline drift in HPLC? What causes this?

Hello, Why do i see this baseline drift when i compare my blank (black) to the sample (blue)? Any suggestions as to why this happened? Thank you!

11 August 2024 3,770 4 View

Text-Communication from the M1 Hand Area using BCI—and then there is Elon Musk?

Willett, Shenoy et al. (2021) have developed a brain computer interface (BCI) that used neural signal collected from the hand area of the motor cortex (area M1) of a paralyzed patient. The...

10 August 2024 7,180 0 View

Handling Missing Data and Building a Predictive Model with Incomplete Information ?

I am developing a predictive model for a water supply network that involves 20 influencing points. However, I only have historical data for 10 out of these 20 points. I would like to know how to...

10 August 2024 4,005 2 View

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

I'm currently exploring the application of Python in textile engineering, specifically in areas like data analysis, process automation, and the development of smart textiles. I'm interested in...

10 August 2024 7,429 2 View

How can I use the cif data obtained from rietveld refinement extracted via gsas2, for microstructural analysis using ETEX software?

09 August 2024 7,718 0 View

How are iso-frequency contours plotted?

Let's say we have a standard, regular hexagonal honeycomb with a 3-arm primitive unit cell (something like the figure attached; the figure is only representative and not drawn to scale). The...

07 August 2024 1,937 1 View

How to prepare the nanoparticle treated fungal sample for Environmental SEM analysis?

A fungal strain was treated with nanoparticles. We want to do an environmental SEM analysis. So could anyone share your views on preparing the sample? Thank you.

07 August 2024 5,307 1 View

How to normalize and take the significance of the MTT OD values with 3 replicates for the same cell-line?

Hi, I have a question about normalizing the MTT OD values for doing the statistical analysis. So, if we have 3 different plates and we call them 3 different replicates, so, first we would...

07 August 2024 8,106 4 View

Derek Ouyang

Hi,

I usually dont do back transformation as the log transfermation at the begaining would shrink the differences among your observations thus shrinking the sd as well, so when you transfer back, you may not get what you thought you could get. I would just interpret as the log of x is balabala. But i guess different people have different ways to deal with it

Sergiy Prykhodko

Dear Timothy,

We use normalizing transformations and the back transformations (the transformations are inverse to the normalizing transformations) for solving some problems, for example, anomaly detection in non-Gaussian data. I hope the link below will help to solve your problem.

https://www.researchgate.net/publication/282329599_Statistical_anomaly_detection_techniques_based_on_normalizing_transformations_for_non-Gaussian_data

In additionally, as the normalizing transformation, we recommend to use the Johnson transformation (the Johnson translation system) that has been applied by us for solving many problems.

Conference Paper Statistical anomaly detection techniques based on normalizin...

Mehmet Guven Gunver

@Yongdong,

you are right that different people have different ways to deal with transformation.

how about a new approach which provides new descriptives in statistics, instead of transformation

Article TO DETERMINE SKEWNESS, MEAN AND DEVIATION WITH A NEW APPROAC...

Robert L. Vadas

I've not had problems doing back-transformations, for the purpose of data presentation. The main issue has been getting the formulas (to do so) right.

-Bob

Timothy A Ebert

@Bob, I used a log transformation. What is the right formula to back-transform? My simple attempt seemed to fail. The question could be asked in a different way. I am going out to the field to measure fruit weight. The only help I have is from a single paper that reported back-transformed (log (fruit weight)) values. I gather my fruit weights, find the average and discover that the published value and my value are not very similar. Is the published value wrong?

If you used a log10(x) transformation, then the back-transformation is 10**(x) . But if you used ln(x), then go w/ e**(x).

Re: the published back-transformed values, such mean values won't match the untransformed mean unless the data are perfectly Gausian ('normal'), b/c you transform the data before taking the average. Hence, the back-transformed means should be less biased than if no transformation was used. Did I answer your question?

-Bob Vadas, Jr.

Oops, I need to be more exact in my symbology. Call the transformed value y. So if you used a log10(x) transformation, then the back-transformation is 10**y. But if you used ln(x), then go w/ e**y.

It gets trickier if you used a you used a log10(x+1) transformation, b/c the back-transformation is (10**y) -1. Likewise, if you used ln(x+1), then go w/ (e**y) - 1.

Could you elaborate on the bias that the back transformed values have less of?

Could you please elaborate of the nature of the bias that you refer to?

If a parameter isn't normally distributed, then the untransformed ave. will be biased, b/c that violates the Gaussian assumption when you calculate an arithmetic mean. Let's say that there's 1 high outlier, then such a mean will overestimate the central tendency. But if you take a log transformation, then the back-transformed mean will be lower in value & not overly influenced by the high outlier. Does that make sense? Feel free to try this out on data sets at home!

With 1 high outlier I have a few options. It is a mistake and should be deleted. It is a real value and my sample size is insufficient to properly estimate the variability in the data. In this case the data may be Gaussian but I have a poor estimate of the true mean because of small sample size. Possibly I have a large sample size and an arithmetic mean is a poor predictor of the central tendency of the data. I am having trouble with RG, so I will need to return to this in a day or two when I hope the problems will have been solved. I'll have a data set at that point.

Well, you could do a Q-test to objectively determine if the data pt. is a real outlier. But if it isn't, then log transformation reduces statistical bias and makes use of parametric methods reasonable. If there's a general skewing of the data, then more than one data pt. may be outlying & the Q-test won't find a significant outlier. Then the data just aren’t Gaussian in distribution.

Mthunzi Mndela

I came across the same problem, where I transformed data using Ln and log10 for different data sets. When I back transform the means using ex (where e = 2.718 .....) and 10x, mean values tended to be lower than means of the original data. So in this case, I used the back transformed means although they do not normally mirror the original data. The main reason for this is that, in most of the cases you will notice that the standard errors decline after transformation compared to the ones for original data, even after you back transform them they become less than those of the original data, indicating a well dispersed data.