If log transformation fails to normalize data , which one is better for me-- to apply inverse transformation nor to apply again log transformation on transformed data.
The transformation depends on the type, severity, and direction of skewness. Therefore, each type should be treated differently. Statistical transformation is mentioned below (with the source).
However, the best thing researcher can do toward normality of the data is to check why data are not normally distributed!? is it the sample size, is it the sampling technique, is it the presence of errors in data collection, entry..etc., outliers and extreme values are common reason for skewness. there are many techniques to handle such issues before transformation.
Resolving the problem
Some transformation options are offered below. Before using any of these transformations, determine which transformations, if any, are commonly used in your field of research. These transformations are what you should first use.
Check the data for extreme outliers. Double-check that these outliers have been coded correctly. Extreme outliers may be the result of incorrect data entry (or computation). If you find outliers that were created by incorrect data entry, correct them. You will then want to re-test the normality assumption before considering transformations.
The primary attribute for deciding upon a transformation is whether the data is positively skewed (skewed to right, skew > 0) or negatively skewed (skewed to left, skew < 0).
Positively skewed data may be subject to a "floor," where values cannot drop lower (nearly everybody scores near 0% correct on a test). Negatively skewed data may be subject to a "ceiling,"
where values cannot rise higher (nearly everybody scores near 100% correct on a test).
Skewness may also be discerned from the variable's characteristics across groups. If group means are positively correlated with group variances (or standard deviations), the data may be positively skewed. If group means are negatively correlated with group variances, the data may be negatively skewed.
The secondary attribute to consider is whether the variable contains negative values or zero. Many transformations cannot be applied to negative or zero values. In these cases, a constant, such as 1,
is added to the variable before the transformation is applied.
Logarithmic transformation - Use if:
1) Data have positive skew.
2) You suspect an exponential component in the data.
3) Data might be best classified by orders-of-magnitude.
4) Cumulative main effects are multiplicative, rather than additive.
This transformation cannot be performed on non-positive data. The base of the logarithm is essentially arbitrary (results will only differ by a linear, multiplicative factor), though the most common
bases are e, 10, and 2.
COMPUTE NEWVAR = LG10(OLDVAR) .
COMPUTE NEWVAR = LG10(OLDVAR+1) .
COMPUTE NEWVAR = LN(OLDVAR) .
COMPUTE NEWVAR = LN(OLDVAR+1) .
Square Root transformation - Use if:
1) Data have positive skew.
2) Data may be counts or frequencies.
3) Data have many zero's or extremely small values.
4) Data may have a physical (power) component, such as area vs. length.
This transformation cannot be performed on negative data.
COMPUTE NEWVAR = SQRT(OLDVAR) .
Reciprocal transformation - Use if:
1) Data have positive skew.
2) Data may have been originally derived by division, or represents
a ratio.
The variable should not have values close to zero. This transformation cannot be performed on non-positive values.
COMPUTE NEWVAR = 1 / OLDVAR .
COMPUTE NEWVAR = 1 / (OLDVAR+1) .
Exponential transformation - Use if:
1) Data have negative skew.
2) You suspect an underlying logarithmic trend (decay, attrition, survival ...) in the data.
This transformation can be performed on negative numbers. Dependingon the range of values, this transformation is the most powerful in reducing negative skew. The exponential base is not trivial -
it can affect the characteristics of the transformed variable.
COMPUTE NEWVAR = EXP(OLDVAR) .
COMPUTE NEWVAR = 2 ** OLDVAR .
Power transformation - Use if:
1) Data have negative skew.
2) Data may have a physical (power) component, such as area vs. length.
Usually, data is raised to the second power (squared). Other, higher, powers are also possible. The choice of power exponent is not trivial. Try to choose a power that reflects an underlying physical reality. This transformation cannot be performed on negative values.
COMPUTE NEWVAR = OLDVAR ** 2 .
COMPUTE NEWVAR = OLDVAR ** 3 .
Arcsine transformation - Use if:
1) Data are a proportion ranging between 0.0 - 1.0 or percentage from 0 - 100.
2) Most data points are between 0.2 - 0.8 or between 20 and 80 for percentages.
This transformation yields radians (or degrees) whose distribution will be closer to normality.
The transformation depends on the type, severity, and direction of skewness. Therefore, each type should be treated differently. Statistical transformation is mentioned below (with the source).
However, the best thing researcher can do toward normality of the data is to check why data are not normally distributed!? is it the sample size, is it the sampling technique, is it the presence of errors in data collection, entry..etc., outliers and extreme values are common reason for skewness. there are many techniques to handle such issues before transformation.
Resolving the problem
Some transformation options are offered below. Before using any of these transformations, determine which transformations, if any, are commonly used in your field of research. These transformations are what you should first use.
Check the data for extreme outliers. Double-check that these outliers have been coded correctly. Extreme outliers may be the result of incorrect data entry (or computation). If you find outliers that were created by incorrect data entry, correct them. You will then want to re-test the normality assumption before considering transformations.
The primary attribute for deciding upon a transformation is whether the data is positively skewed (skewed to right, skew > 0) or negatively skewed (skewed to left, skew < 0).
Positively skewed data may be subject to a "floor," where values cannot drop lower (nearly everybody scores near 0% correct on a test). Negatively skewed data may be subject to a "ceiling,"
where values cannot rise higher (nearly everybody scores near 100% correct on a test).
Skewness may also be discerned from the variable's characteristics across groups. If group means are positively correlated with group variances (or standard deviations), the data may be positively skewed. If group means are negatively correlated with group variances, the data may be negatively skewed.
The secondary attribute to consider is whether the variable contains negative values or zero. Many transformations cannot be applied to negative or zero values. In these cases, a constant, such as 1,
is added to the variable before the transformation is applied.
Logarithmic transformation - Use if:
1) Data have positive skew.
2) You suspect an exponential component in the data.
3) Data might be best classified by orders-of-magnitude.
4) Cumulative main effects are multiplicative, rather than additive.
This transformation cannot be performed on non-positive data. The base of the logarithm is essentially arbitrary (results will only differ by a linear, multiplicative factor), though the most common
bases are e, 10, and 2.
COMPUTE NEWVAR = LG10(OLDVAR) .
COMPUTE NEWVAR = LG10(OLDVAR+1) .
COMPUTE NEWVAR = LN(OLDVAR) .
COMPUTE NEWVAR = LN(OLDVAR+1) .
Square Root transformation - Use if:
1) Data have positive skew.
2) Data may be counts or frequencies.
3) Data have many zero's or extremely small values.
4) Data may have a physical (power) component, such as area vs. length.
This transformation cannot be performed on negative data.
COMPUTE NEWVAR = SQRT(OLDVAR) .
Reciprocal transformation - Use if:
1) Data have positive skew.
2) Data may have been originally derived by division, or represents
a ratio.
The variable should not have values close to zero. This transformation cannot be performed on non-positive values.
COMPUTE NEWVAR = 1 / OLDVAR .
COMPUTE NEWVAR = 1 / (OLDVAR+1) .
Exponential transformation - Use if:
1) Data have negative skew.
2) You suspect an underlying logarithmic trend (decay, attrition, survival ...) in the data.
This transformation can be performed on negative numbers. Dependingon the range of values, this transformation is the most powerful in reducing negative skew. The exponential base is not trivial -
it can affect the characteristics of the transformed variable.
COMPUTE NEWVAR = EXP(OLDVAR) .
COMPUTE NEWVAR = 2 ** OLDVAR .
Power transformation - Use if:
1) Data have negative skew.
2) Data may have a physical (power) component, such as area vs. length.
Usually, data is raised to the second power (squared). Other, higher, powers are also possible. The choice of power exponent is not trivial. Try to choose a power that reflects an underlying physical reality. This transformation cannot be performed on negative values.
COMPUTE NEWVAR = OLDVAR ** 2 .
COMPUTE NEWVAR = OLDVAR ** 3 .
Arcsine transformation - Use if:
1) Data are a proportion ranging between 0.0 - 1.0 or percentage from 0 - 100.
2) Most data points are between 0.2 - 0.8 or between 20 and 80 for percentages.
This transformation yields radians (or degrees) whose distribution will be closer to normality.
I will prefer log-log transformation. However, all methods have to be tried until the needed results are achieved. Sometimes, it calls for a different methodology to be used, rather than concentrating on normalizing data.