what is statistical ethics in scientific research about outlier? Should we remove or include the outlier data? For example when we calculating ionic currents.
or What are other methods to refine data in electrophysiology?
I agree with Dr. Refik Kanjhan that outliers must be treated with extreme care. If it can be determined that an outlying point is in fact erroneous, then the outlying value should be deleted from the analysis (or corrected if possible).
In some cases, it may not be possible to determine if an outlying point is bad data. Outliers may be due to random variation or may indicate something scientifically interesting. In any event, we typically do not want to simply delete the outlying observation. However, if the data contains significant outliers, we may need to consider the use of robust statistical techniques.
It first depends on what makes a value an "outlier":
if you know that the observation is faulty, the value must be discarded (even it it looks ok).
if the value is logically/physically/biologically/physiologically very implausible or even impossible, the value must be discarded.
if the value a bit off from the place of other values, but none of the former options is given, it will further depend on the impact of the value on the analysis, and this in turn depends largely on the sample size. If the imact is large it might be adviseable to report and compare the analyses with and without this suspicious value.
After all, "outliers" (values with no hint of a faulty observations, no hint of implausibility, but yet far away from the rest) should be rare. If they are not rare then there might be a fundamental problem with the experiment, the data handling, or the interpretation (e.g. the distribution model of the variables is not as expected; log-normal and Cauchy-distributed variables "seem" to have many outliers when you expect something like a normal distribution). So in this case you should not think about what to do with the "outliers" - you should rather think about the experiment, the data handling, and the interpretation of the variables.
Rare outliers are not a problem statistically. It happens that one gets "outliers", this is almost ineviteable if one only looks at enough data (makes many experiments, many measurements...). Given the distributional model of the variable is correct, the appropriate statistical methods do consider the existense of such outliers. So there is no need to remove them. It is actually wrong to exlude them, because - statistically- this would lead to an underestimation of the variance, standard errors, confidence intervals, and p-values. This may only be recognizeable (and irritating) in small data sets. In large data sets, however, rare outliers won't have any considerable impact anyway.
The irritating fact in small data sets is that often the existence of "outliers" seem to "distrurb" the message. People are inclined to remove them because the standard errors are then smaller, everything lokks nicer, the effects are clearer. However, given the correctness of the assumed distributional model, having large standard errors in this experiment is the price one has to pay to not under-estimatethe standard errors generally. If one would exclude such values, this particular experiment would look better, but the procedure ("let's have a look and throw out values that look like outliers") will bring about standard errors that are too small on average.
The outliers must be treated with extreme care, as they can be extreme from being very important to being an artefact. First of one has to be sure that the outlier(s) are not an artefact for whatever reason. If the outlier is not an artefact, then the sampling or n=? becomes very important. For example by increasing data points you may discover a new population or subpopulation. I believe outliers should not be included in the same population if they are changing the values significantly, but rather mentioned seperately in the manuscript, and explained why that data was not included. History is full of examples when one man's artefact has become another scientist fame.
I agree with Dr. Refik Kanjhan that outliers must be treated with extreme care. If it can be determined that an outlying point is in fact erroneous, then the outlying value should be deleted from the analysis (or corrected if possible).
In some cases, it may not be possible to determine if an outlying point is bad data. Outliers may be due to random variation or may indicate something scientifically interesting. In any event, we typically do not want to simply delete the outlying observation. However, if the data contains significant outliers, we may need to consider the use of robust statistical techniques.
The cells are good, a current of 5pA/pF is physically possible. Why should you want to remove this value?
Removing this value would mean that you are willing to throw away 10% of your data. If you believe that 1 out of 10 values is really "bad" (not just accidentally far away from the clumping rest), then I would doubt that your experiment is reliable.
It does NOT look far away from the others when the squared values are used, as you can see from the attached normal-quantile-plots of your data.
So the interesting question is: may it make sense that the errors in your observations are related to the squared current - that requires some more understanding of the subject matters. It may really provide a new view or new insights. If this turns out to be interesting, then this poor little value you considerd to kick out provided the only valueable information about all that!
Statistics provide tools to make sense of data - and that requires thinking. Statistics is a waste if used to justify removing outliers to perform "standard procedures".
Are there any other hints? Is it due to size of the cell or just less Na current for the same size (pF) cell? Most likely the resting membrane potential of this cell (neuron?) is very depolarized close to the level of inactivation of voltage-gated Na-channels. One possibility is that this cell/neuron may be a young developing/under developed/immature/newly differentiated neuron/cell. Another possibility is that it is not a Na-current but rather a Ca-current, do you have pharmacological block? 10-fold reduction in Na current for a healthy normal/developed neuron is alot, very unusual, and the amplitude of action potential will be probably less than 10 mV and that will unlikely succeed to pass any info to the postsynaptic cell. If that is the case then I would not include this value in the analysis, but I would mention/explain it in text (if it is not due to depolarized resting membrane potential).
Actually I work on cardiac cells. cells with low value looks fine and recording too. I was just wondering that if there is one or two very small or large value compare to rest of data, should we include it or remove it ?
Certainly if we remove or include those very small or very large, it affects on mean and statistic significance. so I am confuse about that data.