It is important that replicate data be examined for presence of outliers always. Where available, the outlier should be expunged before determining nmber of results (N), Mean (X) and Standard Deviation (SD) etc.

While working on a replicate data recently, I observed that for any set of data (N = 3), provided that the two values/results are exactly the same, the third value/result will by default be an outlier, regardless of how big or small it is from the other two values.

Let's say we have the following replicates from a given analyses/measurements:

A). 3.0, 3.0, 6.0

B) 3.0, 3.0, 1.0

C) 3.0, 3.0, 2.9

Now, in data A, it is obvious that 6.0 is an outlier, and similarly in B, 1.0 is an outlier by mere looking. This can also be confirmed by using Dean & Dixon's Test (Q-Test).

In replicate C, 2.9 may seems not be an outlier due to its closeness in value to the 3.0. In another words, the difference between 3.0 and 2.9 is not significant and thus 2.9 may be considered not an outlier. However, a check on 2.9 using Q-test will prove that 2.9 is an outlier.

Now, my questions are;

  • Is visual inspection of data enough to spot an outlier?
  • What is the best practice when you have a scenario like this, giving that regardless of how big or small the third value is, when tested using Q-test it will be an outlier?
  • Is there any exception on this peculiar case?
  • More Naziru Imam's questions See All
    Similar questions and discussions