The term “spurious correlation” refers to a high correlation that is actually due to some third factor. Consider some statistical dataset, where both input factors and output parameter are binary. Let's say that in a statistical dataset:

· factor A takes the value 0 M0 times, of which the output parameter takes the value 1 N0 times

· factor A takes the value 1 M1 times, of which the output parameter takes the value 1 N1 times.

In this case Risk Ratio for factor A is defined as RR = (N1/M1)/(N0/M0). Suppose, that RR >>1. How to research, is output parameter really depends from factor A or it is only due of some spurious correlation ?

More Sergey Porotsky's questions See All
Similar questions and discussions