I have very little background in statistics, so I need some advice. I know that Jaccard is used to compare the similarity (elements in data sets). But can I use the Jaccard coefficient (or something similar – any suggestions?) also to compare the overlapping of for example two symptoms (binary data, asymmetric) in a large sample? Should I then treat the patients as elements in the datasets of symptoms (so I count the yes/yes, yes/no, no/yes, while no/no is considered irrelevant)? For example I have 801 patients, and I want to know the overlapping of two (or more) symptoms (for example those who have nausea and also have vomiting). How would I calculate it? I took the phi coefficient (would it be better in this case to stick to it?) but I doubted – the symptoms are not necessarily correlated, but may simply co-exist. So I though that I just want to show the co-existence of symptoms. Should I simply show the percentages? Gratefully,

Liidia

Similar questions and discussions