28 September 2021 6 5K Report

Hello,

I need some help in figuring out the best way to analyse a set of data for my research project.

The project that aims to investigate consistency of distance travelled in repeat missing children. My IV is the number of missing incidents in a series (i.e. 2 incidents, 3 incidents, 5 incidents and so on; I have these as intervals i.e. 4-10 incidents, 11-20 incidents and so on, as well as a string variable i.e. number of missing incidents rather than intervals of numbers) and my DVs are six distance intervals for which I have computed similarity scores using Jaccard's coefficient. The question I am looking at is whether consistency in distance travelled increases or decreases as the number of incidents in the series increases (one way to think about it is whether someone with 3 missing incidents is more or less likely to travel within the same distance interval than is someone with 15 missing incidents? OR is there a difference between the mean distance similarity score of someone with 3 missing incidents and someone with 15 missing incidents?).

For each case in a series I have coded the distance travelled as 1(yes) or 0(no) for each distance interval (i.e. if during an incident an individual travelled between 0-5 miles, this variable is coded with one, and the value is 0 for all other intervals). I then calculated the Jaccard's coefficient for that case for each of the distance intervals, as well as a mean Jaccard across all distance intervals. I have also calculated the mean Jaccard's coefficient for all cases in a series per each of the different distance intervals.

One option for analysis is MANOVA. However, about 4.6% of my data (700 series of missing incidents) are outliers (as shown by the Mahalanobis distance test). I am aware MANOVA is sensitive to outliers, however I would not want to exclude these all together from the analysis. Should I run the analysis with both the outliers and without them, and report both? Or what is the best way to work around this? Is it absolutely necessary that I delete the outliers?

Also, my data seems to best fit an exponential distribution, how may this effect the results of the MANOVA?

Do people have any suggestions about other tests that may be appropriate for investigating this relationship between number of missing incidents and consistency in distance travelled?

Anticipated thanks to anyone who takes time to read this.

Similar questions and discussions