I have a set of disease cases in the polygon form as an attribute of each city. There are some 180 cities (polygons) that 2-5 of them recorded more than 300 cases, about 100 of them contain 0-2 cases and the rest recorded 2-20 disease cases. I'm going to evaluate the possible correlation between illness and some environmental factor such as temperature, precipitation, etc.
However, the distribution of the disease data is severely non-normal and violates many statistical methods' assumptions.
Do you have any suggestion in this case?