Soil samples were taken from two locations which differ by the values of several variables (x1, x2, etc). There were N1 samples from location 1 and N2 samples from location 2. The total number of species detected in the study was K. For each (i-th) species, the number of samples in which it was found was recorded at both locations: n1 and n2, respectively. The goal is to evaluate how the probability of being found in a sample differs from location 1 to location 2 (i.e. compare n1(i)/N1 vs n2(i)/N2), and how this depends on the predictor variables x1, x2. Importantly, this should be done not only for each i-th species separately, but also for groups of species (e.g. for a given genus with several species, or for a group of species with a common property). Is logistic regression appropriate for these problems?