I want to see if wastewater irrigation affects in-household water contamination. My sample size is 60 (30 wastewater irrigation farming households and 30 non-wastewater farming households). I want to know exactly how much wastewater irrigation activity contributes to the in-house drinking water contamination in farming households by balancing the confounding factors. I collected data for the following variables for all 60 households.
· age (continuous numeric)
· gender (male/female)
· income (continuous numeric)
· Educational level (categorical)
· sanitation facility (yes/no)
· Hand-washing behaviors
· personal hygiene
· domestic hygiene
· environmental hygiene
· drinking water storage in the house (cover, size)
· drinking water withdrawal from in-house storage
· E.coli count from Point of source and point of use
I am considering multiple regression models and propensity score matching methods, but I am a little bit confused.
This is what I am thinking: The E.coli count is the outcome variable, exposure to wastewater is the treatment variable, and the rest variables are covariates.
1. Is the data size (total 60) enough to use PSM? I am afraid it is too small.
2. Which is appropriate?
3. If I choose the regression model, should I use the count data as it is and apply the poison regression model? Or change it to acceptable and not-acceptable based standards and use multiple logistic regression?
4. I there any other equivalent method?
Thank you for your assistance.