I have data from households, where two types of data were collected:

  • A validated questionnaire (Household Dietary Diversity Score (HDDS)), which gives a score between 0 and 12 for each household.
  • Anthropometric measurements of all children < 5 years old in each household = a measure of nutritional status for the children.

I now wish to perform a regression analysis with HDDS score as independent variable and the anthropometric measurements as dependent variable.

The challenge is: In some households there were more than one child, which results in more observations of nutritional status than households scores, and the observations for the nutritional status are not independent, as some of the children come from the same household.

One way to g about it, is to take a mean of the nutritional status for all children in one household, but I would rather not do this, as children of different ages are difficult to compare regarding nutrition.

So what do I do to make sure I respect the assumption of independent observation in this analysis?

In case it's just not possible to perform a regression on this data, please feel free to suggest another analysis!

Hope there's someone out there who can help.

More Janni Bækkelund Petersen's questions See All
Similar questions and discussions