06 June 2019 3 7K Report

I am trying to aggregate a large data set by two identification variables (i.e., id1 and id2). My aggregate command in R drops the entire row if missing a value on any of the columns. For instance, even though row #1 has been observed on all, but one column (missing on only 1 of 20 columns), row #1 is dropped during the aggregation process. I was wondering if there is a way to aggregate (get mean or sums) the rows by the two identification variables without dropping the entire row for missing on 1 or 2 columns.

Here is my current R code for aggregating:

dfagg1

Similar questions and discussions