Dear all,
I have a long list of ordered factor level combinations A = {a1...an}, .... E = {e1...ek} which is non-exhaustive (e.g. not full-factorial) e.g.:
A B C D E
...
a1 b1 c1 d1 e1
a1 b1 c1 d1 e2
a2 b1 c1 d1 e1
...
Now I want to merge all entries which differ only by 1 factor at a time (e.g. E) into a pattern. In the example this should lead to
A B C D E
...
a1 b1 c1 d1 {e1,e2}
a2 b1 c1 d1 e1
...
Now my problem is to do that for all factors. In the example keeping A fixed, one can not simple merge the two entries to:
{a1,a2} b1 c1 d1 {e1,e2}
since this implies that also the combination
a2 b1 c1 d1 e2
was part of the original factor combinations - which it wasn't.
My first try was to only merge entries where all levels of the fixed factor were present and replace the according pattern with a wildcard,e.g. for E = {e1,e2,e3}
a3 b2 c4 e1
a3 b2 c4 e2
a3 b2 c4 e3
becomes
a3 b2 c4 *
Since in this case I know that E is not important for the combination of factors A to C. But this approach is unsatisfactory since it leaves a lot of entries unmerged (e.g. the example in the beginning will not be merged).
So, could someone point me to a direction where a solution to this problem might be found (e.g. graph/subset reduction, also thought of bioinformatic methods treating the factor combinations somehow as strings).
Any help would be very welcome!
Greetings, David