Dear all,

I have a long list of ordered factor level combinations A = {a1...an}, .... E = {e1...ek} which is non-exhaustive (e.g. not full-factorial) e.g.:

A   B  C   D  E

...

a1 b1 c1 d1 e1

a1 b1 c1 d1 e2

a2 b1 c1 d1 e1

...

Now I want to merge all entries which differ only by 1 factor at a time (e.g. E) into a pattern. In the example this should lead to

A   B   C  D   E

...

a1 b1 c1 d1 {e1,e2}

a2 b1 c1 d1 e1

...

Now my problem is to do that for all factors. In the example keeping A fixed, one can not simple merge the two entries to:

{a1,a2} b1 c1 d1 {e1,e2}

since this implies that also the combination

a2 b1 c1 d1 e2

was part of the original factor combinations - which it wasn't.

My first try was to only merge entries where all levels of the fixed factor were present and replace the according pattern with a wildcard,e.g. for E = {e1,e2,e3}

a3 b2 c4 e1

a3 b2 c4 e2

a3 b2 c4 e3

becomes

a3 b2 c4 *

Since in this case I know that E is not important for the combination of factors A to C. But this approach is unsatisfactory since it leaves a lot of entries unmerged (e.g. the example in the beginning will not be merged).

So, could someone point me to a direction where a solution to this problem might be found (e.g. graph/subset reduction, also thought of bioinformatic methods treating the factor combinations somehow as strings).

Any help would be very welcome!

Greetings, David

More David Jule Mack's questions See All
Similar questions and discussions