I need to establish if there is a link between 2 columns from two different datasets with one matching column, where;
Dataset1: bipartite: (M, DS)
M DS
m23 ds3
m23 ds67
m54 ds325
... ...
Dataset2: tripartite: (M, G, DG)
M G DG
m23 g6 dg32
m23 g8 dg1
m54 g32 dg65
... ... ...
These 2 datasets have one column in common(i.e., **M**), and the relationship among the elements is shown below:
```
M ----affects----> G
M ----causes-----> DS
DG ----affects----> M
```
Primary Goal: To calculate the probability of a possible link/edge that might exist between indirectly related columns(eg. **DG** and **DS**) via the common column(**M**).
So, for a given list of DS entries, how to find the probability of the existence of a link/edge between
selected DS, and all the other DGs
```
DS DG
```
If DS; (ds3, ds67) were selected, the output should be like this:
element1 - element2 - probability/statistical value to signify the existence of direct relationship OR link.
```
ds3 - dg32 - 100% (common M value)
ds3 - dg1 - 100% (common M value)
ds3 - dg65 - 43.66%
---
ds67 - dg32 - 100% (common M value)
ds67 - dg1 - 100% (common M value)
ds67 - dg65 - 55.12%
```
I am trying to code this in Java, but Python based solutions can work too.
I am sorry I am not too familiar with graph theory, a little descriptive solutions would be really appreciated.
Thanks.