The main problems are: (i) what is the cause and what is the effect? and (ii) how is the independence assumption for evidential theory affected by the fact that evidence variables are not strictly independent. For example, problem (i), the intake of soft drink may cause hyperglycemia, but hyperglycemia may increase the thirst and if the person is unaware of his/her diabetes, then they may drink more soft drinks (instead of water). We modeled this as that if the hypothesis of hyperglycemia is true, then what is the likelihood of someone drinking soft drinks to quench their thirst (we also added the fact whether they were aware of their diabetes or not, the assumption is that a non-procrastinating diabetic do not drink soft drinks). The alternative, that drinking soft drinks causes hypoglycemia is difficult to specify, since we have to specify the likelihood of someone drinking soft drink during a period of time. Concerning problem (ii), our current attempt is to see evidence as independent (e.g., drinking soft drinks, increased body weight, increased frequency of diuresis, increased food intake), however, they are not truly independent, since increased food intake causes increased body weight and drinking more increased the frequency of diuresis.
Our current strategy to mitigate problem (ii), is to attempt to set the probabilities as if the evidence are independent and then to see if we can translate the probability of the hypothesis node to risk according to an expert in the field. If we can derive meaningful thresholds for possibilities of evidence, then we conjecture that the evidence variables are sufficiently independent in this case. For example, if we find that probability below 0.56 means no risk, [0.56,0,78) means risk and [0,78,1.0] means high risk and it reflects the diagnosis of an expert irrespective of how representative the probabilities are to state something about the world outside the scope of the diagnosis.
Our current strategy to mitigate problem (i), is to check if we in a meaningful way can specify probabilities, where we ask if it is 1 person in 10, 100, 1000, ..., that, if they suffer from hyperglycemia, are, for example, drinking soft drinks, eating more food, increasing their body weight, etc. Then, we reason that 1 in 10 is 0.1, 1 in 100 is 0.01 etc.
We will also look at other method based on evidential theories such as Dempster-Shafer. I added a Genie xdsl file of the last attempt made mainly by Dr Steinhauer and some extent by me that you can have a look at it. Note that we do not have probabilities from experts set correctly yet.
Questions:
(a) Are there any flaws in this reasoning?
(b) What are your thoughts on cause and effect of evidence vs hypothesis, in particular, in Bayesian Belief Networks?
(c) What are your thoughts on the independence of evidence variables?
(d) What are you thoughts on how to mitigate the problems?