REASONING UNDER
UNCERTAINTY
ITS661: Knowlegde-based systems
OUTLINE
Probability Theory
Bayesian Theory
Certainty Theory
This slide: http://bit.ly/ITS661-10-1
INTRODUCTION
Humans
knowledge is often inexact
Sometimes
we are only partially sure
about the truth of a statement and still
have to make educated guesses to solve
problems
INTRODUCTION
Some
concepts or words are inherently
inexact
Sources
of uncertainty:
Indefinite answer
Imprecise knowledge
Incomplete knowledge
INTRODUCTION
Several
approaches are available to deal
with uncertainty, namely:
Bayes Theorem
Certainty Factors
Dempster-Shafer Theory
Fuzzy Logic
INTRODUCTION
Uncertainty
Doubts, dubious, questionable or not surely
Ranges from a mere lack of absolute sureness to
such vagueness as to preclude anything more
than guesswork
INTRODUCTION
Uncertainty in AI wide range of situations
where the relevant information is deficient in one
or more of the following ways:
Information is partial
Information is not fully reliable (e.g. unreliable
observation of evidence)
Information comes from multiple sources and it is
conflicting
Information is approximate
TYPES OF ERROR
Many different types of error can contribute to
uncertainty
Different theories of uncertainty attempt to
resolve some or all of these to provide the most
reliable inference
TYPES OF ERROR
Ambiguity
something may be interpreted in more than one
way
Incomplete
some information is missing
Incorrect
the information is wrong
Measurement
error of precision and accuracy
TYPES OF ERROR
Unreliability
if the measuring equipment supplying the facts
is unreliable
the data is erratic
Random
error
lead to uncertainty of the mean
Systematic
error
one that is not random
instead is introduced because of some bias
PROBABILITY AND THE
BAYESIAN APPROACH
Probability Theory is the basis for Bayesian
Approach
proposes the existence of a number P(E)
probability likelihood of some event E occurring
from a random experiment
E.g : Rolling a die
Assuming a fair die, the probability of producing a
given event would be 1/6
PROBABILITY AND THE
BAYESIAN APPROACH
Sample space, S = {1,2,3,4,5,6}
P(E) = W(E)
N
Where :
W(E) denotes the no. of wins for a particular event
N denotes the no. of times the experiment is performed
0 P(E) 1
P(E) + P(~E) = ?
PROBABILITY AND THE
BAYESIAN APPROACH
Sample space, S = {1,2,3,4,5,6}
P(E) =
W(E)
N
Where :
W(E) denotes the no. of wins for a particular event
N denotes the no. of times the experiment is performed
0 P(E) 1
P(E) + P(~E) = 1
PROBABILITY AND THE
BAYESIAN APPROACH
Compound
Probability
P(A B) = n(A B) = P(A) * P(B)
n(S)
E.g :
A = probability of rolling an odd number
B = probability of rolling a number divisible by 3
PROBABILITY AND THE
BAYESIAN APPROACH
Conditional
probability permits us to
obtain the probability of event A given
that event B has occurred, but this is not
the case in Bayes Theorem
P(H|E) = P(H) * P(E|H)
P(E)
Where :
P(H|E)
P(H)
P(E|H)
P(E)
prob. that H is true given evidence E
prob. that H is true
prob. of observing evidence E when H is true
prob. of E
PROBABILITY AND THE
BAYESIAN APPROACH
P(E) can also be written as follows :
P(E) = P(E|H)*P(H) + P(E|~H)*P(~H)
Where :
P(E|~H) probability that E is true when H is
false
P(~H) probability that H is false
P(E|H) prob. of observing evidence E when H
is true
P(E)
prob. of E
PROBABILITY AND THE
BAYESIAN APPROACH
We
can also find the probability against
the hypothesis being true for the same
evidence, using the following equation:
P(~H|E) =
?_____
PROBABILITY AND THE
BAYESIAN APPROACH
We
can also find the probability against
the hypothesis being true for the same
evidence, using the following equation:
P(~H|E) = P(~H) * P(E|~H)
P(E)
PROBABILITY AND THE
BAYESIAN APPROACH P(H|E) = P(H) * P(E|H)
EXAMPLE 1
P(E)
P(~H|E) = P(~H) * P(E|~H)
P(E)
Patients with chest pains are often given an
electrocardiogram (ECG) test. Test results are
classified as either positive (+ECG) suggesting
heart disease (+HD) or negative (-ECG)
suggesting no heart disease (-HD). Assume
now that the patient has produced a +ECG
and we want to know how probable it is that
he has heart disease, that is:
P(+HD|+ECG)
PROBABILITY AND THE
BAYESIAN APPROACH
The following information apply in this case:
10 people out of hundred have heart disease
90 people out of 100 who have HD will produce +ECG
95 people out of 100 who have HD will produce
ECG
First, obtain the probability values:
P(+HD)
P(-HD)
P(+ECG|+HD)
P(-ECG|-HD)
=?
=?
=?
=?
P(+ECG|-HD)
=?
PROBABILITY AND THE
BAYESIAN APPROACH
First, obtain the probability values:
P(+HD)
= 10/100
= 0.1
P(-HD)
= 1 P(+HD)
= 1 0.1 = 0.9
P(+ECG|+HD) = 90/100
= 0.9
P(-ECG|-HD)
= 95/100
= 0.95
P(+ECG|-HD)
= 1 P(-ECG|-HD)
= 1 0.95
= 0.05
PROBABILITY AND THE
BAYESIAN APPROACH
Therefore;
P(+HD|+ECG)
PROBABILITY AND THE
BAYESIAN APPROACH
Therefore;
P(+HD|+ECG)
= P(+HD)*P(+ECG|+HD)
P(+ECG)
=
=
0.1 * 0.9
__
0.9*0.1 + 0.05*0.9
0.67
EXERCISE
Given:
P(B|A) = 0.7
P(A) = 0.25
P(B) = 0.41
find the probability of the event A given that event B
has occurred
EXERCISE
Given:
P(B|A) = 0.7
P(A) = 0.25
P(B) = 0.41
P(A|B) = P(A) * P(B|A)
P(B)
= 0.25 * 0.7
0.41
= 0.43
find the probability of the event A given that event B
has occurred
Answer = 0.43
CERTAINTY THEORY
Bayesian
requires a statistical basis rarely
found in the types of problems applied to
ES
Most
of the questions are subjective and
require the user to make a judgement
CERTAINTY THEORY
E.g.
Does the patient have a severe headache?
Instead of answering YES or NO, a user might give his
subjective interpretation, may be in percentage (i.e. 0.8)
The number is just an estimation not subject to the rules of
probability theory
CERTAINTY THEORY
CT grew out of the work on MYCIN
Has special significance in medical domain
because of the time constraint:
Infection blood disease can threaten the patients life
Doctors take several days to obtain complete & exact
results from tests. They often dont have this time,
therefore need to deal with incomplete info (as well as
inexact inference)
CERTAINTY THEORY
Through observation, MYCIN team found that :
Doctors often analyze the available information using
phrases such as probably, it is likely that .., it almost
certain that
The team later converted these terms into numbers such as
0.6, 0.8 etc
These numbers represent the doctors belief in the
statement
CERTAINTY THEORY
Given some evidence, doctors might only partially believe
some conclusion. Consider :
If
And
And
Then
A
B
C
D
CF = 0.8 (almost certain that
the conclusion is true)
When doctors belief in available evidence was less than
certain, CF(Ei) < 1, then belief in related inference was also
decreased
When doctors received evidence from multiple sources, then
he held a higher belief in the conclusion
CERTAINTY THEORY
Doctors often confronted with both positive and negative
evidence to balance his belief in a hypothesis
Use net belief which shows the difference between
measure of belief (MB) and measure of disbelief (MD)
CF(H) = ? ?
CERTAINTY THEORY
Doctors often confronted with both positive and negative
evidence to balance his belief in a hypothesis
Use net belief which shows the difference between
measure of belief (MB) and measure of disbelief (MD)
CF(H) = MB(H) MD(H)
Net belief
MB: collection of all information the supported the
hypothesis H
MD: collection of all information that rejected the
hypothesis H
RANGE OF CF VALUES
F
T
-1
Probably F
Range of
Disbelief
Unknown
0
Probably T
Range of
belief
PRACTICAL CERTAINTY MODEL
Consider the statement :
It will probably rain today
CF(E)
= CF(It will probably rain today)
= 0.6 degree to which we believe that it is going to rain
Representation in ES :
It will rain today CF 0.6
PRACTICAL CERTAINTY MODEL
CF can also be attached to rules to represent the
uncertain relationship between E and H
IF
THEN
There are dark clouds (E)
It will rain (H) CF 0.8
CF VALUE INTERPRETATION
CF -1.0 for
CF -0.8
CF -0.6
CF -0.4
CF -0.2 to 0.2
CF 0.4
CF 0.6
CF 0.8
CF 1.0
?
?
?
?
?
?
?
?
?
CF VALUE INTERPRETATION
CF -1.0 for
CF -0.8
CF -0.6
CF -0.4
CF -0.2 to 0.2
CF 0.4
CF 0.6
CF 0.8
CF 1.0
Definitely not
Almost certainly not
Probably not
Maybe not
Unknown
Maybe
Probably
Almost certainly
Definitely
CF PROPAGATION
Single Premise Rules (+ve E)
CF(H,E) = CF(E) * CF(RULE)
E.G :
IF
There are dark clouds
THEN It will rain
CF 0.5
CF 0.8
CF(rain, dark_clouds) = ?
Referring to CV value interpretation:
?
CF PROPAGATION
Single Premise Rules (+ve E)
CF(H,E) = CF(E) * CF(RULE)
E.G :
IF
There are dark clouds
THEN It will rain
CF 0.5
CF 0.8
CF(rain, dark_clouds) = 0.5 * 0.8 = 0.4
Referring to CV value interpretation :
It maybe rain.
CF PROPAGATION
Single Premise Rules (-ve E)
E.G :
IF
There are dark clouds
THEN It will rain
CF -0.5
CF 0.8
CF(rain, dark_clouds) = ?
Referring to CV value interpretation :
?
CF PROPAGATION
Single Premise Rules (-ve E)
E.G :
IF
There are dark clouds
THEN It will rain
CF -0.5
CF 0.8
CF(rain, dark_clouds) = -0.5 * 0.8 = -0.4
Referring to CV value interpretation :
It maybe wont rain.
CF PROPAGATION
Multiple Premises Rules
(conjunctive rules/using operator AND)
CF(H,E1 E2 Ei) = min{CF(Ei)} * CF(RULE)
E.G :
IF
There are dark clouds
AND The wind is getting stronger
THEN It will rain
CF(rain)
=?
CF 1.0
CF 0.7
CF 0.8
CF PROPAGATION
Multiple Premises Rules
(conjunctive rules/using operator AND)
CF(H,E1 E2 Ei) = min{CF(Ei)} * CF(RULE)
E.G :
IF
There are dark clouds
AND The wind is getting stronger
THEN It will rain
CF(rain)
CF 1.0
CF 0.7
CF 0.8
= min{CF(E1),CF(E2)} * CF(RULE)
= min{1.0,0.7} * 0.8
= 0.7 * 0.8
= 0.56 probably rain
CF PROPAGATION
Multiple Premises Rules
(disjunctive rules/using operator OR)
CF(H,E1 E2 Ei) = min{CF(Ei)} * CF(RULE)
E.G :
IF
There are dark clouds
OR
The wind is getting stronger
THEN It will rain
CF(rain)
=?
CF 1.0
CF 0.7
CF 0.8
CF PROPAGATION
Multiple Premises Rules
(disjunctive rules/using operator OR)
CF(H,E1 E2 Ei) = min{CF(Ei)} * CF(RULE)
E.G :
IF
There are dark clouds
OR
The wind is getting stronger
THEN It will rain
CF(rain)
CF 1.0
CF 0.7
CF 0.8
= max{CF(E1),CF(E2)} * CF(RULE)
= max{1.0,0.7} * 0.8
= 1.0 * 0.8
= 0.8 almost certainly will rain
INCREMENTALLY
ACQUIRED EVIDENCE
A method used to combine belief and disbelief
values established by rules concluding the same
H
E.G :
R1
E1 H
R2
E2 H
When we obtain supporting E for a given H from many
different sources, we should feel more confident in that H
INCREMENTALLY
ACQUIRED EVIDENCE
MB or MD of a newly acquired piece of evidence
(say E1) should be added proportionally to the
value determined from the earlier evidence (say
E1)
The new value is used to update the confidence in
H.
CFCOMBINE(CF1,CF2)
= CF1 + CF2*(1-CF1)
=
both CF > 0
CF1 + CF2
one CF < 0
1 min{|CF1|,|CF2|}
= CF1 + CF2*(1 + CF1)
both CF < 0
INCREMENTALLY
ACQUIRED EVIDENCE
E.g. Rain Prediction
Rule 1 (CF 0.8)
IF Weatherman says its going to rain (E1)
THEN
Its going to rain (H)
Rule 2 (CF 0.8)
IF Farmer says its going to rain (E2)
THEN Its going to rain (H)
INCREMENTALLY
ACQUIRED EVIDENCE
E.g.
Rain Prediction
Case 1 : Weatherman & Farmer Are
Certain in Rain
CF(E1) = CF(E2) = 1.0
Refer to CF Prop. For Single Premise Rule :
CF1(H,E1)
=?
CF2(H,E2)
=?
INCREMENTALLY
ACQUIRED EVIDENCE
E.g.
Rain Prediction
Case 1 : Weatherman & Farmer Are
Certain in Rain
CF(E1) = CF(E2) = 1.0
Refer to CF Prop. For Single Premise Rule :
CF1(H,E1)
= CF(E1) * CF(RULE1)
CF2(H,E2)
= 1.0 * 0.8
= CF(E2) * CF(RULE2)
= 1.0 * 0.8
= 0.8
= 0.8
INCREMENTALLY
ACQUIRED EVIDENCE
E.g.
Rain Prediction
Case 1 : Weatherman & Farmer Are
Certain in Rain
CFCOMBINE(CF1,CF2)
=?
INCREMENTALLY
ACQUIRED EVIDENCE
E.g.
Rain Prediction
Case 1 : Weatherman & Farmer Are
Certain in Rain
Refer to Equation for Both CF > 0 :
CFCOMBINE(CF1,CF2)
= CF1 + CF2*(1 CF1)
= 0.8 + 0.8*(1 0.8)
= 0.96
CF of a given H, which is
supported by >1 rule can be
incrementally increased by
acquiring supporting E from
both rules
INCREMENTALLY
ACQUIRED EVIDENCE
E.g.
Rain Prediction
Case 2 : Weatherman Certain in Rain,
Farmer Certain in No Rain
CF(E1) = 1.0, CF(E2) = -1.0
Refer to CF Prop. For Single Premise Rule :
CF1(H,E1)
=?
CF2(H,E2)
=?
INCREMENTALLY
ACQUIRED EVIDENCE
E.g.
Rain Prediction
Case 2 : Weatherman Certain in Rain,
Farmer Certain in No Rain
CF(E1) = 1.0, CF(E2) = -1.0
Refer to CF Prop. For Single Premise Rule :
CF1(H,E1)
= CF(E1) * CF(RULE1)
CF2(H,E2)
= 1.0 * 0.8
= CF(E2) * CF(RULE2)
= -1.0 * 0.8
= 0.8
= -0.8
INCREMENTALLY
ACQUIRED EVIDENCE
E.g.
Rain Prediction
Case 2 : Weatherman Certain in Rain,
Farmer Certain in No Rain
CFCOMBINE(CF1,CF2)
=
?
INCREMENTALLY
ACQUIRED EVIDENCE
E.g.
Rain Prediction
Case 2 : Weatherman Certain in Rain,
Farmer Certain in No Rain
Refer to Equation for One CF < 0 :
CFCOMBINE(CF1,CF2)
=
CF1 + CF2 __
1 min{CF1, CF2}
=
0.8 0.8 __ = 0
1 min{0.8, 0.8}
Prediction of rain has
been set to unknown
due to contradict
information
INCREMENTALLY
ACQUIRED EVIDENCE
E.g.
Rain Prediction
Case 3 : Weatherman & Farmer Believe at
Different Degree That Its Not Going To
Rain
CF(E1) = -0.8, CF(E2) = -0.6
Refer to CF Prop. For Single Premise Rule :
CF1(H,E1)
=?
CF2(H,E2)
=?
INCREMENTALLY
ACQUIRED EVIDENCE
E.g. Rain Prediction
Case 3 : Weatherman & Farmer Believe at
Different Degree That Its Not Going To
Rain
CF(E1) = -0.8, CF(E2) = -0.6
Refer to CF Prop. For Single Premise Rule :
CF1(H,E1)
= CF(E1) * CF(RULE1)
= -0.8 * 0.8
CF2(H,E2)
= -0.64
= CF(E2) * CF(RULE2)
= -0.6 * 0.8
= -0.48
INCREMENTALLY
ACQUIRED EVIDENCE
E.g.
Rain Prediction
Case 3 : Weatherman & Farmer Believe at
Different Degree That Its Not Going To
Rain
Refer to Equation for Both CF < 0 :
CFCOMBINE(CF1,CF2)
=?
INCREMENTALLY
ACQUIRED EVIDENCE
E.g.
Rain Prediction
Case 3 : Weatherman & Farmer Believe at
Different Degree That Its Not Going To
Rain
Refer to Equation for Both CF < 0 :
CFCOMBINE(CF1,CF2)
= CF1 + CF2*(1 + CF1)
= -0.64 - 0.48*(1 0.64)
= -0.81
Demonstrates an
incremental decrease
in H when >1 source
of disconfirming E
is found
INCREMENTALLY
ACQUIRED EVIDENCE
E.g.
Rain Prediction
Case 4 : Several Sources Predict Rain at
Same Belief Level but One Source Predict
No Rain
CFold
CFnew
= CFCOMBINE(CF1, CF2, ) 0.999 = CFold
= -0.8
CFCOMBINE(CFold,CFnew)
=
?
INCREMENTALLY
ACQUIRED EVIDENCE
E.g.
Rain Prediction
Case 4 : Several Sources Predict Rain at
Same Belief Level but One Source Predict
No Rain
CFold
CFnew
= CFCOMBINE(CF1, CF2, ) 0.999 = CFold
= -0.8
CFCOMBINE(CFold,CFnew)
=
CFold + CFnew
1 min{CFold, CFnew}
=
0.999 0.8
1 min{0.999, 0.8}
= 0.995
Shows that a single
piece of disconfirming
E does not have a
major impact on many
pieces of confirming E
EXAMPLE
IF
AND
OR
AND
THEN
CF(H) = ?
E1
E2
E3
E4
H
0.7
0.5
0.8
0.2
0.9
EXAMPLE
IF
AND
OR
AND
THEN
E1
E2
E3
E4
H
0.7
0.5
0.8
0.2
0.9
CF(H) = max {min(E1,E2),min(E3,E4)}*CF
= max {0.5, 0.2}*0.9
= 0.5*0.9
= 0.45
interpretation?
EXERCISE
You
are given the following rules:
R1: IF Y OR D THEN Z {cf 0.8}
R2: IF X AND B AND E THEN Y {cf 0.6}
R3: IF A THEN Z {cf 0.5}
EXERCISE
Conduct
the following computation:
a) The
CF for X is 0.2, for B is 0.4 and for E
is 0.3. Find the CF for Y.
b) Consider,
rule 1 and rule 3 are to be true.
The CF for Y is 0.7, D is 0.6 and A is -0.5.
Find the CF for Z.
EXERCISE
Conduct
the following computation:
a) The
CF for X is 0.2, for B is 0.4 and for E
is 0.3. Find the CF for Y.
Answer: 0.2
b) Consider,
rule 1 and rule 3 are to be true.
The CF for Y is 0.7, D is 0.6 and A is -0.5.
Find the CF for Z.
Answer: 0.41 (the correct one!)