B. Bayes Rule and Conditional Probability: At-risk Students example
To augment the material on Bayes rule in Stat190 the following (highly artificial)
example was used in 191X. Let the Event A1 be that the student drops out of
school and the Event A2 that the student does not drop out of school. Let Event B
be that the student has a bad (deficient) home environment. The retrospective
probability P{B|A1} and also the marginal probabilities P{A1}, P{B} can be used via
Bayes rule to give us the prospective probability P{A1|B}. This latter probability
could tell us about the value of home environment as a predictor of student drop-
out. The examples illustrate the large effect the marginal incidence of drop-outs
P{A1} has on the value of B as a predictor--i.e., P{A1|B} increases in the examples
and Table below as P{A1} increases from .1 to .4.
(* Bayes Rule *)
P{A1|B} := (P{B|A1}*P{A1})/(P{B|A1}*P{A1} + P{B|A2}*P{A2})
P{A2} := 1 - P{A1}
P{B} := P{B|A1}*P{A1} + P{B|A2}*P{A2}
Example 1. P{B|A1} = .5; P{B|A2} = .2; P{A1} = .1;
[P{A1|B}, P{B}] = [0.217391, 0.23]
Example 2. P{B|A1}= .8; P{B|A2}= .2; P{A1}= .35;
[P{A1|B, P{B}]= [0.682927, 0.41]
Create Table--Entries are P{A1|B} (top) and P{B} (below)
Table[{P{A1|B}, P{B}, {P{A1}, .1,.4, .1}, {P{B|A1}, .5,.9,.2}]
P{B|A1}
P{A1} .5 .7 .9
.1 0.217391 0.28 0.333333
0.23 0.25 0.27
.2 0.384615 0.466667 0.529412
0.26 0.3 0.34
.3 0.517241 0.6 0.658537
0.29 0.35 0.41
.4 0.625 0.7 0.75
0.32 0.4 0.48
-----------------------------------------------------------------------------
Problem 1: Child Abuse?
Rather than a yes/no vote on the issue, this header leads us to a
Bayes Thm. calculation along the same lines as the "At-Risk
Students" Example presented in class 10/5 (c.f Course Files
listing). This example is taken from the Larson and Marx text
used in prior Stat190.
A government task force is considering the feasibility of setting
up a national screening program to detect child abuse.
Consultants for the group estimate that:
1.One child in 90 is abused
2.A physician can detect an abused child 90 percent of the time
3.A screening program would incorrectly label 3 percent of all
nonabused children as abused
Using this information, calculate the probability that a child is
actually abused given that the screening program diagnosis
him/her as such? Comment on the usefulness of such a program.
Repeat your calculation and comment if it were the case that One
child in 9 is abused rather than the stated One child in 90 is
abused.
------------------------
Problem 1: Child Abuse?
First, let's define two events and their complements:
A - child abused; notA - child not abused.
B - child labeled as abused; notB - child labeled not abused.
We can now write the information we got in the question in formal
probability terms:
1.One child in 90 is abused: P(A) = 1/90 ==> P(notA) = 89/90
2.A physician can detect an abused child 90 percent of the time: P(B|A) = .9
3.A screening program would incorrectly label 3 percent of all
nonabused children as abused: P(B|notA) = .03
Now we are ready to use Bayes Theorem to get the desired probability - P(A|B).
P(B|A)P(A) .9 x (1/90)
P(A|B) = ---------------------------- = ---------------------------- =
P(B|A)P(A) + P(B|notA)P(notA) .9 x (1/90) + .03 x (89/90)
.01
= ------------ = .252
.01 + .02967
That is, only one child out of four diagnosed by the screening
program as abused is actually abused. The screening program will
produce many "false positive" cases.
When we change the prevalence of abuse P(A) from 1/90 to 1/9 and
repeat the same calculations (just change P(A) to 1/9 in the
above formula) we get P(A|B)=.771; i.e. 3 out of every 4
diagnosed children are actually abused. Here the screening
program will do a much better job at detecting the actual abused
children without falsely labeling many others.