SOLUTION: Using Bayes Theorem: A certain disease occurs in % of the population. A test for the disease is fairly accurate: it misclassifies people with the disease as healthy % of the time a

Click here to see ALL problems on Miscellaneous Word Problems

Question 238435: Using Bayes Theorem: A certain disease occurs in % of the population. A test for the disease is fairly accurate: it misclassifies people with the disease as healthy % of the time and reports that a healthy person is diseased just % of the time. Suppose that a person tests positive for the disease. Compute the probability that the person does not have the disease. Round your answer to two decimal places.

I do not understand how to use this theory - so much to consider and plug into the formula. I might be overthinking it. HELP

Answer by Theo(13342) (Show Source):
You can put this solution on YOUR website!
this is the best tutorial on the web I could find that should help you to understand this material.

bayes theorem by vassar

a and b are two mutually exclusive events.

p(a) = probability of a occurring.
p(na) = probability of a not occurring.

p(b) = probability of b occurring.
p(nb) = probability of b not occurring.

p(b|a) = probability of b given a occurs.
p(nb|a) = probability of b not occurring given that a occurs.

p(b|na) = probability of b given a does not occur.
p(nb|na) = probability of b not occurring given that a does not occur.

they go on to give a formula for p(b) and p(nb)

p(b) = [p(b|a) * p(a)] + [p(b|na) * p(na)]

this means that the probability of b occurring is equal to:
the probability of b occurring given that a occurs times the probability of a occurring, plus the probability of b occurring given that a does not occur times the probability of a not occurring.

p(nb) = p(nb|a) * p(a) + p(nb|na) * p(na)

this means that the probability of b not occurring is equal to:
the probability of b not occurring given that a occurs times the probability of a occurring, plus the probability of b not occurring given that a does not occur times the probability of a not occurring.

they then go on to the finale by giving you the formulas for p(a|b) and p(na|b) and p(na|nb) and p(a|nb)

Those formulas are:

p(a|b) = p(b|a) * p(a) / p(b)

p(na|b) = p(b|na) * p(na) / p(b)

p(na|nb) = p(nb|na) * p(na) / p(nb)

p(a|nb) = p(nb|a) * p(a) / p(nb)

The particular formula you are interested in would be:

If you define being well as a as having the disease and b as testing positive for the disease, then the formula that you would be interested in would be:

a = having the disease.
na = not having the disease.

b = test positive for the disease.
nb = test negative for the disease.

you want to know the probability that the person that tests positive for the disease does not have the disease.

this means you want to know p(na|b)

na means you don't have the disease.
b means you test positive for the disease.

From above, the formula is:

p(na|b) = p(b|na) * p(na) / p(b)

since p(b) = [p(b|a) * p(a)] + [p(b|na) * p(na)]

then your formula becomes:

p(na|b) = p(b|na) * p(na) / ([p(b|a) * p(a)] + [p(b|na) * p(na)])

To put this into number, let's assume the following:

p(a) = .2
p(na) = .8

This means that the probability of having the disease is .2 and the probability of not having the disease is .2.

p(b|a) = .95
p(nb|a) = .05

this means that the probability of being tested positive if you have the disease is .95 and the probability of being tested negative if you have the disease is .05.

p(nb|na) = .7
p(b|na) = .3

this means that the probability of being tested negative if you don't have the disease is .7 and the probability of being tested positive if you don't have the disease is .3.

this means the test is pretty good at detecting the disease if it's present but not so good at detecting the absence of the disease if it's not present.

your factors are:

p(a) = .2
p(na) = .8

p(b|a) = .95
p(nb|a) = .05

p(b|na) = .3
p(nb|na) = .7

the formula you are going to work with is:

p(na|b) = (p(b|na) * p(na) / ([p(b|a) * p(a)] + [p(b|na) * p(na)])

this formula becomes:

p(na|b) = (.3 * .8) / (.95 * .2) + (.3 * .8)

this becomes:

p(na|b) = .24 / (.19 + .24)

this becomes:

p(na|b) = .24 / .43

this becomes:

p(na|b) = .558139535

given these numbers, there's a 55% chance that the person who tested positive is actually healthy.

with a higher degree of accuracy, these number should go down dramatically.

let's improve the accuracy and see where that takes us.

the numbers we were using were:

p(a) = .2
p(na) = .8

p(b|a) = .95
p(nb|a) = .05

p(b|na) = .3
p(nb|na) = .7

we'll improve the accuracy of detecting the disease when it's present and also improve the accuracy of detecting the disease when it's not present.

our new numbers will be:

p(a) = .2
p(na) = .8
p(b|a) = .99
p(nb|a) = .01
p(b|na) = .02
p(nb|na) = .98

our formula is still:
p(na|b) = (p(b|na) * p(na) / ([p(b|a) * p(a)] + [p(b|na) * p(na)])
this formula becomes:
p(na|b) = (.02 * .8 / [(.99 * .2) + (.02 * .8)]
this becomes:
p(na|b) = .016 / .214
this becomes:
.074766355

The probability that he doesn't have the disease given that he is tested positive for the disease becomes 7.5%.