SOLUTION: An e-mail filter is planned to separate valid e-mails from spam. The word free occurs in 60% of the spam messages and only 4% of the valid messages. Also, 20% of the messages are s
Algebra ->
Probability-and-statistics
-> SOLUTION: An e-mail filter is planned to separate valid e-mails from spam. The word free occurs in 60% of the spam messages and only 4% of the valid messages. Also, 20% of the messages are s
Log On
Question 1193750: An e-mail filter is planned to separate valid e-mails from spam. The word free occurs in 60% of the spam messages and only 4% of the valid messages. Also, 20% of the messages are spam. Determine the probabilities:
(a) the message contain free.
(b) the message is spam given that it contains free.
(c) the message is valid given that it does not contain free. Answer by ikleyn(52787) (Show Source):
You can put this solution on YOUR website! .
An e-mail filter is planned to separate valid e-mails from spam.
The word free occurs in 60% of the spam messages and only 4% of the valid messages.
Also, 20% of the messages are spam. Determine the probabilities:
(a) the message contain free.
(b) the message is spam given that it contains free.
(c) the message is valid given that it does not contain free.
~~~~~~~~~~~~~
Couple of notices before we start:
(1) 20% of messages are spam --- hence (due to the context) 80% (or 0.8) of massages are valid.
(2) 4%, or 0.04 of valid messages contain free --- hence 1-0.04 = 0.96 valid messages DO NOT contain free.
(a) P(a message contain free) = 0.6*P(spam) + 0.04*P(valid) = 0.6*0.2 + 0.04*0.8 = 0.152. ANSWER
(b) It is about calculating conditional probability
P(spam | contains "free") = P(spam AND contains free) / P(contains free) =
= = 0.7895 (rounded) ANSWER
Notice to the calculation: the denominator of the fraction P(contains free) is just calculated in part (a) as 0.152.
(c) It is about calculating conditional probability
P(valid | does not contain free) = P(valid AND does not contain free) / P(does not contain free) =
= = 0.9057 (rounded). ANSWER
Notice to the calculation: (1-0.04) comes from (2); (1-0.152) is P(does not contain free), due to part (a).