An "event" is a subset of basic outcomes, for example "odd" or "four or less". We have
Because events are sets, we can combine them with set
operations. Set union corresponds to "or" while set intersection
corresponds to "and."
By definition, the event A is independent of the event B if and only
if Pr(A & B) = Pr(A) * Pr(B)
Notice that then Pr(A|B) = Pr(A & B) / Pr(B) = (Pr(A)
* Pr(B)) / Pr(B) = Pr(A)
The above assumes that the denominators are non-zero. When
doing probability calculations, you always have to pay attention to the
case of probabilities that are zero.
The "product rule" says that Pr(A & B) = Pr(A|B)Pr(B).
Note that Pr(A|B) = 0.8 does not mean "whenever B is true then Pr(A) = 0.8" because it may be the case, for example, that Pr(A|B,C) = 0.
The statement "A is independent of B" is equivalent to Pr(A|B) =
Pr(A) and also equivalent to Pr(B|A) = Pr(B)/
Supposedly Bayes did not publish his discovery because he thought it was hubris for humans to investigate the will of God. He discovered how to compute Pr(Y|X) based on knowledge of Pr(X|Y).
What Bayes discovered is this formula:
Pr(Y|X) = Pr(Y & X)/Pr(X) = (Pr(X|Y)Pr(Y)) / Pr(X)
Example:
The following article review (author unknown), available at http://www.stat.unipg.it/ncsu/info/jse/v5n3/resource.html
shows how difficult commonsense reasoning with probabilities is.
Review of The Psychology of Good Judgment by Gerd Gigerenzer (1996). Medical Decision Making, 16(3), 273-280.
Gigerenzer argues that physicians and their patients will better understand the chance of a false positive result if we replace the conventional conditional probability analysis by an equivalent frequency method.... a mammography problem: To facilitate early detection of breast cancer, women are encouraged from a particular age on to participate at regular intervals in routine screening, even if they have no obvious symptoms. Imagine you conduct in a certain region such a breast cancer screening using mammography. For symptom-free women aged 40 to 50 who participate in screening using mammography, the following information is available for this region.
Probability format:
The probability that one of these women has breast cancer is 1%. If a woman has breast cancer, the probability is 80% that she will have a positive mammography test. If a woman does not have breast cancer, the probability is 10% that she willstill have a positive mammography test. Imagine a woman (aged 40 to 50, no symptoms) who has a positive mammography test in your breast cancer screening. What is the probability that she actually has breast cancer? _____%
Frequency format:
Ten out of every 1,000 women have breast cancer. Of these 10 women with breast cancer, 8 will have a positive mammography test. Of the remaining 990 women without breast cancer, 99 will still have a positive mammography test.Imagine a sample of women (aged 40 to 50, no symptoms) who have positive mammography tests in your breast cancer screening. How many of these women do actually have breast cancer? _____ out of _____
In a classic study by D. M. Eddy (see Dowie J. Elstein (ed.) (1988), Professional Judgment: A Reader in Clinical Decision Making, Cambridge University Press, pp. 45-590), essentially this same question, with just the probability format, was given to 100 physicians. Ninety-five of the physicians gave the answer of approximately 75% instead of the correct answer, which, in this example, is 7.48%.
In the present study, Gigerenzer found that, when the information was presented in the probability format, only 10% reasoned with the Bayes computation
P(breast cancer | positive test) =
(.01)(.80)/[(.01)(.80) + (.99)(.10)] = .0748.
For the group given the frequency format, 46% computed the Bayes probability in the simpler form:
P(breast cancer | positive test) = 8/(8 + 99) = .0748.
The article discusses some of the reactions of the physicians to even considering such problems. Here are some quotes:On such a basis one can't make a diagnosis. Statistical information is one big lie.Some doctors commented that getting the answer in the frequency form was simple.I never inform my patients about statistical data. I would tell the patient that mammography is not so exact, and I would in any case perform a biopsy.
Oh, what nonsense. I can't do it. You should test my daughter. She studies medicine.
Statistics is alien to everyday concerns and of little use for judging individual persons.