Conditioning on evidence
Question 1
Let:
- be the event that an email is a spam.
- be the event that an email contains “free money.”
Therefore:
Question 2
Let:
- be the event that both children are boys.
- be the event that the twins are identical.
Therefore:
Question 3
Let:
- be the event that a man smokes.
- be the event that a man gets lung cancer.
Therefore:
Thus:
From the equation above we have:
Question 4
(a)
(b)
since , therefore . Since knowing the answer guarantees a correct response, while guessing only succeeds with probability , observing that Fred answered correctly increases the likelihood that he knew the answer. Therefore, . Equality holds only in the degenerate cases where Fred never knows the answer (), always knows the answer (), or when there is only one choice ().
Question 5
Let
- be the event that first card is the Ace of Spades.
- be the event that second card is the 8 of Clubs.
- be the event that third card is an Ace.
Therefore:
This can also be concluded by symmetry as all 50 remaining cards have the same probability to be the third card drawn:
Question 6
Let
- be the event that the coin lands Heads on all 7 flips.
- be the event that the coin is double-headed.
Therefore:
Question 7
Let
- be the event that the coin lands Heads 7 on all 7 flips.
- be the event that the chosen coin is double headed.
Therefore: (a)
(b)
Question 8
Let
- , , be the events that the screen is manufactured by company A, B, C respectively.
- be the event that the screen is defective.
Therefore:
Question 9
(a) Since both and imply , they are subsets of , thus:
Since , therefore:
As desired.
(b) Let be “a die is rolled,” and and be “the die shows 1” and “the die shows 2,” respectively. Both and imply , and both have the same prior probability . Therefore, after observing , the posterior probabilities remain equal.
Question 10
Since and are conditionally independent given , therefore:
(a) By definition of conditional probability and LOTP,
From the equations above,
By Bayes’ law: and . So,
Which simplifies to,
Since and ,
For by the equations above,
Since and ,
(b) By LOTP,
Since and ,
Again by LOTP,
Again since , , and ,
Question 11
Since of of the respondents say they voted for , therefore:
By Bayes’ rule,
By LOTP,
Since ,
Since and ,
Solving for ,
Question 12
(a) Let
- be the event that was received.
- be the event that was sent.
Therefore, by Bayes’ law and LOTP:
Since , and , thus:
(b) Let
- be the event that was received.
- be the event that was sent.
Therefore by the equations above:
Since , thus:
Since
Thus:
Question 13
Let
- be the event that the test was successful, meaning it was positive for diseased patients and it was negative for healthy patients.
- be the event that the test was positive.
- be the event that the patient is diseased.
Therefore:
By definition of conditional probability,
(a) From definitions above for company B:
Therefore their success rate is , As for company A:
which is a lower success rate than company B.
(b) Company B’s test gives no information about the diseased population, in contrast Company A can identify diseased patients ().
(c) For beating Company B with equal sensitivity and specificity:
Therefore the specificity and sensitivity both must be greater than .
For beating Company B with sensitivity :
Therefore the specificity must be greater than
For beating Company B with specificity :
Therefore sensitivity must just be greater than .
Question 14
(a) should be bigger as the event , his house being burglarized, would make Peter to install the alarm as soon as possible to prevent further burglaries.
(b) should be bigger as a burglar would choose houses with less protection.
(c) By definition of conditional probability,
Since , and by rearranging,
Since , divide both side by
Since and , therefore,
By rearranging,
By LOTP,
Rearranging,
Since , divide both sides by ,
As desired.
(d) The opinion was popular because people reasoned using separate causal narratives for each conditional probability, failing to recognize that Bayes’ rule mathematically links them and makes the two judgments logically incompatible.
Question 15
By definition of conditional probability:
Since the question is essentially about comparing these three values, and they all share the positive numerator, the answer can be found by comparing , and . Since therefore:
Question 16
By Bayes’ rule,
Since , divide both sides by and multiply by ,
Since ,
By Bayes’ rule,
Since ,
Both sides can be divided by , since .
As desired.
If learning that occurred lowers the probability of , then learning that did not occur must raise the probability of , since probabilities must re-balance across and .
Question 17
(a) Since ,
By Bayes’ law,
Since , thus,
Since ,
As desired.
(b) Let and be independent events, thus:
Now let , therefore
Question 18
Since ,
Thus,
Question 19
Holmes’s maxim can be interpreted using conditional probability and the distinction between prior and posterior probabilities. Before any evidence is observed, each possible explanation has a prior probability reflecting how plausible it initially seems, and some explanations may appear very unlikely. When evidence is observed, we condition on that evidence to obtain posterior probabilities. Any hypothesis for which the observed evidence is impossible has conditional probability zero and is therefore eliminated. Once all such impossible hypotheses are excluded, the remaining hypotheses must account for all of the posterior probability, and if only one remains, it must be true with posterior probability one, even if its prior probability was very small. Thus, an explanation that initially seemed improbable can become certain after conditioning on the available evidence.
Question 20
Let
- be the event that th card is a queen
Therefore,
(a)
(b)
Let
- be the event that th card is the Queen of Hearts
Therefore,
(c)
Since (Probability of first card or second card being Queen of hearts given the both cards are Queens), thus,
Question 21
(a)
(b)
Question 22
Question 23
Suppose
Therefore,
As desired.
Question 24
Imagine two doctors doing both heart surgery and band aid surgery:
- Hospital :
- Heart surgery: successful, failed
- Band Aid surgery: successful
- Hospital :
- Heart surgery: successful, failed
- Band Aid surgery: successful, failed.
Let,
- be the event that the surgery is done by hospital .
- be the event that the surgery is done by hospital .
- be the event that the surgery is a heart surgery.
- be the event that the surgery is a band aid surgery.
As can be seen,
But,
Question 25
Let
- be the event that the party is guilty.
- be the event that the party is guilty.
- be the event that the party matches the blood type.
Therefore, (a)
(b)
Question 26
(a)
(b)
(c) Yes. Conditioning first on and then on gives the same result as conditioning once on . This is because conditioning is associative, and because the programs’ outputs are conditionally independent given whether the email is legitimate or spam, so the likelihoods used in the second update are unchanged by conditioning on .
Question 27
Let
- be the event that the suspect, who has blood type , is guilty
- be the event blood type is found in the scene
Therefore,
Assume , therefore,
This holds only if .
Question 28
Let
- be the event that Fred has the disease
- be the event that the test is positive
- be the prior odds of Fred getting the disease,
- be the sensitivity of the test,
- be the specificity of the test,
Therefore, (a) Let , be the posterior odds
(b)
Since the disease is rare we can assume, , therefore,
As can be seen, for a rare disease, improving specificity dramatically improves the positive predictive value, while increasing sensitivity has relatively little effect.
Question 29
Let
- be the event that the child is a girl.
- be the event that the child has characteristic ,
Therefore,
Independence and conditional independence
Question 30
(a) The events are dependent because knowing that is older than provides information about ’s overall “seniority” in the birth order. If is older than , is restricted from being the youngest child (the or scenarios), which statistically increases the likelihood that is also older than . In simpler terms, the more people we know is older than, the more likely it is that is the oldest overall. Since the probability of being older than shifts from (with no information) to (knowing beat ), the two events influence one another and are therefore not independent.
(b)
Question 31
In order to an event be independent of itself, the following equation must hold,
Thus, for an event to be independent of itself, it must equal either or . In case of , occurrence of the event gives no additional information as the prior and posterior occurrence is certain.
Question 32
Let, denotes the event that die shows the value ; similarly, , , and denote the events that dice , , and show the values , , and , respectively. (a)
(b) In order for to be independent of , the following equation must hold,
Calculating and ,
Since they are equal, therefore these probabilities are independent.
And the same goes for and ,
Which are not equal, therefore they are not independent.
Question 33
(a) There are possible subsets of , all equally likely to occur as ,
(b) For each person, there are four probabilities,
- Bob’s friend, Alice’s friend,
- Bob’s friend, not Alice’s friend,
- Not Bob’s friend, Alice’s friend,
- Not Bob’s friend, not Alice’s friend,
Only the third outcome would contradict , therefore each person must fall under the other three outcomes, thus,
(c) Same as the above, only the forth outcome would contradict , therefore each person must fall under the other three outcomes, thus,
Question 34
(a) Since we have supposed that the occurrence of an accident wouldn’t change the driver’s skill, and that the driver’s skill remains the same over time, therefore occurrence of either or given , wouldn’t give any additional information about the other one; therefore and are conditionally independent given .
(b)
(c)
Question 35
Let
- be the event that you win the the game
- be the events that the opponent is either a beginner, an intermediate or a master respectively.
(a) Thus,
(b)
Since the outcomes of the games are independent, Given the skill level of your opponent, thus , therefore,
Again, by the same fact as above, , therefore,
(c)
- First let’s assume that winning probabilities are unconditionally independent, , we know this is not correct since knowing that has occurred would increase the chance of .
- Now let’s assume that they are conditionally independent, , assuming that won’t affect our’s or the opponent’s skill, this is correct, because the occurrence of won’t give any additional information since has all the information we need about the chances of .
Question 36
Because of the “if and only if” condition, all the students would fall under three categories:
- Only good at math
- Only good at baseball
- Good at both
And we can probably assume that the population of those students who are good at both are significantly smaller than the students who are only good at one them; Therefore conditioning on being good at baseball, would eliminate the population of the student who are only good at math, decreasing the probability.
(b) In order to and be conditionally independent given , the following equation must hold,
Calculating both sides:
Given , , therefore
Thus, the equation doesn’t hold and and are conditionally dependent given . In addition,
Question 37
(a)
(b)
(c) In order to and be conditionally independent given , the following equation must hold,
Computing both sides:
These values can only be equal if , which holds only if , meaning that everyone without the diseases would also have the symptoms.
(d) Supposing ,
Therefore in order to and to be conditionally independent given , the following equation must hold:
Thus, the assumption is only true if,
which is not possible since , therefore and are conditionally dependent given .
Question 38
Let
- be the event of that new email includes 23rd, 64th and 65th words or phrases on the list. ()
Therefore,
For the sake of simplicity, let
Therefore,
Monty Hall
Question 39
(a) Let
- be the event that the car is behind door
- be the event that we win the car, considering we always switch
Let’s assume we chose door , even if we didn’t we could simply relabel the doors,
Since we always switch to one of three remaining doors, therefore
Thus,
Therefore we should switch because the posterior probability is higher than the prior probability .
(b) Generalizing:
Question 40
(a) Let
- be the event that the car is behind door
- be the event that Monty opens door
- be the event that we win the car, considering we always switch
Let’s assume we chose door , even if we didn’t we could simply relabel the doors,
Since ,
(b)
(c)
Question 41
Let
- be the event that the car is behind door
- be the event that Monty opens door
- be the event that the coin landed on the secret flip
- be the event that we win the car, considering we always switch
Therefore,
Since (We know that a goat was behind door 2), thus,
And since , therefore,
Using Bayes’ theorem,
Since , therefore,
Conditioning on ,
Since and are independent,
Question 42
Let
- be the event that the car is behind door
- be the event that Monty opens door
- be the event that we win the car, considering we always switch if Monty opens a door.
Let’s assume we chose door , even if we didn’t we could simply relabel the doors, (a)
So, if , Monty only opens a door when the initial chosen door has a car behind it, forcing us to always choose a goat in the switch; and if , Monty always opens a door revealing a goat, it’s the same as the classic Monty Hall conditions.
(b)
Question 43
Let
- be the event that the car is behind door
- be the event that the computer is behind door
- be the event that the goat is behind door
- be the event that Monty opens door
- be the event that we win the car, considering we always switch
(a) Let’s assume we chose door , and Monty opened door revealing a goat, even if these didn’t happen, we could simply relabel the doors,
Therefore, staying or switching have equal probability of winning.
(b) Let’s assume we chose door , and Monty opened door revealing a computer, even if these didn’t happen, we could simply relabel the doors,
If
- , you should switch.
- , you should stay.
- , it does not matter.
Question 44
Let
- be the event that the car is behind door
- be the event that a goat is behind door
- be the event that Monty opens door
- be the event that we win the car
(a)
Calculating and ,
- Same as the ,
Thus,
Therefore, if
- contestant stays on the initial choice,
- contestant switches doors,
Assume switching and winning has higher probability,
Therefore, the contestant should only switch if .
(b) Using the notation from above
Using the equations from above,
Therefore, if
- contestant stays on the initial choice,
- contestant switches doors,
Assume switching and winning has higher probability,
Which contradicts , therefore the contestant should stay and not switch.
(c)
Using the equations from above,
Therefore, if
- contestant switches doors,
- contestant stays on the initial choice,
Assume switching and winning has higher probability,
Which is always true (), therefore the contestant should always switch.
(d)
From the equations above,
Therefore, if
- the contestant switches doors,
- the contestant stays on the initial choice,
Assume switching and winning has higher probability,
Which is true (), therefore the contestant should always switch (expect for where it doesn’t matter whether the contestant switches or not).
Question 45
Let
- be the event that the car is behind door
- be the event that Monty opens door
- be the event that we win the car
(a) Assuming the contestant always switches,
Calculating each one:
Thus,
(b)
Question 46
Let
- be the event that the car is behind door
- be the event that the apple is behind door
- be the event that the book is behind door
- be the event that the goat is behind door
- be the event that Monty opens door
- be the event that we win the car
(a) Assuming the contestant always switches to one of the remaining two doors,
Since (The contestant always switches), thus,
And for , since for switching we have two options in which one of them will lead to the car,
(b) Let be the event that Monty reveals the apple
For each sentence,
- , since the apple is not in Monty’s options
- , since apple is the intermediately preferred item (options: Car, Apple, Goat)
- , since apple is the least preferred item (Options: Car, Book, Apple)
- , since apple is the intermediately preferred item (Options: Book, Apple, Goat)
Thus,
(c)
Since and , thus
Since , thus
Since , thus
Question 47
Let
- be the event of winning given the strategies
- be the event that the car is behind door
- be the event that we switch to door on the first round
- be the event that Monty opens door on the first round
(a)
Since and , thus
(b)
Since in this strategy it’s impossible to remain on door , thus ,
And , since the contestant switches after Monty opens two doors out of the remaining doors, which all lead to a goat, leaving the contestant with only one door to switch to, which is certainly a car for , thus,
(c)
Since in this strategy it’s impossible to remain on door , thus ,
And , since the contestant switches after Monty opens only one door out of the remaining doors, which leads to a goat, leaving the contestant with two doors to switch to, thus,
(d)
Since
- , because of the second switch making the contestant unable to go back to door , and
- , because after the first switch landing on a goat door and Monty opening the second goat door the only remaining option for switch would be the car door
Therefore
Calculating each sentence, For and ,
-
- Calculating for ,
-
- Calculating for ,
Thus,
(e) Stay-Switch strategy is the best since it is the highest probability of winning the car.
First-step analysis and gambler’s ruin
Question 48
(a) Since is the initial value in the sequence, thus , it is certain that this value is seen, and for any , , since it is impossible that any negative value be seen in the sequence.
Let be the event that a die shows after rolling, for () conditioning on the last die roll we have:
Generalizing the sequence we get:
(b)
Since , thus,
(c) Think of the running total as a sequence of landings on a number line. Since the average value of a single die roll is , the “jumps” you take have an average length of units. Over a long distance , you will land on approximately one out of every integers. Therefore, the probability of hitting any specific integer stabilizes to the reciprocal of the mean step size:
Question 49
(a) Let
- be the event that the th trial ends in success
Therefore,
Since , thus
(b) Base case:
Which fits
Inductive steps: Let’s assume the formula holds for , therefore,
Now consider , for trials to have even success count,
- the first trials must have an even number of successes and the th trial must be a failure
- the first trials must have an odd number of successes and the th trial must be a success
Thus,
The induction is complete.
(c)
- for any if ,therefore , thus
- for all , , thus
- for all , , thus
Question 50
(a) Let
- be the event that Calvin wins the match
- be the event of winning a single game
- be the event of losing a single game
Therefore,
is the same as , since the score is tied, and they are back to effectively starting the match over, therefore
(b) Assume there are five states:
- State 0: Hobbes wins the match (Calvin is “ruined”).
- State 1: Hobbes is up by 1.
- State 2: The match is tied (Starting position).
- State 3: Calvin is up by 1.
- State 4: Calvin wins the match.
Calvin wins if he reaches State 4 before State 0, starting from State 2. In Gambler’s Ruin, the probability of reaching target N starting from i is given by:
Therefore, for and ,
Question 51
The gambler starts at and quits when he reaches , the probability that that he’ll ever be ahead by dollars is effectively the same as the probability that he reaches dollars before reaching dollars,
For the sake of simplicity let , now let’s assume that ,
Which is impossible, therefore
Question 52
Normalizing and in the classic model by ,
Therefore the probability that wins equals
Finding the limit for
Since , thus , and since , the s in the denominator and nominator can be committed,
Since , therefore , thus
Question 53
Let the points on the circle have numbers between and , with wolf being on point and the sheep on the other points, therefore the sheep who is opposite the wolf would be on number . For the sheep to be eaten, first sheep or must be eaten, thus by LOTP,
By symmetry , and also , thus,
can be modeled as the gambler’s ruin situation, where the wolf is standing on number , and must reach point before he reaches ,
Question 54
(a) Let be the event that the man moves one to the right from the origin, therefore,
since and (man must move twice now), therefore,
Solving for ,
Generalizing,
(b)
- , therefore for , which can’t be true, thus the other root must be the answer
- , therefore for ,
- , therefore for , , it can’t be the other root since it’s biased towards the left
Simpson’s paradox
Question 55
Summing and ,
Therefore, it’s impossible that
Question 56
(a) Let
- be the event that Blackheart will hurt Stampy
- be the event that Blackheart possesses a large amount of Ivory
- be the event that Blackheart is an Ivory dealer
(b)
- After seeing his supplies , Lisa Argues that is more probable,
- Lisa also argues that an Ivory dealer has a more probability of hurting Stampy,
- Therefore, Lisa claims . (The evidence of possessing more Ivory increases the chances of hurt)
(c) Homer is conditioning on the amount of ivory while ignoring that the ivory itself tells you Blackheart’s profession. It’s like saying a man covered in blood is less likely to be a murderer because he “already has plenty of blood.”, Homer here argues that,
Therefore, conditioning on ,
Here, we can assume that and , thus
Which is logically wrong because an ivory dealer has much higher probability to hurt an elephant for its ivory than a non-dealer.
Question 57
(a) Let
- have 80 green gummy bears and 10 red gummy bears ()
- have 10 green gummy bears and 0 red gummy bears ()
- have 40 green gummy bears and 8 red gummy bears ()
- have 89 green gummy bears and 1 red gummy bear ()
As can be seen, has a bigger percentage than and has a bigger percentage than ; but and combined have 90 green gummy bears out of 100 (), and and combined have 91 green gummy bears out of 100 ().
(b) It is exactly the Simpson’s paradox, since sub-events have higher probability compared but have lower probability compared when combined. Let
- be the event that a random gummy bear is green
- be the event that jars are from group
- be the event that the jar number is
Therefore,
but
Question 58
(a) Since and are independent, therefore,
Which contradicts the assumption , therefore it’s impossible.
(b) If and are independent, therefore,
Also we have,
since we are given that , the weighted average must lie strictly between these two values,
Thus,
And also by the same logic for ,
Now consider the aggregate . This is the weighted average of the two subgroup rates:
Since both and are strictly less than , their weighted average must also be less than .
Similarly, for , both subgroup rates are strictly greater than :
This results in , which preserves the original inequality direction (). The paradox requires a reversal (), so it is impossible.
(c) If and are independent (), the treatment assignment is not associated with the con-founder (this is the goal of Randomized Controlled Trials). This implies:
Let’s expand the aggregate probabilities using the Law of Total Probability:
We are given the subgroup inequalities:
Because the weights and are identical for both equations and non-negative, the inequality is preserved when we sum them up.
The aggregate inequality is strictly <, but the paradox requires >. Thus, it is impossible.
Question 59
Question 60
Let
- be the event that the patient is diseased
- be the event that they tested positive
- and be the events that lab A or B is chosen respectively
(a) The probability that the patient has the disease, given that they tested positive is , using Bayes’ rule and LOTP,
Since and are independent from , therefore and ,
Again since is independent from and , therefore, , thus
(b) The probability that the patient’s blood sample was analyzed by lab A, given that the patient tested positive is , using Bayes’ rule,
Since and , thus,
By LOTP,
Question 61
Let be the event that all tests are positive, therefore, (a) By Bayes’ rule
Using LOTP,
Plugging in the values,
(b) Using the notation from above,
Conditioning over ,
Since and are independent, thus , therefore,
Plugging in the values,
Question 62
Let
- be the event that the mother has the disease
- be the event that the th child has the disease
(a)
Since given that the mother has the disease, her children independently will have it with probability , therefore, , thus,
(b) The children are conditionally independent given M, but they are not unconditionally independent. Learning that one child doesn’t have the disease increases the probability that the mother doesn’t have it, which affects the probability for the other child.
(c) By Bayes’s rule and LOTP,
Plugging in the values,
Question 63
The flaw is treating a guaranteed event (being able to find two matching coins) as if it provides useful conditioning information, when it actually doesn’t restrict the sample space at all.
Question 64
Let
- be the event that green ball is drawn before any blue balls
- be the event that the th drawn ball is green
- be the event that the th drawn ball is red
- be the event that the th drawn ball is green
(a)
If the first drawn ball is red, we are essentially back to the initial condition, therefore , thus
Solving for ,
(b)
Now is not independent of anymore since there only two green and blue balls are left in the urn, solving ,
Plugging in,
Which is still the same as before
(c) Let
- be the event that type comes before type
- be the event that type comes in trial
Since , and , thus
Solving for ,
Question 65
(a) Let
- be the event of not drawing “you lose” on the th turn
- be the event that “you lose” is still in the bag in the th turn
The event can be modeled as total possibilities from which “you lose” is not included, thus,
Plugging and in,
Since and , thus
Since is not a function of , it doesn’t matter on which turn you pick your piece of paper.
(b) From above notation,
Since and , therefore,
Solving ,
Solving ,
Plugging in the values,
Solving for
Now let’s suppose that is independent of , thus ,
Which results in,
Since only when , we conclude that in general (when ), the probability depends on position .
- When , the “you lose” paper has more weight, so it’s more likely to be drawn early, therefore drawing first is better.
- When , the “you lose” paper has more weight, so it’s less likely to be drawn early, therefore, drawing later is better.