The problem is named after Monty Hall, host of the TV game show Let's Make a Deal. It is the kind of problem almost everyone gets wrong the first time, thinks they understand, then gets fooled by again in slight disguise — so the goal is a way of thinking, not just an answer.
The setup:
The question: should you switch? The assumptions matter, and several are usually left implicit.
The last assumption is the one most often omitted. A homework extension explores "lazy Monty Hall," where he prefers opening door 2 with probability $p$ (he doesn't want to walk to door 3); that breaks the symmetry, but the basic problem assumes $p = \tfrac{1}{2}$.
The naive reasoning: Monty opens door 2, so you choose between door 1 and door 3 with no apparent distinguishing information — so it's 50/50. This is wrong. Under the stated assumptions:
So you should always switch.
The "50/50" answer abuses the naive definition of probability, which assumes equally likely outcomes. The doors were equally likely initially ($\tfrac{1}{3}$ each); that does not mean they stay equally likely conditionally, after observing what Monty does.
The crucial point: condition on all the evidence. The naive approach conditions only on "door 2 has a goat." The real evidence is richer — it is that Monty chose to open door 2. Seeing why that extra fact matters is the whole problem.
The cleanest picture is a two-stage probability tree, conditioned on the contestant having chosen door 1. The first branch is which door has the car; the second is which door Monty opens.
Reading the branches:
Now condition on the observed event "Monty opens door 2." Only two paths are consistent with it:
| Path | Car door | Monty opens | Path probability |
|---|---|---|---|
| A | door 1 | door 2 | $\tfrac{1}{3} \cdot \tfrac{1}{2} = \tfrac{1}{6}$ |
| B | door 3 | door 2 | $\tfrac{1}{3} \cdot 1 = \tfrac{1}{3}$ |
The other two paths are deleted, just as in the pebble-world picture: conditioning removes everything inconsistent with what you observe. The survivors ($\tfrac{1}{6}$ and $\tfrac{1}{3}$) don't sum to $1$, so renormalize by dividing by their total $\tfrac{1}{2}$ (equivalently, multiply both by $2$):
Conditional on Monty opening door 2, the car is behind door 3 with probability $\tfrac{2}{3}$. Switching to door 3 therefore succeeds with probability $\tfrac{2}{3}$. By the same calculation (circling the other two paths), if Monty opens door 3 the answer is again $\tfrac{2}{3}$.
The tree is a law-of-total-probability calculation in pictures. The key step in any such argument is deciding what to condition on. Use wishful thinking: ask what you wish you knew. Here, obviously, you wish you knew where the car is — so condition on that.
Let $S$ be the event that you succeed using the switching strategy, and $D_j$ the event that door $j$ has the car. Condition on which door has the car, with prior weights $P(D_j) = \tfrac{1}{3}$. Given that you picked door 1 and will switch, the conditional probabilities are easy:
This is the unconditional probability that switching succeeds. Is the conditional probability given that Monty opened a specific door the same? Here, yes — by symmetry. Doors 2 and 3 are interchangeable until Monty opens one, so:
Both the conditional and unconditional probabilities equal $\tfrac{2}{3}$. In the "lazy Monty" extension the unconditional probability is still $\tfrac{2}{3}$, but the conditional probabilities change because the symmetry is broken.
Switching wins exactly when your initial guess was wrong, which is $\tfrac{2}{3}$ of the time.
Take the extreme case: replace 3 doors with $1{,}000{,}000$. You pick one; Monty opens $999{,}998$ goat doors, leaving just your door and one other. Almost nobody refuses to switch here — you are almost certain your initial guess (probability $\tfrac{1}{1{,}000{,}000}$) was wrong, and almost certain the single remaining door hides the car.
Conceptually there is no difference between this and the three-door problem; the "50/50" argument would apply identically and is just as wrong. The million-door version makes the absurdity visible; the three-door version hides it.
The controversy erupted when a reader posed the question to Marilyn vos Savant's column in Parade magazine. She gave the correct answer (switch), and thousands wrote in insisting she was wrong — including some with PhDs in mathematics, some quite rudely. The dispute partly reflects genuine ambiguity when assumptions are implicit, and partly how strongly the wrong intuition grips people.
A practical lesson: even without conditional-probability machinery, you can just simulate — with cups and props, or a short program. Run it a thousand times and switching wins about two-thirds of the time. When in doubt, simulate.
The second notorious problem: is it possible for one doctor to have a higher success rate than another at every single type of surgery, yet a lower overall success rate?
It sounds impossible — surely if A beats B in every category, A beats B in the total. Simpson's paradox says no: the direction of an inequality can flip when you aggregate. One thing looks better in every individual case yet worse in the total.
An aside on "paradox": there is no such thing as a true paradox. A genuine contradiction would mean the universe explodes and we wouldn't be here. What we call a paradox is something deeply counterintuitive — it forces you to think harder, and once you do, it makes sense.
Blitzstein uses two doctors from The Simpsons (a mnemonic for "Simpson's"). Dr. Hibbert is the respected, expensive town doctor; Dr. Nick is the cheap infomercial quack who offers any surgery for \$129.99. The numbers are invented to make the paradox stark. Each doctor performs 100 surgeries total — so neither does more volume — split between heart surgery (hard) and band-aid removal (easy).
| Surgery | Successes | Failures | Total | Success rate |
|---|---|---|---|---|
| Heart | 70 | 20 | 90 | $70/90 \approx 78\%$ |
| Band-aid | 10 | 0 | 10 | $10/10 = 100\%$ |
| Overall | 80 | 20 | 100 | $80/100 = 80\%$ |
| Surgery | Successes | Failures | Total | Success rate |
|---|---|---|---|---|
| Heart | 2 | 8 | 10 | $2/10 = 20\%$ |
| Band-aid | 81 | 9 | 90 | $81/90 = 90\%$ |
| Overall | 83 | 17 | 100 | $83/100 = 83\%$ |
Within each type, Dr. Hibbert wins: heart $78\%$ vs. $20\%$, band-aid $100\%$ vs. $90\%$. Yet aggregated, Dr. Nick wins, $83\%$ vs. $80\%$ — he can truthfully advertise the higher rate at a fraction of the price.
The direction flipped between the conditional comparison (per surgery type, prefer Hibbert) and the unconditional one (aggregated, Nick looks better). The cause: $90\%$ of Dr. Nick's surgeries are easy band-aid removals, while Dr. Hibbert took mostly hard heart surgeries. The surgery mix differs, and that drives the aggregate.
This is realistic. The world's leading neurosurgeons may have lower headline success rates precisely because they get referred the hardest cases that no one else can handle.
To add fractions correctly you do not add numerators and denominators:
But adding numerators and denominators (the "wrong" way) is exactly how aggregation works — you add up successes and add up trials. If fractions actually added that way, the per-category ordering would always carry over and the paradox could not occur. Because real addition is not like that, the paradox is possible.
Map the example to events:
It is possible for all three of these to hold at once:
$$P(A \mid B, C) < P(A \mid B^c, C)$$
$$P(A \mid B, C^c) < P(A \mid B^c, C^c)$$
$$\text{yet} \qquad P(A \mid B) > P(A \mid B^c)$$
The within-category comparisons favor $B^c$ (Hibbert), but aggregation reverses the inequality. Essentially any instance of the paradox can be written this way.
$C$ is the confounder (the letter also stands for "control") — a variable to control for. The more relevant comparison is the conditional one: surgery type clearly matters, and everyone agrees Dr. Hibbert is better. Failing to condition on $C$ gives a misleading answer, because knowing the doctor ($B$) gives information about the surgery type ($C$), which affects success. Choosing Dr. Nick signals an easy band-aid removal, inflating his apparent rate.
It is tempting to think the aggregate inequality must follow from the per-category ones via the law of total probability. Seeing where that fails is instructive. The conditional form (everything given $B$) is valid — conditional probabilities are genuine probabilities:
The analogous expansion holds with $B$ replaced by $B^c$. We know each Nick term is smaller than the corresponding Hibbert term:
You cannot conclude $P(A \mid B) < P(A \mid B^c)$, because the weights differ. Nick's weights are $P(C \mid B)$ and $P(C^c \mid B)$; Hibbert's are $P(C \mid B^c)$ and $P(C^c \mid B^c)$, and there is no way to relate them. Concretely, $P(C \mid B) = \tfrac{10}{100} = 0.1$ (a Nick surgery being a heart surgery) is utterly different from $P(C \mid B^c) = \tfrac{90}{100} = 0.9$. Because the weights change between doctors, the aggregate can flip — that weight difference is exactly what enables Simpson's paradox.