The Prisoner’s Dilemma – A Classic Game‑Theory Puzzle
Setup Two suspects (A & B) are arrested and interrogated separately. They can either cooperate with each other by staying silent or defect by betraying the other.
- Payoff Matrix
B Cooperates B Defects
A Cooperates (2, 2) (0, 3)
A Defects (3, 0) (1, 1)
The numbers are years of prison time (smaller is better).
(Cooperate, cooperate): both get 2 years.
(Defect, Defect): both get 1 year.
If one defects while the other cooperates: the defector goes free (0 years), the cooperator gets 3 years.
Note: The actual numbers can vary; what matters is the ordering of outcomes. - Why It’s a Dilemma
Individual Rationality → Defection
If B cooperates, A should defect (free vs 2 yrs).
If B defects, A still defends defecting (1 yr vs 3 yrs).
So defect is a dominant strategy for both.
Collective Optimality → Cooperation
The pair would be better off if both cooperated: 2 + 2 = 4 years total.
Defection gives 1 + 1 = 2 years total – worse for the group.
Thus, each player faces a conflict between self‑interest (defect) and mutual benefit (cooperate).
- Nash Equilibrium
A Nash equilibrium is a set of strategies where no one can improve by changing unilaterally.
In the Prisoner’s Dilemma, (Defect, Defect) is the unique Nash equilibrium because neither player benefits from switching to cooperate.
- Extensions & Variations
Variation What it shows
Repeated (Iterated) PD Cooperation can emerge if players interact many times; past behaviour influences future decisions (tit‑for‑tat strategy).
Stochastic Payoffs Introducing uncertainty in outcomes can make cooperation more attractive.
Multiple Players Extends to public goods games, illustrating free‑rider problems.
Communication / Commitments Allowing pre‑play negotiation or binding agreements can alter the equilibrium. - Real‑World Analogies
Advertising: Two companies may both benefit from lower prices (cooperate), but each has an incentive to undercut the other (defect).
Climate Change: Nations gain by reducing emissions (cooperate) but have incentives to free‑ride on others’ efforts (defect).
Antitrust Law: Firms might collude for higher profits, yet regulatory bodies and rivals threaten that cooperation. - Key Takeaways
Dominant strategy ≠ Pareto optimal – the rational move can be socially suboptimal.
Repeated interactions enable trust & reciprocity, making cooperation viable.
Mechanisms (contracts, institutions, reputations) can shift incentives toward collective benefit.
Bottom line: The Prisoner’s Dilemma illustrates how individual rationality can lead to a worse outcome for all involved—a fundamental insight that underpins much of economics, political science, biology, and even computer science (e.g., distributed systems, security protocols).