← All Lessons
Week 4|Math

Probability: Predicting the Unpredictable

Explore experimental vs theoretical probability and simulate the Monty Hall problem in Python.

Materials for this lesson

  • Laptop (charged)
  • Two standard dice
  • A coin (any denomination)

Warm-Up: Probability Intuition Check

Without calculating anything, estimate the probability (as a percentage) of each event. Just go with your gut.

  1. It will rain at some point in the next 30 days.
  2. You flip a coin 10 times and get heads all 10 times.
  3. Two people in a room of 30 share a birthday.
  4. You roll a standard die and get a 6.
  5. A random person on Earth is left-handed.
  6. You draw a heart from a standard deck of cards.
🔥 Warm-Up

Probability is one of the areas of math where human intuition is consistently wrong. Today's lesson is about replacing gut feelings with precise mathematical reasoning — and then using code to verify our answers.


Core Lesson: The Mathematics of Chance

What Is Probability?

Probability measures how likely an event is to happen. It's always a number between 0 and 1 (or 0% and 100%):

  • 0 = impossible (rolling a 7 on a standard die)
  • 1 = certain (the sun rising tomorrow)
  • 0.5 = equally likely to happen or not (fair coin flip)

P(event) = (number of favorable outcomes) / (total number of possible outcomes)

Sample Spaces and Counting Outcomes

The sample space is the set of all possible outcomes.

| Experiment | Sample Space | Size | |---|---|---| | Flip a coin | {Heads, Tails} | 2 | | Roll a die | {1, 2, 3, 4, 5, 6} | 6 | | Roll two dice | {(1,1), (1,2), ..., (6,6)} | 36 | | Draw a card | {A♠, 2♠, ..., K♠, A♥, ..., K♣} | 52 |

Example: What's the probability of rolling a sum of 7 with two dice?

The favorable outcomes are: (1,6), (2,5), (3,4), (4,3), (5,2), (6,1) — that's 6 outcomes out of 36 total.

P(sum = 7) = 6/36 = 1/6 ≈ 16.7%

What is the probability of rolling a sum of 2 with two standard dice?

Theoretical vs Experimental Probability

  • Theoretical probability is what math predicts should happen.
  • Experimental probability is what actually happens when you run the experiment.

The Law of Large Numbers says: as you repeat an experiment more and more times, the experimental probability gets closer and closer to the theoretical probability.

This is a profound idea. Flip a coin 10 times and you might get 7 heads (70%). Flip it 1,000 times and you'll probably get close to 50%. Flip it 1,000,000 times and you'll be very close to 50%.

💡 Key Concept

The Law of Large Numbers doesn't say each individual flip "balances out." It says the proportion converges. After 999 heads in a row (incredibly unlikely!), the next flip is still exactly 50-50. The coin has no memory.

Introduction to Combinations

Sometimes counting outcomes requires more sophisticated methods. How many ways can you choose 2 students from a class of 5?

You could list them: (AB, AC, AD, AE, BC, BD, BE, CD, CE, DE) — that's 10 combinations.

The formula is:

C(n, k) = n! / (k! × (n - k)!)

Where n! (n factorial) means n × (n-1) × (n-2) × ... × 1.

C(5, 2) = 5! / (2! × 3!) = 120 / (2 × 6) = 10

How many ways can you choose 3 toppings from a menu of 8 options?

Probability Explained — Math Antics


Hands-On Lab: Simulating Probability with Python

Simulation 1: Coin Flips — Watching the Law of Large Numbers

import random

def simulate_coins(num_flips):
    """Flip a coin num_flips times and return the fraction of heads."""
    heads = 0
    for _ in range(num_flips):
        if random.choice(["heads", "tails"]) == "heads":
            heads += 1
    return heads / num_flips

# Run simulations at different scales
print("=== Coin Flip Simulation ===")
print(f"{'Flips':<15} {'Heads %':<15} {'Off from 50%'}")
print("-" * 45)

for n in [10, 100, 1000, 10000, 100000, 1000000]:
    result = simulate_coins(n)
    off = abs(result - 0.5) * 100
    print(f"{n:<15,} {result*100:<15.3f} {off:.3f}%")

print("\nNotice how the percentage gets closer to 50% as we flip more!")

Simulation 2: Rolling Dice — Distribution of Sums

import random

def simulate_dice(num_rolls):
    """Roll two dice num_rolls times and count the sums."""
    counts = {}
    for _ in range(num_rolls):
        roll = random.randint(1, 6) + random.randint(1, 6)
        counts[roll] = counts.get(roll, 0) + 1
    return counts

# Simulate 1,000,000 rolls
num_rolls = 1000000
results = simulate_dice(num_rolls)

print(f"=== Two Dice Rolled {num_rolls:,} Times ===\n")
print(f"{'Sum':<6} {'Count':<12} {'Experimental %':<18} {'Theoretical %':<18} {'Bar'}")
print("-" * 75)

# Theoretical probabilities for two dice
theoretical = {2: 1, 3: 2, 4: 3, 5: 4, 6: 5, 7: 6,
               8: 5, 9: 4, 10: 3, 11: 2, 12: 1}

for s in range(2, 13):
    count = results.get(s, 0)
    exp_pct = count / num_rolls * 100
    theo_pct = theoretical[s] / 36 * 100
    bar = "█" * int(exp_pct * 3)
    print(f"{s:<6} {count:<12,} {exp_pct:<18.2f} {theo_pct:<18.2f} {bar}")

Simulation 3: The Birthday Problem

How many people do you need in a room before there's a 50% chance two share a birthday?

import random

def birthday_experiment(num_people, num_trials=100000):
    """Run the birthday experiment many times and count matches."""
    matches = 0
    for _ in range(num_trials):
        birthdays = set()
        found_match = False
        for _ in range(num_people):
            bday = random.randint(1, 365)
            if bday in birthdays:
                found_match = True
                break
            birthdays.add(bday)
        if found_match:
            matches += 1
    return matches / num_trials

print("=== Birthday Problem Simulation (100,000 trials each) ===\n")
print(f"{'People':<10} {'Match Probability':<20}")
print("-" * 30)

for n in [5, 10, 15, 20, 23, 25, 30, 40, 50, 70]:
    prob = birthday_experiment(n)
    marker = " <-- ~50%!" if 0.48 <= prob <= 0.52 else ""
    print(f"{n:<10} {prob*100:<20.1f}{marker}")

print("\nSurprised? With just 23 people, there's a ~50% chance of a shared birthday!")
print("With 70 people, it's virtually guaranteed (~99.9%).")
Tip

New Python concepts you just learned:

  • random.choice() — picks a random element from a list
  • random.randint(a, b) — random integer from a to b (inclusive)
  • set() — a collection that automatically prevents duplicates (perfect for checking birthday matches!)
  • dict.get(key, default) — safely get a value from a dictionary, returning a default if the key doesn't exist
  • f"{value:<15,}" — left-align in a 15-character field with comma separators

Challenge: The Monty Hall Problem

This is one of the most famous (and most argued-about) problems in all of probability.

The Setup

You're on a game show. There are 3 doors:

  • Behind one door is a car (you want this).
  • Behind the other two doors are goats (you don't want these).

You pick a door. The host (Monty Hall), who knows what's behind every door, opens a different door that has a goat behind it. Now you see two remaining doors — yours and one other.

Monty asks: "Do you want to switch to the other door?"

The Question

Should you stay with your original door, switch to the other door, or does it not matter?

Part 1: Prove It with Code

import random

def monty_hall(num_games, switch):
    """Simulate the Monty Hall problem.

    switch=True means the player always switches.
    switch=False means the player always stays.
    """
    wins = 0

    for _ in range(num_games):
        # Set up the doors: car is behind one random door
        doors = [0, 1, 2]
        car = random.choice(doors)

        # Player picks a door
        player_pick = random.choice(doors)

        # Monty opens a door that:
        #   1. Is not the player's pick
        #   2. Does not have the car
        monty_options = [d for d in doors if d != player_pick and d != car]
        monty_opens = random.choice(monty_options)

        if switch:
            # Player switches to the remaining door
            remaining = [d for d in doors if d != player_pick and d != monty_opens]
            player_pick = remaining[0]

        if player_pick == car:
            wins += 1

    return wins / num_games

# Run the simulation
num_games = 1000000

stay_rate = monty_hall(num_games, switch=False)
switch_rate = monty_hall(num_games, switch=True)

print(f"=== Monty Hall Simulation ({num_games:,} games) ===\n")
print(f"Strategy: STAY   → Win rate: {stay_rate*100:.2f}%")
print(f"Strategy: SWITCH → Win rate: {switch_rate*100:.2f}%")
print(f"\nSwitching wins {switch_rate/stay_rate:.1f}x more often than staying!")
print(f"\nTheoretical: Stay = 33.3%, Switch = 66.7%")
🏆 Challenge

Extra challenge: Modify the simulation for a version with 100 doors. You pick one, Monty opens 98 goat doors, and one remains. Should you switch now? (The answer is even more obvious with 100 doors — switching wins 99% of the time!)

The Monty Hall Problem — Numberphile


Resources