Open In Colab

Introduction

In this project, we’ll analyze historical weather data from Los Angeles and test whether temperature differences between months are statistically significant using a randomization test (also called a permutation test).

We’ll use:

  • Python

  • Open-Meteo weather API

  • Pandas + NumPy

  • Matplotlib

  • Statistical simulation


The Question

Do summer months in Los Angeles actually have significantly higher daily maximum temperatures than nearby months?

For example:

  • Is July hotter than August?

  • Is February cooler than March?

  • Could observed differences happen by random chance?

Instead of relying on assumptions from classical statistics, we’ll use simulation.

The core statistic in this analysis is the difference in means between two months’ daily maximum temperatures. If July temperatures are consistently warmer than August temperatures, the average of the July observations should exceed the average of the August observations. By repeatedly shuffling the temperature labels and recomputing the difference in means, we can estimate how likely such a difference would occur purely by chance.


Step 1 – Download Historical Weather Data

We’ll pull daily maximum temperatures from the Open-Meteo archive API.

import requests
import pandas as pd

latitude = 34.0522
longitude = -118.2437

url = "https://archive-api.open-meteo.com/v1/archive"

params = {
    "latitude": latitude,
    "longitude": longitude,
    "start_date": "2024-01-01",
    "end_date": "2024-12-31",
    "daily": "temperature_2m_max",
    "timezone": "America/Los_Angeles"
}

response = requests.get(url, params=params)
data = response.json()

df = pd.DataFrame(data["daily"])

print(df.head())

This gives us:

time temperature_2m_max
2024-01-01 18.1
2024-01-02 19.4

Step 2 – Prepare Monthly Groups

We convert the dates into month labels so temperatures can be grouped.

df["time"] = pd.to_datetime(df["time"])
df["month"] = df["time"].dt.month

Now we can isolate temperatures for specific months.

month1 = df[df["month"] == 7]["temperature_2m_max"].values
month2 = df[df["month"] == 8]["temperature_2m_max"].values

Here:

  • 7 = July

  • 8 = August


Step 3 – Visualize Temperature Distributions

A boxplot helps us understand spread, variability, and outliers.

import matplotlib.pyplot as plt

monthly_temps = [
    df[df["month"] == month]["temperature_2m_max"]
    for month in range(1, 13)
]

plt.figure(figsize=(12, 6))

plt.boxplot(
    monthly_temps,
    tick_labels=[
        "Jan", "Feb", "Mar", "Apr",
        "May", "Jun", "Jul", "Aug",
        "Sep", "Oct", "Nov", "Dec"
    ]
)

plt.xlabel("Month")
plt.ylabel("Maximum Temperature (°C)")
plt.title("Daily Maximum Temperatures by Month")
plt.grid(True)

plt.show()

The visualization immediately reveals:

  • Summer months shift upward

  • Winter months cluster lower

  • Some months have larger variability


# Step 4 – Compute the Observed Difference in Means

We calculate the difference between monthly means.

The observed statistic is:

\[\Delta = \bar{x}_1 - \bar{x}_2\]

where:

  • \(\bar{x}_1\) = average temperature of Month 1

  • \(\bar{x}_2\) = average temperature of Month 2

In Python:

observed_diff = month1.mean() - month2.mean()

print(observed_diff)

Step 5 – The Null Hypothesis

Our null hypothesis says:

The two months come from the same temperature distribution.

If that’s true, then shuffling the temperature labels should not matter.

Formally:

\[H_0 : \mu_1 = \mu_2\]

Step 6 – Randomization Test

We combine both groups together and repeatedly shuffle them.

import numpy as np

combined = np.concatenate([month1, month2])

num_simulations = 5000
simulated_diffs = []

for i in range(num_simulations):

    np.random.shuffle(combined)

    sim_mo1 = combined[:len(month1)]
    sim_mo2 = combined[len(month1):]

    sim_diff = sim_mo1.mean() - sim_mo2.mean()

    simulated_diffs.append(sim_diff)

simulated_diffs = np.array(simulated_diffs)

This creates a simulated null distribution.


Why This Works

Under the null hypothesis,

\[H_0 : \mu_1 = \mu_2\]

all temperature observations are interchangeable.

By shuffling labels thousands of times, we estimate what differences would occur purely by chance.


Step 7 – Compute the p-value

The p-value measures how extreme the observed difference is relative to the simulated distribution.

Formally:

\[p = P\left(|\Delta_{sim}| \geq |\Delta_{obs}|\right)\]

Python implementation:

if observed_diff > 0:
    p_value = np.mean(simulated_diffs >= observed_diff)
else:
    p_value = np.mean(simulated_diffs <= observed_diff)

print(p_value)

Step 8 – Visualize the Simulation

bins = np.linspace(
    simulated_diffs.min(),
    simulated_diffs.max(),
    30
)

plt.hist(simulated_diffs, bins=bins)

plt.axvline(
    observed_diff,
    linestyle="dashed"
)

plt.xlabel("Simulated Difference in Means")
plt.ylabel("Frequency")
plt.title("Randomization Test Distribution")

plt.show()

The histogram shows:

  • Most shuffled differences cluster near zero

  • Extreme values are rare

  • The observed statistic may sit far in the tail

In our simulations, a result of -0.76 or less occurred 568 times in 5000 samples (p-value = 0.1136). Since this result is not unusual when assuming July and August have the same average maximum temperature, there is not convincing evidence against the null hypothesis. The difference in means is not statistically significant.


Interpreting Results

Suppose we get:

p_value = 0.0032

This means:

Only 0.32% of shuffled simulations produced a difference as extreme as the real data.

Since:

\[p < 0.05\]

we reject the null hypothesis and conclude the monthly temperatures differ significantly.