Monte Carlo Sampling Methods: Inverse Transform, Rejection, and Composition
Monte Carlo methods rely on random sampling to approximate distributions and integrals. In this post, we’ll explore three major sampling techniques for generating random variables from arbitrary distributions:
- Inverse Transform Sampling
- Acceptance-Rejection Sampling
- Composition Method
We’ll derive these methods mathematically and then show working Python code examples using sympy, numpy, and seaborn for visualization.
1. Inverse Transform Sampling
Given a Cumulative Distribution Function (CDF) \(F(x)\), we can generate samples by inverting it.
Mathematical Foundation
If \(U \sim \text{Uniform}(0,1)\), then:
\[X = F^{-1}(U)\]produces samples \(X\) from the target distribution.
To demonstrate, define:
\[F(x) = \frac{1}{4}(x + 3x^2)\]The corresponding Probability Density Function (PDF) is:
\[f(x) = F'(x) = \frac{1}{4}(1 + 6x)\]We verify that it integrates to 1:
\[\int_0^1 f(x)\,dx = \frac{1}{4} \int_0^1 (1 + 6x)\,dx = \frac{1}{4}(1 + 3) = 1\]Code Implementation
import sympy as sp
from sympy import integrate, Symbol, diff, Eq, solve
import numpy as np
import seaborn as sns
x, u = sp.symbols('x u', real=True, positive=True)
# Define CDF and derive PDF
F_x = sp.Rational(1, 4) * (x + 3*x**2)
f_x = diff(F_x, x)
# Check normalization
integrate(f_x, (x, 0, 1))
# Solve inverse CDF
inverse_eq = Eq(F_x, u)
inverse_solution = solve(inverse_eq, x)
inverse_cdf = inverse_solution[0]
inverse_cdf_func = sp.lambdify(u, inverse_cdf, 'numpy')
To sample:
def generate_rv_inverse_transform(size=10000):
U_samples = np.random.rand(size)
X_samples = inverse_cdf_func(U_samples)
return X_samples
rs = generate_rv_inverse_transform()
sns.histplot(rs);
2. Acceptance–Rejection Sampling
When \(F(x)\) is not invertible analytically, the acceptance–rejection method offers a flexible alternative.
Principle
Suppose we have an easy-to-sample proposal distribution \(g(x)\) and constant \(c\) such that:
\[f(x) \leq c \cdot g(x), \quad \forall x\]Then:
- Generate \(Y \sim g(x)\)
- Generate \(U \sim \text{Uniform}(0,1)\)
- Accept \(Y\) if \(U \leq \frac{f(Y)}{c g(Y)}\)
Example
Here, we take \(g(x) = 1\) for \(x \in [0,1]\) (a uniform proposal).
# Define upper bound c
c = sp.N(max(f_x.subs(x, val) for val in [0, 1] + sp.solve(sp.diff(f_x, x))))
def generate_rv_acceptance_rejection(size=10000, c=c):
samples = np.empty(size)
count = 0
while count < size:
Y = np.random.rand()
U = np.random.rand()
if U <= f_x.subs(x, Y) / c:
samples[count] = Y
count += 1
return samples
rs = generate_rv_acceptance_rejection()
sns.histplot(rs);
This technique is powerful when direct inversion is impossible.
3. Composition Sampling
If a distribution is a mixture of several components, the composition method allows sampling efficiently.
\[f(x) = \sum_{i=1}^k w_i f_i(x), \quad \text{where } \sum_i w_i = 1\]Algorithm:
- Select component \(i\) with probability \(w_i\)
- Generate sample from \(f_i(x)\)
Code Implementation
def generate_rv_composition_general(size=10000, weights=None, inverse_cdfs=None):
if weights is None or inverse_cdfs is None:
raise ValueError("weights and inverse_cdfs must be provided")
weights = np.array(weights)
cum_weights = np.cumsum(weights)
samples = np.empty(size)
U_choice = np.random.rand(size)
U_rv = np.random.rand(size)
for i, (cw, inv_cdf) in enumerate(zip(cum_weights, inverse_cdfs)):
mask = (U_choice <= cw) if i == 0 else (U_choice > cum_weights[i-1]) & (U_choice <= cw)
samples[mask] = inv_cdf(U_rv[mask])
return samples
For example, a mixture of:
- \[F_1(x) = x\]
- \[F_2(x) = x^2\]
with weights \(w_1 = 0.25, w_2 = 0.75\):
weights = [0.25, 0.75]
inverse_cdfs = [
sp.lambdify(u, solve(Eq(x, u), x)[0], 'numpy'),
sp.lambdify(u, solve(Eq(x**2, u), x)[0], 'numpy')
]
rs = generate_rv_composition_general(size=10000, weights=weights, inverse_cdfs=inverse_cdfs)
sns.histplot(rs);
Summary of Methods
| Method | Key Equation | Pros | Cons |
|---|---|---|---|
| Inverse Transform | \(X = F^{-1}(U)\) | Exact, simple | Needs analytic inverse |
| Acceptance-Rejection | \(U \le f(Y)/(cg(Y))\) | Flexible | May reject many samples |
| Composition | \(f(x) = \sum w_i f_i(x)\) | Ideal for mixtures | Needs multiple inverses |
Appendix: Complex Case with Symbolic Derivation
In more complex examples, the inverse CDF cannot be solved analytically. We can still compute it numerically using nsolve:
f_x = sp.Rational(1, 2) + x**3 + sp.Rational(6, 4)*x**5
F_x = integrate(f_x, (x, 0, x))
inverse_F = lambda u_val: float(sp.nsolve(F_x - u_val, x, 0.5))
U = np.random.rand(10000)
rs = np.array([inverse_F(u_val) for u_val in U])
sns.histplot(rs);
Note
The general approach:
\[F^{-1}(u) = \text{solution of } F(x) = u\]is always valid — even if solved numerically. Combining this with rejection or composition allows sampling from virtually any distribution.