Monte Carlo Sampling Methods: Inverse Transform, Rejection, and Composition

Monte Carlo methods rely on random sampling to approximate distributions and integrals. In this post, we’ll explore three major sampling techniques for generating random variables from arbitrary distributions:

Inverse Transform Sampling
Acceptance-Rejection Sampling
Composition Method

We’ll derive these methods mathematically and then show working Python code examples using sympy, numpy, and seaborn for visualization.

1. Inverse Transform Sampling

Given a Cumulative Distribution Function (CDF) \(F(x)\), we can generate samples by inverting it.

Mathematical Foundation

If \(U \sim \text{Uniform}(0,1)\), then:

\[X = F^{-1}(U)\]

produces samples \(X\) from the target distribution.

To demonstrate, define:

\[F(x) = \frac{1}{4}(x + 3x^2)\]

The corresponding Probability Density Function (PDF) is:

\[f(x) = F'(x) = \frac{1}{4}(1 + 6x)\]

We verify that it integrates to 1:

\[\int_0^1 f(x)\,dx = \frac{1}{4} \int_0^1 (1 + 6x)\,dx = \frac{1}{4}(1 + 3) = 1\]

Code Implementation

import sympy as sp
from sympy import integrate, Symbol, diff, Eq, solve
import numpy as np
import seaborn as sns

x, u = sp.symbols('x u', real=True, positive=True)

# Define CDF and derive PDF
F_x = sp.Rational(1, 4) * (x + 3*x**2)
f_x = diff(F_x, x)

# Check normalization
integrate(f_x, (x, 0, 1))

# Solve inverse CDF
inverse_eq = Eq(F_x, u)
inverse_solution = solve(inverse_eq, x)
inverse_cdf = inverse_solution[0]
inverse_cdf_func = sp.lambdify(u, inverse_cdf, 'numpy')

To sample:

def generate_rv_inverse_transform(size=10000):
    U_samples = np.random.rand(size)
    X_samples = inverse_cdf_func(U_samples)
    return X_samples

rs = generate_rv_inverse_transform()
sns.histplot(rs);

2. Acceptance–Rejection Sampling

When \(F(x)\) is not invertible analytically, the acceptance–rejection method offers a flexible alternative.

Principle

Suppose we have an easy-to-sample proposal distribution \(g(x)\) and constant \(c\) such that:

\[f(x) \leq c \cdot g(x), \quad \forall x\]

Then:

Generate \(Y \sim g(x)\)
Generate \(U \sim \text{Uniform}(0,1)\)
Accept \(Y\) if \(U \leq \frac{f(Y)}{c g(Y)}\)

Example

Here, we take \(g(x) = 1\) for \(x \in [0,1]\) (a uniform proposal).

# Define upper bound c
c = sp.N(max(f_x.subs(x, val) for val in [0, 1] + sp.solve(sp.diff(f_x, x))))

def generate_rv_acceptance_rejection(size=10000, c=c):
    samples = np.empty(size)
    count = 0
    while count < size:
        Y = np.random.rand()
        U = np.random.rand()
        if U <= f_x.subs(x, Y) / c:
            samples[count] = Y
            count += 1
    return samples

rs = generate_rv_acceptance_rejection()
sns.histplot(rs);

This technique is powerful when direct inversion is impossible.

3. Composition Sampling

If a distribution is a mixture of several components, the composition method allows sampling efficiently.

\[f(x) = \sum_{i=1}^k w_i f_i(x), \quad \text{where } \sum_i w_i = 1\]

Algorithm:

Select component \(i\) with probability \(w_i\)
Generate sample from \(f_i(x)\)

Code Implementation

def generate_rv_composition_general(size=10000, weights=None, inverse_cdfs=None):
    if weights is None or inverse_cdfs is None:
        raise ValueError("weights and inverse_cdfs must be provided")

    weights = np.array(weights)
    cum_weights = np.cumsum(weights)
    samples = np.empty(size)

    U_choice = np.random.rand(size)
    U_rv = np.random.rand(size)

    for i, (cw, inv_cdf) in enumerate(zip(cum_weights, inverse_cdfs)):
        mask = (U_choice <= cw) if i == 0 else (U_choice > cum_weights[i-1]) & (U_choice <= cw)
        samples[mask] = inv_cdf(U_rv[mask])

    return samples

For example, a mixture of:

\[F_1(x) = x\]
\[F_2(x) = x^2\]

with weights \(w_1 = 0.25, w_2 = 0.75\):

weights = [0.25, 0.75]
inverse_cdfs = [
    sp.lambdify(u, solve(Eq(x, u), x)[0], 'numpy'),
    sp.lambdify(u, solve(Eq(x**2, u), x)[0], 'numpy')
]

rs = generate_rv_composition_general(size=10000, weights=weights, inverse_cdfs=inverse_cdfs)
sns.histplot(rs);

Summary of Methods

Method	Key Equation	Pros	Cons
Inverse Transform	\(X = F^{-1}(U)\)	Exact, simple	Needs analytic inverse
Acceptance-Rejection	\(U \le f(Y)/(cg(Y))\)	Flexible	May reject many samples
Composition	\(f(x) = \sum w_i f_i(x)\)	Ideal for mixtures	Needs multiple inverses

Appendix: Complex Case with Symbolic Derivation

In more complex examples, the inverse CDF cannot be solved analytically. We can still compute it numerically using nsolve:

f_x = sp.Rational(1, 2) + x**3 + sp.Rational(6, 4)*x**5
F_x = integrate(f_x, (x, 0, x))
inverse_F = lambda u_val: float(sp.nsolve(F_x - u_val, x, 0.5))
U = np.random.rand(10000)
rs = np.array([inverse_F(u_val) for u_val in U])
sns.histplot(rs);

Note

The general approach:

\[F^{-1}(u) = \text{solution of } F(x) = u\]

is always valid — even if solved numerically. Combining this with rejection or composition allows sampling from virtually any distribution.