Representative Markov Chain Examples
Introduction
Markov chains (Durrett, 2012) are one of the simplest and most powerful mathematical models for stochastic systems.
They appear everywhere:
-
web-page ranking
-
reinforcement learning
-
queueing systems
-
finance
-
genetics
-
statistical physics
-
language models
The central idea is simple:
The future depends only on the present state, not on the full history.
Formally, a Markov chain satisfies
\[P(X_{t+1}=j \mid X_t=i, X_{t-1}, \dots, X_0) P(X_{t+1}=j \mid X_t=i)\]where:
-
\(X_t\) is the state at time \(t\)
-
\(P(X_{t+1}=j \mid X_t=i)\) is the transition probability from state \(i\) to state \(j\)
A finite Markov chain is completely described by its transition matrix:
\[P = \begin{bmatrix} p_{11} & p_{12} & \cdots & p_{1n} \\ p_{21} & p_{22} & \cdots & p_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ p_{n1} & p_{n2} & \cdots & p_{nn} \\ \end{bmatrix}\]Each row sums to \(1\):
\[\sum_{j=1}^n p_{ij} = 1\]In this post, we’ll walk through several representative examples and explain the most important structural properties of Markov chains.
1. Ergodic Chain
Consider:
\[P = \begin{bmatrix} 0.5 & 0.5 \\ 0.2 & 0.8 \\ \end{bmatrix}\]This chain is:
-
irreducible
-
aperiodic
-
ergodic
Irreducibility
A chain is irreducible if every state can eventually reach every other state.
Here:
-
state \(0\) can reach state \(1\)
-
state \(1\) can reach state \(0\)
So the state space forms a single communicating class.
Aperiodicity
A state has period
\[d(i) \gcd { n \ge 1 : P^n(i,i) > 0 }\]Because both states have self-loops:
\[P(0 \to 0) = 0.5\]and
\[P(1 \to 1) = 0.8\]the chain can return in one step, giving period \(1\).
Ergodicity
An irreducible + aperiodic finite chain is ergodic.
That means:
\[P^n \to \begin{bmatrix} \pi \ \pi \ \vdots \end{bmatrix} \quad \text{as } n \to \infty\]where \(\pi\) is the stationary distribution.
The stationary distribution solves:
\[\pi P = \pi\]with
\[\sum_i \pi_i = 1\]For this chain:
\[\pi = \begin{bmatrix} 0.2857 & 0.7143 \end{bmatrix}\]meaning the chain spends about \(71%\) of its time in state \(1\) long-run.
2. Periodic Chain
Now consider:
\[P = \begin{bmatrix} 0 & 1 \\ 1 & 0 \\ \end{bmatrix}\]This chain alternates forever:
\[0 \to 1 \to 0 \to 1 \to \cdots\]The return times to state \(0\) are:
\[2,4,6,8,\dots\]Thus:
\[d(0)=2\]and similarly:
\[d(1)=2\]The chain is irreducible but periodic.
Why Periodicity Matters
Even though a stationary distribution exists:
\[\pi = \begin{bmatrix} 0.5 & 0.5 \end{bmatrix}\]the powers of the transition matrix do not converge:
\[P^{2n} = \begin{bmatrix} 1 & 0 \\ 0 & 1 \\ \end{bmatrix}\]while
\[P^{2n+1} = \begin{bmatrix} 0 & 1 \\ 1 & 0 \\ \end{bmatrix}\]The system oscillates forever.
3. Absorbing Chain
Consider:
\[P = \begin{bmatrix} 1.0 & 0.0 \\ 0.4 & 0.6 \\ \end{bmatrix}\]State \(0\) is absorbing because:
\[P(0 \to 0)=1\]Once entered, it can never be left.
Long-Term Behavior
State \(1\) eventually flows into state \(0\).
Repeated multiplication shows:
\[P^n \to \begin{bmatrix} 1 & 0 \\ 1 & 0 \\ \end{bmatrix}\]Eventually all probability mass accumulates in the absorbing state.
4. Multiple Closed Classes
Now consider:
\[P = \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0.5 & 0 & 0.5 & 0 \\ 0 & 0.5 & 0 & 0.5 \\ \end{bmatrix}\]States \(0\) and \(1\) are absorbing closed classes.
States \(2\) and \(3\) are transient.
Eventually:
-
state \(2\) gets trapped in class \({0}\)
-
state \(3\) gets trapped in class \({1}\)
This chain is reducible because not all states communicate.
5. Fully Connected Chain
A more “generic” stochastic system:
\[P = \begin{bmatrix} 0.2 & 0.3 & 0.5 \\ 0.4 & 0.4 & 0.2 \\ 0.3 & 0.3 & 0.4 \\ \end{bmatrix}\]Every state communicates with every other state in one step.
The chain is:
-
irreducible
-
aperiodic
-
ergodic
Because all entries are positive:
\[p_{ij} > 0\]the chain mixes rapidly toward equilibrium.
6. Transient to Closed Flow
Consider:
\[P = \begin{bmatrix} 0.2 & 0.8 & 0.0 \\ 0.0 & 0.5 & 0.5 \\ 0.0 & 0.4 & 0.6 \\ \end{bmatrix}\]State \(0\) is transient.
Once the chain enters \({1,2}\), it never returns to \(0\).
The subset:
\[{1,2}\]forms a closed irreducible class.
This structure appears frequently in:
-
queueing systems
-
epidemic models
-
reinforcement learning environments
where some states act as temporary entry points before long-run equilibrium behavior emerges.
7. Self-Loops Create Aperiodicity
Consider:
\[P = \begin{bmatrix} 0.1 & 0.9 \\ 0.7 & 0.3 \\ \end{bmatrix}\]Even a tiny self-loop can destroy periodicity.
Because:
\[P(0 \to 0)=0.1\]the chain can return in one step.
Thus:
\[d(0)=1\]and the entire irreducible chain becomes aperiodic.
This is why adding “noise” or “lazy steps” is a common trick in Markov Chain Monte Carlo (MCMC).
8. Three-Cycle Periodic Chain
Consider:
\[P = \begin{bmatrix} 0 & 1 & 0 \\ 0 & 0 & 1 \\ 1 & 0 & 0 \\ \end{bmatrix}\]The chain cycles deterministically:
\[0 \to 1 \to 2 \to 0\]Returns occur only at times:
\[3,6,9,\dots\]So:
\[d(i)=3\]for every state.
The chain is irreducible but not ergodic.
9. Identity Chain
The identity matrix:
\[P = I\]gives:
\[P = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \\ \end{bmatrix}\]Every state is absorbing.
Nothing ever changes.
This is the most degenerate possible Markov chain.
10. Random Stochastic Matrix
Your helper function generates matrices by:
-
sampling random positive entries
-
normalizing rows
-
rounding entries
-
fixing row sums
Mathematically:
\[P_{ij} \frac{A_{ij}} {\sum_k A_{ik}}\]where \(A_{ij}>0\) are random draws.
Most randomly generated positive matrices are:
-
irreducible
-
aperiodic
-
ergodic
because positive entries imply strong connectivity.
Communicating Classes
One of the most important structural ideas in Markov chains is communication.
State \(i\) communicates with state \(j\) if:
\[i \leftrightarrow j\]meaning:
\[i \to j \quad \text{and} \quad j \to i\]A communicating class is a maximal set of mutually reachable states.
These classes determine the chain’s long-run structure.
Transient vs Recurrent States
A state is:
-
recurrent if it is revisited infinitely often
-
transient if it can eventually be abandoned forever
For finite chains:
-
closed irreducible classes are recurrent
-
states outside closed classes are transient
This decomposition is fundamental to Markov chain theory.
Stationary Distributions
A stationary distribution satisfies:
\[\pi P = \pi\]Interpretation:
If the chain starts distributed as \(\pi\), it remains distributed as \(\pi\) forever.
For ergodic chains:
\[\lim_{n\to\infty} P^n(i,j)=\pi_j\]independent of the starting state.
This is the mathematical basis of:
-
PageRank
-
MCMC
-
Gibbs sampling
-
Hidden Markov Models
-
stochastic optimization
Final Thoughts
These examples capture the major behaviors finite Markov chains can exhibit:
| Type | Key Feature |
|---|---|
| Ergodic | Converges to equilibrium |
| Periodic | Oscillates forever |
| Absorbing | Gets trapped |
| Reducible | Splits into disconnected regions |
| Fully connected | Rapid mixing |
| Transient flow | Temporary states vanish |
Markov chains are deceptively simple objects.
Yet from the single equation
\[P(X_{t+1}=j \mid X_t=i)\]emerges an enormous theory connecting:
-
probability
-
linear algebra
-
dynamical systems
-
statistical mechanics
-
machine learning
and modern AI itself.
Appendix: Code Walkthrough
The code begins by defining a helper function, random_stochastic_matrix(n=3, decimals=1), which generates a random stochastic matrix by first sampling a random positive matrix with np.random.rand(n, n), normalizing each row so that the entries sum to \(1\), rounding the entries for readability, and then correcting the final entry in each row to ensure exact row sums after rounding. A dictionary called examples stores a collection of representative Markov-chain transition matrices, including ergodic chains, periodic chains, absorbing chains, reducible chains, fully connected systems, transient-to-recurrent systems, and randomly generated chains. Each dictionary key is a descriptive label, while each value is a NumPy array representing the transition matrix. The program then initializes an empty dictionary called results and iterates through every example using for name, P in examples.items():. For each transition matrix \(P\), the code prints the matrix, calls classify_markov_chain(P) to compute structural properties of the chain, stores the returned properties in the results dictionary, and prints a formatted summary of key characteristics including irreducibility, aperiodicity, ergodicity, communicating classes, closed classes, absorbing states, recurrent states, transient states, and periods. If the classification routine determines that a stationary distribution exists, the code prints the stationary distribution rounded to four decimal places using np.round. Overall, the script acts as a compact framework for generating, analyzing, and comparing representative finite-state Markov chains.
References
- Durrett, R. (2012). Essentials of Stochastic Processes (2nd ed.). Springer. https://doi.org/10.1007/978-1-4614-3615-7