Introduction

Enzyme kinetics is one of the foundational topics in biochemistry and computational biology. While the Michaelis–Menten equation describes the relationship between substrate concentration and reaction velocity, extracting meaningful kinetic parameters such as \(K_m\) and \(V_{max}\) often requires transforming nonlinear data into linear forms.

In this project, we use Python, NumPy, and Matplotlib to analyze enzyme kinetics using five classical linearization methods:

Lineweaver–Burk
Hanes–Woolf
Eadie–Hofstee
Direct Linear Transform
Eadie–Scatchard

The script computes linear regressions, estimates kinetic constants, and visualizes each transformation with annotated plots.

The Michaelis–Menten Equation

The core equation of enzyme kinetics is:

\[v = \frac{V_{max}[S]}{K_m + [S]}\]

where:

\(v\) = reaction velocity
\([S]\) = substrate concentration
\(V_{max}\) = maximum reaction velocity
\(K_m\) = Michaelis constant

Because this equation is nonlinear, several algebraic rearrangements have historically been used to estimate kinetic parameters through linear regression.

Dataset

The experimental dataset (Segel, 1976) consists of substrate concentrations and measured reaction velocities.

S = np.array([8.33e-6, 1.00e-5, 1.25e-5, 1.67e-5, 2.00e-5,
              2.50e-5, 3.33e-5, 4.00e-5, 5.00e-5,
              6.00e-5, 8.00e-5, 1.00e-4, 2.00e-4])

v = np.array([13.8, 16.0, 19.0, 23.6, 26.7,
              30.8, 36.3, 40.0, 44.4,
              48.0, 53.4, 57.1, 66.7])

These values represent a classic saturation curve where reaction velocity increases with substrate concentration and gradually approaches \(V_{max}\).

Building a Reusable Linear Regression Function

To avoid repetitive code, the script defines a helper function for linear fitting:

def fit_line(x, y):
    m, b = np.polyfit(x, y, 1)
    yhat = m*x + b
    r2 = 1 - np.sum((y-yhat)**2)/np.sum((y-y.mean())**2)
    return m, b, r2

This function computes:

slope (\(m\))
intercept (\(b\))
coefficient of determination (\(R^2\))

The \(R^2\) value helps evaluate how well the transformed data follows a linear relationship.

1. Lineweaver–Burk Plot

The Lineweaver–Burk transformation takes the reciprocal of both sides of the Michaelis–Menten equation.

\[\frac{1}{v} = \frac{K_m}{V_{max}}\frac{1}{[S]} + \frac{1}{V_{max}}\]

This produces a straight line where:

slope = \(\frac{K_m}{V_{max}}\)
intercept = \(\frac{1}{V_{max}}\)

Advantages

Historically important
Easy visual interpretation

Drawbacks

Strongly amplifies experimental error at low substrate concentrations
Reciprocal transformations distort variance

The script calculates:

Vmax = 100 / b
Km = (m * Vmax) / 1e6

and overlays the regression line directly on the scatter plot.

2. Hanes–Woolf Plot

The Hanes–Woolf transformation rearranges the equation into:

\[\frac{[S]}{v} = \frac{1}{V_{max}}[S] + \frac{K_m}{V_{max}}\]

Compared with Lineweaver–Burk, this method reduces the weighting problem caused by reciprocal velocity terms.

Advantages

Less sensitive to low-concentration noise
More stable regression behavior

Interpretation

slope = \(\frac{1}{V_{max}}\)
intercept = \(\frac{K_m}{V_{max}}\)

This transformation is often considered more reliable for experimental datasets.

3. Eadie–Hofstee Plot

The Eadie–Hofstee equation rearranges Michaelis–Menten into:

\[v = -K_m\left(\frac{v}{[S]}\right) + V_{max}\]

Unlike reciprocal methods, this transformation uses velocity on both axes.

Interpretation

slope = \(-K_m\)
intercept = \(V_{max}\)

Advantages

Reduced distortion from reciprocal scaling
Frequently produces visually balanced plots

Drawbacks

Correlated errors because \(v\) appears in both variables

4. Direct Linear Transform

The direct linear transform uses:

\[[S] = V_{max}\left(\frac{[S]}{v}\right) - K_m\]

This formulation avoids reciprocal velocity terms entirely.

Interpretation

slope = \(V_{max}\)
intercept = \(-K_m\)

This method is less commonly discussed in introductory biochemistry courses but can provide intuitive geometric insight.

5. Eadie–Scatchard Plot

The Eadie–Scatchard transformation is:

\[\frac{v}{[S]} = -\frac{1}{K_m}v + \frac{V_{max}}{K_m}\]

Interpretation

slope = \(-\frac{1}{K_m}\)
intercept = \(\frac{V_{max}}{K_m}\)

This method is mathematically related to the Eadie–Hofstee plot but flips the variable arrangement.

Visualization Strategy

The script uses Matplotlib to generate publication-style figures with:

scatter plots for experimental data
regression lines
equation annotations
estimated \(K_m\)
estimated \(V_{max}\)
slope/intercept values
\(R^2\) statistics

Example plotting structure:

plt.scatter(x, y)
plt.plot(x, m*x + b)

plt.title(
    rf"Lineweaver-Burk\n"
    rf"$K_m$ = {Km:.3e} M, "
    rf"$V_{{max}}$ = {Vmax:.2f}"
)

The use of raw formatted strings (rf"") allows LaTeX rendering and variable interpolation simultaneously.

Why Multiple Linearizations Matter

Different transformations emphasize different regions of the experimental data.

Method	Strength	Weakness
Lineweaver–Burk	Simple interpretation	Distorts low-\([S]\) errors
Hanes–Woolf	More stable regression	Still linearized
Eadie–Hofstee	Balanced visualization	Correlated variables
Direct Linear	Geometric intuition	Less commonly used
Eadie–Scatchard	Useful alternative form	Error propagation issues

Modern enzyme kinetics often favors nonlinear regression directly on the Michaelis–Menten equation, but these classical plots remain valuable educational and analytical tools.

Scientific Computing Takeaways

This project demonstrates several important computational techniques:

numerical linear regression with NumPy
statistical goodness-of-fit analysis
scientific plotting with Matplotlib
biochemical parameter estimation
equation visualization using LaTeX formatting

It also highlights how mathematical transformations can drastically change the interpretation and stability of experimental data.

Final Thoughts

Linear transformations of the Michaelis–Menten equation are a classic example of how mathematics and computation intersect with biochemistry. Although nonlinear fitting methods are now standard in research environments, understanding these linear plots provides deep intuition about enzyme behavior, parameter estimation, and error propagation.

With only NumPy and Matplotlib, Python becomes a powerful environment for biochemical data analysis and scientific visualization.

Appendix: Code Walkthrough

This appendix explains what each section of the Python script does and how the values of \(K_m\), \(V_{max}\), and \(R^2\) are obtained.

1. Importing Libraries

import numpy as np
import matplotlib.pyplot as plt

The script uses NumPy for numerical calculations and Matplotlib for plotting.

NumPy handles arrays, transformations, and linear regression through np.polyfit.

Matplotlib creates the scatter plots, regression lines, labels, titles, and annotations.

2. Setting Plot Styles

plt.rcParams.update({
    "font.size": 12,
    "axes.titlesize": 13,
    "axes.labelsize": 12
})

This updates the default appearance of all plots.

The font size is increased so that axis labels, titles, and annotations are easier to read.

3. Entering the Experimental Data

S = np.array([8.33e-6, 1.00e-5, 1.25e-5, 1.67e-5, 2.00e-5, 2.50e-5,
              3.33e-5, 4.00e-5, 5.00e-5, 6.00e-5, 8.00e-5, 1.00e-4, 2.00e-4])

v = np.array([13.8, 16.0, 19.0, 23.6, 26.7, 30.8,
              36.3, 40.0, 44.4, 48.0, 53.4, 57.1, 66.7])

S stores the substrate concentrations \([S]\) in molar units.

v stores the initial reaction velocities.

Each value in S corresponds to the velocity value at the same array position in v.

For example:

S[0] = 8.33e-6
v[0] = 13.8

This means that when \([S] = 8.33 \times 10^{-6}\) M, the measured velocity is \(13.8\).

4. Defining the Linear Fit Function

def fit_line(x, y):
    m, b = np.polyfit(x, y, 1)
    yhat = m*x + b
    r2 = 1 - np.sum((y-yhat)**2)/np.sum((y-y.mean())**2)
    return m, b, r2

This function performs a linear regression of the form:

\[y = mx + b\]

where \(m\) is the slope and \(b\) is the y-intercept.

The line:

m, b = np.polyfit(x, y, 1)

fits a first-degree polynomial to the data.

A first-degree polynomial is a straight line:

\[y = mx + b\]

The next line calculates the predicted values:

yhat = m*x + b

These are the points on the best-fit line.

The script then calculates \(R^2\):

r2 = 1 - np.sum((y-yhat)**2)/np.sum((y-y.mean())**2)

Mathematically, this is:

\[R^2 = 1 - \frac{\sum (y_i - \hat{y}_i)^2}{\sum (y_i - \bar{y})^2}\]

where \(\hat{y}_i\) is the predicted value and \(\bar{y}\) is the mean of the observed values.

A value of \(R^2\) close to \(1\) means the transformed data fits a straight line well.

5. Scaling the Data

S_s = S * 1e5
v_s = v

invS_s = (1/S) * 1e-4
invv_s = (1/v) * 100

v_by_S_s = (v/S) * 1e-6
S_by_v_s = (S/v) * 1e6

These lines create scaled variables for plotting.

The scaling does not change the relationship between variables. It only makes the axis numbers easier to read.

For example:

S_s = S * 1e5

converts very small molar concentrations into more convenient numbers.

If:

\[[S] = 8.33 \times 10^{-6}\]

then:

\[S_s = [S] \times 10^5 = 0.833\]

The reciprocal substrate concentration is scaled as:

invS_s = (1/S) * 1e-4

The reciprocal velocity is scaled as:

invv_s = (1/v) * 100

The velocity-over-substrate term is scaled as:

v_by_S_s = (v/S) * 1e-6

The substrate-over-velocity term is scaled as:

S_by_v_s = (S/v) * 1e6

These scaled variables are used in the five transformed plots.

6. Lineweaver–Burk Section

x = invS_s
y = invv_s
m, b, r2 = fit_line(x, y)

For the Lineweaver–Burk plot, the script sets:

\[x = \frac{1}{[S]}\]

and:

\[y = \frac{1}{v}\]

The Lineweaver–Burk equation is:

\[\frac{1}{v} \frac{K_m}{V_{max}}\frac{1}{[S]} + \frac{1}{V_{max}}\]

After fitting the line, the script extracts \(V_{max}\) and \(K_m\):

Vmax = 100 / b
Km = (m * Vmax) / 1e6

Because the plotted \(y\)-axis is scaled as:

\[y = \frac{1}{v} \times 100\]

the intercept is:

\[b = \frac{100}{V_{max}}\]

Solving for \(V_{max}\) gives:

\[V_{max} = \frac{100}{b}\]

Because the plotted \(x\)-axis is scaled as:

\[x = \frac{1}{[S]} \times 10^{-4}\]

the slope must be corrected for the scaling. The script calculates:

\[K_m = \frac{mV_{max}}{10^6}\]

The plot is then created using:

plt.scatter(x, y)
plt.plot(x, m*x + b)

The scatter points show the transformed data, and the line shows the linear fit.

7. Hanes–Woolf Section

x = S_s
y = S_by_v_s
m, b, r2 = fit_line(x, y)

For the Hanes–Woolf plot, the script uses:

\[x = [S]\]

and:

\[y = \frac{[S]}{v}\]

The Hanes–Woolf equation is:

\[\frac{[S]}{v} \frac{1}{V_{max}}[S] + \frac{K_m}{V_{max}}\]

The script calculates:

Vmax = 10 / m
Km = (b * Vmax) / 1e6

The slope is related to \(V_{max}\), but the factor of \(10\) appears because both axes were scaled for readability.

The intercept is related to:

\[\frac{K_m}{V_{max}}\]

so multiplying the intercept by \(V_{max}\) gives \(K_m\), followed by the scaling correction.

8. Eadie–Hofstee Section

x = v_by_S_s
y = v_s
m, b, r2 = fit_line(x, y)

For the Eadie–Hofstee plot, the script uses:

\[x = \frac{v}{[S]}\]

and:

\[y = v\]

The Eadie–Hofstee equation is:

\[v -K_m\left(\frac{v}{[S]}\right) + V_{max}\]

The script extracts the parameters using:

Vmax = b
Km = -m / 1e6

The intercept gives \(V_{max}\) directly:

\[V_{max} = b\]

The slope is related to \(-K_m\), but the \(x\)-axis is scaled by \(10^{-6}\), so the script corrects the slope using:

\[K_m = \frac{-m}{10^6}\]

9. Direct Linear Transform Section

x = S_by_v_s
y = S_s
m, b, r2 = fit_line(x, y)

For the direct linear transform, the script uses:

\[x = \frac{[S]}{v}\]

and:

\[y = [S]\]

The transformed equation is:

\[[S] V_{max}\left(\frac{[S]}{v}\right) K_m\]

The script calculates:

Vmax = 10 * m
Km = -b / 1e5

The slope gives \(V_{max}\) after correcting for the axis scaling.

The intercept represents \(-K_m\), so:

\[K_m = -b\]

Because the plotted substrate concentration is scaled by \(10^5\), the script converts it back using:

\[K_m = \frac{-b}{10^5}\]

10. Eadie–Scatchard Section

v_M = v * 1e-9

This converts velocity from nanomolar-style units into molar units.

Then the script rescales the velocity and velocity-over-substrate variables:

v_s = v_M * 1e9
v_by_S_s = (v_M / S) * 1e3

For the Eadie–Scatchard plot, the script uses:

\[x = v\]

and:

\[y = \frac{v}{[S]}\]

The Eadie–Scatchard equation is:

\[\frac{v}{[S]} -\frac{1}{K_m}v + \frac{V_{max}}{K_m}\]

The regression is performed with:

x = v_s
y = v_by_S_s
m, b, r2 = fit_line(x, y)

The kinetic parameters are calculated using:

Km = -1e-6 / m
Vmax = (b * Km) / 1e-6

The slope is proportional to:

\[-\frac{1}{K_m}\]

After correcting for the scaling used on both axes, the script solves for:

\[K_m = \frac{-10^{-6}}{m}\]

The intercept is proportional to:

\[\frac{V_{max}}{K_m}\]

so rearranging gives:

\[V_{max} = \frac{bK_m}{10^{-6}}\]

11. Plot Labels and Titles

Each plot uses commands like:

plt.xlabel(...)
plt.ylabel(...)
plt.title(...)

For example:

plt.xlabel(r"$v$ ($\mathrm{M\cdot min^{-1}} \times 10^{9}$)")

The r before the string creates a raw string. This is useful when writing LaTeX-style math labels because backslashes are interpreted correctly.

The plot title includes calculated kinetic parameters:

plt.title(
    rf"Lineweaver-Burk"
    "\n"
    rf"$K_m = {Km:.3e}\ \mathrm{{M}},\ "
    rf"V_{{max}} = {Vmax:.2f}\ \mathrm{{nmol \cdot L^{{-1}} \cdot min^{{-1}}}}$"
)

The rf prefix means the string is both raw and formatted.

That allows math notation like:

\[K_m\]

and Python variables like:

{Km:.3e}

to appear in the same title.

12. Text Annotations

Each plot includes a text box:

plt.text(0.20*max(x), 0.75*max(y),
         "1/v = (Km/Vmax)(1/[S]) + 1/Vmax\n"
         f"slope = {m:.2e}\n"
         f"intercept = {b:.2e}\n"
         f"$R^2 = {r2:.4f}$",
         fontsize=9)

This places explanatory text inside the plot.

The coordinates:

0.20*max(x), 0.75*max(y)

position the text relative to the maximum axis values.

The annotation includes:

the linearized equation
slope
intercept
\[R^2\]

This makes each figure self-contained.

13. Displaying Each Plot

plt.show()

This displays the current figure.

Because the script calls plt.figure() before each plot, each method appears in its own separate window or output cell.

14. Overall Code Flow

The full script follows this pattern five times:

Choose the transformed \(x\) and \(y\) variables.
Fit a straight line with fit_line(x, y).
Convert slope and intercept into \(K_m\) and \(V_{max}\).
Create a scatter plot.
Add the regression line.
Label the axes.
Add a title with the calculated kinetic constants.
Annotate the plot with the equation, slope, intercept, and \(R^2\).
Display the figure.

The repeated structure makes the code easy to compare across the five linearization methods.

References

Segel, I. H. (1976). Biochemical Calculations (2nd ed.). John Wiley & Sons.