Part: control Week 14 Published nonlinear.py test_nonlinear.py

Nonlinear Control: Lyapunov Design, Feedback Linearization, and Sliding Modes

When the plant is nonlinear, eigenvalues and Riccati equations describe only a local picture. This chapter builds the tools that replace them: Lyapunov's direct method and LaSalle's invariance principle as global stability certificates, feedback linearization that cancels the nonlinearity by coordinate change, sliding-mode control that enforces a surface in finite time and is robust to matched uncertainty, and input-to-state stability and backstepping as the constructive bridge to robust and adaptive design — with the Lyapunov function read as the control-theoretic cousin of the reinforcement-learning value function.

On this page

Why the linear tools run out
Lyapunov’s direct method
Feedback linearization
Sliding-mode control
Input-to-state stability and backstepping
Nonlinear control versus deep RL
What’s next
Exercises
Companion code

Nonlinear Control: Lyapunov Design, Feedback Linearization, and Sliding Modes

Where we are. Weeks 11–13 lived in the linear world: a state-space model, the structural tests of controllability and observability, and the linear-quadratic regulator that solved the optimal-control problem in closed form. Every one of those tools is, at bottom, an eigenvalue statement — stability is the spectrum of $\statemat$ , the LQR closed loop is Schur because the Riccati matrix makes it so. But the linear model is only the tangent picture at one operating point. A pendulum, a quadrotor, a robot arm, a power grid: linearize and you get a local certificate that says nothing about the basin of attraction, the swing-up, or behavior far from equilibrium. This chapter asks the Week-14 question: what replaces eigenvalues and Riccati equations when the plant is genuinely nonlinear? The answer is a shift from spectra to energy — from “where are the poles” to “does a scalar certificate decrease along every trajectory.”

Chapter 14 — at a glance

Goal. Build the four load-bearing tools of nonlinear control — Lyapunov’s direct method (+ LaSalle), feedback linearization, sliding-mode control, and input-to-state stability with backstepping — each as a constructive design method, not just an analysis test, and each checked numerically on the inverted pendulum.

Reading time. ~55 minutes; ~95 with the proofs and exercises.

Key insight — the Lyapunov / value-function bridge. A Lyapunov function $\lyap(\statevec)$ is the control-theoretic cousin of the reinforcement-learning value function. Chapter 1’s value decreases in expectation under a good policy (the Bellman operator is a contraction); a Lyapunov certificate decreases along every trajectory, $\dot{\lyap} < 0$ . Chapter 13 made the kinship literal: the optimal cost-to-go $\statevec^\top\riccati\statevec$ is both the value function and a Lyapunov function. Optimal control derives $\lyap$ by solving the whole problem; nonlinear control designs $\lyap$ directly, trading optimality for a sample-free, certified controller that exploits structure. That trade — structure versus sampling — is the chapter’s thesis and the hinge to Part III.

Why the linear tools run out

A nonlinear system $\dot{\statevec} = f(\statevec, u)$ , $f(0,0)=0$ , linearized at the origin gives $\dot{\statevec} \approx \statemat\statevec + \inputmat u$ with $\statemat = \partial f/\partial\statevec$ . Lyapunov’s indirect method says: if $\statemat$ (closed-loop) is Hurwitz, the origin is locally asymptotically stable — and that is all it says. It is silent on how large the basin is, blind to multiple equilibria and limit cycles, and useless when the linearization is marginal (eigenvalues on the imaginary axis), which is exactly when nonlinear terms decide stability. The pendulum makes this concrete: about the hanging equilibrium the linearization is a lightly damped oscillator, but the global behavior — every release angle spiraling to rest — is a nonlinear, energy statement the eigenvalues cannot certify. We need a tool that sees the whole state space at once.

Lyapunov’s direct method

The idea is mechanical-energy made abstract. Find a scalar $\lyap(\statevec) \geq 0$ that is zero only at the equilibrium and decreases along trajectories; then trajectories cannot escape its sublevel sets, and if the decrease is strict they slide down to the equilibrium. No solution of the differential equation is required — the certificate is checked by differentiation alone.

Definition 14.1 (Stability of an equilibrium).

The origin of $\dot{\statevec} = f(\statevec)$ , $f(0)=0$ , is stable (in the sense of Lyapunov) if for every $\varepsilon > 0$ there is a $\delta > 0$ such that $\norm{\statevec(0)} < \delta$ implies $\norm{\statevec(t)} < \varepsilon$ for all $t \geq 0$ ; asymptotically stable if it is stable and $\statevec(t) \to 0$ for all $\statevec(0)$ near the origin; and globally asymptotically stable if that holds for every $\statevec(0)$ .

Theorem 14.1 (Lyapunov's direct method).

Let $\lyap : \mathcal{D} \to \mathbb{R}$ be continuously differentiable on a neighborhood $\mathcal{D}$ of the origin, positive definite ( $\lyap(0)=0$ and $\lyap(\statevec)>0$ for $\statevec \neq 0$ ), with derivative along the flow $\dot{\lyap}(\statevec) = \nabla\lyap(\statevec)^\top f(\statevec)$ . If $\dot{\lyap}(\statevec) \leq 0$ on $\mathcal{D}$ the origin is stable; if $\dot{\lyap}(\statevec) < 0$ for $\statevec \neq 0$ it is asymptotically stable; and if additionally $\lyap$ is radially unbounded ( $\lyap(\statevec)\to\infty$ as $\norm{\statevec}\to\infty$ ) it is globally asymptotically stable.

Proof.

Fix $\varepsilon>0$ and a ball $B_\varepsilon \subset \mathcal{D}$ . Let $m = \min_{\norm{\statevec}=\varepsilon}\lyap(\statevec) > 0$ by positive definiteness, and choose $\delta$ so that $\lyap(\statevec) < m$ on $\norm{\statevec}<\delta$ . Since $\dot{\lyap}\leq 0$ , $\lyap(\statevec(t))$ is non-increasing, so a trajectory starting inside $\norm{\statevec}<\delta$ has $\lyap(\statevec(t)) < m$ for all $t$ and can never reach the shell $\norm{\statevec}=\varepsilon$ — stability. If $\dot{\lyap}<0$ strictly, $\lyap(\statevec(t))$ is a decreasing function bounded below by $0$ , hence convergent; its limit must be a point where $\dot{\lyap}=0$ , which is only the origin — asymptotic stability. Radial unboundedness makes every sublevel set compact, so the argument is global. $\qquad\blacksquare$

The pendulum is the worked example.

Take the energy

\lyap(\phi,\dot\phi) = \tfrac12\dot\phi^2 + (g/\ell)(1-\cos\phi)

, positive definite about the hanging equilibrium. Along the damped dynamics its rate is

\dot{\lyap} = -d\,\dot\phi^2 \leq 0

— negative semidefinite, not definite, because it vanishes on the whole line

\dot\phi = 0

, not just at the origin. Theorem 14.1 then gives only stability, not asymptotic stability, even though we know every trajectory decays. The gap is closed by an invariance argument Khalil (2002) .

Theorem 14.2 (LaSalle's invariance principle).

Let $\Omega$ be a compact set, positively invariant under $\dot{\statevec}=f(\statevec)$ , and let $\lyap$ be continuously differentiable with $\dot{\lyap}\leq 0$ on $\Omega$ . Then every trajectory starting in $\Omega$ converges to the largest invariant set contained in $\{\statevec \in \Omega : \dot{\lyap}(\statevec)=0\}$ .

For the damped pendulum $\{\dot{\lyap}=0\}$ is $\{\dot\phi=0\}$ ; the only complete trajectory that stays there is the equilibrium (if $\dot\phi\equiv 0$ then $\ddot\phi\equiv 0$ forces $\sin\phi=0$ ), so LaSalle upgrades stability to asymptotic stability with a semidefinite $\dot{\lyap}$ . The companion confirms it: energy decreases monotonically and the state converges to the origin.

The bridge to Chapters 1 and 13. A Lyapunov function is a value function read backwards. Chapter 1’s $\valuefn_\policy$ satisfies the Bellman equation and decreases in expectation under an improving policy; a Lyapunov $\lyap$ decreases along every deterministic trajectory, $\dot{\lyap}<0$ — the autonomous, worst-case analog of “the value goes down.” Chapter 13 fused the two: there we proved $\statevec^\top\riccati\statevec$ drops by exactly the stage cost each step, so the optimal cost-to-go is a Lyapunov function. The difference is where the certificate comes from. Optimal control solves the whole Hamilton–Jacobi–Bellman problem and gets $\lyap$ as a byproduct; Lyapunov design guesses $\lyap$ (often a physical energy) and only checks a derivative — cheaper, structure-exploiting, and not tied to optimality.

Feedback linearization

Lyapunov’s method certifies; feedback linearization constructs a controller by canceling the nonlinearity outright. For an input-affine system $\dot{\statevec} = f(\statevec) + g(\statevec)u$ with output $y = h(\statevec)$ , differentiate $y$ until the input appears.

Definition 14.2 (Relative degree).

The system has relative degree $r$ at $\statevec_0$ if $\liederiv_g \liederiv_f^{k}h(\statevec)=0$ for $k < r-1$ near $\statevec_0$ and $\liederiv_g \liederiv_f^{r-1}h(\statevec_0)\neq 0$ , where the Lie derivative $\liederiv_f h = \nabla h^\top f$ is the rate of change of $h$ along $f$ . Equivalently, $r$ is the number of differentiations of $y$ before the input $u$ appears explicitly.

When the relative degree equals the state dimension, the change of coordinates $(\,h, \liederiv_f h, \dots, \liederiv_f^{r-1}h\,)$ turns the system into a chain of integrators, and the control

u = \frac{1}{\liederiv_g \liederiv_f^{r-1}h(\statevec)}\big(-\liederiv_f^{r}h(\statevec) + v\big)

makes the input–output map exactly linear, $y^{(r)} = v$ — then place poles with a linear $v$ Sastry (1999) . The pendulum is the textbook case of computed torque: with $\ddot\theta = a\sin\theta - d\dot\theta + b\,u$ , choosing $u = \tfrac1b(-a\sin\theta + d\dot\theta + v)$ cancels gravity and damping, leaving $\ddot\theta = v$ ; then $v = -k_1\theta - k_2\dot\theta$ gives a chosen second-order linear closed loop Slotine & Li (1991) .

Proposition 14.1 (Computed-torque exactness).

Under the computed-torque law above, the nonlinear closed loop equals the linear system $\dot{\statevec} = \big[\begin{smallmatrix}0 & 1\\ -k_1 & -k_2\end{smallmatrix}\big]\statevec$ exactly, with closed-loop poles the roots of $\lambda^2 + k_2\lambda + k_1$ . Stability is by design (choose $k_1,k_2>0$ ), not by linearization.

The companion integrates the true nonlinear plant under this law and the target linear system from the same initial state and finds the trajectories identical to numerical precision — the nonlinearity is gone, not merely small. The cost is honesty about its price: feedback linearization needs an accurate model (it cancels exact terms), it can demand large control authority, and it is only valid where the relative degree is well-defined and the internal dynamics (the unobserved part when $r < n$ ) are stable. Cancel a nonlinearity you do not know precisely and the cancellation leaves a residual — which motivates a method that does not depend on exact cancellation.

Sliding-mode control

Sliding-mode control trades smoothness for robustness. Instead of canceling the dynamics, it forces the state onto a designer-chosen surface and keeps it there despite model error, as long as the uncertainty enters through the same channel as the control (matched uncertainty).

Proposition 14.2 (Finite-time reaching).

Let $s(\statevec)$ define a sliding surface $s=0$ on which the reduced dynamics are stable, and suppose the control enforces the reaching law $\dot s = -\eta\,\mathrm{sign}(s)$ with $\eta>0$ . Then $\tfrac{d}{dt}\tfrac12 s^2 = -\eta\,\abs{s} \leq -\eta\sqrt{2}\,\big(\tfrac12 s^2\big)^{1/2}$ , so $\abs{s}$ reaches $0$ in finite time bounded by $\abs{s(0)}/\eta$ , after which the trajectory stays on $s=0$ and the reduced dynamics carry it to the origin.

For the pendulum, take $s = \dot\theta + \lambda\theta$ ( $\lambda>0$ ); on $s=0$ the reduced dynamics are $\dot\theta = -\lambda\theta$ , which decays. The control $u = \tfrac1b\big(-a\sin\theta + (d-\lambda)\dot\theta - \eta\,\mathrm{sat}(s/\Phi)\big)$ drives $s$ into a boundary layer of width $\Phi$ .

The robustness is the selling point: a wrong gravity estimate still drives

s\to 0

, because the gravity term enters through the same input channel as

u

and is dominated by a large enough

\eta

Slotine & Li (1991) . The companion verifies all three claims — finite-time reaching within the

\abs{s(0)}/\eta

bound, the surface maintained inside the boundary layer thereafter, and convergence under a deliberately mismatched model.

Lyapunov is doing the work both times. Computed torque imposes a stable linear Lyapunov function; sliding mode uses $\tfrac12 s^2$ as a Lyapunov function for the surface. Each design is a recipe for a certificate — which is exactly what the next two ideas systematize.

Input-to-state stability and backstepping

Real systems have disturbances. Input-to-state stability (ISS) is the right generalization of asymptotic stability to forced systems, and it is stated with comparison functions: a class- $\classK$ function is continuous, strictly increasing, and zero at zero.

Definition 14.3 (Input-to-state stability).

The system $\dot{\statevec} = f(\statevec, u)$ is input-to-state stable if there exist a class- $\classK\mathcal{L}$ function $\beta$ and a class- $\classK$ function $\gamma$ such that for every initial state and every bounded input,

\norm{\statevec(t)} \;\leq\; \beta\big(\norm{\statevec(0)},\,t\big) \;+\; \gamma\Big(\sup_{0\leq\tau\leq t}\norm{u(\tau)}\Big).

The state is eventually bounded by a gain $\gamma$ of the input size, and decays to zero when the input does.

ISS, due to Sontag, makes “small disturbance, small deviation” precise and composable: a cascade of ISS subsystems is ISS, which is what lets large nonlinear designs be built and certified piece by piece Sontag (1998) . Backstepping is the constructive engine that exploits this. For systems in strict-feedback (cascade) form, it builds the controller and a Lyapunov function recursively: stabilize the first subsystem treating the next state as a virtual control, define the error between that state and its desired value, augment the Lyapunov function with a quadratic in the error, and step inward until the real input appears Kellett & Braun (2023) . The output is a controller and a Lyapunov certificate delivered together — Lyapunov design turned into an algorithm, and the classical counterpart of the “value function as a learnable object” that model-based RL will pursue in Part III.

Nonlinear control versus deep RL

Lay the chapter beside the reinforcement-learning half of the curriculum. Both seek a feedback policy and a scalar certificate of good behavior; they differ in what they assume and what they pay.

Nonlinear control assumes a model with structure — input-affine form, a known relative degree, matched uncertainty, a physical energy. Given that structure, it returns a controller with a stability proof and no sampling: computed torque, a sliding surface, a backstepping Lyapunov function. The certificate $\lyap$ is designed, not learned.
Deep RL assumes samples, not structure. It learns $\optvaluefn$ and $\policy$ from interaction, paying in data and variance, and buys the ability to handle dynamics no one can write down — contact, friction, pixels — where relative degree and clean cancellations do not exist.

The dividing line is whether the structure is available and trustworthy. When it is, control wins on guarantees and sample cost; when the structure is absent or too hard to obtain, sampling wins on reach. Part III is the synthesis: model predictive control (Week 15) turns a model into a policy by online optimization, and the convergence weeks graft learned value functions onto controllers with Lyapunov-style guarantees — RL’s reach with control’s certificates.

What’s next

Week 15 (model predictive control). Rather than design one feedback law, re-solve a finite-horizon optimal control problem at every step and apply the first move. MPC is online approximate dynamic programming: Chapter 13’s Riccati value function becomes a horizon- $N$ optimization, the Lyapunov certificate of this chapter reappears as a terminal cost, and constraints — which neither LQR nor feedback linearization handle — become first-class.

Exercises

(Derive) For the hanging pendulum with energy $\lyap = \tfrac12\dot\phi^2 + (g/\ell)(1-\cos\phi)$ and dynamics $\ddot\phi = -(g/\ell)\sin\phi - d\dot\phi$ , compute $\dot{\lyap} = \nabla\lyap^\top f$ and show the gravity terms cancel, leaving $\dot{\lyap} = -d\dot\phi^2$ .

Solution
$\nabla\lyap = \big((g/\ell)\sin\phi,\ \dot\phi\big)$ and $f = \big(\dot\phi,\ -(g/\ell)\sin\phi - d\dot\phi\big)$ . Their inner product is $(g/\ell)\sin\phi\,\dot\phi + \dot\phi(-(g/\ell)\sin\phi - d\dot\phi) = -d\dot\phi^2$ . The cross terms cancel; only dissipation remains. For $d>0$ this is negative semidefinite, so Theorem 14.1 gives stability and LaSalle (Theorem 14.2) upgrades it to asymptotic stability.
(Prove) Show that $\dot{\lyap}<0$ for $\statevec\neq 0$ implies the limit of $\lyap(\statevec(t))$ is a value at which $\dot{\lyap}=0$ , and conclude the origin is asymptotically stable.

Solution
$\lyap(\statevec(t))$ is monotonically decreasing and bounded below by $0$ , hence convergent to some $\lyap_\infty\geq 0$ . If $\lyap_\infty>0$ the trajectory stays in the compact shell $\{c_1\leq\lyap\leq\lyap(\statevec(0))\}$ where $\dot{\lyap}\leq -\mu<0$ , forcing $\lyap\to-\infty$ — a contradiction. So $\lyap_\infty=0$ , and by positive definiteness $\statevec(t)\to0$ . (This is the strict-decrease half of Theorem 14.1; LaSalle handles the semidefinite case.)
(Compute) A system has relative degree $r=2$ : $\dot{x}_1 = x_2$ , $\dot{x}_2 = \sin x_1 + u$ , $y=x_1$ . Find the feedback-linearizing control that imposes $\ddot y = -k_1 y - k_2\dot y$ and give the closed-loop poles.

Solution
$\ddot y = \dot x_2 = \sin x_1 + u$ , so $u = -\sin x_1 + v$ with $v = -k_1 x_1 - k_2 x_2$ yields $\ddot y = -k_1 y - k_2\dot y$ . The Lie-derivative bookkeeping: $\liederiv_f h = x_2$ , $\liederiv_f^2 h = \sin x_1$ , $\liederiv_g\liederiv_f h = 1\neq0$ (relative degree $2$ ). Poles are the roots of $\lambda^2 + k_2\lambda + k_1$ ; pick $k_1,k_2>0$ for a Hurwitz pair.
(Prove) For the reaching law $\dot s = -\eta\,\mathrm{sign}(s)$ , show $\abs{s(t)}$ hits zero by time $\abs{s(0)}/\eta$ . Why does matched uncertainty not change this bound?

Solution
With $W=\tfrac12 s^2$ , $\dot W = s\dot s = -\eta\abs{s} = -\eta\sqrt{2W}$ . Separating, $\sqrt{2W}$ decreases at constant rate $\eta$ , so $\abs{s}=\sqrt{2W}$ reaches $0$ in time $\abs{s(0)}/\eta$ . A matched disturbance $\delta$ enters as $\dot s = -\eta\,\mathrm{sign}(s) + \delta$ ; choosing $\eta > \sup\abs{\delta}$ keeps $s\dot s \leq -(\eta-\sup\abs{\delta})\abs{s} < 0$ , so the surface is still reached — the gain $\eta$ dominates the uncertainty rather than canceling it.
(Implement) In the companion, verify the three sliding-mode claims: finite-time reaching within $\abs{s(0)}/\eta$ , the surface maintained in the boundary layer afterward, and convergence under a mismatched model.

Solution
See experiments/python/week14/test_nonlinear.py: test_sliding_mode_reaches_surface_in_bounded_time checks the reaching bound and boundary-layer maintenance; test_sliding_mode_robust_to_matched_parameter_error builds the controller with a wrong length (so a wrong gravity coefficient) yet still reaches the surface and converges — matched-uncertainty robustness.
(Extend) Argue why the quadratic LQR cost-to-go $\statevec^\top\riccati\statevec$ of Chapter 13 is a Lyapunov function for the optimal closed loop, and use this to connect Lyapunov design to value functions.

Solution
From Chapter 13, along the optimal closed loop $\statevec_k^\top\riccati\statevec_k - \statevec_{k+1}^\top\riccati\statevec_{k+1} = \statevec_k^\top(Q + \lqrgain^\top R\lqrgain)\statevec_k \geq 0$ , so $\lyap=\statevec^\top\riccati\statevec$ is positive definite with $\Delta\lyap\leq 0$ — a Lyapunov function that also equals the optimal value. Lyapunov design chooses such a certificate directly (energy, $\tfrac12 s^2$ , a backstepping sum) instead of solving the optimal-control problem for it — the same object, reached without the full Hamilton–Jacobi–Bellman computation. This is the seam model-based RL works along in Part III.

Companion code

The Week-14 companion lives at experiments/python/week14/ (pure numpy, RK4 integration with the controller evaluated at each stage for a faithful continuous-time closed loop).

nonlinear.py — the pendulum model (hanging and upright conventions); the energy Lyapunov function with its analytic and numeric $\dot{\lyap}$ ; a certificate check that passes for the damped pendulum and fails for the anti-damped one; the computed-torque feedback-linearizing law and its target linear system; and the boundary-layer sliding-mode law with the sliding surface and reaching time.
test_nonlinear.py — mathematical-correctness tests: $\dot{\lyap}=-d\dot\phi^2$ matches the numeric $\nabla\lyap^\top f$ ; the certificate discriminates damped from anti-damped; the damped pendulum’s energy is monotone and the state converges (LaSalle); feedback linearization reproduces the target linear system exactly with the prescribed poles; and sliding mode reaches the surface in bounded time, stays in the boundary layer, and is robust to matched parameter error.

# nonlinear-control algorithms + correctness tests
PYTHONPATH=. pytest experiments/python/week14/test_nonlinear.py -q

# worked Lyapunov / feedback-linearization / sliding-mode demonstrations on the pendulum
PYTHONPATH=. python experiments/python/week14/nonlinear.py