Nonlinear Control: Lyapunov Design, Feedback Linearization, and Sliding Modes
When the plant is nonlinear, eigenvalues and Riccati equations describe only a local picture. This chapter builds the tools that replace them: Lyapunov's direct method and LaSalle's invariance principle as global stability certificates, feedback linearization that cancels the nonlinearity by coordinate change, sliding-mode control that enforces a surface in finite time and is robust to matched uncertainty, and input-to-state stability and backstepping as the constructive bridge to robust and adaptive design — with the Lyapunov function read as the control-theoretic cousin of the reinforcement-learning value function.
On this page
Nonlinear Control: Lyapunov Design, Feedback Linearization, and Sliding Modes
Where we are. Weeks 11–13 lived in the linear world: a state-space model, the structural tests of controllability and observability, and the linear-quadratic regulator that solved the optimal-control problem in closed form. Every one of those tools is, at bottom, an eigenvalue statement — stability is the spectrum of , the LQR closed loop is Schur because the Riccati matrix makes it so. But the linear model is only the tangent picture at one operating point. A pendulum, a quadrotor, a robot arm, a power grid: linearize and you get a local certificate that says nothing about the basin of attraction, the swing-up, or behavior far from equilibrium. This chapter asks the Week-14 question: what replaces eigenvalues and Riccati equations when the plant is genuinely nonlinear? The answer is a shift from spectra to energy — from “where are the poles” to “does a scalar certificate decrease along every trajectory.”
Why the linear tools run out
A nonlinear system , , linearized at the origin gives with . Lyapunov’s indirect method says: if (closed-loop) is Hurwitz, the origin is locally asymptotically stable — and that is all it says. It is silent on how large the basin is, blind to multiple equilibria and limit cycles, and useless when the linearization is marginal (eigenvalues on the imaginary axis), which is exactly when nonlinear terms decide stability. The pendulum makes this concrete: about the hanging equilibrium the linearization is a lightly damped oscillator, but the global behavior — every release angle spiraling to rest — is a nonlinear, energy statement the eigenvalues cannot certify. We need a tool that sees the whole state space at once.
Lyapunov’s direct method
The idea is mechanical-energy made abstract. Find a scalar that is zero only at the equilibrium and decreases along trajectories; then trajectories cannot escape its sublevel sets, and if the decrease is strict they slide down to the equilibrium. No solution of the differential equation is required — the certificate is checked by differentiation alone.
The origin of , , is stable (in the sense of Lyapunov) if for every there is a such that implies for all ; asymptotically stable if it is stable and for all near the origin; and globally asymptotically stable if that holds for every .
Let be continuously differentiable on a neighborhood of the origin, positive definite ( and for ), with derivative along the flow . If on the origin is stable; if for it is asymptotically stable; and if additionally is radially unbounded ( as ) it is globally asymptotically stable.
Fix and a ball . Let by positive definiteness, and choose so that on . Since , is non-increasing, so a trajectory starting inside has for all and can never reach the shell — stability. If strictly, is a decreasing function bounded below by , hence convergent; its limit must be a point where , which is only the origin — asymptotic stability. Radial unboundedness makes every sublevel set compact, so the argument is global.
The pendulum is the worked example. Take the energy , positive definite about the hanging equilibrium. Along the damped dynamics its rate is — negative semidefinite, not definite, because it vanishes on the whole line , not just at the origin. Theorem 14.1 then gives only stability, not asymptotic stability, even though we know every trajectory decays. The gap is closed by an invariance argument Khalil (2002) .
Let be a compact set, positively invariant under , and let be continuously differentiable with on . Then every trajectory starting in converges to the largest invariant set contained in .
For the damped pendulum is ; the only complete trajectory that stays there is the equilibrium (if then forces ), so LaSalle upgrades stability to asymptotic stability with a semidefinite . The companion confirms it: energy decreases monotonically and the state converges to the origin.
The bridge to Chapters 1 and 13. A Lyapunov function is a value function read backwards. Chapter 1’s satisfies the Bellman equation and decreases in expectation under an improving policy; a Lyapunov decreases along every deterministic trajectory, — the autonomous, worst-case analog of “the value goes down.” Chapter 13 fused the two: there we proved drops by exactly the stage cost each step, so the optimal cost-to-go is a Lyapunov function. The difference is where the certificate comes from. Optimal control solves the whole Hamilton–Jacobi–Bellman problem and gets as a byproduct; Lyapunov design guesses (often a physical energy) and only checks a derivative — cheaper, structure-exploiting, and not tied to optimality.
Feedback linearization
Lyapunov’s method certifies; feedback linearization constructs a controller by canceling the nonlinearity outright. For an input-affine system with output , differentiate until the input appears.
The system has relative degree at if for near and , where the Lie derivative is the rate of change of along . Equivalently, is the number of differentiations of before the input appears explicitly.
When the relative degree equals the state dimension, the change of coordinates turns the system into a chain of integrators, and the control
makes the input–output map exactly linear, — then place poles with a linear Sastry (1999) . The pendulum is the textbook case of computed torque: with , choosing cancels gravity and damping, leaving ; then gives a chosen second-order linear closed loop Slotine & Li (1991) .
Under the computed-torque law above, the nonlinear closed loop equals the linear system exactly, with closed-loop poles the roots of . Stability is by design (choose ), not by linearization.
The companion integrates the true nonlinear plant under this law and the target linear system from the same initial state and finds the trajectories identical to numerical precision — the nonlinearity is gone, not merely small. The cost is honesty about its price: feedback linearization needs an accurate model (it cancels exact terms), it can demand large control authority, and it is only valid where the relative degree is well-defined and the internal dynamics (the unobserved part when ) are stable. Cancel a nonlinearity you do not know precisely and the cancellation leaves a residual — which motivates a method that does not depend on exact cancellation.
Sliding-mode control
Sliding-mode control trades smoothness for robustness. Instead of canceling the dynamics, it forces the state onto a designer-chosen surface and keeps it there despite model error, as long as the uncertainty enters through the same channel as the control (matched uncertainty).
Let define a sliding surface on which the reduced dynamics are stable, and suppose the control enforces the reaching law with . Then , so reaches in finite time bounded by , after which the trajectory stays on and the reduced dynamics carry it to the origin.
For the pendulum, take (); on the reduced dynamics are , which decays. The control drives into a boundary layer of width . The robustness is the selling point: a wrong gravity estimate still drives , because the gravity term enters through the same input channel as and is dominated by a large enough Slotine & Li (1991) . The companion verifies all three claims — finite-time reaching within the bound, the surface maintained inside the boundary layer thereafter, and convergence under a deliberately mismatched model.
Lyapunov is doing the work both times. Computed torque imposes a stable linear Lyapunov function; sliding mode uses as a Lyapunov function for the surface. Each design is a recipe for a certificate — which is exactly what the next two ideas systematize.
Input-to-state stability and backstepping
Real systems have disturbances. Input-to-state stability (ISS) is the right generalization of asymptotic stability to forced systems, and it is stated with comparison functions: a class- function is continuous, strictly increasing, and zero at zero.
The system is input-to-state stable if there exist a class- function and a class- function such that for every initial state and every bounded input,
The state is eventually bounded by a gain of the input size, and decays to zero when the input does.
ISS, due to Sontag, makes “small disturbance, small deviation” precise and composable: a cascade of ISS subsystems is ISS, which is what lets large nonlinear designs be built and certified piece by piece Sontag (1998) . Backstepping is the constructive engine that exploits this. For systems in strict-feedback (cascade) form, it builds the controller and a Lyapunov function recursively: stabilize the first subsystem treating the next state as a virtual control, define the error between that state and its desired value, augment the Lyapunov function with a quadratic in the error, and step inward until the real input appears Kellett & Braun (2023) . The output is a controller and a Lyapunov certificate delivered together — Lyapunov design turned into an algorithm, and the classical counterpart of the “value function as a learnable object” that model-based RL will pursue in Part III.
Nonlinear control versus deep RL
Lay the chapter beside the reinforcement-learning half of the curriculum. Both seek a feedback policy and a scalar certificate of good behavior; they differ in what they assume and what they pay.
- Nonlinear control assumes a model with structure — input-affine form, a known relative degree, matched uncertainty, a physical energy. Given that structure, it returns a controller with a stability proof and no sampling: computed torque, a sliding surface, a backstepping Lyapunov function. The certificate is designed, not learned.
- Deep RL assumes samples, not structure. It learns and from interaction, paying in data and variance, and buys the ability to handle dynamics no one can write down — contact, friction, pixels — where relative degree and clean cancellations do not exist.
The dividing line is whether the structure is available and trustworthy. When it is, control wins on guarantees and sample cost; when the structure is absent or too hard to obtain, sampling wins on reach. Part III is the synthesis: model predictive control (Week 15) turns a model into a policy by online optimization, and the convergence weeks graft learned value functions onto controllers with Lyapunov-style guarantees — RL’s reach with control’s certificates.
What’s next
- Week 15 (model predictive control). Rather than design one feedback law, re-solve a finite-horizon optimal control problem at every step and apply the first move. MPC is online approximate dynamic programming: Chapter 13’s Riccati value function becomes a horizon- optimization, the Lyapunov certificate of this chapter reappears as a terminal cost, and constraints — which neither LQR nor feedback linearization handle — become first-class.
Exercises
-
(Derive) For the hanging pendulum with energy and dynamics , compute and show the gravity terms cancel, leaving .
Solution
and . Their inner product is . The cross terms cancel; only dissipation remains. For this is negative semidefinite, so Theorem 14.1 gives stability and LaSalle (Theorem 14.2) upgrades it to asymptotic stability.
-
(Prove) Show that for implies the limit of is a value at which , and conclude the origin is asymptotically stable.
Solution
is monotonically decreasing and bounded below by , hence convergent to some . If the trajectory stays in the compact shell where , forcing — a contradiction. So , and by positive definiteness . (This is the strict-decrease half of Theorem 14.1; LaSalle handles the semidefinite case.)
-
(Compute) A system has relative degree : , , . Find the feedback-linearizing control that imposes and give the closed-loop poles.
Solution
, so with yields . The Lie-derivative bookkeeping: , , (relative degree ). Poles are the roots of ; pick for a Hurwitz pair.
-
(Prove) For the reaching law , show hits zero by time . Why does matched uncertainty not change this bound?
Solution
With , . Separating, decreases at constant rate , so reaches in time . A matched disturbance enters as ; choosing keeps , so the surface is still reached — the gain dominates the uncertainty rather than canceling it.
-
(Implement) In the companion, verify the three sliding-mode claims: finite-time reaching within , the surface maintained in the boundary layer afterward, and convergence under a mismatched model.
Solution
See
experiments/python/week14/test_nonlinear.py:test_sliding_mode_reaches_surface_in_bounded_timechecks the reaching bound and boundary-layer maintenance;test_sliding_mode_robust_to_matched_parameter_errorbuilds the controller with a wrong length (so a wrong gravity coefficient) yet still reaches the surface and converges — matched-uncertainty robustness. -
(Extend) Argue why the quadratic LQR cost-to-go of Chapter 13 is a Lyapunov function for the optimal closed loop, and use this to connect Lyapunov design to value functions.
Solution
From Chapter 13, along the optimal closed loop , so is positive definite with — a Lyapunov function that also equals the optimal value. Lyapunov design chooses such a certificate directly (energy, , a backstepping sum) instead of solving the optimal-control problem for it — the same object, reached without the full Hamilton–Jacobi–Bellman computation. This is the seam model-based RL works along in Part III.
Companion code
The Week-14 companion lives at experiments/python/week14/ (pure numpy, RK4 integration with the
controller evaluated at each stage for a faithful continuous-time closed loop).
nonlinear.py— the pendulum model (hanging and upright conventions); the energy Lyapunov function with its analytic and numeric ; a certificate check that passes for the damped pendulum and fails for the anti-damped one; the computed-torque feedback-linearizing law and its target linear system; and the boundary-layer sliding-mode law with the sliding surface and reaching time.test_nonlinear.py— mathematical-correctness tests: matches the numeric ; the certificate discriminates damped from anti-damped; the damped pendulum’s energy is monotone and the state converges (LaSalle); feedback linearization reproduces the target linear system exactly with the prescribed poles; and sliding mode reaches the surface in bounded time, stays in the boundary layer, and is robust to matched parameter error.
# nonlinear-control algorithms + correctness tests
PYTHONPATH=. pytest experiments/python/week14/test_nonlinear.py -q
# worked Lyapunov / feedback-linearization / sliding-mode demonstrations on the pendulum
PYTHONPATH=. python experiments/python/week14/nonlinear.py