Part: control Week 12 Published structural.py test_structural.py

Stability, Controllability, and Observability

The structural properties of a linear model knowable before any controller exists: internal stability via eigenvalues and the Lyapunov equation, controllability and observability as reachability and state-identifiability, and the duality that makes them one theory — and makes the LQR regulator and the Kalman estimator one computation.

On this page

Internal stability
Controllability
Observability
Duality
The structural bridge to learning
What’s next
Exercises
Companion code

Stability, Controllability, and Observability

Where we are. Week 11 built the linear state-space model and showed the transfer function is its coordinate-free invariant. Before designing any controller, three structural questions decide what is achievable on that model: is it internally stable; can the input steer the whole state (controllability); can the output reconstruct the whole state (observability)? These are the linear-systems analogues of reachability and identifiability in reinforcement learning — they bound what any policy or estimator can do, before optimality is even on the table. LQR and the Kalman filter (Week 13) will need exactly the mild weakenings of these properties.

Chapter 12 — at a glance

Goal. Characterize internal stability by eigenvalue location and by the Lyapunov equation; define controllability and observability with their rank tests; and establish the duality that makes them one theory.

Reading time. ~40 minutes; ~70 with the proofs and exercises.

Key insight — the DP bridge. Controllability is reachability — which states an input sequence can attain, the support of the reachable set under known dynamics. Observability is state identifiability from outputs — the linear shadow of the MDP-versus-POMDP distinction. LQR/LQG (Week 13) need only the weaker stabilizability and detectability (any uncontrollable / unobservable mode is already stable), which are precisely the conditions under which the dynamic-programming value function — the Riccati solution — exists and stabilizes. Stability itself is the spectral test that the closed-loop Bellman map is a contraction.

Internal stability

Definition 12.1 (Asymptotic stability).

The autonomous system $\dot{\statevec} = \statemat\statevec$ (zero input) is asymptotically stable if $\statevec(t) \to 0$ as $t \to \infty$ from every initial state. Its discrete-time counterpart $\statevec_{k+1} = \statemat\statevec_k$ is asymptotically stable if $\statevec_k \to 0$ for every $\statevec_0$ .

For a linear system, asymptotic stability is decided entirely by the spectrum of $\statemat$ .

Theorem 12.1 (Spectral stability condition).

The continuous-time system $\dot{\statevec} = \statemat\statevec$ is asymptotically stable iff every eigenvalue of $\statemat$ has strictly negative real part — $\statemat$ is Hurwitz, $\spec(\statemat) \subset \{\lambda : \operatorname{Re}\lambda < 0\}$ . The discrete-time system is asymptotically stable iff every eigenvalue lies strictly inside the unit disk — $\statemat$ is Schur, the spectral radius $\spectralradius(\statemat) < 1$ .

The reason is the modal decomposition: solutions of $\dot{\statevec} = \statemat\statevec$ are combinations of $e^{\lambda t}$ over the eigenvalues $\lambda$ (with polynomial factors at repeated eigenvalues), and $e^{\lambda t} \to 0$ iff $\operatorname{Re}\lambda < 0$ ; in discrete time the modes are $\lambda^k$ , which vanish iff $|\lambda| < 1$ .

Eigenvalues answer the question but require an eigendecomposition. The Lyapunov equation gives an equivalent algebraic certificate — and it is the template for the Riccati equation of Week 13.

Theorem 12.2 (Lyapunov stability theorem).

$\statemat$ is Hurwitz iff for every symmetric $Q \succ 0$ the Lyapunov equation

\statemat^\top P + P\statemat = -Q

has a unique solution $P$ , and that $P$ is symmetric positive definite. Then $V(\statevec) = \statevec^\top P \statevec$ is a Lyapunov function: $V \succ 0$ and $\dot V = -\statevec^\top Q \statevec < 0$ along every nonzero trajectory.

Proof.

( $\Leftarrow$ , the construction.) Suppose $\statemat$ is Hurwitz and fix $Q \succ 0$ . Define

P \defeq \int_0^\infty e^{\statemat^\top t}\, Q\, e^{\statemat t}\,dt .

Hurwitzness gives $\lVert e^{\statemat t} \rVert \le M e^{-\alpha t}$ for some $M,\alpha > 0$ , so the integrand decays exponentially and the integral converges; $P$ is symmetric because $Q$ is. It is positive definite: for $\statevec \neq 0$ , $\statevec^\top P \statevec = \int_0^\infty (e^{\statemat t}\statevec)^\top Q\,(e^{\statemat t}\statevec)\,dt > 0$ since $Q \succ 0$ and $e^{\statemat t}\statevec \neq 0$ . Finally it solves the equation:

\begin{aligned} \statemat^\top P + P\statemat &= \int_0^\infty \big(\statemat^\top e^{\statemat^\top t} Q\, e^{\statemat t} + e^{\statemat^\top t} Q\, e^{\statemat t}\statemat\big)\,dt && \text{(linearity)} \\ &= \int_0^\infty \frac{d}{dt}\!\left( e^{\statemat^\top t} Q\, e^{\statemat t}\right) dt && \big(\tfrac{d}{dt}e^{\statemat t} = \statemat e^{\statemat t} = e^{\statemat t}\statemat\big) \\ &= \Big[\, e^{\statemat^\top t} Q\, e^{\statemat t}\,\Big]_0^\infty = 0 - Q = -Q . && \text{(Hurwitz: upper limit vanishes)} \end{aligned}

The Lyapunov-function claim follows by differentiating $V(\statevec) = \statevec^\top P \statevec$ along $\dot{\statevec} = \statemat\statevec$ : $\dot V = \statevec^\top(\statemat^\top P + P\statemat)\statevec = -\statevec^\top Q \statevec < 0$ . The converse — a positive-definite $P$ forcing the spectrum into the open left half-plane — is the standard direction in Sontag (1998) . $\qquad\blacksquare$

So stability has two faces: a spectral test (eigenvalues) and an algebraic certificate (a quadratic energy $V$ that strictly decreases). The Lyapunov view generalizes to nonlinear systems (Week 14) and, with a control term added, becomes the Riccati equation that solves LQR (Week 13).

Controllability

Definition 12.2 (Controllability).

The pair $(\statemat,\inputmat)$ is controllable if for any initial state $\statevec_0$ and any target $\statevec_f$ there is an input $u(\cdot)$ on some finite interval that drives the state from $\statevec_0$ to $\statevec_f$ . Equivalently, every state is reachable from the origin.

Theorem 12.3 (Kalman controllability rank condition).

$(\statemat,\inputmat)$ with $\statemat \in \R^{n\times n}$ is controllable iff the controllability matrix

\ctrbmat = \begin{pmatrix} \inputmat & \statemat\inputmat & \statemat^2\inputmat & \cdots & \statemat^{\,n-1}\inputmat \end{pmatrix} \in \R^{n\times nm}

has full row rank, $\rank\,\ctrbmat = n$ .

Only powers up to $\statemat^{n-1}$ appear: by the Cayley–Hamilton theorem $\statemat^n$ is a linear combination of $I, \statemat, \dots, \statemat^{n-1}$ , so higher powers add no new directions.

An equivalent test uses the controllability Gramian

W_c = \int_0^T e^{\statemat\tau}\inputmat\inputmat^\top e^{\statemat^\top\tau}\,d\tau

, which is positive definite iff

(\statemat,\inputmat)

is controllable; for Hurwitz

\statemat

the infinite-horizon Gramian solves a Lyapunov equation

\statemat W_c + W_c\statemat^\top = -\inputmat\inputmat^\top

, tying controllability back to the previous section.

Observability

Observability is the same question asked of the output map: can the measurements pin down the state?

Definition 12.3 (Observability).

The pair $(\statemat,\outputmat)$ is observable if the initial state $\statevec_0$ can be uniquely determined from the output $y(\cdot)$ over any finite interval (the input being known).

Theorem 12.4 (Observability rank condition).

$(\statemat,\outputmat)$ with $\statemat \in \R^{n\times n}$ , $\outputmat \in \R^{p\times n}$ is observable iff the observability matrix

\obsvmat = \begin{pmatrix} \outputmat \\ \outputmat\statemat \\ \outputmat\statemat^2 \\ \vdots \\ \outputmat\statemat^{\,n-1} \end{pmatrix} \in \R^{np\times n}

has full column rank, $\rank\,\obsvmat = n$ .

The two rank conditions are visibly mirror images — $\ctrbmat$ stacks $\statemat^k\inputmat$ horizontally, $\obsvmat$ stacks $\outputmat\statemat^k$ vertically. That mirror is not a coincidence.

Duality

Proposition 12.1 (Controllability–observability duality).

$(\statemat,\outputmat)$ is observable iff $(\statemat^\top,\outputmat^\top)$ is controllable; dually, $(\statemat,\inputmat)$ is controllable iff $(\statemat^\top,\inputmat^\top)$ is observable. Concretely, the observability matrix of $(\statemat,\outputmat)$ is the transpose of the controllability matrix of the dual pair,

\obsvmat(\statemat,\outputmat) = \ctrbmat(\statemat^\top,\outputmat^\top)^\top .

Proof.

Write the dual controllability matrix, and transpose its blocks one at a time:

\ctrbmat(\statemat^\top,\outputmat^\top) = \begin{pmatrix} \outputmat^\top & \statemat^\top\outputmat^\top & \cdots & (\statemat^\top)^{n-1}\outputmat^\top \end{pmatrix}, \qquad \big[(\statemat^\top)^k\outputmat^\top\big]^\top = \outputmat\statemat^k .

Transposing turns the horizontal blocks into vertical ones, $\big[\,\outputmat^\top \mid \statemat^\top\outputmat^\top \mid \cdots\,\big]^\top = (\outputmat;\ \outputmat\statemat;\ \dots;\ \outputmat\statemat^{\,n-1}) = \obsvmat(\statemat,\outputmat)$ . Since rank is invariant under transposition, $\rank\,\obsvmat(\statemat,\outputmat) = \rank\,\ctrbmat(\statemat^\top,\outputmat^\top)$ , and the rank conditions of Theorems 12.3–12.4 give the equivalence. $\qquad\blacksquare$

Duality halves the theory: every controllability result has a free observability twin. It is also why LQR (state feedback) and the Kalman filter (state estimation) are the same Riccati computation run on dual systems — the deepest structural fact Week 13 will exploit.

The structural bridge to learning

These properties are the control-theoretic statements of limits that also bound reinforcement learning:

Controllability ↔ reachability. An uncontrollable mode is a direction of state space no input — and therefore no policy — can affect. It is the linear, exact version of the reachable-set question that determines what any RL agent can possibly accomplish on a system.
Observability ↔ identifiability. An unobservable mode is a latent direction the outputs never reveal, so no estimator can recover it from data. This is precisely the gap between a fully observed MDP and a POMDP, where the agent must act on a belief over hidden state.
The weakenings LQR needs. Optimal control does not require full controllability/observability, only stabilizability (every uncontrollable mode is already stable) and detectability (every unobservable mode is already stable). These are the exact conditions under which the dynamic-programming value function of Week 13 exists and yields a stabilizing policy — a structural prerequisite, checked before any optimization, for the Bellman recursion to have a well-behaved fixed point.

What’s next

Week 13 (LQR/LQG). Put a quadratic cost on the controllable linear model and the Bellman equation collapses to the algebraic Riccati equation — the Lyapunov equation of this chapter with a control term subtracted. Stabilizability and detectability are exactly what make its stabilizing solution exist; duality makes the optimal regulator and the optimal estimator one calculation. This is where dynamic programming and control theory become the same equation.

Exercises

(Compute) For $\statemat = \begin{pmatrix} -1 & 2 \\ 0 & -3 \end{pmatrix}$ , decide continuous-time (Hurwitz) and discrete-time (Schur) asymptotic stability.

Solution
The eigenvalues are $-1, -3$ (triangular matrix). Both have $\operatorname{Re} < 0$ , so $\statemat$ is Hurwitz and the flow $\dot{\statevec} = \statemat\statevec$ is asymptotically stable. As a discrete map, $|-1| = 1$ and $|-3| = 3$ are not inside the unit disk, so $\statemat$ is not Schur and $\statevec_{k+1} = \statemat\statevec_k$ is unstable. The same matrix is stable as a flow but unstable as a map — stability is a property of the system type, not of $\statemat$ alone.
(Prove) With $V(\statevec) = \statevec^\top P \statevec$ , $P \succ 0$ solving $\statemat^\top P + P\statemat = -Q$ for $Q \succ 0$ , show $\dot V < 0$ along $\dot{\statevec} = \statemat\statevec$ and conclude asymptotic stability.

Solution
$\dot V = \dot{\statevec}^\top P \statevec + \statevec^\top P \dot{\statevec} = \statevec^\top(\statemat^\top P + P\statemat)\statevec = -\statevec^\top Q \statevec < 0$ for $\statevec \neq 0$ . A positive-definite function strictly decreasing along every trajectory forces $\statevec \to 0$ (Lyapunov’s direct method), so the origin is asymptotically stable — without computing a single eigenvalue.
(Compute) For $\statemat = \begin{pmatrix} -1 & 0 \\ 0 & -2 \end{pmatrix}$ , $\inputmat = \begin{pmatrix} 1 \\ 0 \end{pmatrix}$ , form $\ctrbmat$ and decide controllability.

Solution
$\ctrbmat = (\inputmat\ \ \statemat\inputmat) = \begin{pmatrix} 1 & -1 \\ 0 & 0 \end{pmatrix}$ , which has rank $1 < 2$ , so $(\statemat,\inputmat)$ is uncontrollable. The second mode (the $-2$ eigendirection) has no input coupling, so no $u$ can move it. Because that mode is already stable, the system is still stabilizable — the weaker property LQR needs.
(Prove) Show $\obsvmat(\statemat,\outputmat) = \ctrbmat(\statemat^\top,\outputmat^\top)^\top$ , and hence observability of $(\statemat,\outputmat)$ is equivalent to controllability of $(\statemat^\top,\outputmat^\top)$ .

Solution
As in Proposition 12.1: $[(\statemat^\top)^k\outputmat^\top]^\top = \outputmat\statemat^k$ , so transposing the horizontal blocks of $\ctrbmat(\statemat^\top,\outputmat^\top)$ produces the vertical observability stack $\obsvmat(\statemat,\outputmat)$ . Rank is invariant under transposition, so the two full-rank conditions coincide.
(Implement) In the companion, verify that the controllable/uncontrollable and observable/unobservable examples have the predicted ranks, that the Lyapunov solution $P$ is positive definite iff $\statemat$ is Hurwitz, and that duality holds as a rank equality.

Solution
See experiments/python/week12/test_structural.py: is_controllable/is_observable match the Kalman rank conditions; lyapunov_solve returns a positive-definite $P$ for the Hurwitz oscillator and an indefinite $P$ for an unstable $\statemat$ (the certificate failing exactly when stability does); and observability_matrix(A.T, C.T) equals controllability_matrix(A, C).T.
(Extend) The rank tests are exact in arithmetic but decided numerically by a singular-value threshold. Show that the verdict of numpy.linalg.matrix_rank on a controllable system depends on the tolerance convention: scale the input by $\varepsilon$ and compare numpy’s default (relative) tolerance against a fixed absolute one.

Solution
Scaling the input column by $\varepsilon$ shrinks every singular value of $\ctrbmat$ in proportion, so $\ctrbmat$ stays full rank for every $\varepsilon > 0$ . numpy’s default tolerance is relative — $\sigma_{\max}\cdot\max(m,n)\cdot\varepsilon_{\text{mach}}$ — hence scale-invariant: it reports rank $n$ at every $\varepsilon$ . A fixed absolute threshold tells a different story: once the smallest singular value drops below it, the same controllable system reads as rank-deficient. The companion’s --scaling-demo prints both verdicts side by side (the default stays $2$ ; the absolute one flips to $1$ , then $0$ ). The lesson: controllability is binary in exact arithmetic but graded in finite precision, and “numerically uncontrollable” names a chosen tolerance, not the system — the Gramian’s smallest eigenvalue is the tolerance-free measure of how controllable a system is.

Companion code

The Week-12 companion lives at experiments/python/week12/ (Python, on scipy + numpy, cross-checked against python-control).

structural.py — Hurwitz / Schur stability tests; the Lyapunov solver for $\statemat^\top P + P\statemat = -Q$ (scipy.linalg.solve_continuous_lyapunov); controllability and observability matrices with their rank tests; the controllability Gramian; and a built (un)controllable / (un)observable dual pair.
test_structural.py — mathematical-correctness tests: the stability tests agree with the eigenvalues; the Lyapunov $P$ is positive definite iff $\statemat$ is Hurwitz, with residual $\statemat^\top P + P\statemat + Q \approx 0$ ; the rank conditions classify the controllable/observable examples and their rank-deficient duals; the Gramian solves $\statemat W_c + W_c\statemat^\top = -\inputmat\inputmat^\top$ and is positive definite iff controllable; and duality holds as $\obsvmat(\statemat,\outputmat) = \ctrbmat(\statemat^\top,\outputmat^\top)^\top$ .

# structural algorithms + correctness tests (scipy + python-control cross-check)
PYTHONPATH=. pytest experiments/python/week12/test_structural.py -q

# worked summary table for several small systems + the rank-under-scaling demo
PYTHONPATH=. python experiments/python/week12/structural.py --scaling-demo