Teaching > Control

Optimal Control

Following [1], let \([0, T]\) be a finite and fixed time horizon and suppose \begin{align*} A: [0, T] \rightarrow \mathbb{R}^{n \times n}&, \;\; t \mapsto A(t) \\ B: [0, T] \rightarrow \mathbb{R}^{n \times m}&, \;\; t \mapsto B(t) \\ Q: [0, T] \rightarrow \mathbb{R}^{n \times n}&, \;\; t \mapsto Q(t) \\ R: [0, T] \rightarrow \mathbb{R}^{m \times m}&, \;\; t \mapsto R(t) \end{align*} are continuous matrix-valued functions defined on \([0, T]\). We assume that the matrices \(Q(t)\) and \(R(t)\) are symmetric and in addition that \(Q(t)\) is positive semidefinite and \(R(t)\) is positive definite for all \(t \in [0, T]\). Furthermore, let \(S_T \in \mathbb{R}^{n \times n}\) be a constant, symmetric, and positive semidefinite matrix.

Linear-Quadratic Regulator

The linear-quadratic regulator then is the following optimal control problem [LQ]: Find a continuous function \(u: [0,T] \rightarrow \mathbb{R}^m\), the control, that minimizes a quadratic objective of the form \begin{equation} \begin{split} { J(u) = \frac{1}{2}\int_0^T \left[ x^\top(t)Q(t)x(t) + u^\top(t)R(t)u(t) \right] dt } \\ { + \frac{1}{2}x^\top(T)S_Tx(T) } \label{eq:cost_fcn} \end{split} \end{equation} subject to the linear dynamics \begin{equation*} { \dot{x}(t) = A(t)x(t) + B(t)u(t), \quad x(0) = x_0 } \end{equation*} Theorem. The solution to the linear-quadratic optimal control problem [LQ] is given by the linear feedback control \begin{equation} u_*(t,x) = -R(t)^{-1}B(t)^\top S(t) x, \label{eq:lqr_ctrl} \end{equation} where \(S\) is the solution to the Riccati terminal value problem \begin{align} \begin{split} \dot{S} + SA(t) + A^\top(t)S - SB(t)R(t)^{-1}B^\top(t)S + Q(t) &= 0 \\ S(T) &= S_T. \end{split} \label{eq:dynamic_riccati} \end{align} Proof. Let \(u: [0,T] \rightarrow \mathbb{R}^m\) be any continuous control and let \(x: [0,T] \rightarrow \mathbb{R}^n\) denote the corresponding trajectory. Dropping the argument \(t\) from the notation, we have for any differentiable matrix function \(S \in \mathbb{R}^{n \times n}\) that \begin{align*} \frac{d}{dt} \left( x^\top S x \right) &= \dot{x}^\top S x + x^\top \dot{S} x + x^\top S \dot{x} \\ &= \left( Ax + Bu \right)^\top S x + x^\top \dot{x} x + x^\top S \left( Ax + Bu \right) \end{align*} and thus, by adjoining this quantity to the Lagrangian in the objective, we can express the cost equivalently as \begin{equation*} { J(u) = \frac{1}{2} \int_0^T \left[ x^\top \left( Q + A^\top S + \dot{S} + SA \right)x + x^\top SB u + u^\top B^\top S^\top x + u^\top R u \right] dt } \\ { + \frac{1}{2}x^\top(T) \left[ S_T - S(T) \right]x(T) + \frac{1}{2}x_0^T S(0) x_0 }. \end{equation*} For the moment, let us assume that there exists a solution \(S\) to the matrix Riccati equation \eqref{eq:dynamic_riccati} over the full interval \([0,T]\). Then the objective simplifies to \begin{equation*} J(u) = \frac{1}{2}\int_0^T \left[ \left( u + R^{-1}B^\top Sx \right)^\top R \left( u + R^{-1}B^\top Sx \right) \right] dt + \frac{1}{2}x_0^T S(0)x_0 \end{equation*} Since the matrix \(R\) is continuous and postivie definite over \([0,T]\), th eminimum is realized if and only if \begin{equation*} u(t) = -R^{-1}(t)B^\top(t)S(t)x(t), \end{equation*} and the minimum value is given by \begin{equation*} J_*(u) = \frac{1}{2}x_0^\top S(0)x_0. \end{equation*} For this argument to be valid, it remains to argue that such a solution $S$ to the initial value problem \eqref{eq:dynamic_riccati} indeed does exist on all of $[0,T]$. It follows from general results about the existence of solutions to ordinary differential equations that such a solution exists on some maximal interval $(\tau, T]$ and that as $t \searrow \tau$ (i.e., $t \rightarrow \tau$ and $t > \tau$), at least one of the components of the solution $S(t)$ needs to diverge to $+\infty$ or $-\infty$. For if this were not the case, then by the local existence theorem on ODEs, the solution could be extended further onto some small interval $(\tau-\epsilon, \tau+\epsilon)$, contradicting the maximality of the interval $(\tau, T]$. In general, however, this explosion time $\tau$ could be nonnegative, invalidating the argument above. That this is not the case for the linear-quadratic regulator problem is a consequence of the positivitiy assumptions on the objective, specifically, the definiteness assumptions on the matrices $R$, $Q$, and $S_T$.

$\blacksquare$

Corollary. If \(A(t) \equiv A\), \(B(t) \equiv B\), \(Q(t) \equiv Q \), and \(R(t) \equiv R\) and \(T \rightarrow \infty\), the same conclusion holds with the algebraic Riccati equation \begin{equation} SA + A^\top S - SBR^{-1}B^\top S + Q = 0 \label{eq:algebraic_riccati} \end{equation}

Solving the Algebraic Riccati Equation

It is possible to find the solution to the algebraic Riccati equation by finding the eigendecomposition of a larger system. We define the Hamiltonian matrix \begin{equation*} Z = \begin{bmatrix} A & -BR^{-1}B^\top \\ -Q & -A^\top \end{bmatrix} \end{equation*} Since \(Z\) is Hamiltonian, if it does not have any eigenvalues on the imaginary axis, then exactly half of its eigenvalues have a negative real part. If we denote the \(2n \times n\) matrix whose columns form a basis of the corresponding subspace, in block-matrix notation, as \( \begin{bmatrix} U_1^\top & U_2^\top \end{bmatrix}^\top \) then \begin{equation*} S = U_2U_1^{-1} \end{equation*} is a solution of the algebraic Riccati equation \eqref{eq:algebraic_riccati}; furthermore, the eigenvalues of \(A - BR^{-1}B^\top S\) are the eigenvalues of \(Z\) with negative real part.