Optimal Control
Following [1], let [0,T] be a finite and fixed time horizon and suppose
A:[0,T]→Rn×n,t↦A(t)B:[0,T]→Rn×m,t↦B(t)Q:[0,T]→Rn×n,t↦Q(t)R:[0,T]→Rm×m,t↦R(t)
are continuous matrix-valued functions defined on [0,T]. We assume that the matrices Q(t) and R(t) are symmetric and in addition that Q(t) is positive semidefinite and R(t) is positive definite for all t∈[0,T]. Furthermore, let ST∈Rn×n be a constant, symmetric, and positive semidefinite matrix.
Linear-Quadratic Regulator
The
linear-quadratic regulator then is the following optimal control problem
[LQ]: Find a continuous function
u:[0,T]→Rm, the control, that minimizes a quadratic objective of the form
J(u)=12∫T0[x⊤(t)Q(t)x(t)+u⊤(t)R(t)u(t)]dt+12x⊤(T)STx(T)
subject to the linear dynamics
˙x(t)=A(t)x(t)+B(t)u(t),x(0)=x0
Theorem. The solution to the linear-quadratic optimal control problem [LQ] is given by the linear feedback control
u∗(t,x)=−R(t)−1B(t)⊤S(t)x,
where
S is the solution to the Riccati terminal value problem
˙S+SA(t)+A⊤(t)S−SB(t)R(t)−1B⊤(t)S+Q(t)=0S(T)=ST.
Proof. Let
u:[0,T]→Rm be any continuous control and let
x:[0,T]→Rn denote the corresponding trajectory. Dropping the argument
t from the notation, we have for any differentiable matrix function
S∈Rn×n that
ddt(x⊤Sx)=˙x⊤Sx+x⊤˙Sx+x⊤S˙x=(Ax+Bu)⊤Sx+x⊤˙xx+x⊤S(Ax+Bu)
and thus, by adjoining this quantity to the Lagrangian in the objective, we can express the cost equivalently as
J(u)=12∫T0[x⊤(Q+A⊤S+˙S+SA)x+x⊤SBu+u⊤B⊤S⊤x+u⊤Ru]dt+12x⊤(T)[ST−S(T)]x(T)+12xT0S(0)x0.
For the moment, let us assume that there exists a solution
S to the matrix Riccati equation
(3) over the full interval
[0,T]. Then the objective simplifies to
J(u)=12∫T0[(u+R−1B⊤Sx)⊤R(u+R−1B⊤Sx)]dt+12xT0S(0)x0
Since the matrix
R is continuous and postivie definite over
[0,T], th eminimum is realized if and only if
u(t)=−R−1(t)B⊤(t)S(t)x(t),
and the minimum value is given by
J∗(u)=12x⊤0S(0)x0.
For this argument to be valid, it remains to argue that such a solution
S to the initial value problem
(3) indeed does exist on all of
[0,T]. It follows from general results about the existence of solutions to ordinary differential equations that such a solution exists on some maximal interval
(τ,T] and that as
t↘τ (i.e.,
t→τ and
t>τ), at least one of the components of the solution
S(t) needs to diverge to
+∞ or
−∞. For if this were not the case, then by the local existence theorem on ODEs, the solution could be extended further onto some small interval
(τ−ϵ,τ+ϵ), contradicting the maximality of the interval
(τ,T]. In general, however, this explosion time
τ could be nonnegative, invalidating the argument above. That this is not the case for the linear-quadratic regulator problem is a consequence of the positivitiy assumptions on the objective, specifically, the definiteness assumptions on the matrices
R,
Q, and
ST.
◼
Corollary. If A(t)≡A, B(t)≡B, Q(t)≡Q, and R(t)≡R and T→∞, the same conclusion holds with the algebraic Riccati equation
SA+A⊤S−SBR−1B⊤S+Q=0
Solving the Algebraic Riccati Equation
It is possible to find the solution to the algebraic Riccati equation by finding the eigendecomposition of a larger system. We define the
Hamiltonian matrix
Z=[A−BR−1B⊤−Q−A⊤]
Since
Z is Hamiltonian, if it does not have any eigenvalues on the imaginary axis, then exactly half of its eigenvalues have a negative real part. If we denote the
2n×n matrix whose columns form a basis of the corresponding subspace, in block-matrix notation, as
[U⊤1U⊤2]⊤ then
S=U2U−11
is a solution of the algebraic Riccati equation
(4); furthermore, the eigenvalues of
A−BR−1B⊤S are the eigenvalues of
Z with negative real part.