Partial-State Stabilization and Optimal Feedback Control for Stochastic Dynamical Systems

Rajpurohit, Tanmay; Haddad, Wassim M.

doi:10.1115/1.4036033

In this paper, we develop a unified framework to address the problem of optimal nonlinear analysis and feedback control for partial stability and partial-state stabilization of stochastic dynamical systems. Partial asymptotic stability in probability of the closed-loop nonlinear system is guaranteed by means of a Lyapunov function that is positive definite and decrescent with respect to part of the system state which can clearly be seen to be the solution to the steady-state form of the stochastic Hamilton–Jacobi–Bellman equation, and hence, guaranteeing both partial stability in probability and optimality. The overall framework provides the foundation for extending optimal linear-quadratic stochastic controller synthesis to nonlinear-nonquadratic optimal partial-state stochastic stabilization. Connections to optimal linear and nonlinear regulation for linear and nonlinear time-varying stochastic systems with quadratic and nonlinear-nonquadratic cost functionals are also provided. Finally, we also develop optimal feedback controllers for affine stochastic nonlinear systems using an inverse optimality framework tailored to the partial-state stochastic stabilization problem and use this result to address polynomial and multilinear forms in the performance criterion.

Introduction

In Ref. [1], we extended the framework developed in Refs. [2,3] to address the problem of optimal partial-state stabilization, wherein stabilization with respect to a subset of the system state variables is desired. Partial-state stabilization arises in many engineering applications [4,5]. Specifically, in spacecraft stabilization via gimballed gyroscopes, asymptotic stability of an equilibrium position of the spacecraft is sought while requiring Lyapunov stability of the axis of the gyroscope relative to the spacecraft [5]. Alternatively, in the control of rotating machinery with mass imbalance, spin stabilization about a nonprincipal axis of inertia requires motion stabilization with respect to a subspace instead of the origin [4]. The most common application where partial stabilization is necessary is adaptive control, wherein asymptotic stability of the closed-loop plant states is guaranteed without necessarily achieving parameter error convergence.

In this paper, we extend the framework developed in Ref. [1] to address the problem of optimal partial-state stochastic stabilization. Specifically, we consider a notion of optimality that is directly related to a given Lyapunov function that is positive definite and decrescent with respect to part of the system state. In particular, an optimal partial-state stochastic stabilization control problem is stated, and sufficient Hamilton–Jacobi–Bellman conditions are used to characterize an optimal feedback controller. Another important application of partial stability and partial stabilization theory is the unification it provides between time-invariant stability theory and stability theory for time-varying systems [3,6]. We exploit this unification and specialize our results to address optimal linear and nonlinear regulation for linear and nonlinear time-varying stochastic systems with quadratic and nonlinear-nonquadratic cost functionals.

Our approach focuses on the role of the Lyapunov function guaranteeing stochastic stability of the closed-loop system and its connection to the steady-state solution of the stochastic Hamilton–Jacobi–Bellman equation characterizing the optimal nonlinear feedback controller. In order to avoid the complexity in solving the stochastic steady-state, Hamilton–Jacobi–Bellman equation, we do not attempt to minimize a given cost functional, but rather, we parameterize a family of stochastically stabilizing controllers that minimizes a derived cost functional that provides the flexibility in specifying the control law. This corresponds to addressing an inverse optimal stochastic control problem [7–13].

The inverse optimal control design approach provides a framework for constructing the Lyapunov function for the closed-loop system that serves as an optimal value function and, as shown in Refs. [11,12], achieves desired stability margins. Specifically, nonlinear inverse optimal controllers that minimize a meaningful (in the terminology of Refs. [11,12]) nonlinear-nonquadratic performance criterion involving a nonlinear-nonquadratic, non-negative-definite function of the state and a quadratic positive-definite function of the feedback control are shown to possess sector margin guarantees to component decoupled input nonlinearities in the conic sector $(1 / 2, \infty)$ ⁠.

The paper is organized follows. In Sec. 2, we establish notation, definitions, and present some key results on partial stability of nonlinear stochastic dynamical systems. In Sec. 3, we consider a stochastic nonlinear system with a performance functional evaluated over the infinite horizon. The performance functional is then evaluated in terms of a Lyapunov function that guarantees partial asymptotic stability in probability. We then state a stochastic optimal control problem and provide sufficient conditions for characterizing an optimal nonlinear feedback controller guaranteeing partial asymptotic stability in probability of the closed-loop system. These results are then used to address a stochastic optimal control problem for uniform asymptotic stabilization in probability of nonlinear time-varying stochastic dynamical systems.

In Sec. 4, we develop optimal feedback controllers for affine stochastic nonlinear systems using an inverse optimality framework tailored to the partial-state stochastic stabilization problem. This result is then used to derive time-varying extensions of the results in Refs. [14,15] involving nonlinear feedback controllers minimizing polynomial and multilinear performance criteria. In Sec. 5, we provide two illustrative numerical examples that highlight the optimal partial-state stochastic stabilization framework. In Sec. 6, we present conclusions and highlight some future research directions. Finally, we note that a preliminary version of this paper appeared in Ref. [16]. The present paper considerably expands on Ref. [16] by providing detailed proofs of all the results along with examples and additional motivation.

Notation, Definitions, and Mathematical Preliminaries

In this section, we establish notation, definitions, and review some basic results on partial stability of nonlinear stochastic dynamical systems [17–22]. Specifically, $ℝ$ denotes the set of real numbers, $ℝ_{+}$ denotes the set of positive real numbers, ${\bar{ℝ}}_{+}$ denotes the set of non-negative numbers, $ℤ_{+}$ denotes the set of positive integers, $ℝ^{n}$ denotes the set of n × 1 real column vectors, $ℝ^{n \times m}$ denotes the set of n × m real matrices, $ℕ^{n}$ denotes the set of n × n non-negative-definite matrices, and $ℙ^{n}$ denotes the set of n × n positive-definite matrices. We write $B_{ε} (x)$ for the open ball centered at x with radius ε, $| | \cdot | |$ for the Euclidean vector norm or an induced matrix norm (depending on context), ${‖ \cdot ‖}_{F}$ for the Frobenius matrix norm, A^T for the transpose of the matrix A, ⊗ for the Kronecker product, ⊕ for the Kronecker sum, and I_n or I for the n × n identity matrix. Furthermore, $B^{n}$ denotes the σ-algebra of Borel sets in $D \subseteq ℝ^{n}$ ⁠, and $S$ denotes a σ-algebra generated on a set $S \subseteq ℝ^{n}$ ⁠.

We define a complete probability space as $(Ω, F, ℙ)$ ⁠, where Ω denotes the sample space, $F$ denotes a σ-algebra, and $ℙ$ defines a probability measure on the σ-algebra $F$ ⁠; that is, $ℙ$ is a non-negative countably additive set function on $F$ such that $ℙ (Ω) = 1$ [20]. Furthermore, we assume that w(⋅) is a standard d-dimensional Wiener process defined by $(w (\cdot), Ω, F, ℙ^{w_{0}})$ ⁠, where $ℙ^{w_{0}}$ is the classical Wiener measure [22, p. 10], with a continuous-time filtration ${F_{t}}_{t \geq 0}$ generated by the Wiener process w(t) up to time t. We denote a stochastic dynamical system by $G$ generating a filtration ${F_{t}}_{t \geq 0}$ adapted stochastic process $x : {\bar{ℝ}}_{+} \times Ω \to D$ on $(Ω, F, ℙ^{x_{0}})$ satisfying $F_{τ} \subset F_{t}, 0 \leq τ < t$ ⁠, such that ${ω \in Ω : x (t, ω) \in B} \in F_{t}, t \geq 0$ ⁠, for all Borel sets $B \subset ℝ^{n}$ contained in the Borel σ-algebra $B^{n}$ ⁠. Here, we use the notation x(t) to represent the stochastic process x(t, ω) omitting its dependence on ω.

We denote the set of equivalence classes of measurable, integrable, and square-integrable $ℝ^{n}$ or $ℝ^{n \times m}$ (depending on context) valued random processes on $(Ω, F, ℙ)$ over the semi-infinite parameter space [0, ∞) by $L^{0} (Ω, F, ℙ), L^{1} (Ω, F, ℙ)$ ⁠, and $L^{2} (Ω, F, ℙ)$ ⁠, respectively, where the equivalence relation is the one induced by $ℙ$ -almost-sure equality. In particular, elements of $L^{0} (Ω, F, ℙ)$ take finite values $ℙ$ -almost surely (a.s.). Hence, depending on the context, $ℝ^{n}$ will denote either the set of n × 1 real variables or the subspace of $L^{0} (Ω, F, ℙ)$ comprising $ℝ^{n}$ random processes that are constant almost surely. All inequalities and equalities involving random processes on $(Ω, F, ℙ)$ are to be understood to hold $ℙ$ -almost surely. Furthermore, $E [\cdot]$ and $E^{x_{0}} [\cdot]$ denote, respectively, the expectation with respect to the probability measure $ℙ$ and with respect to the classical Wiener measure $ℙ^{x_{0}}$ ⁠.

Finally, we write tr(⋅) for the trace operator, ${(\cdot)}^{- 1}$ for the inverse operator, $V' (x) ≜ ((\partial V (x)) / \partial x)$ for the Fréchet derivative of V at x, $V^{″} (x) ≜ ((\partial^{2} V (x)) / \partial x^{2})$ for the Hessian of V at x, and $H_{n}$ for the Hilbert space of random vectors $x \in ℝ^{n}$ with finite average power, that is, $H_{n} ≜ {x : Ω \to ℝ^{n} : E [x^{T} x] < \infty}$ ⁠. For an open set $D \subseteq ℝ^{n}, H_{n}^{D} ≜ {x \in H_{n} : x : Ω \to D}$ denotes the set of all the random vectors in $H_{n}$ induced by $D$ ⁠. Similarly, for every $x_{0} \in ℝ^{n}, H_{n}^{x_{0}} ≜ {x \in H_{n} : x = x_{0} a . s .}$ ⁠. Furthermore, C² denotes the space of real-valued functions $V : D \to ℝ$ that are two-times continuously differentiable with respect to $x \in D \subseteq ℝ^{n}$ ⁠.

In this paper, we consider nonlinear stochastic autonomous dynamical systems

G

of the form

d x_{1} (t) = f_{1} (x_{1} (t), x_{2} (t)) d t + D_{1} (x_{1} (t), x_{2} (t)) d w (t), x_{1} (t_{0}) = x_{10} a . s ., t \geq t_{0}

(1)

d x_{2} (t) = f_{2} (x_{1} (t), x_{2} (t)) d t + D_{2} (x_{1} (t), x_{2} (t)) d w (t), x_{2} (t_{0}) = x_{20} a . s .

(2)

where, for every $t \geq t_{0}, x_{1} (t) \in H_{n_{1}}^{D}$ and $x_{2} (t) \in H_{n_{2}}$ are such that $x (t) ≜ {[x_{1}^{T} (t), x_{2}^{T} (t)]}^{T}$ is a $F_{t}$ -measurable random state vector, $x (t_{0}) \in H_{n_{1}}^{D} \times H_{n_{2}}, D \subseteq ℝ^{n_{1}}$ is an open set with $0 \in D$ ⁠, w(t) is a d-dimensional independent standard Wiener process (i.e., Brownian motion) defined on a complete filtered probability space $(Ω, F, {F_{t}}_{t \geq t_{0}}, ℙ), x (t_{0})$ is independent of $(w (t) - w (t_{0})), t \geq t_{0}$ ⁠, and $f_{1} : D \times ℝ^{n_{2}} \to ℝ^{n_{1}}$ is such that, for every $x_{2} \in ℝ^{n_{2}}, f_{1} (0, x_{2}) = 0$ and f₁(⋅, x₂) is locally Lipschitz continuous in x₁, and $f_{2} : D \times ℝ^{n_{2}} \to ℝ^{n_{2}}$ is such that, for every $x_{1} \in D, f_{2} (x_{1}, \cdot)$ is locally Lipschitz continuous in x₂. In addition, the function $D_{1} : D \times ℝ^{n_{2}} \to ℝ^{n_{1} \times d}$ is continuous such that, for every $x_{2} \in ℝ^{n_{2}}, D_{1} (0, x_{2}) = 0$ ⁠, and $D_{2} : D \times ℝ^{n_{2}} \to ℝ^{n_{2} \times d}$ is continuous.

A

ℝ^{n_{1} + n_{2}}

-valued stochastic process

x : [t_{0}, τ] \times Ω \to D \times ℝ^{n_{2}}

is said to be a solution of Eqs. and on the interval [t₀, τ] with initial condition x(t₀) = x₀ a.s., if x(⋅) is progressively measurable (i.e., x(⋅) is nonanticipating and measurable in t and ω) with respect to

{F_{t}}_{t \geq t_{0}}, f (x_{1}, x_{2}) ≜ {[f_{1}^{T} (x_{1}, x_{2}), f_{2}^{T} (x_{1}, x_{2})]}^{T} \in L^{1} (Ω, F, ℙ), D (x_{1}, x_{2}) ≜ {[D_{1}^{T} (x_{1}, x_{2}), D_{2}^{T} (x_{1}, x_{2})]}^{T} \in L^{2} (Ω, F, ℙ)

⁠, and

x (t) = x_{0} + \int_{t_{0}}^{t} f (x (s)) d s + \int_{t_{0}}^{t} D (x (s)) d w (s) a . s ., t \in [t_{0}, τ]

(3)

where the integrals in Eq. (3) are Itô integrals. Note that for each fixed t ≥ t₀, the random variable $ω \mapsto x (t, ω)$ assigns a vector x(ω) to every outcome ω ∈ Ω of an experiment, and for each fixed ω ∈Ω, the mapping $t \mapsto x (t, ω)$ is the sample path of the stochastic process x(t), t ≥ t₀. A pathwise solution $t \mapsto x (t)$ of Eqs. (1) and (2) in $(Ω, {F_{t}}_{t \geq t_{0}}, ℙ^{x_{0}})$ is said to be right maximally defined if x cannot be extended (either uniquely or nonuniquely) forward in time. We assume that all right maximal pathwise solutions to Eqs. (1) and (2) in $(Ω, {F_{t}}_{t \geq t_{0}}, ℙ^{x_{0}})$ exist on [t₀, ∞), and hence, we assume that Eqs. (1) and (2) are forward complete. Sufficient conditions for forward completeness or global solutions to Eqs. (1) and (2) are given by Corollary 6.3.5 of Ref. [20].

Furthermore, we assume that

f : D \times ℝ^{n_{2}} \to ℝ^{n_{1} + n_{2}}

and

D : D \times ℝ^{n_{2}} \to ℝ^{(n_{1} + n_{2}) \times d}

satisfy the uniform Lipschitz continuity condition

‖ f (x) - f (y) ‖ + {‖ D (x) - D (y) ‖}_{F} \leq L ‖ x - y ‖, x, y \in D \times ℝ^{n_{2}}

(4)

and the growth restriction condition

{‖ f (x) ‖}^{2} + {‖ D (x) ‖}_{F}^{2} \leq L^{2} (1 + {‖ x ‖}^{2}), x \in D \times ℝ^{n_{2}}

(5)

for some Lipschitz constant L > 0, and hence, since $x (t_{0}) \in H_{n_{1}}^{D} \times H_{n_{2}}$ and x(t₀) is independent of $(w (t) - w (t_{0})), t \geq t_{0}$ ⁠, it follows that there exists a unique solution $x \in L^{2} (Ω, F, ℙ)$ of Eqs. (1) and (2) in the following sense. For every $x \in H_{n_{1}}^{D} \times H_{n_{2}}$ ⁠, there exists τ_x > 0 such that, if $x_{I} : [t_{0}, τ_{1}] \times Ω \to D \times ℝ^{n_{2}}$ and $x_{II} : [t_{0}, τ_{2}] \times Ω \to D \times ℝ^{n_{2}}$ are two solutions of Eqs. (1) and (2); that is, if $x_{I}, x_{II} \in L^{2} (Ω, F, ℙ)$ ⁠, with continuous sample paths almost surely, solve Eqs. (1) and (2), then $τ_{x} \leq \min {τ_{1}, τ_{2}}$ and $ℙ (x_{I} (t) = x_{II} (t), t_{0} \leq t \leq τ_{x}) = 1$ ⁠. Sufficient conditions for forward existence and uniqueness in the absence of the uniform Lipschitz continuity condition and growth restriction condition can be found in Refs. [23,24].

A solution $t \mapsto {[x_{1}^{T} (t), x_{2}^{T} (t)]}^{T}$ is said to be regular if and only if $ℙ^{x_{0}} (τ^{e} = \infty) = 1$ for all $x (0) \in H_{n_{1}}^{D} \times H_{n_{2}}$ ⁠, where τ^e is the first stopping time of the solution to Eqs. (1) and (2) from every bounded domain in $D \times ℝ^{n_{2}}$ ⁠. Recall that regularity of solutions implies that solutions exist for t ≥ t₀ almost surely. Here, we assume regularity of solutions to Eqs. (1) and (2), and hence, τ_x = ∞ [18, p. 75]. Moreover, the unique solution determines a $ℝ^{n_{1} + n_{2}}$ -valued, time-homogeneous Feller continuous Markov process x(⋅), and hence, its stationary Feller transition probability function is given by (Refs. [18, Theorem 3.4] and [20, Theorem 9.2.8]) $ℙ (x (t) \in B | x (t_{0}) \overset{a . s .}{=} x_{0}) = ℙ (t - t_{0}, x_{0}, 0, B)$ for all $x_{0} \in D \times ℝ^{n_{2}}$ and t ≥ t₀, and all Borel subsets $B$ of $D \times ℝ^{n_{2}}$ ⁠, where $ℙ (s, x, t, B), t \geq s$ ⁠, denotes the probability of transition of the point $x \in D \times ℝ^{n_{2}}$ at time instant s into the set $B \subset D \times ℝ^{n_{2}}$ at time instant t. Finally, recall that every continuous process with Feller transition probability function is also a strong Markov process [18, p. 101].

Definition 2.1 [22, Definition 7.7]. Let x(⋅) be a time-homogeneous Markov process in

H_{n_{1}}^{D} \times H_{n_{2}}

and let

V : D \times ℝ^{n_{2}} \to ℝ

. Then, the infinitesimal generator

L

of x(t), t ≥ 0, with x(0) = x₀ a.s., is defined by

L V (x_{0}) ≜ \lim_{t \to 0^{+}} \frac{E^{x_{0}} [V (x (t))] - V (x_{0})}{t}, x_{0} \in D \times ℝ^{n_{2}}

(6)

If

V \in C^{2}

and has a compact support, and x(t), t ≥ t₀, satisfies Eqs. and , then the limit in Eq. exists for all

x \in D \times ℝ^{n_{2}}

and the infinitesimal generator

L

of x(t), t ≥ t₀, can be characterized by the system drift and diffusion functions f(x) and D(x) defining the stochastic dynamical system and with system state x(t), t ≥ t₀, and is given by [22, Theorem 7.9]

L V (x) ≜ \frac{\partial V (x)}{\partial x} f (x) + \frac{1}{2} tr D^{T} (x) \frac{\partial^{2} V (x)}{\partial x^{2}} D (x), x \in D \times ℝ^{n_{2}}

(7)

In the following definition, we introduce the notion of stochastic partial stability.

Definition 2.2. (i) The nonlinear stochastic dynamical system

G

given by Eqs.andis Lyapunov stable in probability with respect to x₁ uniformly in x₂₀if, for every ε > 0 and ρ > 0, there exist

δ = δ (ρ, ε) > 0

such that, for all

x_{10} \in B_{δ} (0)

ℙ^{x_{0}} (\sup_{t \geq t_{0}} ‖ x_{1} (t) ‖ > ε) \leq ρ

(8)

for all t ≥ 0 and all $x_{20} \in ℝ^{n_{2}}$ ⁠.

(ii)

G

is asymptotically stable in probability with respect to x₁ uniformly in x₂₀if

G

is Lyapunov stable in probability with respect to x₁ uniformly in x₂₀ and

\lim_{x_{10} \to 0} ℙ^{x_{0}} (\lim_{t \to \infty} ‖ x_{1} (t) ‖ = 0) = 1

(9)

uniformly in x₂₀ for all $x_{20} \in ℝ^{n_{2}}$ ⁠.

(iii) $G$ is globally asymptotically stable in probability with respect to x₁ uniformly in x₂₀ if $G$ is Lyapunov stable in probability with respect to x₁ uniformly in x₂₀ and $ℙ^{x_{0}} (\lim_{t \to \infty} ‖ x_{1} (t) ‖ = 0) = 1$ holds uniformly in x₂₀ for all $(x_{10}, x_{20}) \in ℝ^{n_{1}} \times ℝ^{n_{2}}$ ⁠.

Remark 2.1. It is important to note that there is a key difference between the stochastic partial stability definitions given in Definitions 2.2 and the definitions of stochastic partial stability given in Ref. [21]. In particular, the stochastic partial stability definitions given in Ref. [21] require that both the initial conditions x₁₀ and x₂₀ lie in a neighborhood of origin, whereas in Definition 2.2 x₂₀ can be arbitrary. As will be seen below, this difference allows us to unify autonomous stochastic partial stability theory with time-varying stochastic stability theory. An additional difference between our formulation of the stochastic partial stability problem and the stochastic partial stability problem considered in Ref. [21] is in the treatment of the equilibrium of Eqs. (1) and (2). Specifically, in our formulation, we require the weaker partial equilibrium condition f₁(0, x₂) = 0 and D₁(0, x₂) = 0 for every $x_{2} \in ℝ^{n_{2}}$ ⁠, whereas in Ref. [21] the author requires the stronger equilibrium condition $f_{1} (0, 0) = 0, f_{2} (0, 0) = 0, D_{1} (0, 0) = 0$ ⁠, and D₂(0, 0) = 0.

Remark 2.2. A more general stochastic stability notion can also be introduced here involving stochastic stability and convergence to an invariant (stationary) distribution. In this case, state convergence is not to an equilibrium point but rather to a stationary distribution. This framework can relax the vanishing perturbation assumption $D_{1} (0, x_{2}) = 0, x_{2} \in ℝ^{n_{2}}$ ⁠, and requires a more involved analysis and synthesis framework showing stability of the underlying Markov semigroup [25].

As shown in Refs. [3] and [6], an important application of deterministic partial stability theory is the unification it provides between time-invariant stability theory and stability theory for time-varying systems. A similar unification can be provided for stochastic dynamical systems. Specifically, consider the nonlinear time-varying stochastic dynamical system given by

d x (t) = f (t, x (t)) d t + D (t, x (t)) d w (t), x (t_{0}) = x_{0} a . s ., t \geq t_{0}

(10)

where, for every

t \geq t_{0}, x (t) \in H_{n}^{D}, D \subseteq ℝ^{n}, D

is an open set with

0 \in D, f (t, 0) = 0, D (t, 0) = 0

⁠, and

f : [t_{0}, \infty) \times D \to ℝ^{n}

and

D : [t_{0}, \infty) \times D \to ℝ^{n \times d}

are jointly continuous in t and x, and satisfy Eqs. and for all

x \in D

uniformly in t for all t in compact subsets of [t₀, ∞). Now, defining

x_{1} (τ) ≜ x (t)

and

x_{2} (τ) ≜ t

a.s., where

τ ≜ t - t_{0}

⁠, it follows that the solution x(t), t ≥ t₀, to the nonlinear time-varying stochastic dynamical system can be equivalently characterized by the solution

x_{1} (τ), τ \geq 0

⁠, to the nonlinear autonomous stochastic dynamical system

d x_{1} (τ) = f (x_{2} (τ), x_{1} (τ)) d τ + D (x_{2} (τ), x_{1} (τ)) d w (t), x_{1} (0) = x_{0} a . s ., τ \geq 0

(11)

d x_{2} (τ) = d τ, x_{2} (0) = t_{0} a . s .

(12)

Note that Eqs. (11) and (12) are in the same form as the system given by Eqs. (1) and (2), and Definition 2.2 applied to Eqs. (11) and (12) specializes to the definitions of uniform Lyapunov stability in probability, uniform asymptotic stability in probability, and global uniform asymptotic stability in probability of Eq. (10); for details, see Refs. [17] and [20].

Next, we provide sufficient conditions for partial stability of the nonlinear stochastic dynamical system given by Eqs. (1) and (2). For the statement of this result, recall the definitions of a class $K$ and $K_{\infty}$ functions given in Ref. [3, p. 162].

Theorem 2.1. Consider the nonlinear stochastic dynamical systems (1) and (2). Then, the following statements hold:

(i) If there exist a two-times continuously differentiable function

V : D \times ℝ^{n_{2}} \to ℝ

and class

K

functions

α (\cdot), β (\cdot)

, and

γ (\cdot)

such that, for all

(x_{1}, x_{2}) \in D \times ℝ^{n_{2}}

α (‖ x_{1} ‖) \leq V (x_{1}, x_{2}) \leq β (‖ x_{1} ‖)

(13)

\begin{matrix} \frac{\partial V (x_{1}, x_{2})}{\partial x_{1}} f_{1} (x_{1}, x_{2}) + \frac{\partial V (x_{1}, x_{2})}{\partial x_{2}} f_{2} (x_{1}, x_{2}) + \frac{1}{2} tr D_{1}^{T} (x_{1}, x_{2}) \frac{\partial^{2} V (x_{1}, x_{2})}{\partial x_{1}^{2}} D_{1} (x_{1}, x_{2}) \\ + \frac{1}{2} tr D_{2}^{T} (x_{1}, x_{2}) \frac{\partial^{2} V (x_{1}, x_{2})}{\partial x_{2}^{2}} D_{2} (x_{1}, x_{2}) \leq - γ (‖ x_{1} ‖) \end{matrix}

(14)

then the nonlinear dynamical system given by Eqs.(1)and(2)is asymptotically stable in probability with respect to x₁ uniformly in x₂₀.

(ii) If there exist a two-times continuously differentiable function $V : ℝ^{n_{1}} \times ℝ^{n_{2}} \to ℝ$ , class $K_{\infty}$ functions α(⋅) and β(⋅), and a class $K$ function γ(⋅) satisfying Eqs.(13)and(14), then the nonlinear dynamical system given by Eqs.(1)and(2)is globally asymptotically stable in probability with respect to x₁ uniformly in x₂₀.

Proof. (i) Let

x_{20} \in ℝ^{n_{2}}

⁠, let ε > 0 be such that

B_{ε} (0) \subseteq D

⁠, let ρ > 0, and define

D_{ε, ρ} ≜ {x_{1} \in B_{ε} (0) : V (x_{1}, x_{20}) < α (ε) ρ}

⁠. Since V (⋅, ⋅) is continuous and

V (0, x_{2}) = 0, x_{2} \in ℝ^{n_{2}}

⁠, it follows that

D_{ε, ρ}

is nonempty and there exists δ = δ(ε, ρ) > 0 such that

V (x_{1}, x_{20}) < α (ε) ρ, x_{1} \in B_{δ} (0)

⁠. Hence,

B_{δ} (0) \subseteq D_{ε, ρ}

⁠. Next, it follows from Eq. that V (x₁(t), x₂(t)) is a (positive) supermartingale [18, Lemma 5.4], and hence, for every

x_{1} (0) \in H_{n_{1}}^{B_{δ} (0)} \subseteq H_{n_{1}}^{D_{ε, ρ}}

⁠, it follows from Eq. and the extended version of the Markov inequality for monotonically increasing functions [26, p. 193] that

\begin{matrix} ℙ^{x_{0}} (\sup_{t \geq 0} | | x_{1} (t) | | \geq ε) \leq \sup_{t \geq 0} \frac{E^{x_{0}} [α (| | x_{1} (t) | |)]}{α (ε)} \leq \sup_{t \geq 0} \frac{E^{x_{0}} [V (x_{1} (t), x_{2} (t))]}{α (ε)} \\ \leq \frac{E^{x_{0}} [V (x_{1} (0), x_{2} (0))]}{α (ε)} \leq ρ \end{matrix}

which proves partial Lyapunov stability in probability with respect to x₁ uniformly in x₂₀.

To prove partial asymptotic stability in probability with respect to x₁, note that it follows from Eqs. and that

L V (x_{1}, x_{2}) \leq - γ (| | x_{1} | |) \leq - γ ° β^{- 1} (V (x_{1}, x_{2})), (x_{1}, x_{2}) \in D \times ℝ^{n_{2}}

Furthermore, it follows from partial Lyapunov stability in probability that $B_{ε} (0) \times ℝ^{n_{2}}$ is an invariant set with respect to the solutions of Eqs. (1) and (2) as ε → 0, and hence, using Corollary 4.2 of Ref. [27] with $η (\cdot) = γ ° β^{- 1} (\cdot)$ it follows that $\lim_{t \to \infty} γ ° β^{- 1} (V (x_{1} (t), x_{2} (t))) \overset{a . s .}{=} 0$ ⁠. Furthermore, using the properties of the class $K$ functions $α (\cdot), β (\cdot),$ and γ(⋅), it follows that $\lim_{t \to \infty} V (x_{1} (t), x_{2} (t)) \overset{a . s .}{=} 0$ ⁠, which yields $\lim_{t \to \infty} α (‖ x_{1} (t) ‖) \leq \lim_{t \to \infty} V (x_{1} (t), x_{2} (t)) \overset{a . s .}{=} 0$ ⁠. Hence, $\lim_{t \to \infty} x_{1} (t) \overset{a . s .}{\to} 0$ as x₁₀ → 0, which proves partial asymptotic stability in probability with respect to x₁ uniformly in x₂₀.

(ii) Finally, for $D = ℝ^{n_{1}}$ ⁠, globally asymptotically stable in probability with respect to x₁ uniformly in x₂₀ is direct consequence of the radially unbounded condition on V(⋅, ⋅) using standard arguments and the fact that α(⋅) and β(⋅) are class $K_{\infty}$ functions. ▪

Stochastic Optimal Partial-State Stabilization

In the first part of this section, we provide connections between Lyapunov functions and nonquadratic cost evaluation. Specifically, we consider the problem of evaluating a nonlinear-nonquadratic performance measure that depends on the solution of the stochastic nonlinear dynamical system given by Eqs. and . In particular, we show that the nonlinear-nonquadratic performance measure

J (x_{10}, x_{20}) ≜ E^{x_{0}} [\int_{0}^{\infty} L (x_{1} (t), x_{2} (t)) d t]

(15)

where $L : ℝ^{n_{1}} \times ℝ^{n_{2}} \to ℝ$ is jointly continuous in x₁ and x₂, and x₁(t) and x₂(t), t ≥ 0, satisfy Eqs. (1) and (2), can be evaluated in a convenient form so long as Eqs. (1) and (2) are related to an underlying Lyapunov function that is positive definite and decrescent with respect to x₁ and proves asymptotic stability in probability of Eqs. (1) and (2) with respect to x₁ uniformly in x₂₀.

Theorem 3.1. Consider the nonlinear stochastic dynamical system

G

given by Eqs.andwith performance measure (15). Assume that there exist a two-times continuously differentiable function

V : ℝ^{n_{1}} \times ℝ^{n_{2}} \to ℝ

, class

K_{\infty}

functions α(⋅) and β(⋅), and a class

K

function γ(⋅) such that, for all

(x_{1}, x_{2}) \in ℝ^{n_{1}} \times ℝ^{n_{2}}

α (| | x_{1} | |) \leq V (x_{1}, x_{2}) \leq β (| | x_{1} | |)

(16)

\begin{matrix} \frac{\partial V (x_{1}, x_{2})}{\partial x_{1}} f_{1} (x_{1}, x_{2}) + \frac{\partial V (x_{1}, x_{2})}{\partial x_{2}} f_{2} (x_{1}, x_{2}) + \frac{1}{2} tr D_{1}^{T} (x_{1}, x_{2}) \frac{\partial^{2} V (x_{1}, x_{2})}{\partial x_{1}^{2}} D_{1} (x_{1}, x_{2}) + \frac{1}{2} tr D_{2}^{T} (x_{1}, x_{2}) \frac{\partial^{2} V (x_{1}, x_{2})}{\partial x_{2}^{2}} D_{2} (x_{1}, x_{2}) \leq - γ (| | x_{1} | |) \end{matrix}

(17)

\begin{matrix} L (x_{1}, x_{2}) + \frac{\partial V (x_{1}, x_{2})}{\partial x_{1}} f_{1} (x_{1}, x_{2}) + \frac{\partial V (x_{1}, x_{2})}{\partial x_{2}} f_{2} (x_{1}, x_{2}) \\ + \frac{1}{2} tr D_{1}^{T} (x_{1}, x_{2}) \frac{\partial^{2} V (x_{1}, x_{2})}{\partial x_{1}^{2}} D_{1} (x_{1}, x_{2}) + \frac{1}{2} tr D_{2}^{T} (x_{1}, x_{2}) \frac{\partial^{2} V (x_{1}, x_{2})}{\partial x_{2}^{2}} D_{2} (x_{1}, x_{2}) = 0 \end{matrix}

(18)

Then, the nonlinear stochastic dynamical system

G

is globally asymptotically stable in probability with respect to x₁ uniformly in x₂₀ and, for all

(x_{10}, x_{20}) \in ℝ^{n_{1}} \times ℝ^{n_{2}}

J (x_{10}, x_{20}) = V (x_{10}, x_{20})

(19)

Proof. Let x₁(t) and x₂(t), t ≥ t₀, satisfy Eqs. (1) and (2). Then, Eqs. (16) and (17) are a restatement of Eqs. (13) and (14), and hence, it follows from Theorem 2.1 that the system $G$ is globally asymptotically stable in probability with respect to x₁ uniformly in x₂₀. Consequently, $ℙ^{x_{0}} (\lim_{t \to \infty} ‖ x_{1} (t) ‖ = 0) = 1$ holds for all initial conditions $(x_{10}, x_{20}) \in ℝ^{n_{1}} \times ℝ^{n_{2}}$ ⁠.

Next, using Itô’s (chain rule) formula, it follows that the stochastic differential of

V (x_{1} (t), x_{2} (t))

along the system trajectories x₁(t) and

x_{2} (t), t \geq t_{0}

⁠, is given by

\begin{matrix} d V (x_{1} (t), x_{2} (t)) = (\frac{\partial V (x_{1} (t), x_{2} (t))}{\partial x_{1}} f_{1} (x_{1} (t), x_{2} (t)) + \frac{\partial V (x_{1} (t), x_{2} (t))}{\partial x_{2}} f_{2} (x_{1} (t), x_{2} (t)) \\ + \frac{1}{2} tr D_{1}^{T} (x_{1} (t), x_{2} (t)) \frac{\partial^{2} V (x_{1} (t), x_{2} (t))}{\partial x_{1}^{2}} D_{1} (x_{1} (t), x_{2} (t)) \\ + \frac{1}{2} tr D_{2}^{T} (x_{1} (t), x_{2} (t)) \frac{\partial^{2} V (x_{1} (t), x_{2} (t))}{\partial x_{2}^{2}} D_{2} (x_{1} (t), x_{2} (t))) d t + \frac{\partial V (x (t))}{\partial x} D (x_{1} (t), x_{2} (t)) d w (t) \end{matrix}

(20)

Hence, using Eq. , it follows that

\begin{matrix} L (x_{1} (t), x_{2} (t)) d t + d V (x_{1} (t), x_{2} (t)) = (L (x_{1} (t), x_{2} (t)) + \frac{\partial V (x_{1} (t), x_{2} (t))}{\partial x_{1}} f_{1} (x_{1} (t), x_{2} (t)) + \frac{\partial V (x_{1} (t), x_{2} (t))}{\partial x_{2}} f_{2} (x_{1} (t), x_{2} (t)) \\ + \frac{1}{2} tr D_{1}^{T} (x_{1} (t), x_{2} (t)) \frac{\partial^{2} V (x_{1} (t), x_{2} (t))}{\partial x_{1}^{2}} D_{1} (x_{1} (t), x_{2} (t)) \\ + \frac{1}{2} tr D_{2}^{T} (x_{1} (t), x_{2} (t)) \frac{\partial^{2} V (x_{1} (t), x_{2} (t))}{\partial x_{2}^{2}} D_{2} (x_{1} (t), x_{2} (t))) d t + \frac{\partial V (x (t))}{\partial x} D (x_{1} (t), x_{2} (t)) d w (t) \\ = \frac{\partial V (x (t))}{\partial x} D (x_{1} (t), x_{2} (t)) d w (t) \end{matrix}

(21)

Let

{t_{n}}_{n = 0}^{\infty}

be a monotonic sequence of positive numbers with t_n → ∞ as

n \to \infty, τ_{m} : Ω \to [t_{0}, \infty)

be the first exit (stopping) time of the solution x₁(t) and x₂(t), t ≥ t₀, from the set

B_{m} (0) \times ℝ^{n_{2}}

⁠, and let

τ ≜ \lim_{m \to \infty} τ_{m}

⁠. Now, integrating Eq. over

[t_{0}, \min {t_{n}, τ_{m}}]

⁠, where

(n, m) \in ℤ_{+} \times ℤ_{+}

⁠, yields

\begin{matrix} \int_{t_{0}}^{\min {t_{n}, τ_{m}}} L (x_{1} (t), x_{2} (t)) d t = - \int_{t_{0}}^{\min {t_{n}, τ_{m}}} d V (x_{1} (t), x_{2} (t)) + \int_{t_{0}}^{\min {t_{n}, τ_{m}}} \frac{\partial V (x (t))}{\partial x} D (x_{1} (t), x_{2} (t)) d w (t) \\ = V (x_{1} (t_{0}), x_{2} (t_{0})) - V (x_{1} (\min {t_{n}, τ_{m}}), x_{2} (\min {t_{n}, τ_{m}})) + \int_{t_{0}}^{\min {t_{n}, τ_{m}}} \frac{\partial V (x (t))}{\partial x} D (x_{1} (t), x_{2} (t)) d w (t) \end{matrix}

(22)

Next, taking the expectation on both sides of Eq. yields

\begin{matrix} E^{x_{0}} [\int_{t_{0}}^{\min {t_{n}, τ_{m}}} L (x_{1} (t), x_{2} (t)) d t] = E^{x_{0}} [V (x_{1} (t_{0}), x_{2} (t_{0})) - V (x_{1} (\min {t_{n}, τ_{m}}), x_{2} (\min {t_{n}, τ_{m}})) \\ + \int_{t_{0}}^{\min {t_{n}, τ_{m}}} \frac{\partial V (x (t))}{\partial x} D (x_{1} (t), x_{2} (t)) d w (t)] \\ = V (x_{10}, x_{20}) - E^{x_{0}} [V (x_{1} (\min {t_{n}, τ_{m}}), x_{2} (\min {t_{n}, τ_{m}}))] \end{matrix}

(23)

Now, noting that

L (x_{1}, x_{2}) \geq 0, (x_{1}, x_{2}) \in ℝ^{n_{1}} \times ℝ^{n_{2}}

⁠, the sequence of random variables

{f_{n, m}}_{n, m = 0}^{\infty} \subseteq H_{1}

⁠, where

f_{n, m} ≜ \int_{t_{0}}^{\min {t_{n}, τ_{m}}} L (x_{1} (t), x_{2} (t)) d t

⁠, is a pointwise nondecreasing sequence in n and m of non-negative

F_{t}

-measurable random variables on Ω. Moreover, defining the improper integral

\int_{t_{0}}^{\infty} L (x_{1} (t), x_{2} (t)) d t

as the limit of a sequence of proper integrals, it follows from the Lebesgue monotone convergence theorem [28] that

\begin{matrix} \lim_{m \to \infty} \lim_{n \to \infty} E^{x_{0}} [\int_{t_{0}}^{\min {t_{n}, τ_{m}}} L (x_{1} (t), x_{2} (t)) d t] = \lim_{m \to \infty} E^{x_{0}} [\lim_{n \to \infty} \int_{t_{0}}^{\min {t_{n}, τ_{m}}} L (x_{1} (t), x_{2} (t)) d t] \\ = E^{x_{0}} [\lim_{m \to \infty} \int_{t_{0}}^{τ_{m}} L (x_{1} (t), x_{2} (t)) d t] = E^{x_{0}} [\int_{t_{0}}^{\infty} L (x_{1} (t), x_{2} (t)) d t] = J (x_{10}, x_{20}) \end{matrix}

(24)

Next, since

G

is globally asymptotically stable in probability with respect to x₁ uniformly in x₂₀, V(⋅, ⋅) is continuous, and

V (x_{1} (t), x_{2} (t)), t \geq t_{0}

⁠, is positive supermartingale by Eq. and Ref. [18, Lemma 5.4], it follows from Ref. [18, Theorem 5.1] that

\begin{matrix} \lim_{m \to \infty} \lim_{n \to \infty} E^{x_{0}} [V (x_{1} (\min {t_{n}, τ_{m}}), x_{2} (\min {t_{n}, τ_{m}}))] \\ = \lim_{m \to \infty} E^{x_{0}} [\lim_{n \to \infty} V (x_{1} (\min {t_{n}, τ_{m}}), x_{2} (\min {t_{n}, τ_{m}}))] \\ = E^{x_{0}} [\lim_{m \to \infty} \lim_{n \to \infty} V (x_{1} (\min {t_{n}, τ_{m}}), x_{2} (\min {t_{n}, τ_{m}}))] \end{matrix}

(25)

Now, it follows from Eq. that

\begin{matrix} V (x_{10}, x_{20}) - E^{x_{0}} [\lim_{m \to \infty} \lim_{n \to \infty} β (| | x_{1} (\min {t_{n}, τ_{m}}) | |)] \\ \leq V (x_{10}, x_{20}) - E^{x_{0}} [\lim_{m \to \infty} \lim_{n \to \infty} V (x_{1} (\min {t_{n}, τ_{m}}), x_{2} (\min {t_{n}, τ_{m}}))] \\ \leq V (x_{10}, x_{20}) - E^{x_{0}} [\lim_{m \to \infty} \lim_{n \to \infty} α (| | x_{1} (\min {t_{n}, τ_{m}}) | |)] \end{matrix}

(26)

and hence, taking the limit as n → ∞ and m → ∞ on both sides of Eq. , using Eqs. and , and using the continuity of α(⋅) and β(⋅), we obtain

\begin{matrix} V (x_{10}, x_{20}) - E^{x_{0}} [β (\lim_{m \to \infty} \lim_{n \to \infty} ‖ x_{1} (\min {t_{n}, τ_{m}} ‖))] \leq J (x_{10}, x_{20}) \leq V (x_{10}, x_{20}) - E^{x_{0}} [α (\lim_{m \to \infty} \lim_{n \to \infty} ‖ x_{1} (\min {t_{n}, τ_{m}}) ‖)] \end{matrix}

(27)

Finally, using $ℙ^{x_{0}} (\lim_{t \to \infty} ‖ x_{1} (t) ‖ = 0) = 1$ for all $(x_{10}, x_{20}) \in ℝ^{n_{1}} \times ℝ^{n_{2}}$ ⁠, Eq. (19) is a direct consequence of Eq. (27). ▪

The following corollary to Theorem 3.1 considers the nonautonomous stochastic dynamical system with performance measure

J (t_{0}, x_{0}) ≜ E^{x_{0}} [\int_{t_{0}}^{\infty} L (t, x (t)) d t]

(28)

where $L : [t_{0}, \infty) \times ℝ^{n} \to ℝ$ is jointly continuous in t and x, and x(t), t ≥ t₀, satisfies Eq. (10).

Corollary 3.1. Consider the nonlinear time-varying stochastic dynamical system (10) with performance measure (28). Assume that there exist a two-times continuously differentiable function

V : [t_{0}, \infty) \times ℝ^{n} \to ℝ

, class

K_{\infty}

functions α(⋅) and β(⋅), and a class

K

function γ(⋅) such that, for all

(t, x) \in [t_{0}, \infty) \times ℝ^{n}

α (| | x | |) \leq V (t, x) \leq β (| | x | |)

(29)

\frac{\partial V (t, x)}{\partial t} + \frac{\partial V (t, x)}{\partial x} f (t, x) + \frac{1}{2} tr D^{T} (t, x) \frac{\partial^{2} V (t, x)}{\partial x^{2}} D (t, x) \leq - γ (| | x | |)

(30)

- \frac{\partial V (t, x)}{\partial t} = L (t, x) + \frac{\partial V (t, x)}{\partial x} f (t, x) + \frac{1}{2} tr D^{T} (t, x) \frac{\partial^{2} V (t, x)}{\partial x^{2}} D (t, x)

(31)

Then, the stochastic nonlinear dynamical system (10) is globally uniformly asymptotically stable in probability and $J (t_{0}, x_{0}) = V (t_{0}, x_{0})$ for all $(t_{0}, x_{0}) \in [0, \infty) \times ℝ^{n}$ ⁠.

Proof. The result is a direct consequence of Theorem 3.1 with $n_{1} = n, n_{2} = 1, x_{1} (t - t_{0}) = x (t), x_{2} (t - t_{0}) = t, f_{1} (x_{1}, x_{2}) = f_{1} (x_{2}, x_{1}) = f (t, x), f_{2} (x_{1}, x_{2}) = 1, D_{1} (x_{1}, x_{2}) = D_{1} (x_{2}, x_{1}) = D (t, x), D_{2} (x_{1}, x_{2}) = 0$ ⁠, and $V (x_{1}, x_{2}) = V (x_{2}, x_{1}) = V (t, x)$ ⁠. ▪

Next, we use the framework developed in Theorem 3.1 to obtain a characterization of stochastic optimal feedback controllers that guarantee closed-loop, partial-state stabilization in probability. Specifically, sufficient conditions for optimality are given in a form that corresponds to a steady-state version of the stochastic Hamilton–Jacobi–Bellman equation. To address the problem of characterizing partially stabilizing feedback controllers, consider the nonlinear controlled stochastic dynamical system

d x_{1} (t) = F_{1} (x_{1} (t), x_{2} (t), u (t)) d t + D_{1} (x_{1} (t), x_{2} (t), u (t)) d w (t), x_{1} (0) = x_{10} a . s ., t \geq 0

(32)

d x_{2} (t) = F_{2} (x_{1} (t), x_{2} (t), u (t)) d t + D_{2} (x_{1} (t), x_{2} (t), u (t)) d w (t), x_{2} (0) = x_{20} a . s .

(33)

where, for every $t \geq 0, x_{1} (t) \in H_{n_{1}}, x_{2} (t) \in H_{n_{2}}, u (t) \in H_{m}, F_{1} : ℝ^{n_{1}} \times ℝ^{n_{2}} \times ℝ^{m} \to ℝ^{n_{1}}, F_{2} : ℝ^{n_{1}} \times ℝ^{n_{2}} \times ℝ^{m} \to ℝ^{n_{2}}, D_{1} : ℝ^{n_{1}} \times ℝ^{n_{2}} \times ℝ^{m} \to ℝ^{n_{1} \times d}, D_{2} : ℝ^{n_{1}} \times ℝ^{n_{2}} \times ℝ^{m} \to ℝ^{n_{2} \times d}$ ⁠, and F₁(0, x₂, 0) = 0 and D₁(0, x₂, 0) = 0 for every $x_{2} \in ℝ^{n_{2}}$ ⁠.

Here, we assume that u(⋅) satisfies sufficient regularity conditions such that Eqs. (32) and (33) have a unique solution forward in time. Specifically, we assume that the control process u(⋅) in Eqs. (32) and (33) is restricted to the class of admissible controls consisting of measurable functions u(⋅) adapted to the filtration ${F_{t}}_{t \geq 0}$ such that $u (t) \in H_{m}, t \geq 0$ ⁠, and, for all $t \geq s, w (t) - w (s)$ is independent of $u (τ), w (τ), τ \leq s$ ⁠, and $x (0) = {[x_{1}^{T} (0), x_{2}^{T} (0)]}^{T}$ ⁠, and hence, u(⋅) is nonanticipative. Furthermore, we assume u(⋅) takes values in a compact, metrizable set U and the uniform Lipschitz continuity and growth conditions (4) and (5) hold for the controlled drift and diffusion terms $F (x_{1}, x_{2}, u) ≜ {[F_{1}^{T} (x_{1}, x_{2}, u), F_{2}^{T} (x_{1}, x_{2}, u)]}^{T}$ and $D (x_{1}, x_{2}, u) ≜ {[D_{1}^{T} (x_{1}, x_{2}, u), D_{2}^{T} (x_{1}, x_{2}, u)]}^{T}$ uniformly in u. In this case, it follows from Theorem 2.2.4 of Ref. [29] that there exists a pathwise unique solution to Eqs. (32) and (33) in $(Ω, {F}_{t \geq 0}, ℙ^{x_{0}})$ ⁠.

A measurable function

ϕ : ℝ^{n_{1}} \times ℝ^{n_{2}} \to ℝ^{m}

satisfying

ϕ (0, x_{2}) = 0, x_{2} \in ℝ^{n_{2}}

⁠, is called a control law. If

u (t) = ϕ (x_{1} (t), x_{2} (t))

⁠, t ≥ 0, where

ϕ (\cdot, \cdot)

is a control law and x₁(t) and x₂(t) satisfy Eqs. and , then we call u(⋅) a feedback control law. Note that the feedback control law is an admissible control, since

ϕ (x_{1} (t), x_{2} (t)) \in H_{m}, t \geq 0

⁠. Given a control law

ϕ (\cdot, \cdot)

and a feedback control law

u (t) = ϕ (x_{1} (t), x_{2} (t)), t \geq 0

⁠, the closed-loop systems and is given by

\begin{matrix} d x_{1} (t) = F_{1} (x_{1} (t), x_{2} (t), ϕ (x_{1} (t), x_{2} (t))) d t + D_{1} (x_{1} (t), x_{2} (t), ϕ (x_{1} (t), x_{2} (t))) d w (t), x_{1} (0) = x_{10} a . s ., t \geq 0 \end{matrix}

(34)

\begin{matrix} d x_{2} (t) = F_{2} (x_{1} (t), x_{2} (t), ϕ (x_{1} (t), x_{2} (t))) d t + D_{2} (x_{1} (t), x_{2} (t), ϕ (x_{1} (t), x_{2} (t))) d w (t), x_{2} (0) = x_{20} a . s . \end{matrix}

(35)

Next, we present a main theorem for partial-state stabilization in probability characterizing feedback controllers that guarantee partial closed-loop stability in probability and minimize a nonlinear-nonquadratic performance functional. For the statement of this result, let

L : ℝ^{n_{1}} \times ℝ^{n_{2}} \times ℝ^{m} \to ℝ

be jointly continuous in x₁, x₂, and u, and define the set of partial regulation controllers given by

\begin{matrix} S (x_{1} (0), x_{2} (0)) ≜ {u (\cdot) : u (\cdot) is admissible and x_{1} (\cdot) given by Eq . (32) satisfies ℙ^{x_{0}} (\lim_{t \to \infty} | | x_{1} (t) | | = 0) = 1} \end{matrix}

Note that restricting our minimization problem to $u (\cdot) \in S (x_{1} (0), x_{2} (0))$ ⁠, that is, inputs corresponding to partial-state null convergent in probability solutions, can be interpreted as incorporating a partial-state system detectability condition through the cost.

Theorem 3.2. Consider the nonlinear controlled stochastic dynamical system

G

given by Eqs.andwith performance functional

J (x_{10}, x_{20}, u (\cdot)) ≜ E^{x_{0}} [\int_{0}^{\infty} L (x_{1} (t), x_{2} (t), u (t)) d t]

(36)

where u(⋅) is an admissible control. Assume that there exist a two-times continuously differentiable function

V : ℝ^{n_{1}} \times ℝ^{n_{2}} \to ℝ

, class

K_{\infty}

functions α(⋅) and β(⋅), a class

K

function γ(⋅), and a control law

ϕ : ℝ^{n_{1}} \times ℝ^{n_{2}} \to ℝ^{m}

such that, for all

(x_{1}, x_{2}) \in ℝ^{n_{1}} \times ℝ^{n_{2}}

α (| | x_{1} | |) \leq V (x_{1}, x_{2}) \leq β (| | x_{1} | |)

(37)

\begin{matrix} V' (x_{1}, x_{2}) F (x_{1}, x_{2}, ϕ (x_{1}, x_{2})) + \frac{1}{2} tr D^{T} (x_{1}, x_{2}, ϕ (x_{1}, x_{2})) V^{″} (x_{1}, x_{2}) D (x_{1}, x_{2}, ϕ (x_{1}, x_{2})) \leq - γ (| | x_{1} | |) \end{matrix}

(38)

ϕ (0, x_{2}) = 0

(39)

H (x_{1}, x_{2}, ϕ (x)) = 0

(40)

H (x_{1}, x_{2}, u) \geq 0, (x_{1}, x_{2}, u) \in ℝ^{n_{1}} \times ℝ^{n_{2}} \times ℝ^{m}

(41)

where

\begin{matrix} H (x_{1}, x_{2}, u) ≜ L (x_{1}, x_{2}, u) + V' (x_{1}, x_{2}) F (x_{1}, x_{2}, u) + \frac{1}{2} tr D^{T} (x_{1}, x_{2}, u) V^{″} (x_{1}, x_{2}) D (x_{1}, x_{2}, u) \end{matrix}

(42)

Then, with the feedback control

u = ϕ (x_{1}, x_{2})

, the closed-loop system given by Eqs.andis globally asymptotically stable in probability with respect to x₁ uniformly in x₂₀ and

J (x_{10}, x_{20}, ϕ (x_{1} (\cdot), x_{2} (\cdot))) = V (x_{10}, x_{20}), (x_{10}, x_{20}) \in ℝ^{n_{1}} \times ℝ^{n_{2}}

(43)

In addition, if

(x_{10}, x_{20}) \in ℝ^{n_{1}} \times ℝ^{n_{2}}

, then the feedback control

u (\cdot) = ϕ (x_{1} (\cdot), x_{2} (\cdot))

minimizes

J (x_{10}, x_{20}, u (\cdot))

in the sense that

J (x_{10}, x_{20}, ϕ (\cdot, \cdot)) = \min_{u (\cdot) \in S (x_{1} (0), x_{2} (0))} J (x_{10}, x_{20}, u (\cdot))

(44)

Proof. Global asymptotic stability in probability with respect to x₁ uniformly in x₂₀ is a direct consequence of Eqs. (37) and (38) by applying Theorem 2.1 to the closed-loop system given by Eqs. (34) and (35). Furthermore, using Eq. (40), condition (43) is a restatement of Eq. (19) as applied to the closed-loop system.

Next, let

(x_{10}, x_{20}) \in ℝ^{n_{1}} \times ℝ^{n_{2}}

⁠, let

u (\cdot) \in S (x_{1} (0), x_{2} (0))

⁠, and let x₁(t) and x₂(t), t ≥ 0, be solutions of Eqs. and . Then, using Itô's (chain rule) formula, the stochastic differential of

V (x_{1} (t), x_{2} (t))

along the system trajectories

(x_{1} (t), x_{2} (t)), t \geq 0

⁠, is given by

d V (x_{1} (t), x_{2} (t)) = L V (x_{1} (t), x_{2} (t)) d t + \frac{\partial V (x (t))}{\partial x} D (x_{1} (t), x_{2} (t), u (t)) d w (t)

(45)

Hence, using Eqs. and yields

\begin{matrix} L (x_{1} (t), x_{2} (t), u (t)) d t = - d V (x_{1} (t), x_{2} (t)) + (L (x_{1} (t), x_{2} (t), u (t)) + L V (x_{1} (t), x_{2} (t))) d t + \frac{\partial V (x (t))}{\partial x} D (x_{1} (t), x_{2} (t), u (t)) d w (t) \\ = - d V (x_{1} (t), x_{2} (t)) + H (x_{1} (t), x_{2} (t), u (t)) d t + \frac{\partial V (x (t))}{\partial x} D (x_{1} (t), x_{2} (t), u (t)) d w (t) \end{matrix}

(46)

Now, it follows from Eq. that

E^{x_{0}} [\lim_{t \to \infty} α (| | x_{1} (t) | |)] \leq E^{x_{0}} [\lim_{t \to \infty} V (x_{1} (t), x_{2} (t))] \leq E^{x_{0}} [\lim_{t \to \infty} β (| | x_{1} (t) | |)]

(47)

Using the continuity of α(⋅) and β(⋅), and the fact that

ℙ^{x_{0}} (\lim_{t \to \infty} | | x_{1} (t) | | = 0) = 1

for all

u (\cdot) \in S (x_{1} (0), x_{2} (0))

⁠, it follows from Eq. that

0 = E^{x_{0}} [α (\lim_{t \to \infty} | | x_{1} (t) | |)] \leq E^{x_{0}} [\lim_{t \to \infty} V (x_{1} (t), x_{2} (t))] \leq E^{x_{0}} [β (\lim_{t \to \infty} | | x_{1} (t) | |)] = 0

(48)

Let

{t_{n}}_{n = 0}^{\infty}

be a monotonic sequence of positive numbers with t_n → ∞ as

n \to \infty, τ_{m} : Ω \to [0, \infty)

be the first exit (stopping) time of the solution x₁(t) and x₂(t), t ≥ 0, from the set

B_{m} (0) \times ℝ^{n_{2}}

⁠, and let

τ ≜ \lim_{m \to \infty} τ_{m}

⁠. Now, integrating Eq. over

[t_{0}, \min {t_{n}, τ_{m}}]

⁠, where

(n, m) \in ℤ_{+} \times ℤ_{+}

⁠, yields

\begin{matrix} \int_{0}^{\min {t_{n}, τ_{m}}} L (x_{1} (t), x_{2} (t), u (t)) d t = - \int_{0}^{\min {t_{n}, τ_{m}}} d V (x_{1} (t), x_{2} (t)) + \int_{0}^{\min {t_{n}, τ_{m}}} H (x_{1} (t), x_{2} (t), u (t)) d t \\ + \int_{0}^{\min {t_{n}, τ_{m}}} \frac{\partial V (x (t))}{\partial x} D (x_{1} (t), x_{2} (t)) d w (t) \\ = V (x_{1} (0), x_{2} (0)) - V (x_{1} (\min {t_{n}, τ_{m}}), x_{2} (\min {t_{n}, τ_{m}})) \\ + \int_{0}^{\min {t_{n}, τ_{m}}} H (x_{1} (t), x_{2} (t), u (t)) d t + \int_{0}^{\min {t_{n}, τ_{m}}} \frac{\partial V (x (t))}{\partial x} D (x_{1} (t), x_{2} (t)) d w (t) \end{matrix}

(49)

Next, taking the expectation on both sides of Eq. and using Eq. yield

\begin{matrix} E^{x_{0}} [\int_{0}^{\min {t_{n}, τ_{m}}} L (x_{1} (t), x_{2} (t), u (t)) d t] = E^{x_{0}} [V (x_{1} (0), x_{2} (0)) - V (x_{1} (\min {t_{n}, τ_{m}}), x_{2} (\min {t_{n}, τ_{m}})) \\ + \int_{0}^{\min {t_{n}, τ_{m}}} H (x_{1} (t), x_{2} (t), u (t)) d t \\ + \int_{0}^{\min {t_{n}, τ_{m}}} \frac{\partial V (x (t))}{\partial x} D (x_{1} (t), x_{2} (t)) d w (t)] \\ = V (x_{10}, x_{20}) - E^{x_{0}} [V (x_{1} (\min {t_{n}, τ_{m}}), x_{2} (\min {t_{n}, τ_{m}}))] \\ + E^{x_{0}} [\int_{0}^{\min {t_{n}, τ_{m}}} H (x_{1} (t), x_{2} (t), u (t)) d t] \\ \geq V (x_{10}, x_{20}) - E^{x_{0}} [V (x_{1} (\min {t_{n}, τ_{m}}), x_{2} (\min {t_{n}, τ_{m}}))] \end{matrix}

(50)

Now, noting that for all $u (\cdot) \in S (x_{1} (0), x_{2} (0)), \int_{0}^{\infty} | L (x_{1} (t), x_{2} (t), u (t)) | d t \overset{a . s .}{<} \infty$ ⁠, define the random variable $g ≜ \sup_{t \geq 0, m > 0} \int_{0}^{\min {t, τ_{m}}} | L (x_{1} (s), x_{2} (s), u (s)) | d s$ ⁠. In this case, the sequence of $F_{t}$ -measurable random variables ${f_{n, m}}_{n, m = 0}^{\infty} \subseteq H_{1}$ on Ω, where $f_{n, m} ≜ \int_{0}^{\min {t_{n}, τ_{m}}} L (x_{1} (t), x_{2} (t), u (t)) d t$ ⁠, satisfies $| f_{n, m} | \overset{a . s .}{<} g$ ⁠.

Next, defining the improper integral

\int_{0}^{\infty} L (x_{1} (t), x_{2} (t), u (t)) d t

as the limit of a sequence of proper integrals, it follows from dominated convergence theorem [28] that

\begin{matrix} \lim_{m \to \infty} \lim_{n \to \infty} E^{x_{0}} [\int_{0}^{\min {t_{n}, τ_{m}}} L (x_{1} (t), x_{2} (t), u (t)) d t] = \lim_{m \to \infty} E^{x_{0}} [\lim_{n \to \infty} \int_{0}^{\min {t_{n}, τ_{m}}} L (x_{1} (t), x_{2} (t), u (t)) d t] \\ = E^{x_{0}} [\lim_{m \to \infty} \int_{0}^{τ_{m}} L (x_{1} (t), x_{2} (t), u (t)) d t] = E^{x_{0}} [\int_{t_{0}}^{\infty} L (x_{1} (t), x_{2} (t), u (t)) d t] \\ = J (x_{10}, x_{20}, u (\cdot)) \end{matrix}

(51)

Finally, using the fact that

u (\cdot) \in S (x_{1} (0), x_{2} (0))

and V(⋅, ⋅) is continuous, it follows that for every m > 0,

V (x_{1} (\min {t_{n}, τ_{m}}), x_{2} (\min {t_{n}, τ_{m}}))

is bounded for all

{t_{n}}_{n = 0}^{\infty}

⁠. Thus, using the dominated convergence theorem, we obtain

\begin{matrix} \lim_{m \to \infty} \lim_{n \to \infty} E^{x_{0}} [V (x_{1} (\min {t_{n}, τ_{m}}), x_{2} (\min {t_{n}, τ_{m}}))] = E^{x_{0}} [\lim_{m \to \infty} \lim_{n \to \infty} V (x_{1} (\min {t_{n}, τ_{m}}), x_{2} (\min {t_{n}, τ_{m}}))] \end{matrix}

(52)

Now, taking the limit as n → ∞ and m → ∞ on both sides of Eq. (50) and using the fact $u (\cdot) \in S (x_{1} (0), x_{2} (0))$ ⁠, Eqs. (48), (51), (52), and $J (x_{10}, x_{20}, ϕ (x_{1} (\cdot), x_{2} (\cdot))) = V (x_{10}, x_{20})$ yield Eq. (44). ▪

Note that Eq. is the steady-state, stochastic Hamilton–Jacobi–Bellman equation for the nonlinear controlled stochastic dynamical systems and with performance criterion . Furthermore, conditions and guarantee optimality with respect to the set of admissible partially asymptotically stabilizing in probability controllers

S (x_{0} (0), x_{2} (0))

⁠. However, it is important to note that an explicit characterization of

S (x_{1} (0), x_{2} (0))

is not required. In addition, the stochastic optimal asymptotically stabilizing in probability with respect to x₁ uniformly in x₂₀feedback control law

u = ϕ (x_{1}, x_{2})

is independent of the initial condition (x₁₀, x₂₀) and is given by

\begin{matrix} ϕ (x_{1}, x_{2}) = \underset{u \in S (x_{1} (0), x_{2} (0))}{arg min} [L (x_{1}, x_{2}, u) + V' (x_{1}, x_{2}) F (x_{1}, x_{2}, u) \\ + \frac{1}{2} tr D^{T} (x_{1}, x_{2}, u) V^{″} (x_{1}, x_{2}) D (x_{1}, x_{2}, u)] \end{matrix}

(53)

Remark 3.1. Setting n₁ = n and n₂ = 0, the nonlinear controlled stochastic dynamical system given by Eqs. and reduces to

d x (t) = F (x (t), u (t)) d t + D (x (t), u (t)) d w (t), x (0) = x_{0} a . s ., t \geq 0

(54)

In this case, Eq. (37) implies that V(⋅) is positive definite with respect to x, and the conditions of Theorem 3.2 reduce to the conditions given in Chap. 4 of Ref. [17] characterizing the classical stochastic optimal control problem for time-invariant systems on an infinite interval.

Next, we specialize the results of Theorem 3.1 to nonlinear affine in the control stochastic dynamical systems of the form

\begin{matrix} d x_{1} (t) = [f_{1} (x_{1} (t), x_{2} (t)) + G_{1} (x_{1} (t), x_{2} (t)) u (t)] d t + D_{1} (x_{1} (t), x_{2} (t)) d w (t), x_{1} (0) = x_{10} a . s ., t \geq 0 \end{matrix}

(55)

\begin{matrix} d x_{2} (t) = [f_{2} (x_{1} (t), x_{2} (t)) + G_{2} (x_{1} (t), x_{2} (t)) u (t)] d t + D_{2} (x_{1} (t), x_{2} (t)) d w (t), x_{2} (0) = x_{20} a . s . \end{matrix}

(56)

where, for every

t \geq 0, x_{1} (t) \in H_{n_{1}}

and

x_{2} (t) \in H_{n_{2}}, u (t) \in H_{m}

⁠, and

f_{1} : ℝ^{n_{1}} \times ℝ^{n_{2}} \to ℝ^{n_{1}}, f_{2} : ℝ^{n_{1}} \times ℝ^{n_{2}} \to ℝ^{n_{2}}, G_{1} : ℝ^{n_{1}} \times ℝ^{n_{2}} \to ℝ^{n_{1} \times m}, G_{2} : ℝ^{n_{1}} \times ℝ^{n_{2}} \to ℝ^{n_{2} \times m}, D_{1} : ℝ^{n_{1}} \times ℝ^{n_{2}} \to ℝ^{n_{1} \times d}

⁠, and

D_{2} : ℝ^{n_{1}} \times ℝ^{n_{2}} \to ℝ^{n_{2} \times d}

are such that f₁(0, x₂) = 0 and D₁(0, x₂) = 0 for all

x_{2} \in ℝ^{n_{2}}

⁠; and

F (x_{1}, x_{2}, u) ≜ {[{(f_{1} (x_{1}, x_{2}) + G_{1} (x_{1}, x_{2}) u)}^{T}, {(f_{2} (x_{1}, x_{2}) + G_{2} (x_{1}, x_{2}) u)}^{T}]}^{T}, D (x_{1}, x_{2}, u) ≜ {[D_{1}^{T} (x_{1}, x_{2}, u), D_{2}^{T} (x_{1}, x_{2}, u)]}^{T}

satisfy Eqs. and uniformly in u. Furthermore, we consider performance integrands L(x₁, x₂, u) of the form

L (x_{1}, x_{2}, u) = L_{1} (x_{1}, x_{2}) + L_{2} (x_{1}, x_{2}) u + u^{T} R_{2} (x_{1}, x_{2}) u, (x_{1}, x_{2}, u) \in ℝ^{n_{1}} \times ℝ^{n_{2}} \times ℝ^{m}

(57)

where

L_{1} : ℝ^{n_{1}} \times ℝ^{n_{2}} \to ℝ, L_{2} : ℝ^{n_{1}} \times ℝ^{n_{2}} \to ℝ^{1 \times m}

⁠, and

R_{2} (x_{1}, x_{2}) \geq N (x_{1}) > 0, (x_{1}, x_{2}) \in ℝ^{n_{1}} \times ℝ^{n_{2}}

⁠, so that Eq. becomes

\begin{matrix} J (x_{10}, x_{20}, u (\cdot)) = E^{x_{0}} [\int_{0}^{\infty} [L_{1} (x_{1} (t), x_{2} (t)) + L_{2} (x_{1} (t), x_{2} (t)) u (t) + u^{T} (t) R_{2} (x_{1} (t), x_{2} (t)) u (t)] d t] \end{matrix}

(58)

For the statement of the next result, define

\begin{matrix} f (x_{1}, x_{2}) ≜ {[f_{1}^{T} (x_{1}, x_{2}), f_{2}^{T} (x_{1}, x_{2})]}^{T}, G (x_{1}, x_{2}) ≜ {[G_{1}^{T} (x_{1}, x_{2}), G_{2}^{T} (x_{1}, x_{2})]}^{T}, D (x_{1}, x_{2}) ≜ {[D_{1}^{T} (x_{1}, x_{2}), D_{2}^{T} (x_{1}, x_{2})]}^{T} \end{matrix}

Corollary 3.2. Consider the controlled nonlinear affine stochastic dynamical systems (55) and (56) with performance measure (58). Assume that there exist a two-times continuously differentiable function

V : ℝ^{n_{1}} \times ℝ^{n_{2}} \to ℝ

, class

K_{\infty}

functions α(⋅) and β(⋅), and a class

K

function γ(⋅) such that, for all

(x_{1}, x_{2}) \in ℝ^{n_{1}} \times ℝ^{n_{2}}

α (| | x_{1} | |) \leq V (x_{1}, x_{2}) \leq β (| | x_{1} | |)

(59)

\begin{matrix} V' (x_{1}, x_{2}) [f (x_{1}, x_{2}) - \frac{1}{2} G (x_{1}, x_{2}) R_{2}^{- 1} (x_{1}, x_{2}) L_{2}^{T} (x_{1}, x_{2}) \\ - \frac{1}{2} G (x_{1}, x_{2}) R_{2}^{- 1} (x_{1}, x_{2}) G^{T} (x_{1}, x_{2}) V^{' T} (x_{1}, x_{2})] \\ + \frac{1}{2} tr D^{T} (x_{1}, x_{2}) V^{″} (x_{1}, x_{2}) D (x_{1}, x_{2}) \leq - γ (| | x_{1} | |) \end{matrix}

(60)

L_{2} (0, x_{2}) = 0

(61)

\begin{matrix} 0 = L_{1} (x_{1}, x_{2}) + V' (x_{1}, x_{2}) f (x_{1}, x_{2}) + \frac{1}{2} tr D^{T} (x_{1}, x_{2}) V^{″} (x_{1}, x_{2}) D (x_{1}, x_{2}) \\ - \frac{1}{4} [V' (x_{1}, x_{2}) G (x_{1}, x_{2}) + L_{2} (x_{1}, x_{2})] R_{2}^{- 1} (x_{1}, x_{2}) {[V^{'} (x_{1}, x_{2}) G (x_{1}, x_{2}) + L_{2} (x_{1}, x_{2})]}^{T} \end{matrix}

(62)

Then, with the feedback control

u = ϕ (x_{1}, x_{2}) = - \frac{1}{2} R_{2}^{- 1} (x_{1}, x_{2}) {[L_{2} (x_{1}, x_{2}) + V^{'} (x_{1}, x_{2}) G (x_{1}, x_{2})]}^{T}

(63)

the closed-loop system

\begin{matrix} d x_{1} (t) = [f_{1} (x_{1} (t), x_{2} (t)) + G_{1} (x_{1} (t), x_{2} (t)) ϕ (x_{1} (t), x_{2} (t))] d t + D_{1} (x_{1} (t), x_{2} (t)) d w (t), x_{1} (0) = x_{10} a . s ., t \geq 0 \end{matrix}

(64)

\begin{matrix} d x_{2} (t) = [f_{2} (x_{1} (t), x_{2} (t)) + G_{2} (x_{1} (t), x_{2} (t)) ϕ (x_{1} (t), x_{2} (t))] d t + D_{2} (x_{1} (t), x_{2} (t)) d w (t), x_{2} (0) = x_{20} a . s . \end{matrix}

(65)

is globally asymptotically stable in probability with respect to x₁ uniformly in x₂₀, and the performance measure (58) is minimized in the sense of Eq.(44). Finally, Eq.(43)holds.

Proof. The result is a consequence of Theorem 3.2 with $F (x_{1}, x_{2}, u) = f (x_{1}, x_{2}) + G (x_{1}, x_{2}) u$ and $L (x_{1}, x_{2}, u) = L_{1} (x_{1}, x_{2}) + L_{2} (x_{1}, x_{2}) u + u^{T} R_{2} (x_{1}, x_{2}) u$ ⁠. ▪

Finally, we use Theorem 3.2 to provide a unification between optimal partial-state stochastic stabilization and stochastic optimal control for nonlinear time-varying systems. Specifically, consider the nonlinear time-varying controlled stochastic dynamical system

d x (t) = F (t, x (t), u (t)) d t + D (t, x (t), u (t)) d w (t), x (t_{0}) = x_{0} a . s ., t \geq t_{0}

(66)

with performance measure

J (t_{0}, x_{0}, u (\cdot)) ≜ E^{x_{0}} [\int_{t_{0}}^{\infty} L (t, x (t), u (t)) d t]

(67)

where, for every

t \geq t_{0}, x (t) \in H_{n}, u (t) \in H_{m}, L : [t_{0}, \infty) \times ℝ^{n} \times ℝ^{m} \to ℝ, F : [t_{0}, \infty) \times ℝ^{n} \times ℝ^{m} \to ℝ^{n}

⁠, and

D : [t_{0}, \infty) \times ℝ^{n} \times ℝ^{m} \to ℝ^{n \times d}

are jointly continuous in t, x, and u, F(t, ⋅, u) and D(t, ⋅, u) are Lipschitz continuous in x for every

(t, u) \in [t_{0}, \infty) \times ℝ^{m}

⁠, and F(t, x, ⋅) and D(t, x, ⋅) are Lipschitz continuous in u for every

(t, x) \in [t_{0}, \infty) \times ℝ^{n}

⁠. For the statement of the next result, define the set of regulation controllers

\begin{matrix} S (t_{0}, x (t_{0})) ≜ {u (\cdot) : u (\cdot) is admissible and x (\cdot) given by Eq . (66) satisfies ℙ^{x_{0}} (\lim_{t \to \infty} | | x (t) | | = 0) = 1} \end{matrix}

Corollary 3.3. Consider the nonlinear time-varying controlled stochastic dynamical system (66) with performance measure (67), where u(⋅) is an admissible control. Assume that there exist a two-times continuously differentiable function

V : [t_{0}, \infty) \times ℝ^{n} \to ℝ

, class

K_{\infty}

functions α(⋅) and β(⋅), a class

K

function γ(⋅), and a control law

ϕ : [t_{0}, \infty) \times ℝ^{n} \to ℝ^{m}

such that, for all

(t, x) \in [t_{0}, \infty) \times ℝ^{n}

α (| | x | |) \leq V (t, x) \leq β (| | x | |)

(68)

\begin{matrix} \frac{\partial V (t, x)}{\partial t} + \frac{\partial V (t, x)}{\partial x} F (t, x, ϕ (t, x)) + \frac{1}{2} tr D^{T} (t, x, ϕ (t, x)) \cdot \frac{\partial^{2} V (t, x)}{\partial x^{2}} D (t, x, ϕ (t, x)) \leq - γ (| | x | |) \end{matrix}

(69)

ϕ (t, 0) = 0

(70)

\begin{matrix} L (t, x, ϕ (t, x)) + \frac{\partial V (t, x)}{\partial t} + \frac{\partial V (t, x)}{\partial x} F (t, x, ϕ (t, x)) + \frac{1}{2} tr D^{T} (t, x, ϕ (t, x)) \frac{\partial^{2} V (t, x)}{\partial x^{2}} D (t, x, ϕ (t, x)) = 0 \end{matrix}

(71)

\begin{matrix} L (t, x, u) + \frac{\partial V (t, x)}{\partial t} + \frac{\partial V (t, x)}{\partial x} F (t, x, u) + \frac{1}{2} tr D^{T} (t, x, u) \frac{\partial^{2} V (t, x)}{\partial x^{2}} D (t, x, u) \geq 0, (t, x, u) \in [t_{0}, \infty) \times ℝ^{n} \times ℝ^{m} \end{matrix}

(72)

Then, with the feedback control

u = ϕ (t, x)

, the closed-loop system given by Eq.is globally uniformly asymptotically stable in probability and

J (t_{0}, x_{0}, ϕ (\cdot, \cdot)) = V (t_{0}, x_{0})

for all

(t_{0}, x_{0}) \in [0, \infty) \times D_{0}

. In addition, if

(t_{0}, x_{0}) \in [0, \infty) \times ℝ^{n}

, then the feedback control

u (\cdot) = ϕ (\cdot, x (\cdot))

minimizes J(x₀, u(⋅)) in the sense that

J (t_{0}, x_{0}, ϕ (\cdot, \cdot)) = \min_{u (\cdot) \in S (t_{0}, x (t_{0}))} J (t_{0}, x_{0}, u (\cdot))

(73)

Proof. The proof is a direct consequence of Theorem 3.2 with $n_{1} = n, n_{2} = 1, x_{1} (t - t_{0}) = x (t), x_{2} (t - t_{0}) = t, F_{1} (x_{1}, x_{2}, u) = F_{1} (x_{2}, x_{1}, u) = F (t, x, u), F_{2} (x_{1}, x_{2}, u) = 1, D_{1} (x_{1}, x_{2}, u) = D_{1} (x_{2}, x_{1}, u) = D (t, x, u), D_{2} (x_{1}, x_{2}, u) = 0, ϕ (x_{1}, x_{2}) = ϕ (x_{2}, x_{1}) = ϕ (t, x)$ ⁠, and $V (x_{1}, x_{2}) = V (x_{2}, x_{1}) = V (t, x)$ ⁠. ▪

Note that Eqs. and give the stochastic Hamilton–Jacobi–Bellman equation

\begin{matrix} - \frac{\partial V (t, x)}{\partial t} = \min_{u \in S (t_{0}, x (t_{0}))} [L (t, x, u) + \frac{\partial V (t, x)}{\partial x} F (t, x, u) \\ + \frac{1}{2} tr D^{T} (t, x, u) \frac{\partial^{2} V (t, x)}{\partial x^{2}} D (t, x, u)], (t, x) \in [t_{0}, \infty) \times ℝ^{n} \end{matrix}

(74)

which characterizes the optimal control

\begin{matrix} ϕ (t, x) = \underset{u \in S (t_{0}, x (t_{0}))}{arg min} [L (t, x, u) + \frac{\partial V (t, x)}{\partial x} F (t, x, u) \\ + \frac{1}{2} tr D^{T} (t, x, u) \frac{\partial^{2} V (t, x)}{\partial x^{2}} D (t, x, u)] \end{matrix}

(75)

for time-varying stochastic systems on a finite or infinite interval.

Inverse Optimal Stochastic Control

In this section, we construct state feedback controllers for nonlinear affine in the control stochastic dynamical systems that are predicated on an inverse optimal control problem [7–13]. In particular, as noted in the Introduction, to avoid the complexity in solving the steady-state, stochastic Hamilton–Jacobi–Bellman equation (62), we do not attempt to minimize a given cost functional, but rather, we parameterize a family of stabilizing controllers that minimize some derived cost functional that provides flexibility in specifying the control law. The performance integrand is shown to explicitly depend on the nonlinear system dynamics, the Lyapunov function of the closed-loop system, and the stabilizing feedback control law, wherein the coupling is introduced via the stochastic Hamilton–Jacobi–Bellman equation. Hence, by varying the parameters in the Lyapunov function and the performance integrand, the proposed framework can be used to characterize a class of globally partial-state stabilizing (in probability) controllers that can meet closed-loop system response constraints.

Theorem 4.1. Consider the nonlinear controlled affine stochastic dynamical systems (55) and (56) with performance measure (58). Assume there exist a two-times continuously differentiable function

V : ℝ^{n_{1}} \times ℝ^{n_{2}} \to ℝ

, class

K_{\infty}

functions α(⋅) and β(⋅), and a class

K

function γ(⋅) such that, for all

(x_{1}, x_{2}) \in ℝ^{n_{1}} \times ℝ^{n_{2}}

α (| | x_{1} | |) \leq V (x_{1}, x_{2}) \leq β (| | x_{1} | |)

(76)

\begin{matrix} V' (x_{1}, x_{2}) [f (x_{1}, x_{2}) - \frac{1}{2} G (x_{1}, x_{2}) R_{2}^{- 1} (x_{1}, x_{2}) L_{2}^{T} (x_{1}, x_{2}) - \frac{1}{2} G (x_{1}, x_{2}) R_{2}^{- 1} (x_{1}, x_{2}) \\ \cdot G^{T} (x_{1}, x_{2}) V^{' T} (x_{1}, x_{2})] + \frac{1}{2} tr D^{T} (x_{1}, x_{2}) V^{″} (x_{1}, x_{2}) D (x_{1}, x_{2}) \leq - γ (| | x_{1} | |) \end{matrix}

(77)

L_{2} (0, x_{2}) = 0

(78)

Then, with the feedback control

u = ϕ (x_{1}, x_{2}) = - \frac{1}{2} R_{2}^{- 1} (x_{1}, x_{2}) {[L_{2} (x_{1}, x_{2}) + V' (x_{1}, x_{2}) G (x_{1}, x_{2})]}^{T}

(79)

the closed-loop system given by Eqs.andis globally asymptotically stable in probability with respect to x₁ uniformly in x₂₀ and the performance functional (58) with

\begin{matrix} L_{1} (x_{1}, x_{2}) = ϕ^{T} (x_{1}, x_{2}) R_{2} (x_{1}, x_{2}) ϕ (x_{1}, x_{2}) - V' (x_{1}, x_{2}) f (x_{1}, x_{2}) - \frac{1}{2} tr D^{T} (x_{1}, x_{2}) V^{″} (x_{1}, x_{2}) D (x_{1}, x_{2}) \end{matrix}

(80)

is minimized in the sense of Eq.(44). Finally, Eq.(43)holds.

Proof. The proof is identical to the proof of Corollary 3.2. ▪

Next, we specialize Theorem 4.1 to linear time-varying stochastic systems controlled by nonlinear controllers that minimize a polynomial cost functional generalizing the results of Refs. [1] and [3] to the stochastic setting. Specifically, consider the linear time-varying stochastic dynamical system

d x (t) = [A (t) x (t) + B (t) u (t)] d t + x (t) σ^{T} (t) d w (t), x (t_{0}) = x_{0} a . s ., t \geq t_{0}

(81)

where, for all

t \geq t_{0}, x (t) \in H_{n}, u (t) \in H_{m}

⁠, and

σ : [t_{0}, \infty) \to ℝ^{d}, A : [t_{0}, \infty) \to ℝ^{n \times n}

⁠, and

B : [t_{0}, \infty) \to ℝ^{n \times m}

are continuous and uniformly bounded. For the following result, let

R_{1} : [t_{0}, \infty) \to ℝ^{n \times n}, R_{2} : [t_{0}, \infty) \to ℝ^{m \times m}

⁠, and

{\hat{R}}_{q} : [t_{0}, \infty) \to ℝ^{n \times n}, q = 2, \dots, r

⁠, where r is a positive integer, be continuous, uniformly bounded, and positive-definite matrices, that is, there exist γ, μ,

{\hat{μ}}_{q} > 0, q = 2, \dots, r

⁠, such that

R_{1} (t) \geq γ I_{n} > 0, R_{2} (t) \geq μ I_{m} > 0

⁠, and

{\hat{R}}_{q} (t) \geq {\hat{μ}}_{q} I_{m} > 0

⁠, for all t ≥ t₀. Furthermore, we consider performance integrands in Eq. of the form

L (t, x, u) = L_{1} (t, x) + L_{2} (t, x) u + u^{T} R_{2} (t, x) u, (t, x, u) \in [t_{0}, \infty) \times ℝ^{n} \times ℝ^{m}

(82)

where

L_{1} : [t_{0}, \infty) \times ℝ^{n} \to ℝ, L_{2} : [t_{0}, \infty) \times ℝ^{n} \to ℝ^{1 \times m}

⁠, and

R_{2} (t, x) \geq N (x) > 0, (t, x) \in [t_{0}, \infty) \times ℝ^{n}

⁠, so that Eq. becomes

J (t_{0}, x_{0}, u (\cdot)) = E^{x_{0}} [\int_{t_{0}}^{\infty} [L_{1} (t, x (t)) + L_{2} (t, x (t)) u (t) + u^{T} (t) R_{2} (t, x (t)) u (t)] d t]

(83)

Corollary 4.1. Consider the linear controlled time-varying stochastic dynamical system (81), where u(⋅) is admissible. Assume that there exist a uniformly bounded, continuously differentiable, positive definite

P : [t_{0}, \infty) \to ℝ^{n \times n}

and continuously differentiable, uniformly bounded, non-negative definite

M_{q} : [t_{0}, \infty) \to ℝ^{n \times n}, q = 2, \dots, r

, such that

\begin{matrix} - \dot{P} (t) = {(A (t) + \frac{1}{2} {‖ σ (t) ‖}^{2} I_{n})}^{T} P (t) + P (t) (A (t) + \frac{1}{2} {‖ σ (t) ‖}^{2} I_{n}) + R_{1} (t) - P (t) S (t) P (t), \lim_{t_{f} \to \infty} P (t_{f}) = \bar{P}, t \in [t_{0}, \infty) \end{matrix}

(84)

and

\begin{matrix} - {\dot{M}}_{q} (t) = {(A (t) + \frac{1}{2} (2 q - 1) {‖ σ (t) ‖}^{2} I_{n} - S (t) P (t))}^{T} M_{q} (t) + M_{q} (t) (A (t) + \frac{1}{2} (2 q - 1) {‖ σ (t) ‖}^{2} I_{n} - S (t) P (t)) + {\hat{R}}_{q} (t), \\ \lim_{t_{f} \to \infty} M_{q} (t_{f}) = {\bar{M}}_{q}, q = 2, \dots, r, t \in [t_{0}, \infty) \end{matrix}

(85)

where

S (t) ≜ B (t) R_{2}^{- 1} (t) B^{T} (t)

⁠, and

\bar{P}

and

{\bar{M}}_{q}

satisfy Eqs.and, respectively. Then, the zero solution

x (t) \equiv 0

of the closed-loop system

d x (t) = [A (t) x (t) + B (t) ϕ (t, x)] d t + x (t) σ^{T} (t) d w (t), x (t_{0}) = x_{0} a . s ., t \geq t_{0}

(86)

is globally uniformly asymptotically stable in probability with feedback control

u = ϕ (t, x) = - R_{2}^{- 1} (t) B^{T} (t) (P (t) + \sum_{q = 2}^{r} (x^{T} M_{q} (t) x)^{q - 1} M_{q} (t)) x

(87)

and the performance functional (83) with

R_{2} (t, x) = R_{2} (t), L_{2} (t, x) = 0

, and

\begin{matrix} L_{1} (t, x) = x^{T} (R_{1} (t) + \sum_{q = 2}^{r} (x^{T} M_{q} (t) x)^{q - 1} {\hat{R}}_{q} (t) + [\sum_{q = 2}^{r} (x^{T} M_{q} (t) x)^{q - 1} M_{q} (t)]^{T} S (t) [\sum_{q = 2}^{r} (x^{T} M_{q} (t) x)^{q - 1} M_{q} (t)]) x \end{matrix}

(88)

is minimized in the sense of Eq.. Finally

J (t_{0}, x_{0}, ϕ (\cdot, \cdot)) = x_{0}^{T} P (t_{0}) x_{0} + \sum_{q = 2}^{r} \frac{1}{q} {(x_{0}^{T} M_{q} (t_{0}) x_{0})}^{q}, (t_{0}, x_{0}) \in [0, \infty) \times ℝ^{n}

(89)

Proof. The result is a consequence of Theorem 4.1 with

n_{1} = n, n_{2} = 1, x_{1} (t - t_{0}) = x (t), x_{2} (t - t_{0}) = t, f_{1} (x_{1}, x_{2}) = f_{1} (x_{2}, x_{1}) = A (t) x, f_{2} (x_{1}, x_{2}) = 1, G_{1} (x_{1}, x_{2}) = G_{1} (x_{2}, x_{1}) = B (t)

⁠,

G_{2} (x_{1}, x_{2}) = 0, D_{1} (x_{1}, x_{2}) = D_{1} (x_{2}, x_{1}) = x σ^{T} (t), D_{2} (x_{1}, x_{2}) = 0, L_{1} (x_{1}, x_{2}) = L_{1} (x_{2}, x_{1}) = L_{1} (t, x)

⁠, where L₁(t, x) is given by Eq. ,

L_{2} (x_{1}, x_{2}) = 0, R_{2} (x_{1}, x_{2}) = R_{2} (x_{2}, x_{1}) = R_{2} (t), V (x_{1}, x_{2}) = V (x_{2}, x_{1}) = x^{T} P (t) x + \sum_{q = 2}^{r} (1 / q) {(x^{T} M_{q} (t) x)}^{q}, α (| | x_{1} | |) = α | | x | |^{2}

⁠,

β (| | x_{1} | |) = β | | x | |^{2} + \sum_{q = 2}^{r} (1 / q) {\hat{β}}_{q}^{q} | | x | |^{2 q}

⁠, and

γ (| | x_{1} | |) = - γ | | x | |^{2} - \sum_{q = 2}^{r} {\hat{σ}}_{q} {\hat{β}}_{q}^{q - 1} | | x | |^{2 q}

⁠, for some α, β, γ,

{\hat{β}}_{q}

⁠, and

{\hat{σ}}_{q} > 0, q = 2, \dots, r

⁠. Specifically, since P(⋅) and M_q(⋅) are uniformly bounded and, respectively, positive and non-negative definite, there exist constants α, β, and

{\hat{β}}_{q} > 0, q = 2, \dots, r

⁠, such that

α I_{n} \leq P (t) \leq β I_{n}

and

0 \leq M_{q} (t) \leq {\hat{β}}_{q} I_{n}, t \geq t_{0}

⁠, and hence

α | | x | |^{2} \leq V (t, x) \leq β | | x | |^{2} + \sum_{q = 2}^{r} \frac{1}{q} {\hat{β}}_{q}^{q} | | x | |^{2 q}, (t, x) \in [t_{0}, \infty) \times ℝ^{n}

(90)

which verifies Eq. (76).

Next, Eq. is a restatement of Eq. . Now, let

ϕ (t, x) = ϕ_{1} (t, x) + ϕ_{2} (t, x)

⁠, where

ϕ_{1} (t, x) ≜ - R_{2}^{- 1} (t) B^{T} (t) P (t) x

(91)

ϕ_{2} (t, x) ≜ - R_{2}^{- 1} (t) B^{T} (t) \sum_{q = 2}^{r} (x^{T} M_{q} (t) x)^{q - 1} M_{q} (t) x

(92)

Computing the infinitesimal generator

L V (t, x)

along the trajectories of the closed-loop system gives

\begin{matrix} L V (t, x) = x^{T} (\dot{P} (t) x + P (t) A (t) + A^{T} (t) P (t)) x + 2 x^{T} P (t) B (t) ϕ (t, x) + {‖ σ (t) ‖}^{2} x^{T} P (t) x \\ + \sum_{q = 2}^{r} (x^{T} M_{q} (t) x)^{q - 1} [x^{T} ({\dot{M}}_{q} (t) + M_{q} (t) A (t) + A^{T} (t) M_{q} (t)) x \\ + 2 x^{T} M_{q} (t) B (t) ϕ (t, x) + (2 q - 1) {‖ σ (t) ‖}^{2} x^{T} M_{q} (t) x] \\ = x^{T} (\dot{P} (t) + P (t) (A (t) + \frac{1}{2} {‖ σ (t) ‖}^{2} I_{n}) + {(A (t) + \frac{1}{2} {‖ σ (t) ‖}^{2} I_{n})}^{T} P (t) \\ - P (t) S (t) P (t)) x - x^{T} P (t) S (t) P (t) x + 2 x^{T} P (t) B (t) ϕ_{2} (t, x) \\ + \sum_{q = 2}^{r} (x^{T} M_{q} (t) x)^{q - 1} [x^{T} ({\dot{M}}_{q} (t) + M_{q} (t) (A (t) + \frac{1}{2} (2 q - 1) {‖ σ (t) ‖}^{2} I_{n} - S (t) P (t)) \\ + {(A (t) + \frac{1}{2} (2 q - 1) {‖ σ (t) ‖}^{2} I_{n} - S (t) P (t))}^{T} M_{q} (t)) x + 2 x^{T} M_{q} (t) B (t) ϕ_{2} (t, x)], \\ (t, x) \in [t_{0}, \infty) \times ℝ^{n} \end{matrix}

(93)

Now, using Eqs. and , Eq. yields

\begin{matrix} L V (t, x) = - x^{T} (R_{1} (t) + \sum_{q = 2}^{r} (x^{T} M_{q} (t) x)^{q - 1} {\hat{R}}_{q} (t)) x - x^{T} P (t) S (t) P (t) x - 2 x^{T} [\sum_{q = 2}^{r} (x^{T} M_{q} (t) x)^{q - 1} M_{q} (t)]^{T} S (t) [\sum_{q = 2}^{r} (x^{T} M_{q} (t) x)^{q - 1} M_{q} (t)] x \\ - 2 x^{T} P (t) S (t) \sum_{q = 2}^{r} (x^{T} M_{q} (t) x)^{q - 1} M_{q} (t) x \leq - x^{T} R_{1} (t) x - x^{T} \sum_{q = 2}^{r} (x^{T} M_{q} (t) x)^{q - 1} {\hat{R}}_{q} (t) x \\ \leq - γ | | x | |^{2} - \sum_{q = 2}^{r} {({\hat{β}}_{q} | | x | |^{2})}^{q - 1} {\hat{σ}}_{q} | | x | |^{2} \leq - γ | | x | |^{2} - \sum_{q = 2}^{r} {\hat{σ}}_{q} {\hat{β}}_{q}^{q - 1} | | x | |^{2 q}, (t, x) \in [t_{0}, \infty) \times ℝ^{n} \end{matrix}

(94)

and hence, Eq. (77) holds.

Finally, note that

\begin{matrix} ϕ^{T} (t, x) R_{2} (t) ϕ (t, x) = x^{T} P (t) S (t) P (t) x + 2 x^{T} P (t) S (t) \sum_{q = 2}^{r} (x^{T} M_{q} (t) x)^{q - 1} M_{q} (t) x \\ + x^{T} [\sum_{q = 2}^{r} (x^{T} M_{q} (t) x)^{q - 1} M_{q} (t)]^{T} S (t) [\sum_{q = 2}^{r} (x^{T} M_{q} (t) x)^{q - 1} M_{q} (t)] x \end{matrix}

(95)

which, using the first equality in Eq. , implies

\begin{matrix} L V (t, x) = - x^{T} R_{1} (t) x - x^{T} \sum_{q = 2}^{r} (x^{T} M_{q} (t) x)^{q - 1} {\hat{R}}_{q} (t) x - ϕ (t, x) R_{2} (t) ϕ (t, x) - x^{T} [\sum_{q = 2}^{r} (x^{T} M_{q} (t) x)^{q - 1} M_{q} (t)]^{T} S (t) [\sum_{q = 2}^{r} (x^{T} M_{q} (t) x)^{q - 1} M_{q} (t)] x \\ = - L_{1} (t, x) - ϕ^{T} (t, x) R_{2} (t) ϕ (t, x) \end{matrix}

(96)

where L₁(t, x) is given by Eq. (88), and thus, Eq. (80) is verified. The result now follows as a direct consequence of Theorem 4.1. ▪

Finally, we specialize Theorem 4.1 to linear time-varying stochastic systems controlled by nonlinear controllers that minimize a multilinear cost functional. For the following result, define $x^{[k]} ≜ x \otimes x \otimes \dots \otimes x$ and $\overset{q}{\oplus} A ≜ A \oplus A \oplus \dots \oplus A$ ⁠, with x and A appearing k times, where k is a positive integer. Furthermore, define $N^{(k, n)} ≜ {Ψ \in ℝ^{1 \times n^{k}} : Ψ x^{[k]} \geq 0, x \in ℝ^{n}}$ and let ${\hat{P}}_{q} : [t_{0}, \infty) \to ℝ^{1 \times n^{2 q}}, {\hat{R}}_{2 q} : [t_{0}, \infty) \to ℝ^{1 \times n^{2 q}}, q = 2, \dots, r$ ⁠, where r is a positive integer, and $R_{2} : [t_{0}, \infty) \to ℝ^{m \times m}$ be continuous and uniformly bounded, ${\hat{R}}_{2 q} (t), {\hat{P}}_{q} (t) \in N^{(2 q, n)}$ ⁠, and $R_{2} (t) \geq μ I_{m} > 0$ ⁠, for some μ > 0 and for all t ≥ t₀.

Corollary 4.2. Consider the linear controlled time-varying stochastic dynamical system (81), where u(⋅) is admissible. Assume that there exist a continuously differentiable, uniformly bounded, positive definite

P : [t_{0}, \infty) \to ℝ^{n \times n}

and continuously differentiable, uniformly bounded

{\hat{P}}_{q} : [t_{0}, \infty) \to ℝ^{1 \times n^{2 q}}, q = 2, \dots, r

, such that

{\hat{P}}_{q} \in N^{(k, n)}

\begin{matrix} - \dot{P} (t) = {(A (t) + \frac{1}{2} {‖ σ (t) ‖}^{2} I_{n})}^{T} P (t) + P (t) (A (t) + \frac{1}{2} {‖ σ (t) ‖}^{2} I_{n}) + R_{1} (t) - P (t) S (t) P (t), \lim_{t_{f} \to \infty} P (t_{f}) = \bar{P}, t \in [t_{0}, \infty) \end{matrix}

(97)

and

\begin{matrix} - {\dot{\hat{P}}}_{q} (t) = {\hat{P}}_{q} (t) [\overset{2 q}{\oplus} (A (t) + \frac{1}{2} (2 q - 1) {‖ σ (t) ‖}^{2} I_{n} - S (t) P (t))] + {\hat{R}}_{2 q} (t), \lim_{t_{f} \to \infty} {\hat{P}}_{q} (t_{f}) = {\bar{\hat{P}}}_{q}, q = 2, \dots, r, t \in [t_{0}, \infty) \end{matrix}

(98)

where

S (t) ≜ B (t) R_{2}^{- 1} (t) B^{T} (t)

,and

\bar{P}

and

{\bar{\hat{P}}}_{q}

satisfy Eqs.and, respectively. Then, the zero solution

x (t) \equiv 0

of the closed-loop system (86) is globally uniformly asymptotically stable in probability with the feedback control law

ϕ (t, x) = - R_{2}^{- 1} (t) B^{T} (t) (P (t) x + \frac{1}{2} g^{' T} (t, x))

(99)

where

g (t, x) ≜ \sum_{q = 2}^{r} {\hat{P}}_{q} (t) x^{[2 q]}

, and the performance functional (83) with

R_{2} (t, x) = R_{2} (t), L_{2} (t, x) = 0

, and

L_{1} (t, x) = x^{T} R_{1} (t) x + \sum_{q = 2}^{r} {\hat{R}}_{2 q} (t) x^{[2 q]} + \frac{1}{4} g' (t, x) S (t) g^{' T} (t, x)

(100)

is minimized in the sense of Eq.. Finally

J (t_{0}, x_{0}, ϕ (\cdot, \cdot)) = x_{0}^{T} P (t_{0}) x_{0} + \sum_{q = 2}^{r} {\hat{P}}_{q} (t_{0}) x_{0}^{[2 q]}, (t_{0}, x_{0}) \in [0, \infty) \times ℝ^{n}

(101)

Proof. The result is a consequence of Theorem 4.1 with

n_{1} = n, n_{2} = 1, x_{1} (t - t_{0}) = x (t), x_{2} (t - t_{0}) = t, f_{1} (x_{1}, x_{2}) = f_{1} (x_{2}, x_{1}) = A (t) x, f_{2} (x_{1}, x_{2}) = 1, G_{1} (x_{1}, x_{2}) = G_{1} (x_{2}, x_{1}) = B (t), G_{2} (x_{1}, x_{2}) = 0, D_{1} (x_{1}, x_{2}) = D_{1} (x_{2}, x_{1}) = x σ^{T} (t), D_{2} (x_{1}, x_{2}) = 0, L_{1} (x_{1}, x_{2}) = L_{1} (x_{2}, x_{1}) = L_{1} (t, x)

⁠, where L₁(t, x) is given by Eq. ,

L_{2} (x_{1}, x_{2}) = 0, R_{2} (x_{1}, x_{2}) = R_{2} (x_{2}, x_{1}) = R_{2} (t), V (x_{1}, x_{2}) = V (x_{2}, x_{1}) = x^{T} P (t) x + \sum_{q = 2}^{r} {\hat{P}}_{q} (t) x^{[2 q]}, α (| | x_{1} | |) = α | | x | |^{2}

⁠,

β (| | x_{1} | |) = β | | x | |^{2}

⁠, and

γ (| | x_{1} | |) = - γ | | x | |^{2}

⁠, for some α, β, γ > 0. Specifically, since P(⋅) is uniformly bounded and positive definite, there exist constants α, β > 0 such that

α I_{n} \leq P (t) \leq β I_{n}

⁠. In addition, since

{\hat{P}}_{q} (t) \in N^{(2 q, n)}, q = 2, \dots, n

⁠, for all t ≥ t₀, it follows that

α | | x | |^{2} \leq V (t, x) \leq β | | x | |^{2}, (t, x) \in [t_{0}, \infty) \times ℝ^{n}

(102)

which verifies Eq. (76).

Computing the infinitesimal generator

L V (t, x)

along the trajectories of the closed-loop system gives

\begin{matrix} L V (t, x) = x^{T} (\dot{P} (t) + P (t) A (t) + A^{T} (t) P (t)) x + 2 x^{T} P (t) B (t) ϕ (t, x) + \frac{1}{2} tr {(x σ^{T} (t))}^{T} 2 P (t) x σ^{T} (t) + \sum_{q = 2}^{r} {\dot{\hat{P}}}_{q} (t) x^{[2 q]} + g^{'} (t, x) (A (t) x + B (t) ϕ (t, x)) \\ + \frac{1}{2} tr {(x σ^{T} (t))}^{T} g^{″} (t, x) x σ^{T} (t) \\ = x^{T} (\dot{P} (t) x + P (t) (A (t) + \frac{1}{2} {‖ σ (t) ‖}^{2} I_{n}) + {(A (t) + \frac{1}{2} {‖ σ (t) ‖}^{2} I_{n})}^{T} P (t) \\ - P (t) S (t) P (t)) x - x^{T} P (t) S (t) P (t) x - x^{T} P (t) S (t) {g^{'}}^{T} (t, x) \\ + \sum_{q = 2}^{r} {\dot{\hat{P}}}_{q} (t) x^{[2 q]} + g^{'} (t, x) [(A (t) - S (t) P (t)) x - \frac{1}{2} S (t) {g^{'}}^{T} (t, x)] + \frac{1}{2} tr {(x σ^{T} (t))}^{T} g^{″} (t, x) x σ^{T} (t) \end{matrix}

(103)

for all

(t, x) \in [t_{0}, \infty) \times ℝ^{n}

⁠. Next, noting that

\begin{matrix} g^{'} (t, x) (A (t) - S (t) P (t)) x + \frac{1}{2} tr {(x σ^{T} (t))}^{T} g^{″} (t, x) x σ^{T} (t) \\ = \frac{\partial}{\partial x} [\sum_{q = 2}^{r} {\hat{P}}_{q} (t) x^{[2 q]}] (A (t) - S (t) P (t)) x + \frac{1}{2} x^{T} \frac{\partial^{2}}{\partial x^{2}} [\sum_{q = 2}^{r} {\hat{P}}_{q} (t) x^{[2 q]}] x {‖ σ (t) ‖}^{2} \\ = \sum_{q = 2}^{r} {\hat{P}}_{q} (t) (\sum_{i_{q} = 1}^{2 q} x \otimes \dots \otimes \overset{i_{q}^{th} entry}{\overset{︷}{I_{n}}} \otimes \dots \otimes x) (A (t) - S (t) P (t)) x \\ + \sum_{q = 2}^{r} \frac{1}{2} {‖ σ (t) ‖}^{2} (\sum_{i = 1}^{n} \sum_{j = 1}^{n} \sum_{i_{q} = 1}^{2 q} \sum_{j_{q} = 1, j_{q} \neq i_{q}}^{2 q} x_{i} {\hat{P}}_{q} (t) (x \otimes \dots \\ \dots \otimes \overset{i_{q}^{th} entry}{\overset{︷}{e_{i}}} \otimes \dots \otimes \overset{j_{q}^{th} entry}{\overset{︷}{e_{j}}} \otimes \dots \otimes x) x_{j}) \\ = \sum_{q = 2}^{r} {\hat{P}}_{q} (t) (\sum_{i_{q} = 1}^{2 q} x \otimes \dots \otimes \overset{i_{q}^{th} entry}{\overset{︷}{(A (t) - S (t) P (t)) x}} \otimes \dots \otimes x) \\ + \sum_{q = 2}^{r} \frac{1}{2} {‖ σ (t) ‖}^{2} (\sum_{i_{q} = 1}^{2 q} \sum_{j_{q} = 1, j_{q} \neq i_{q}}^{2 q} \sum_{i = 1}^{n} \sum_{j = 1}^{n} {\hat{P}}_{q} (t) (x \otimes \dots \\ \dots \otimes \overset{i_{q}^{th} entry}{\overset{︷}{x_{i} e_{i}}} \otimes \dots \otimes \overset{j_{q}^{th} entry}{\overset{︷}{x_{j} e_{j}}} \otimes \dots \otimes x)) \\ = \sum_{q = 2}^{r} {\hat{P}}_{q} (t) (\sum_{i_{q} = 1}^{2 q} I_{n} \otimes \dots \otimes \overset{i_{q}^{th} entry}{\overset{︷}{(A (t) - S (t) P (t))}} \otimes \dots \otimes I_{n}) x^{[2 q]} \\ + \sum_{q = 2}^{r} \frac{1}{2} {‖ σ (t) ‖}^{2} (\sum_{i_{q} = 1}^{2 q} \sum_{j_{q} = 1, j_{q} \neq i_{q}}^{2 q} {\hat{P}}_{q} (t) (x \otimes \dots \\ \dots \otimes \overset{i_{q}^{th} entry}{\overset{︷}{(\sum_{i = 1}^{n} x_{i} e_{i})}} \otimes \dots \otimes \overset{j_{q}^{th} entry}{\overset{︷}{(\sum_{j = 1}^{n} x_{j} e_{j})}} \otimes \dots \otimes x)) \\ = \sum_{q = 2}^{r} {\hat{P}}_{q} (t) (\sum_{i_{q} = 1}^{2 q} I_{n} \otimes \dots \otimes \overset{i_{q}^{th} entry}{\overset{︷}{(A (t) - S (t) P (t))}} \otimes \dots \otimes I_{n}) x^{[2 q]} \\ + \sum_{q = 2}^{r} \frac{1}{2} {‖ σ (t) ‖}^{2} {\hat{P}}_{q} (t) (\sum_{i_{q} = 1}^{2 q} \sum_{j_{q} = 1, j_{q} \neq i_{q}}^{2 q} x \otimes \dots \otimes \overset{i_{q}^{th} entry}{\overset{︷}{x}} \otimes \dots \otimes \overset{j_{q}^{th} entry}{\overset{︷}{x}} \otimes \dots \otimes x) \\ = \sum_{q = 2}^{r} {\hat{P}}_{q} (t) (\sum_{i_{q} = 1}^{2 q} I_{n} \otimes \dots \otimes \overset{i_{q}^{th} entry}{\overset{︷}{(A (t) - S (t) P (t))}} \otimes \dots \otimes I_{n}) x^{[2 q]} \\ + \sum_{q = 2}^{r} {\hat{P}}_{q} (t) (\sum_{i_{q} = 1}^{2 q} I_{n} \otimes \dots \otimes \overset{i_{q}^{th} entry}{\overset{︷}{\frac{1}{2} (q - 1) {‖ σ (t) ‖}^{2} I_{n}}} \otimes \dots \otimes I_{n}) x^{[2 q]} \\ = \sum_{q = 2}^{r} {\hat{P}}_{q} (t) (\sum_{i_{q} = 1}^{2 q} I_{n} \otimes \dots \otimes \overset{i_{q}^{th} entry}{\overset{︷}{((A (t) - S (t) P (t)) + \frac{1}{2} (q - 1) {‖ σ (t) ‖}^{2} I_{n})}} \otimes \dots \otimes I_{n}) x^{[2 q]} \\ = \sum_{q = 2}^{r} {\hat{P}}_{q} (t) [\overset{2 q}{\otimes} (A (t) + \frac{1}{2} (2 q - 1) {‖ σ (t) ‖}^{2} I_{n} - S (t) P (t))] x^{[2 q]} \end{matrix}

(104)

it follows from Eqs. , , and , that

\begin{matrix} L V (t, x) = - x^{T} R_{1} (t) x - x^{T} P (t) S (t) P (t) x - x^{T} P (t) S (t) g^{' T} (t, x) \\ + \sum_{q = 2}^{r} ({\dot{\hat{P}}}_{q} (t) + {\hat{P}}_{q} (t) [\overset{2 q}{\otimes} (A (t) + \frac{1}{2} (2 q - 1) {‖ σ (t) ‖}^{2} I_{n} - S (t) P (t))]) x^{[2 q]} \\ - \frac{1}{2} g^{'} (t, x) S (t) g^{' T} (t, x) \\ = - x^{T} R_{1} (t) x - x^{T} P (t) S (t) P (t) x - x^{T} P (t) S (t) g^{' T} (t, x) - \sum_{q = 2}^{r} {\hat{R}}_{2 q} (t) x^{[2 q]} - \frac{1}{2} g^{'} (t, x) S (t) g^{' T} (t, x) \end{matrix}

(105)

Finally, note that

\begin{matrix} ϕ^{T} (t, x) R_{2} (t) ϕ (t, x) = (x^{T} P (t) + \frac{1}{2} g^{'} (t, x)) S (t) (P (t) x + \frac{1}{2} g^{' T} (t, x)) \\ = x^{T} P (t) S (t) P (t) x + \frac{1}{4} g^{'} (t, x) S (t) g^{' T} (t, x) + x^{T} P (t) S (t) g^{' T} (t, x) \end{matrix}

(106)

which, using Eq. , implies that

L V (t, x) = - x^{T} R_{1} (t) x - \sum_{q = 2}^{r} {\hat{R}}_{2 q} (t) x^{[2 q]} - \frac{1}{4} g^{'} (t, x) S (t) g^{' T} (t, x) - ϕ^{T} (t, x) R_{2} (t) ϕ (t, x)

(107)

for all

(t, x) \in [t_{0}, \infty) \times ℝ^{n}

⁠, and hence, Eq. holds with

γ (| | x | |) = - γ | | x | |^{2}

⁠. In addition, writing Eq. as

L V (t, x) = - L_{1} (t, x) - ϕ^{T} (t, x) R_{2} (t) ϕ (t, x)

(108)

where L₁(t, x) is given by Eq. (100), and thus, Eq. (80) is verified. The result now follows as a direct consequence of Theorem 4.1. ▪

Illustrative Numerical Examples

In this section, we provide two illustrative numerical examples to highlight the optimal and inverse optimal partial-state asymptotic stabilization framework developed in the paper.

Optimal Partial Stabilization of a Rigid Spacecraft.

Consider the rigid spacecraft with stochastic disturbances given by

d ω_{1} (t) = [I_{23} ω_{2} (t) ω_{3} (t) - α_{1} ω_{1} (t) + u_{1} (t)] d t + σ_{1} ω_{1} (t) d w (t), ω_{1} (0) = ω_{10} a . s ., t \geq 0

(109)

d ω_{2} (t) = [I_{31} ω_{3} (t) ω_{1} (t) - α_{2} ω_{2} (t) + u_{2} (t)] d t + σ_{2} ω_{2} (t) d w (t), ω_{2} (0) = ω_{20} a . s .

(110)

d ω_{3} (t) = [I_{12} ω_{1} (t) ω_{2} (t)] d t + σ_{3} ω_{3} (t) d w (t), ω_{3} (0) = ω_{30} a . s .

(111)

where

I_{23} ≜ (I_{2} - I_{3}) / I_{1}, I_{31} ≜ (I_{3} - I_{1}) / I_{2}, I_{12} ≜ (I_{1} - I_{2}) / I_{3}

⁠, I₁, I₂, and I₃ are the principal moments of inertia of the spacecraft such that I₁ > I₂ > I₃ > 0, α₁ ≥ 0 and α₂ ≥ 0 reflect dissipation in the ω₁ and ω₂ coordinates of the spacecraft, u₁ and u₂ are the spacecraft control moments, and w(t) is a standard Wiener process. Here, the state-dependent disturbances can be used to capture perturbations in atmospheric drag for low-altitude (i.e.,<600 km) satellites from the Earth's residual atmosphere as well as J₂ perturbations due to the nonspherical mass distribution of the Earth and its nonuniform mass density. For details, see Refs. [30,31]. For this example, we seek a state feedback controller

u = {[u_{1}, u_{2}]}^{T} = ϕ (x_{1}, x_{2})

⁠, where

x_{1} = {[ω_{1}, ω_{2}]}^{T}

and x₂ = ω₃, such that the performance measure

J (x_{10}, x_{20}, u (\cdot)) = E^{x_{0}} [\int_{0}^{\infty} [x_{1}^{T} (t) R_{1} x_{1} (t) + u^{T} (t) u (t)]] d t

(112)

where R₁ > 0 is minimized in the sense of Eq. (44), and Eqs. (109)–(111) are globally asymptotically stable in probability with respect to x₁ uniformly in x₂₀.

Note that Eqs. – with performance measure can be cast in the form of Eqs. and with performance measure . In this case, Theorem 3.2 can be applied with n₁ = 2, n₂ = 1, m = 2,

f (x_{1}, x_{2}) = \tilde{f} (x_{1}, x_{2}) - A x_{1}, \tilde{f} (x_{1}, x_{2}) ≜ {[I_{23} ω_{2} ω_{3}, I_{31} ω_{3} ω_{1}, I_{12} ω_{1} ω_{2}]}^{T}

⁠,

A ≜ {[\begin{matrix} α_{1} & 0 & 0 \\ 0 & α_{2} & 0 \end{matrix}]}^{T}

⁠,

G (x_{1}, x_{2}) = {[\begin{matrix} 1 & 0 & 0 \\ 0 & 1 & 0 \end{matrix}]}^{T}, D (x_{1}, x_{2}) = {[\begin{matrix} σ_{1} ω_{1} & σ_{2} ω_{2} & σ_{3} ω_{3} \end{matrix}]}^{T}

⁠,

L_{1} (x_{1}, x_{2}) = x_{1}^{T} R_{1} x_{1}, L_{2} (x_{1}, x_{2}) = 0

⁠, and R₂(x₁, x₂) = I₂ to characterize the optimal partially stabilizing controller. Specifically, in this case, Eq. reduces to

\begin{matrix} 0 = x_{1}^{T} R_{1} x_{1} + V' (x_{1}, x_{2}) \tilde{f} (x_{1}, x_{2}) - V' (x_{1}, x_{2}) A x_{1} + \frac{1}{2} tr D^{T} (x_{1}, x_{2}) V^{″} (x_{1}, x_{2}) D (x_{1}, x_{2}) \\ - \frac{1}{4} V' (x_{1}, x_{2}) G (x_{1}, x_{2}) G^{T} (x_{1}, x_{2}) V^{' T} (x_{1}, x_{2}), (x_{1}, x_{2}) \in ℝ^{n_{1}} \times ℝ^{n_{2}} \end{matrix}

(113)

Now, choosing

V (x_{1}, x_{2}) = x_{1}^{T} P x_{1}

⁠, where P > 0, it follows from Eq. that

0 = x_{1}^{T} R_{1} x_{1} + V' (x_{1}, x_{2}) \tilde{f} (x_{1}, x_{2}) - 2 x_{1}^{T} P H x_{1} + x_{1}^{T} Σ P Σ x_{1} - x_{1}^{T} P P x_{1}

(114)

where

H ≜ [\begin{matrix} α_{1} & 0 \\ 0 & α_{2} \end{matrix}], Σ ≜ [\begin{matrix} σ_{1} & 0 \\ 0 & σ_{2} \end{matrix}]

⁠, and

V' (x_{1}, x_{2}) \tilde{f} (x_{1}, x_{2}) = 0

only if P = ρJ, where ρ > 0 and

J ≜ [\begin{matrix} - I_{31} & 0 \\ 0 & I_{23} \end{matrix}]

⁠. In this case, Eq. and P = ρJ imply that

0 = R_{1} - 2 ρ J \tilde{H} - ρ^{2} J^{2}

(115)

where $\tilde{H} = H - (1 / 2) Σ^{2}$ ⁠. Hence, Eq. (59) holds with $α (| | x_{1} | |) = ρ λ_{min} (J) | | x_{1} | |^{2}$ and $β (| | x_{1} | |) = ρ λ_{max} (J) | | x_{1} | |^{2}$ ⁠, where λ_min(⋅) and λ_max(⋅) denote minimum and maximum eigenvalues, respectively, and Eq. (60) holds with $γ (| | x_{1} | |) = λ_{min} (R_{1}) | | x_{1} | |^{2}$ ⁠.

Since all of the conditions of Theorem 3.2 hold, it follows that the feedback control law given by

ϕ (x_{1}, x_{2}) = - \frac{1}{2} R_{2}^{- 1} (x_{1}, x_{2}) G^{T} (x_{1}, x_{2}) V^{' T} (x_{1}, x_{2}) = - ρ J x_{1}, (x_{1}, x_{2}) \in ℝ^{n_{1}} \times ℝ^{n_{2}}

(116)

guarantees that the stochastic dynamical systems (109)–(111) is globally asymptotically stable in probability with respect to x₁ uniformly in x₂₀ and $J (x_{10}, x_{20}, ϕ (x_{1} (\cdot), x_{2} (\cdot))) = x_{10}^{T} P x_{10}$ for all $(x_{10}, x_{20}) \in ℝ^{n_{1}} \times ℝ^{n_{2}}$ ⁠.

Let $I_{1} = 20 kg m^{2}, I_{2} = 15 kg m^{2}, I_{3} = 10 kg m^{2}$ ⁠, $ω_{10} = π / 3 H z, ω_{20} = π / 4 H z, ω_{30} = π / 5 H z$ ⁠, $α_{1} = 1.1668 H z, α_{2} = 0.2 H z, σ_{1} = 1, σ_{2} = 0.4, σ_{3} = 0.1$ ⁠, and $R_{1} = [\begin{matrix} 5 & 0 \\ 0 & 0.54 \end{matrix}] {Hz}^{2}$ ⁠. Figure 1 shows the sample average along with the standard deviation of the controlled system state versus time for 20 sample paths for $ρ = 2.5 Hz / (N \cdot m^{2})$ ⁠. Note that $x_{1} (t) = {[ω_{1} (t), ω_{2} (t)]}^{T} \to 0$ a.s. as t → ∞, whereas $x_{2} (t) = ω_{3} (t)$ does not converge to zero. Figure 2 shows the sample average along with the standard deviation of the corresponding control signal versus time. Finally, $J (x_{10}, x_{20}, ϕ (x_{1} (\cdot), x_{2} (\cdot))) = 2.2132 {Hz}^{3}$ ⁠.

Thermoacoustic Combustion Model.

In this example, we consider control of thermoacoustic instabilities in combustion processes. Engineering applications involving steam and gas turbines and jet and ramjet engines for power generation and propulsion technology involve combustion processes. Due to the inherent coupling between several intricate physical phenomena in these processes involving acoustics, thermodynamics, fluid mechanics, and chemical kinetics, the dynamic behavior of combustion systems is characterized by highly complex nonlinear models [32,33]. The unstable dynamic coupling between heat release in combustion processes generated by reacting mixtures releasing chemical energy and unsteady motions in the combustor develop acoustic pressure and velocity oscillations that can severely affect operating conditions and system performance.

Consider the nonlinear stochastic dynamical system adopted from Refs. [3] and [32] given by

\begin{matrix} d q_{1} (t) = [- α_{1} q_{1} (t) - β q_{1} (t) q_{2} (t) cos q_{3} (t) + u (t)] d t + σ_{1} q_{1} (t) d w (t), q_{1} (0) \overset{a . s .}{=} q_{10}, t \geq 0 \end{matrix}

(117)

d q_{2} (t) = [- α_{2} q_{2} (t) + β q_{1}^{2} (t) cos q_{3} (t) + u (t)] d t + σ_{2} q_{2} (t) d w (t), q_{2} (0) \overset{a . s .}{=} q_{20} \neq 0

(118)

d q_{3} (t) = [2 θ_{1} - θ_{2} - β (\frac{q_{1}^{2} (t)}{q_{2} (t)} - 2 q_{2} (t)) sin q_{3} (t)] d t + σ_{3} q_{1} (t) q_{2} (t) d w (t), q_{3} (0) \overset{a . s .}{=} q_{30}

(119)

representing a time-averaged, two-mode thermoacoustic combustion model with state-dependent stochastic disturbances, where α₁ > 0 and α₂ > 0 represent decay constants, θ₁ and $θ_{2} \in ℝ$ represent frequency shift constants, $β = ((γ + 1) / 8 γ) ω_{1}$ ⁠, where γ denotes the ratio of specific heats and ω₁ is the frequency of the fundamental mode, σ₁, σ₂, and σ₃ are such that $α_{1} > (1 / 2) σ_{1}^{2}$ and $α_{2} > (1 / 2) σ_{2}^{2}$ and represent augmentation factors of the variance of the state-dependent stochastic disturbance, and u is the control input signal. As shown in Refs. [32,33], only the first two states q₁ and q₂ representing the modal amplitudes of a two-mode thermoacoustic combustion model are relevant in characterizing system instabilities, since the third state q₃ represents the phase difference between the two modes [34]. Hence, we require asymptotic stability of $q_{1} (t), t \geq 0$ ⁠, and $q_{2} (t), t \geq 0$ ⁠, which necessitates partial stabilization.

For this example, we seek a state feedback controller

u = ϕ (x_{1}, x_{2})

⁠, where

x_{1} = {[q_{1}, q_{2}]}^{T}

and x₂ = q₃, such that the performance measure

J (x_{1} (0), x_{2} (0), u (\cdot)) = \int_{0}^{\infty} [x_{1}^{T} (t) R_{1} x_{1} (t) + u^{2} (t)] d t

(120)

where

R_{1} = ρ [\begin{matrix} 2 α_{1} - σ_{1}^{2} + ρ & ρ \\ ρ & 2 α_{2} - σ_{2}^{2} + ρ \end{matrix}], ρ > 0

(121)

is minimized in the sense of Eq. (44), and Eqs. (117)–(119) are globally asymptotically stable with respect to x₁ uniformly in x₂₀.

Note that Eqs. – with performance measure can be cast in the form of Eqs. and with performance measure . In this case, Theorem 3.2 can be applied with n₁ = 2, n₂ = 1, m = 1,

f (x_{1}, x_{2}) = {[- α_{1} q_{1} - β q_{1} q_{2} cos q_{3}, - α_{2} q_{2} + β q_{1}^{2} cos q_{3}, 2 θ_{1} - θ_{2} - β (q_{1}^{2} / q_{2} - 2 q_{2}) sin q_{3}]}^{T}

⁠,

G (x_{1}, x_{2}) = {[\begin{matrix} 1 & 1 & 0 \end{matrix}]}^{T}, D (x_{1}, x_{2}) = {[\begin{matrix} σ_{1} q_{1} & σ_{2} q_{2} & σ_{3} q_{1} q_{2} \end{matrix}]}^{T}

⁠,

L_{1} (x_{1}, x_{2}) = x_{1}^{T} R_{1} x_{1}, L_{2} (x_{1}, x_{2}) = 0

⁠, and

R_{2} (x_{1}, x_{2}) = 1

to characterize the optimal partially stabilizing controller. Specifically, Eq. reduces to

\begin{matrix} 0 = x_{1}^{T} R_{1} x_{1} + V' (x_{1}, x_{2}) f (x_{1}, x_{2}) + \frac{1}{2} tr D^{T} (x_{1}, x_{2}) V^{″} (x_{1}, x_{2}) D (x_{1}, x_{2}) \\ - \frac{1}{4} V' (x_{1}, x_{2}) G (x_{1}, x_{2}) G^{T} (x_{1}, x_{2}) V^{' T} (x_{1}, x_{2}), (x_{1}, x_{2}) \in ℝ^{n_{1}} \times ℝ^{n_{2}} \end{matrix}

(122)

which implies that $V' (x_{1}, x_{2}) = 2 ρ [q_{1}, q_{2}, 0] .$ Furthermore, since $V (0, x_{2}) = 0, x_{2} \in ℝ, V (x_{1}, x_{2}) = ρ x_{1}^{T} x_{1},$ which is positive definite with respect to x₁, and hence, Eq. (59) holds.

Since all of the conditions of Theorem 3.2 hold, it follows that the feedback control given by

\begin{matrix} ϕ (x_{1}, x_{2}) = - \frac{1}{2} R_{2}^{- 1} (x_{1}, x_{2}) G^{T} (x_{1}, x_{2}) V^{' T} (x_{1}, x_{2}) \\ = - ρ [\begin{matrix} 1 & 1 & 0 \end{matrix}] {[\begin{matrix} q_{1} & q_{2} & 0 \end{matrix}]}^{T} \\ = - ρ [\begin{matrix} 1 & 1 & 0 \end{matrix}] [\begin{matrix} x_{1} \\ 0 \end{matrix}], (x_{1}, x_{2}) \in ℝ^{n_{1}} \times ℝ^{n_{2}} \end{matrix}

(123)

guarantees that the dynamical systems (117)–(119) is globally asymptotically stable with respect to x₁ uniformly in x₂₀ and $J (x_{10}, x_{20}, ϕ (x_{1} (\cdot), x_{2} (\cdot))) = ρ x_{10}^{T} x_{10}$ for all $(x_{10}, x_{20}) \in ℝ^{2} \times ℝ$ ⁠.

Let $α_{1} = 5 Hz, α_{2} = 45 Hz, σ_{1} = 2, σ_{2} = 5, σ_{3} = 1, γ = 1.4, ω_{1} = 1 Hz, θ_{1} = 4 Hz, θ_{2} = 32 Hz, ρ = 1 Hz, q_{10} = 4, q_{20} = 2$ ⁠, and q₃₀ = 10. Figure 3 shows the sample average along with the standard deviation of the controlled system state versus time, whereas Fig. 4 shows the sample average along with the standard deviation of the corresponding control signal versus time for 20 sample paths. Note that $x_{1} (t) = {[q_{1} (t), q_{2} (t)]}^{T} \overset{a . s .}{\to} 0$ as t → ∞, whereas $x_{2} (t) = q_{3} (t)$ is unstable. Finally, $J (x_{1} (0), x_{2} (0), ϕ (x_{1} (\cdot), x_{2} (\cdot))) = 20 Hz$ ⁠.

Conclusion

In this paper, an optimal control problem for partial-state stochastic stabilization is stated, and sufficient conditions are derived to characterize an optimal nonlinear feedback controller that guarantees asymptotic stability in probability of part of the closed-loop system state. Specifically, we utilized a steady-state stochastic Hamilton–Jacobi–Bellman framework to characterize optimal nonlinear feedback controllers with a notion of optimality that is directly related to a given Lyapunov function that is positive definite and decrescent with respect to part of the system state. This result was then used to address optimal linear and nonlinear regulation for linear and nonlinear time-varying stochastic systems with quadratic and nonlinear-nonquadratic performance measures. In addition, we developed inverse optimal feedback controllers for affine nonlinear systems and linear time-varying stochastic systems with polynomial and multilinear performance criteria. Extensions of this framework for addressing discrete-time systems with computation constraints as well as optimal adaptive controllers for stochastic dynamical systems are currently under development.

Acknowledgment

This work was supported in part by the Air Force Office of Scientific Research under Grant No. FA9550-16-1-0100.

References

1.

L'Afflitto

,

A.

,

Haddad

,

W. M.

, and

Bakolas

,

E.

,

2016

, “

Partial-State Stabilization and Optimal Feedback Control

,”

Int. J. Robust Nonlinear Control

,

26

(

5

), pp.

1026

–

1050

.

Google Scholar

Crossref

2.

Bernstein

,

D. S.

,

1993

, “

Nonquadratic Cost and Nonlinear Feedback Control

,”

Int. J. Robust Nonlinear Control

,

3

(

3

), pp.

211

–

229

.

Google Scholar

Crossref

3.

Haddad

,

W. M.

, and

Chellaboina

,

V.

,

2008

,

Nonlinear Dynamical Systems and Control: A Lyapunov-Based Approach

,

Princeton University Press

,

Princeton, NJ

.

4.

Lum

,

K.-Y.

,

Bernstein

,

D. S.

, and

Coppola

,

V. T.

,

1995

, “

Global Stabilization of the Spinning Top With Mass Imbalance

,”

Dyn. Stab. Syst.

,

10

(

4

), pp.

339

–

365

.

Google Scholar

Crossref

5.

Vorotnikov

,

V. I.

,

1998

,

Partial Stability and Control

,

Birkhäuser

,

Boston, MA

.

6.

Chellaboina

,

V.

, and

Haddad

,

W. M.

,

2002

, “

A Unification Between Partial Stability and Stability Theory for Time-Varying Systems

,”

IEEE Control Syst.

,

22

(

6

), pp.

66

–

75

.

Google Scholar

Crossref

7.

Molinari

,

B.

,

1973

, “

The Stable Regulator Problem and Its Inverse

,”

IEEE Trans. Autom. Control

,

18

(

5

), pp.

454

–

459

.

Google Scholar

Crossref

8.

Moylan

,

P. J.

, and

Anderson

,

B.

,

1973

, “

Nonlinear Regulator Theory and an Inverse Optimal Control Problem

,”

IEEE Trans. Autom. Control

,

18

(

5

), pp.

460

–

465

.

Google Scholar

Crossref

9.

Jacobson

,

D. H.

,

1977

,

Extensions of Linear-Quadratic Control Optimization and Matrix Theory

,

Academic Press

,

New York

.

10.

Jacobson

,

D. H.

,

Martin

,

D. H.

,

Pachter

,

M.

, and

Geveci

,

T.

,

1980

,

Extensions of Linear-Quadratic Control Theory

,

Springer-Verlag

,

Berlin

.

11.

Freeman

,

R. A.

, and

Kokotović

,

P. V.

,

1996

, “

Inverse Optimality in Robust Stabilization

,”

SIAM J. Control Optim.

,

34

(

4

), pp.

1365

–

1391

.

Google Scholar

Crossref

12.

Sepulchre

,

R.

,

Jankovic

,

M.

, and

Kokotovic

,

P.

,

1997

,

Constructive Nonlinear Control

,

Springer

,

London

.

13.

Deng

,

H.

, and

Krstić

,

M.

,

1997

, “

Stochastic Nonlinear Stabilization—Part II: Inverse Optimality

,”

Syst. Control Lett.

,

32

(

3

), pp.

151

–

159

.

Google Scholar

Crossref

14.

Speyer

,

J.

,

1976

, “

A Nonlinear Control Law for a Stochastic Infinite Time Problem

,”

IEEE Trans. Autom. Control

,

21

(

4

), pp.

560

–

564

.

Google Scholar

Crossref

15.

Bass

,

R.

, and

Webber

,

R.

,

1966

, “

Optimal Nonlinear Feedback Control Derived From Quartic and Higher-Order Performance Criteria

,”

IEEE Trans. Autom. Control

,

11

(

3

), pp.

448

–

454

.

Google Scholar

Crossref

16.

Rajpurohit

,

T.

, and

Haddad

,

W. M.

,

2016

, “

Partial-State Stabilization and Optimal Feedback Control for Stochastic Dynamical Systems

,”

American Control Conference

(

ACC

), Boston, MA, July 6–8, pp.

6562

–

6567

.

17.

Kushner

,

H. J.

,

1967

,

Stochastic Stability and Control

,

Academic Press

,

New York

.

18.

Khasminskii

,

R. Z.

,

2012

,

Stochastic Stability of Differential Equations

,

Springer-Verlag

,

Berlin

.

19.

Kushner

,

H. J.

,

1971

,

Introduction to Stochastic Control

,

Holt, Rinehart and Winston

,

New York

.

20.

Arnold

,

L.

,

1974

,

Stochastic Differential Equations: Theory and Applications

,

Wiley Interscience

,

New York

.

21.

Sharov

,

V.

,

1978

, “

Stability and Stabilization of Stochastic Systems Vis-a-Vis Some of the Variables

,”

Avtom. Telemekh.

,

11

(1), pp.

63

–

71

(in Russian).

22.

Øksendal

,

B.

,

1995

,

Stochastic Differential Equations: An Introduction With Applications

,

Springer-Verlag

,

Berlin

.

23.

Yamada

,

T.

, and

Watanabe

,

S.

,

1971

, “

On the Uniqueness of Solutions of Stochastic Differential Equations

,”

J. Math. Kyoto Univ.

,

11

(

1

), pp.

155

–

167

.

Google Scholar

Crossref

24.

Watanabe

,

S.

, and

Yamada

,

T.

,

1971

, “

On the Uniqueness of Solutions of Stochastic Differential Equations II

,”

J. Math. Kyoto Univ.

,

11

(

3

), pp.

553

–

563

.

Google Scholar

Crossref

25.

Meyn

,

S. P.

, and

Tweedie

,

R. L.

,

1993

,

Markov Chains and Stochastic Stability

,

Springer-Verlag

,

London

.

26.

Folland

,

G. B.

,

1999

,

Real Analysis: Modern Techniques and Their Applications

,

Wiley Interscience

,

New York

.

27.

Mao

,

X.

,

1999

, “

Stochastic Versions of the LaSalle Theorem

,”

J. Differ. Equations

,

153

(

1

), pp.

175

–

195

.

Google Scholar

Crossref

28.

Apostol

,

T. M.

,

1957

,

Mathematical Analysis

,

Addison-Wesley

,

Reading, MA

.

29.

Arapostathis

,

A.

,

Borkar

,

V. S.

, and

Ghosh

,

M. K.

,

2012

,

Ergodic Control of Diffusion Processes

,

Cambridge University Press

,

Cambridge, UK

.

30.

Curtis

,

H. D.

,

2014

,

Orbital Mechanics for Engineering Students

,

Elsevier

,

Oxford, UK

.

31.

Junkins

,

J.

, and

Schaub

,

H.

,

2009

,

Analytical Mechanics of Space Systems

,

AIAA Education Series

,

Reston, VA

.

32.

Culick

,

F. E. C.

,

1976

, “

Nonlinear Behavior of Acoustic Waves in Combustion Chambers—I

,”

Acta Astronaut.

,

3

(

9–10

), pp.

715

–

734

.

Google Scholar

Crossref

33.

Paparizos

,

L. G.

, and

Culick

,

F. E. C.

,

1989

, “

The Two-Mode Approximation to Nonlinear Acoustics in Combustion Chambers—I: Exact Solution for Second Order Acoustics

,”

Combust. Sci. Technol.

,

65

(

1–3

), pp.

39

–

65

.

Google Scholar

Crossref

34.

Yang

,

V.

,

Kim

,

S. I.

, and

Culick

,

F. E. C.

,

1987

, “

Third-Order Nonlinear Acoustic Waves and Triggering of Pressure Oscillations in Combustion Chambers—Part I: Longitudinal Modes

,”

AIAA

Paper No. 87-1873.

Partial-State Stabilization and Optimal Feedback Control for Stochastic Dynamical Systems

Introduction

Notation, Definitions, and Mathematical Preliminaries

Stochastic Optimal Partial-State Stabilization

Inverse Optimal Stochastic Control

Illustrative Numerical Examples

Optimal Partial Stabilization of a Rigid Spacecraft.

Thermoacoustic Combustion Model.

Conclusion

Acknowledgment

References

Contents

Data & Figures

Supplements

References

Get Email Alerts

Cited By

ASME Journals

ASME Conference Proceedings

ASME eBooks

Resources

Opportunities

Partial-State Stabilization and Optimal Feedback Control for Stochastic Dynamical Systems

Introduction

Notation, Definitions, and Mathematical Preliminaries

Stochastic Optimal Partial-State Stabilization

Inverse Optimal Stochastic Control

Illustrative Numerical Examples

Optimal Partial Stabilization of a Rigid Spacecraft.

Thermoacoustic Combustion Model.

Conclusion

Acknowledgment

References

Contents

Data & Figures

Supplements

References

Related

Get Email Alerts

Cited By

Related Articles

Related Proceedings Papers

Related Chapters

ASME Journals

ASME Conference Proceedings

ASME eBooks

Resources

Opportunities

This Feature Is Available To Subscribers Only