**4. Sequential optimization processes**

0 , and ¢ ¶ ¶ æ ö + = ç ÷ ¶ ¶è ø *F f u u*


*H F xut f xut* = + () () ,, ,, l

 l

The necessary conditions (i) to (iv) could be simplified further by introducing an Hamiltonian

*H x*

l¶ = - ¶

( ) *<sup>H</sup> x f*

0,and *<sup>H</sup>*

()( ) ()() 0 *<sup>T</sup>* -+=

 d

Furthermore, with the assumption that all the necessary conditions for optimality exist and sufficient for a unique optimal control, a sequential decision processes for optimal response

*u*

ld

l

 d¢ é¢ù ()() () () () () *T T FT T f T T* <sup>0</sup> ë û (32)

(iv) *Transversality boundary conditions*

140 Robust Control - Theoretical Models and Case Studies

Such that

**i.** *Euler's equation*:

**ii.** *Constraints relations*:

**iii.** *Optimal control*:

**iv.** *Boundary conditions*:

strategy can be developed.

ld

l

(31)

¢ (33)

& (34)

¶ = ×= ¶ & (35)

¶ <sup>=</sup> ¶ (36)

¢ *T x HT T* (37)

Sequential decision processes are mathematical abstractions of situations in which decisions must be made in several stages while incurring a certain cost at each stage. The philosophy here is to establish a sequential decision policy to be used as a combating technique strategy in oil spill control.

First, consider *x <sup>t</sup>* at time *t* ∈ 0, *T* , where *T* specifies the time horizon for the situation. For a control *u <sup>t</sup>* defined on 0, *T* , the state equation given in Eq. (38) assumes a sudden rate of variation in the system. Thus,*xt* ∈ℝ*<sup>n</sup>* denotes the state of oil spill in waters, whereas *x*˙*<sup>t</sup>* ∈ℝ*<sup>n</sup>* represents the vector of first-order time derivatives of *x <sup>t</sup>* and *ut* <sup>∈</sup>*<sup>U</sup>* ⊂ℝ*<sup>m</sup>* denotes the control vector. With the assumption that the initial value *x* <sup>0</sup> and the control trajectory over the time interval 0≤*t* ≤*T* are known, the optimization problem over the control trajectory is given as

$$\min\_{u} \int\_{0}^{T} f\left(\mathbf{x}(t), u(t), t\right) dt \tag{38}$$

$$\mathbf{x}\text{ subjectto}\quad\dot{\mathbf{x}}(t) = \mathbf{g}\left(\mathbf{x}(t), \boldsymbol{\mu}(t), t\right) \tag{39}$$

where *g* is a given function of *u, t*a, and possibly *x*. This model establishes a sequential decision path for optimal policy to be used in the application of oil spill combating technique.

By introducing a value function *V*, we have

$$\begin{aligned} V\left(0, \mathbf{x}\_0\right) &:= \min\_u \int\_0^T f\left(t, \mathbf{x}\left(t\right), \boldsymbol{\mu}\left(t\right)\right) dt \\ \text{subject to} \quad \dot{\mathbf{x}}\left(t\right) &= \mathbf{g}\left(t, \mathbf{x}\left(t\right), \boldsymbol{\mu}\left(t\right)\right), \end{aligned} \tag{40}$$

and by fixing Δ*t* >0, we get

$$V\left(0, \mathbf{x}\_0\right) = \min\_{u} \left\{ \int\_0^{\Delta t} f\left(t, \mathbf{x}\left(t\right), u\left(t\right)\right) dt + \int\_{\Delta t}^T f\left(t, \mathbf{x}\left(t\right), u\left(t\right)\right) dt. \tag{41}$$

Also, with the application of the principle of optimality,1 we have

<sup>1</sup> See [9] for detailed discussion on principle of optimality.

$$V\left(0, \mathbf{x}\_0\right) = \min\_{u} \left\{ \int\_0^{\Delta t} f\left(t, \mathbf{x}\left(t\right), u\left(t\right)\right) dt + V\left(\Delta t, \mathbf{x}\left(\Delta t\right)\right) \right\} \tag{42}$$

Discretizing via Taylor series expansion, we get

$$V\left(0, \mathbf{x}\_0\right) = \min\_{\mathbf{u}} \left\{ f\left(t\_0, \mathbf{x}\_0, \boldsymbol{\mu}\right) \Delta t + V\left(t\_0, \mathbf{x}\_0\right) + V\_I\left(t\_0, \mathbf{x}\_0\right) \Delta t + V\_{\mathbf{x}}\left(\mathbf{x}\_0, t\_0\right) \Delta \mathbf{x} + \cdots \tag{43}$$

where Δ*x* = *x*(*t*<sup>0</sup> + Δ*t*) − *x*(*t*0). Thus, letting Δ*t* →0 and dividing by Δ*t*, we have

$$-V\_t\left(\mathbf{x},t\right) = \min\_{\mu} \left\{ f\left(t,\mathbf{x},\mu\right) + V\_x\left(\mathbf{x},t\right) \mathbf{g}\left(t,\mathbf{x},\mu\right) \right\} \tag{44}$$

with boundary condition

$$V\left(T, \mathbf{x}\_T\right) = \mathbf{0}.\tag{45}$$

**Theorem 1 [8]:** Let *t*0, *t*1 denotes the range of time in which a sequence of control is applied. Then, for any processes, *t*<sup>0</sup> ≤*τ*<sup>1</sup> ≤*τ*<sup>2</sup> ≤*t*1:

$$V\left(\tau\_1, \mathbf{x}\left(\tau\_1\right)\right) \le V\left(\tau\_2, \mathbf{x}\left(\tau\_2\right)\right) \tag{46}$$

and for any *t*, such that *t*<sup>0</sup> ≤*t* ≤*t*1, the setΛ*t*,*x*(*t*) is not empty, as the restriction of the control to the time interval is feasible for *x*(*t*).

### *Proof:*

Let *u* \* be any optimal control in Λ*τ*2,*x*(*τ*2), where *<sup>u</sup>* \* is defined on *τ*1, *τ* → <sup>1</sup> and is given by

$$\mu^\bullet \left( \xi \right) = \begin{cases} \mu \left( \xi \right), & \text{if } \tau\_1 \le \xi \le \tau\_2 \\ \vec{\mu} \left( \xi \right), & \text{if } \tau\_2 \le \xi \le \vec{\tau}\_1 \end{cases} \tag{47}$$

Then, *u* \* ∈Λ*τ*1,*x*(*τ*1). Hence,

$$W\left(\tau\_{\mathbf{l}}, \mathbf{x}\left(\tau\_{\mathbf{l}}\right)\right) \leq \phi\_{\mathbf{l}}\left(\vec{\tau}\_{\mathbf{l}}, \mathbf{x}^\*\left(\vec{\tau}\_{\mathbf{l}}\right)\right) \tag{48}$$

where *ϕ*<sup>1</sup> ( ⋅ ) is a value function defined on *τ*1, *τ* → <sup>1</sup> . Because *<sup>u</sup>* \* was any optimal control in <sup>Λ</sup>*τ*2,*x*(*τ*2), taking the infimum over the controls in Λ*τ*2,*x*(*τ*2) gives

$$V\left(\tau\_1, \mathbf{x}\left(\tau\_1\right)\right) \le V\left(\tau\_2, \mathbf{x}\left(\tau\_2\right)\right) \tag{49}$$

This implies that, if *u* \* is any optimal control for the sequential optimization process, the value function *V* evaluated along the state and control trajectories will be a nondecreasing function of time.

Theorem 1 summarizes the expected future utility at any node of the decision tree on the assumption that an optimal policy will be imminent. The implication is that a continuous selection of a sequence of control at different assessment point will optimize the performance index of the control strategy. This, however, requires a decision rule, and the next section contained further explanation on this.
