**A Real-Time Gradient Method for Nonlinear Model Predictive Control**

Knut Graichen and Bartosz Käpernick *Institute of Measurement, Control and Microtechnology University of Ulm Germany*

### **1. Introduction**

Model predictive control (MPC) is a modern control scheme that relies on the solution of an optimal control problem (OCP) on a receding horizon. MPC schemes have been developed in various formulations (regarding continuous/discrete-time systems, finite/infinite horizon length, terminal set/equality constraints, etc.). Comprehensive overviews and references on MPC can, for instance, be found in Diehl et al. (2009); Grüne & Pannek (2011); Kothare & Morari (2000); Mayne et al. (2000); Rawlings & Mayne (2009).

Although the methodology of MPC is naturally suited to handle constraints and multiple-input systems, the iterative solution of the underlying OCP is in general computationally expensive. An intuitive approach to reducing the computational load is to solve the OCP approximately, for instance, by using a fixed number of iterations in each sampling step. In the next MPC step, the previous solution can be used for a warm-start of the optimization algorithm in order to successively reduce the suboptimality of the predicted trajectories. This incremental strategy differs from the "optimal" MPC case where the (numerically exact) OCP solution is assumed to be known.

There exist various suboptimal and real-time approaches in the literature with different kinds of terminal constraints and demands on the optimization algorithm (Cannon & Kouvaritakis, 2002; DeHaan & Guay, 2007; Diehl et al., 2005; Graichen & Kugi, 2010; Lee et al., 2002; Michalska & Mayne, 1993; Ohtsuka, 2004; Scokaert et al., 1999). In particular, the approaches of Ohtsuka (2004) and Diehl et al. (2005) are related to the MPC scheme presented in this chapter. In Ohtsuka (2004), an algorithm is developed that traces the solution of the discretized optimality conditions over the single sampling steps. The real-time iteration scheme presented by Diehl et al. (2005) uses a Newton scheme together with terminal constraints in order to compute an approximate solution that is refined in each sampling step.

Suboptimal MPC schemes require special attention regarding their convergence and stability properties. This is particularly important if an MPC formulation without terminal constraints is used in order to minimize the computational complexity and to allow for a real-time implementation for very fast dynamical systems. In this context, a suboptimal MPC approach without terminal constraints was investigated in Graichen & Kugi (2010). Starting from the assumption that an optimization algorithm with a linear rate of convergence exists, it is

Nonlinear Model Predictive Control 3

A Real-Time Gradient Method for Nonlinear Model Predictive Control 11

The initial condition *x*(*tk*) = *xk* in (3b) denotes the measured (or observed) state of the system (1) at time *tk* = *t*<sup>0</sup> + *k*Δ*t* with the sampling time Δ*t*. The bared variables *x*¯(*τ*), *u*¯(*τ*) represent internal variables of the controller with the MPC prediction time coordinate

The integral and the terminal cost functions in (3a) are assumed to be continuously

*ml*(||*x*||<sup>2</sup> <sup>+</sup> ||*u*||2) <sup>≤</sup>*l*(*x*, *<sup>u</sup>*) <sup>≤</sup> *Ml*(||*x*||<sup>2</sup> <sup>+</sup> ||*u*||2)

for some constants *ml*, *Ml* > 0 and *mV*, *MV* > 0. The optimal solution of OCP (3) is denoted

To obtain a stabilizing MPC feedback law on the sampling interval [*tk*, *tk*<sup>+</sup>1), the first part of

∗

which can be interpreted as a nonlinear "sampled" control law with *κ*(0; *xk*) = 0. In the next MPC step at time *tk*+1, OCP (3) is solved again with the new initial condition *xk*+1. In the

<sup>∗</sup>(*τ*; *xk*, *u*¯<sup>∗</sup>

*<sup>k</sup>* (*τ*) is used as control input for the system (1)

*<sup>k</sup>* (*τ*) =: *κ*(*x*¯

absence of model errors and disturbances, the next point *xk*<sup>+</sup><sup>1</sup> is given by *xk*<sup>+</sup><sup>1</sup> = *x*¯<sup>∗</sup>

<sup>∗</sup>(*τ*; *xk*),

The following lines summarize important results for the "optimal" MPC case without terminal constraints, i.e. when the optimal solution (6) of OCP (3) is assumed to be known in each sampling step. These results are the basis for the suboptimal MPC case treated in Section 3.

Since *u* is constrained, Assumption 1 is always satisfied for systems without finite escape time. Moreover, note that the existence of a solution of OCP (3) in Assumption 2 is not very restrictive as no terminal constraints are considered and all functions are assumed to

<sup>1</sup> Theorems on existence and uniqueness of solutions for certain classes of OCPs can, for instance, be

<sup>∞</sup>[0, *<sup>T</sup>*] : *<sup>u</sup>*(*t*) <sup>∈</sup> [*u*−, *<sup>u</sup>*+], *<sup>t</sup>* <sup>∈</sup> [0, *<sup>T</sup>*]} . (4)

*mV*||*x*||<sup>2</sup> <sup>≤</sup> *<sup>V</sup>*(*x*) <sup>≤</sup> *MV*||*x*||<sup>2</sup> (5)

<sup>∗</sup>(*xk*) := *J*(*xk*, *u*¯

*<sup>k</sup>* (*τ*); *xk*), *τ* ∈ [0, Δ*t*), (7)

*, the system* (1) *has a bounded solution over* [0, *T*]*.*

∗ *<sup>k</sup>* ). (6)

*<sup>k</sup>* (Δ*t*) and

<sup>0</sup> . (8)

*<sup>k</sup>* ), *τ* ∈ [0, *T*] , *J*

<sup>∗</sup>(*τ*; *xk*), *<sup>τ</sup>* <sup>∈</sup> [0, <sup>Δ</sup>*t*), *<sup>k</sup>* <sup>∈</sup> **<sup>N</sup>**<sup>+</sup>

where U[0,*T*] is the admissible input space

*τ* ∈ [0, *T*] and the horizon length *T* ≥ Δ*t*.

by

*u*¯∗

*<sup>k</sup>* (*τ*) := *u*¯

the closed-loop trajectories are

**2.2 Domain of attraction and stability**

be continuously differentiable. 1.

the optimal control *u*¯∗

differentiable and to satisfy the quadratic bounds

<sup>∗</sup>(*τ*; *xk* ), *x*¯

∗ *<sup>k</sup>* (*τ*) := *x*¯

*u*(*tk* + *τ*) = *u*¯<sup>∗</sup>

*x*(*t*) = *x*(*tk* + *τ*) = *x*¯

*u*(*t*) = *u*(*tk* + *τ*) = *u*¯

**Assumption 2.** *OCP* (3) *has an optimal solution* (6) *for all xk* <sup>∈</sup> **<sup>R</sup>***n.*

Some basic assumptions are necessary to proceed: **Assumption 1.** *For every x*<sup>0</sup> <sup>∈</sup> **<sup>R</sup>***<sup>n</sup> and u* ∈ U[0,*T*]

found in Berkovitz (1974); Lee & Markus (1967).

<sup>U</sup>[0,*T*] :<sup>=</sup> {*u*(·) <sup>∈</sup> *<sup>L</sup><sup>m</sup>*

shown that exponential stability of the closed-loop system as well as exponential decay of the suboptimality can be guaranteed if the number of iterations per sampling step satisfies a lower bound (Graichen & Kugi, 2010). The decay of the suboptimality also illustrates the incremental improvement of the MPC scheme.

Based on these theoretical considerations (Graichen & Kugi, 2010), this chapter presents a real-time MPC scheme that relies on the gradient method in optimal control (Dunn, 1996; Graichen et al., 2010; Nikol'skii, 2007). This algorithm is particularly suited for a real-time implementation, as it takes full advantage of the MPC formulation without terminal constraints. In addition, the gradient method allows for a memory and time efficient computation of the single iterations, which is of importance in order to employ the MPC scheme for fast dynamical systems.

In this chapter, the gradient-based MPC algorithm is described for continuous-time nonlinear systems subject to control constraints. Starting from the general formulation of the MPC problem, the stability properties in the optimal MPC case are summarized before the suboptimal MPC strategy is discussed. As a starting point for the derivation of the gradient method, the necessary optimality conditions for the underlying OCP formulation without terminal constraints are derived from Pontryagin's Maximum Principle. Based on the optimality conditions, the gradient algorithm is described and its particular implementation within a real-time MPC scheme is detailed. The algorithm as well as its properties and incremental improvement in the MPC scheme are numerically investigated for the double pendulum on a cart, which is a benchmark in nonlinear control. The simulation results as well as the CPU time requirements reveal the efficiency of the gradient-based MPC scheme.

#### **2. MPC formulation**

We consider a nonlinear continuous-time system of the form

$$\dot{\mathbf{x}}(t) = f(\mathbf{x}(t), \boldsymbol{\mu}(t)) \quad \mathbf{x}(t\_0) = \mathbf{x}\_0, \quad t \ge t\_0 \tag{1}$$

with the state *<sup>x</sup>* <sup>∈</sup> **<sup>R</sup>***<sup>n</sup>* and the control *<sup>u</sup>* <sup>∈</sup> **<sup>R</sup>***<sup>m</sup>* subject to the control constraints

$$u(t) \in \left[u^-, u^+\right]. \tag{2}$$

Without loss of generality, we assume that the origin is an equilibrium of the system (1) with *f*(0, 0) = 0. Moreover, the system function *f* is supposed to be continuously differentiable in its arguments. This section summarizes the MPC formulation as well as basic assumptions and basic results for the stability of the MPC scheme in closed-loop.

#### **2.1 Optimal control problem**

For stabilizing the origin of the system (1), an MPC scheme based on the following optimal control problem (OCP) is used

$$\min\_{\mathbf{z}\in\mathcal{U}\_{[0,T]}} \quad f(\mathbf{x}\_k, \boldsymbol{\uppi}) = V(\mathbf{x}(T)) + \int\_0^T l(\mathbf{x}(\tau), \boldsymbol{\uppi}(\tau)) \, \mathrm{d}\tau \tag{3a}$$

$$\mathbf{s}.\mathbf{t}.\qquad \dot{\mathbf{x}}(\tau) = f(\mathbf{\bar{x}}(\tau), \mathbf{\bar{u}}(\tau)), \quad \mathbf{\bar{x}}(0) = \mathbf{x}\_k = \mathbf{x}(t\_k).\tag{3b}$$

where U[0,*T*] is the admissible input space

2 Will-be-set-by-IN-TECH

shown that exponential stability of the closed-loop system as well as exponential decay of the suboptimality can be guaranteed if the number of iterations per sampling step satisfies a lower bound (Graichen & Kugi, 2010). The decay of the suboptimality also illustrates the

Based on these theoretical considerations (Graichen & Kugi, 2010), this chapter presents a real-time MPC scheme that relies on the gradient method in optimal control (Dunn, 1996; Graichen et al., 2010; Nikol'skii, 2007). This algorithm is particularly suited for a real-time implementation, as it takes full advantage of the MPC formulation without terminal constraints. In addition, the gradient method allows for a memory and time efficient computation of the single iterations, which is of importance in order to employ the MPC

In this chapter, the gradient-based MPC algorithm is described for continuous-time nonlinear systems subject to control constraints. Starting from the general formulation of the MPC problem, the stability properties in the optimal MPC case are summarized before the suboptimal MPC strategy is discussed. As a starting point for the derivation of the gradient method, the necessary optimality conditions for the underlying OCP formulation without terminal constraints are derived from Pontryagin's Maximum Principle. Based on the optimality conditions, the gradient algorithm is described and its particular implementation within a real-time MPC scheme is detailed. The algorithm as well as its properties and incremental improvement in the MPC scheme are numerically investigated for the double pendulum on a cart, which is a benchmark in nonlinear control. The simulation results as well as the CPU time requirements reveal the efficiency of the gradient-based MPC scheme.

*x*˙(*t*) = *f*(*x*(*t*), *u*(*t*)) *x*(*t*0) = *x*<sup>0</sup> , *t* ≥ *t*<sup>0</sup> (1)

*<sup>u</sup>*(*t*) <sup>∈</sup> [*u*−, *<sup>u</sup>*+] . (2)

*l*(*x*¯(*τ*), *u*¯(*τ*)) d*τ* (3a)

incremental improvement of the MPC scheme.

We consider a nonlinear continuous-time system of the form

with the state *<sup>x</sup>* <sup>∈</sup> **<sup>R</sup>***<sup>n</sup>* and the control *<sup>u</sup>* <sup>∈</sup> **<sup>R</sup>***<sup>m</sup>* subject to the control constraints

and basic results for the stability of the MPC scheme in closed-loop.

Without loss of generality, we assume that the origin is an equilibrium of the system (1) with *f*(0, 0) = 0. Moreover, the system function *f* is supposed to be continuously differentiable in its arguments. This section summarizes the MPC formulation as well as basic assumptions

For stabilizing the origin of the system (1), an MPC scheme based on the following optimal

 *T* 0

s.t. *x*¯˙(*τ*) = *f*(*x*¯(*τ*), *u*¯(*τ*)), *x*¯(0) = *xk* = *x*(*tk*), (3b)

*J*(*xk*, *u*¯) = *V*(*x*¯(*T*)) +

scheme for fast dynamical systems.

**2. MPC formulation**

**2.1 Optimal control problem**

control problem (OCP) is used

min *u*¯∈U[0,*T*]

$$\mathcal{U}\_{[0,T]} := \left\{ \boldsymbol{\mu}(\cdot) \in L\_{\infty}^{\mathrm{m}}[0,T] : \boldsymbol{\mu}(t) \in [\boldsymbol{\mu}^{-}, \boldsymbol{\mu}^{+}], t \in [0,T] \right\}.\tag{4}$$

The initial condition *x*(*tk*) = *xk* in (3b) denotes the measured (or observed) state of the system (1) at time *tk* = *t*<sup>0</sup> + *k*Δ*t* with the sampling time Δ*t*. The bared variables *x*¯(*τ*), *u*¯(*τ*) represent internal variables of the controller with the MPC prediction time coordinate *τ* ∈ [0, *T*] and the horizon length *T* ≥ Δ*t*.

The integral and the terminal cost functions in (3a) are assumed to be continuously differentiable and to satisfy the quadratic bounds

$$m\_I(||x||^2 + ||u||^2) \le l(\mathbf{x}, u) \le M\_l(||\mathbf{x}||^2 + ||u||^2)$$

$$m\_V(||\mathbf{x}||^2 \le V(\mathbf{x}) \le M\_V(||\mathbf{x}||^2)$$

for some constants *ml*, *Ml* > 0 and *mV*, *MV* > 0. The optimal solution of OCP (3) is denoted by

$$\mathfrak{u}\_k^\*(\tau) := \mathfrak{u}^\*(\tau; \mathfrak{x}\_k), \quad \mathfrak{x}\_k^\*(\tau) := \mathfrak{x}^\*(\tau; \mathfrak{x}\_k, \mathfrak{u}\_k^\*), \quad \tau \in [0, T], \quad f^\*(\mathfrak{x}\_k) := f(\mathfrak{x}\_k, \mathfrak{u}\_k^\*). \tag{6}$$

To obtain a stabilizing MPC feedback law on the sampling interval [*tk*, *tk*<sup>+</sup>1), the first part of the optimal control *u*¯∗ *<sup>k</sup>* (*τ*) is used as control input for the system (1)

$$
\mu(t\_k + \tau) = \mathfrak{u}\_k^\*(\tau) =: \kappa(\mathfrak{x}\_k^\*(\tau); \mathfrak{x}\_k), \quad \tau \in [0, \Delta t), \tag{7}
$$

which can be interpreted as a nonlinear "sampled" control law with *κ*(0; *xk*) = 0. In the next MPC step at time *tk*+1, OCP (3) is solved again with the new initial condition *xk*+1. In the absence of model errors and disturbances, the next point *xk*<sup>+</sup><sup>1</sup> is given by *xk*<sup>+</sup><sup>1</sup> = *x*¯<sup>∗</sup> *<sup>k</sup>* (Δ*t*) and the closed-loop trajectories are

$$\begin{aligned} \mathbf{x}(t) &= \mathbf{x}(t\_k + \tau) = \bar{\mathbf{x}}^\*(\tau; \mathbf{x}\_k) \\ \mathbf{u}(t) &= \mathbf{u}(t\_k + \tau) = \bar{\mathbf{u}}^\*(\tau; \mathbf{x}\_k) \ , \quad \tau \in [0, \Delta t) \ , \quad k \in \mathbb{N}\_0^+ \ . \end{aligned} \tag{8}$$

#### **2.2 Domain of attraction and stability**

The following lines summarize important results for the "optimal" MPC case without terminal constraints, i.e. when the optimal solution (6) of OCP (3) is assumed to be known in each sampling step. These results are the basis for the suboptimal MPC case treated in Section 3. Some basic assumptions are necessary to proceed:

**Assumption 1.** *For every x*<sup>0</sup> <sup>∈</sup> **<sup>R</sup>***<sup>n</sup> and u* ∈ U[0,*T*] *, the system* (1) *has a bounded solution over* [0, *T*]*.* **Assumption 2.** *OCP* (3) *has an optimal solution* (6) *for all xk* <sup>∈</sup> **<sup>R</sup>***n.*

Since *u* is constrained, Assumption 1 is always satisfied for systems without finite escape time. Moreover, note that the existence of a solution of OCP (3) in Assumption 2 is not very restrictive as no terminal constraints are considered and all functions are assumed to be continuously differentiable. 1.

<sup>1</sup> Theorems on existence and uniqueness of solutions for certain classes of OCPs can, for instance, be found in Berkovitz (1974); Lee & Markus (1967).

Nonlinear Model Predictive Control 5

A Real-Time Gradient Method for Nonlinear Model Predictive Control 13

from the CLF property (9) on the set *S<sup>β</sup>* (Jadbabaie et al., 2001). Indeed, consider the

CLF law *<sup>u</sup>*¯*q*(*τ*) = *<sup>q</sup>*(*x*¯*q*(*τ*)). Note that *<sup>x</sup>*¯*q*(*τ*) <sup>∈</sup> *<sup>S</sup><sup>β</sup>* for all *<sup>τ</sup>* <sup>≥</sup> 0, i.e. *<sup>S</sup><sup>β</sup>* ist positive invariant due to the definition of *S<sup>β</sup>* and the CLF inequality (9) that can be expressed in the form

*<sup>q</sup>*(*τ*)) ≤ −*l*(*x*¯*<sup>q</sup>*

*l*(*x*ˆ(*τ*), *u*ˆ(*τ*)) d*τ* + *V*(*x*ˆ(*T*))

4. Based on (11), Barbalat's Lemma allows one to conclude that the closed-loop trajectories (8) satisfy lim*t*→<sup>∞</sup> ||*x*(*t*)|| = 0, see e.g. Chen & Allgöwer (1998); Fontes (2001). Note that this property is weaker than asymptotic stability in the sense of Lyapunov, which can be proved if the optimal cost *J*∗(*xk* )is continuously differentiable (Findeisen, 2006; Fontes et al., 2007).

In practice, the exact solution of the receding horizon optimal control problem is typically approximated by a sufficiently accurate numerical solution of a suitable optimization algorithm. If the sampling time Δ*t* is large enough, this numerical approximation will be sufficiently close to the optimal MPC case considered in the last section. However, for large-scale systems or highly dynamical systems, an accurate near-optimal solution often cannot be determined fast enough. This problem, often encountered in practice, gives rise to suboptimal MPC strategies, where an approximate solution is computed in each sampling step. This section develops the necessary changes and differences to the ideal case due to an incremental solution of the underlying OCP solution for a class of optimization algorithms.

Several suboptimal MPC strategies were already mentioned in the introduction (Cannon & Kouvaritakis, 2002; DeHaan & Guay, 2007; Diehl et al., 2005; Lee et al., 2002; Michalska & Mayne, 1993; Scokaert et al., 1999). Moreover, a suboptimal MPC scheme without terminal constraints – as considered in this chapter – was investigated in Graichen & Kugi (2010).

, *<sup>u</sup>*ˆ(*τ*) =

(*τ*), *u*¯

*<sup>q</sup>*(0)) + <sup>Δ</sup>*<sup>t</sup>* 0

 ≤ 0

*l*(*x*¯*<sup>q</sup>* (*τ*), *u*¯

*u*¯∗

*<sup>k</sup>* (*T*) is the state trajectory that results from applying the local

*<sup>k</sup>* (Δ*t*) follows

*<sup>k</sup>* (*τ* + Δ*t*), *τ* ∈ [0, *T*−Δ*t*) *<sup>u</sup>*¯*q*(*<sup>τ</sup>* <sup>−</sup> *<sup>T</sup>* <sup>+</sup> <sup>Δ</sup>*t*), *<sup>τ</sup>* <sup>∈</sup> [*T*−Δ*t*, *<sup>T</sup>*]

*<sup>q</sup>*(*τ*)). (12)

*<sup>q</sup>*(*τ*)) d*τ* .

(13)

3. The decrease condition (11) for the optimal cost at the next point *xk*<sup>+</sup><sup>1</sup> = *x*¯<sup>∗</sup>

*<sup>k</sup>* (*τ* + Δ*t*), *τ* ∈ [0, *T*−Δ*t*) *<sup>x</sup>*¯*q*(*<sup>τ</sup>* <sup>−</sup> *<sup>T</sup>* <sup>+</sup> <sup>Δ</sup>*t*), *<sup>τ</sup>* <sup>∈</sup> [*T*−Δ*t*, *<sup>T</sup>*]

> d d*τ V*(*x*¯

 *T* 0

<sup>∗</sup>(*xk*) −

+ *V*(*x*¯

 Δ*t* 0 *l*(*x*¯ ∗ *<sup>k</sup>* (*τ*), *u*¯ ∗ *<sup>k</sup>* (*τ*)) d*τ*

*<sup>q</sup>*(Δ*t*)) <sup>−</sup> *<sup>V</sup>*(*x*¯

= *J*

trajectories

*<sup>x</sup>*ˆ(*τ*) =

*x*¯∗

where *x*¯*q*(*τ*) with *x*¯*q*(0) = *x*¯<sup>∗</sup>

Hence, the following estimates hold

*<sup>k</sup>* (Δ*t*)) ≤

**3. Suboptimal MPC for real-time feasibility**

**3.1 Suboptimal solution strategy**

*J* ∗(*x*∗

An MPC formulation without terminal constraints has been subject of research by several authors, see for instance Graichen & Kugi (2010); Ito & Kunisch (2002); Jadbabaie et al. (2001); Limon et al. (2006); Parisini & Zoppoli (1995). Instead of imposing a terminal constraint, it is often assumed that the terminal cost *V* represents a (local) Control Lyapunov Function (CLF) on an invariant set *S<sup>β</sup>* containing the origin.

**Assumption 3.** *There exists a compact non-empty set S<sup>β</sup>* <sup>=</sup> {*<sup>x</sup>* <sup>∈</sup> **<sup>R</sup>***<sup>n</sup>* : *<sup>V</sup>*(*x*) <sup>≤</sup> *<sup>β</sup>*} *and a (local) feedback law q*(*x*) <sup>∈</sup> [*u*−, *<sup>u</sup>*+] *such that* <sup>∀</sup>*<sup>x</sup>* <sup>∈</sup> *<sup>S</sup><sup>β</sup>*

$$\frac{\partial V}{\partial \mathbf{x}} f(\mathbf{x}, q(\mathbf{x})) + l(\mathbf{x}, q(\mathbf{x})) \le 0. \tag{9}$$

There exist several approaches in the literature for constructing a CLF as terminal cost, for instance Chen & Allgöwer (1998); Primbs (1999). In particular, *V*(*x*) can be designed as a quadratic function *V*(*x*) = *x*T*Px* with the symmetric and positive definite matrix *P* following from a Lyapunov or Riccati equation provided that the linearization of the system (1) about the origin is stabilizable.

An important requirement for the stability of an MPC scheme without terminal constraints is to ensure that the endpoint of the optimal state trajectory *x*¯∗ *<sup>k</sup>* (*T*) reaches the CLF region *Sβ*. The following theorem states this property more clearly and relates it to the overall stability of the (optimal) MPC scheme.

**Theorem 1** (**Stability of MPC scheme – optimal case).** *Suppose that Assumptions 1-3 are satisfied and consider the compact set*

$$\Gamma\_{\mathfrak{a}} = \{ \mathfrak{x} \in \mathbb{R}^n : f^\*(\mathfrak{x}) \le \mathfrak{a} \}, \quad \mathfrak{a} := \beta \left( 1 + \frac{m\_l}{M\_V} T \right). \tag{10}$$

*Then, for all x*<sup>0</sup> ∈ Γ*<sup>α</sup> the following holds:*


$$J^\*(\mathfrak{x}\_k^\*(\Delta t)) \le f^\*(\mathfrak{x}\_k) - \int\_0^{\Delta t} l(\mathfrak{x}\_k^\*(\tau), \mathfrak{u}\_k^\*(\tau)) \, \mathrm{d}\tau \quad \forall \, \mathfrak{x}\_k \in \Gamma\_a. \tag{11}$$

*4. The origin of the system* (1) *under the optimal MPC law* (7) *is asymptotically stable in the sense that the closed-loop trajectories* (8) *satisfy* lim*t*→<sup>∞</sup> ||*x*(*t*)|| = 0*.*

The single statements 1-4 in Theorem 1 are discussed in the following:


3. The decrease condition (11) for the optimal cost at the next point *xk*<sup>+</sup><sup>1</sup> = *x*¯<sup>∗</sup> *<sup>k</sup>* (Δ*t*) follows from the CLF property (9) on the set *S<sup>β</sup>* (Jadbabaie et al., 2001). Indeed, consider the trajectories

$$\begin{aligned} \hat{\mathfrak{x}}(\tau) = \begin{cases} \tilde{\mathfrak{x}}\_k^\*(\tau + \Delta t), & \tau \in [0, T - \Delta t) \\ \tilde{\mathfrak{x}}^q(\tau - T + \Delta t), & \tau \in [T - \Delta t, T] \end{cases}, \hat{\mathfrak{u}}(\tau) = \begin{cases} \tilde{\mathfrak{u}}\_k^\*(\tau + \Delta t), & \tau \in [0, T - \Delta t) \\ \tilde{\mathfrak{u}}^q(\tau - T + \Delta t), & \tau \in [T - \Delta t, T] \end{cases} \end{aligned}$$

where *x*¯*q*(*τ*) with *x*¯*q*(0) = *x*¯<sup>∗</sup> *<sup>k</sup>* (*T*) is the state trajectory that results from applying the local CLF law *<sup>u</sup>*¯*q*(*τ*) = *<sup>q</sup>*(*x*¯*q*(*τ*)). Note that *<sup>x</sup>*¯*q*(*τ*) <sup>∈</sup> *<sup>S</sup><sup>β</sup>* for all *<sup>τ</sup>* <sup>≥</sup> 0, i.e. *<sup>S</sup><sup>β</sup>* ist positive invariant due to the definition of *S<sup>β</sup>* and the CLF inequality (9) that can be expressed in the form

$$\frac{\mathbf{d}}{\mathbf{d}\tau}V(\vec{x}^{q}(\tau)) \le -l(\vec{x}^{q}(\tau), \vec{u}^{q}(\tau))\,. \tag{12}$$

Hence, the following estimates hold

4 Will-be-set-by-IN-TECH

An MPC formulation without terminal constraints has been subject of research by several authors, see for instance Graichen & Kugi (2010); Ito & Kunisch (2002); Jadbabaie et al. (2001); Limon et al. (2006); Parisini & Zoppoli (1995). Instead of imposing a terminal constraint, it is often assumed that the terminal cost *V* represents a (local) Control Lyapunov Function (CLF)

**Assumption 3.** *There exists a compact non-empty set S<sup>β</sup>* <sup>=</sup> {*<sup>x</sup>* <sup>∈</sup> **<sup>R</sup>***<sup>n</sup>* : *<sup>V</sup>*(*x*) <sup>≤</sup> *<sup>β</sup>*} *and a (local)*

There exist several approaches in the literature for constructing a CLF as terminal cost, for instance Chen & Allgöwer (1998); Primbs (1999). In particular, *V*(*x*) can be designed as a quadratic function *V*(*x*) = *x*T*Px* with the symmetric and positive definite matrix *P* following from a Lyapunov or Riccati equation provided that the linearization of the system (1) about

An important requirement for the stability of an MPC scheme without terminal constraints is

The following theorem states this property more clearly and relates it to the overall stability

**Theorem 1** (**Stability of MPC scheme – optimal case).** *Suppose that Assumptions 1-3 are satisfied*

<sup>∗</sup>(*x*) ≤ *α*} , *α* := *β*

*1. For all MPC steps, it holds that xk* ∈ Γ*α. Moreover, the endpoint of the optimal state trajectory*

 Δ*t* 0 *l*(*x*¯ ∗ *<sup>k</sup>* (*τ*), *u*¯ ∗

*4. The origin of the system* (1) *under the optimal MPC law* (7) *is asymptotically stable in the sense*

1. The sublevel set Γ*<sup>α</sup>* defines the domain of attraction for the MPC scheme without terminal constraints (Graichen & Kugi, 2010; Limon et al., 2006). The proof of this statement is given

2. Although *α* in (10) leads to a rather conservative estimate of Γ*<sup>α</sup>* due to the nature of the proof (see Appendix A), it nevertheless reveals that Γ*<sup>α</sup>* can be enlarged by increasing the

*<sup>k</sup>* (*T*) ∈ *Sβ.*

 1 + *ml MV T* 

*<sup>∂</sup><sup>x</sup> <sup>f</sup>*(*x*, *<sup>q</sup>*(*x*)) + *<sup>l</sup>*(*x*, *<sup>q</sup>*(*x*)) <sup>≤</sup> 0 . (9)

*<sup>k</sup>* (*T*) reaches the CLF region *Sβ*.

*<sup>k</sup>* (*τ*)) d*τ* ∀ *xk* ∈ Γ*<sup>α</sup>* . (11)

. (10)

on an invariant set *S<sup>β</sup>* containing the origin.

*feedback law q*(*x*) <sup>∈</sup> [*u*−, *<sup>u</sup>*+] *such that* <sup>∀</sup>*<sup>x</sup>* <sup>∈</sup> *<sup>S</sup><sup>β</sup>*

the origin is stabilizable.

of the (optimal) MPC scheme.

*Then, for all x*<sup>0</sup> ∈ Γ*<sup>α</sup> the following holds:*

*2.* Γ*<sup>α</sup> contains the CLF region, i.e. S<sup>β</sup>* ⊆ Γ*α.*

*J* ∗(*x*¯ ∗

*and consider the compact set*

*3. The optimal cost satisfies*

in Appendix A.

horizon length *T*.

*x*¯∗

*∂V*

to ensure that the endpoint of the optimal state trajectory *x*¯∗

<sup>Γ</sup>*<sup>α</sup>* <sup>=</sup> {*<sup>x</sup>* <sup>∈</sup> **<sup>R</sup>***<sup>n</sup>* : *<sup>J</sup>*

*<sup>k</sup>* (*τ*)*, τ* ∈ [0, *T*] *reaches the CLF region, i.e. x*¯<sup>∗</sup>

*<sup>k</sup>* (Δ*t*)) ≤ *J*

*that the closed-loop trajectories* (8) *satisfy* lim*t*→<sup>∞</sup> ||*x*(*t*)|| = 0*.*

The single statements 1-4 in Theorem 1 are discussed in the following:

<sup>∗</sup>(*xk*) −

$$\begin{split} \left| f^\*(\mathbf{x}\_k^\*(\Delta t)) \right| &\leq \int\_0^T l(\mathbf{\hat{x}}(\tau), \boldsymbol{\hat{n}}(\tau)) \, \mathrm{d}\tau + V(\boldsymbol{\hat{x}}(T)) \\ &= \int\_0^\cdot \mathbf{(x}\_k) - \int\_0^{\Delta t} l(\mathbf{\hat{x}}\_k^\*(\tau), \boldsymbol{\hat{n}}\_k^\*(\tau)) \, \mathrm{d}\tau \\ &\quad + \underbrace{V(\mathbf{\hat{x}}^q(\Delta t)) - V(\mathbf{\hat{x}}^q(0)) + \int\_0^{\Delta t} l(\mathbf{\hat{x}}^q(\tau), \boldsymbol{\hat{n}}^q(\tau)) \, \mathrm{d}\tau}\_{\leq 0}. \end{split} \tag{13}$$

4. Based on (11), Barbalat's Lemma allows one to conclude that the closed-loop trajectories (8) satisfy lim*t*→<sup>∞</sup> ||*x*(*t*)|| = 0, see e.g. Chen & Allgöwer (1998); Fontes (2001). Note that this property is weaker than asymptotic stability in the sense of Lyapunov, which can be proved if the optimal cost *J*∗(*xk* )is continuously differentiable (Findeisen, 2006; Fontes et al., 2007).

#### **3. Suboptimal MPC for real-time feasibility**

In practice, the exact solution of the receding horizon optimal control problem is typically approximated by a sufficiently accurate numerical solution of a suitable optimization algorithm. If the sampling time Δ*t* is large enough, this numerical approximation will be sufficiently close to the optimal MPC case considered in the last section. However, for large-scale systems or highly dynamical systems, an accurate near-optimal solution often cannot be determined fast enough. This problem, often encountered in practice, gives rise to suboptimal MPC strategies, where an approximate solution is computed in each sampling step. This section develops the necessary changes and differences to the ideal case due to an incremental solution of the underlying OCP solution for a class of optimization algorithms.

#### **3.1 Suboptimal solution strategy**

Several suboptimal MPC strategies were already mentioned in the introduction (Cannon & Kouvaritakis, 2002; DeHaan & Guay, 2007; Diehl et al., 2005; Lee et al., 2002; Michalska & Mayne, 1993; Scokaert et al., 1999). Moreover, a suboptimal MPC scheme without terminal constraints – as considered in this chapter – was investigated in Graichen & Kugi (2010).

Nonlinear Model Predictive Control 7

A Real-Time Gradient Method for Nonlinear Model Predictive Control 15

Assumption 6 is always satisfied for linear systems with quadratic cost functional as proved in Appendix B. In general, the quadratic growth property in Assumption 6 represents a smoothness assumption which, however, is weaker than assuming strong convexity (it is well known that strong convexity on a compact set implies quadratic growth, see, e.g., Allaire

The stability analysis for the suboptimal MPC case is more involved than in the "optimal" MPC case due to the non-vanishing optimization error Δ*J*(*N*)(*xk*). An important question in this context is under which conditions the CLF region *S<sup>β</sup>* can be reached by the suboptimal

**Theorem 2** (**Stability of MPC scheme – suboptimal case).** *Suppose that Assumptions 1-6 are*

*Then, there exists a minimum number of iterations <sup>N</sup>*<sup>ˆ</sup> <sup>≥</sup> <sup>1</sup> *and a maximum admissible optimization*

*1. For all MPC steps, it holds that xk* ∈ Γ*α*<sup>ˆ</sup> *. Moreover, the endpoint of the (suboptimal) state trajectory*

The proof of Theorem 2 consists of several intermediate lemmas and steps that are given in details in Graichen & Kugi (2010). The statements 1-4 in Theorem 2 summarize several

<sup>2</sup> A simple example is the function *f*(*x*) = *x*<sup>2</sup> + 10 sin<sup>2</sup> *x* with the global minimum *f*(*x*∗) = 0 at *x*<sup>∗</sup> = 0.

(*N*)

*<sup>k</sup>* (*T*) ∈ *Sβ.*

<sup>∗</sup>(*x*) <sup>≤</sup> *<sup>α</sup>*<sup>ˆ</sup> } , *<sup>α</sup>*<sup>ˆ</sup> <sup>=</sup> *mV*

J

suboptimal

*<sup>k</sup>* (τ)

xk+3

sufficient conditions for the stability of the suboptimal MPC scheme.

*error* <sup>Δ</sup>ˆ*<sup>J</sup>* <sup>≥</sup> <sup>0</sup>*, such that for all x*<sup>0</sup> <sup>∈</sup> <sup>Γ</sup>*α*<sup>ˆ</sup> *and all initial control trajectories <sup>u</sup>*¯

*2.* <sup>Γ</sup>*α*<sup>ˆ</sup> *contains the CLF region, i.e. S<sup>β</sup>* <sup>⊆</sup> <sup>Γ</sup>*α*ˆ*, if the horizon length satisfies T* <sup>≥</sup> ( <sup>4</sup>*MV*

*3. The origin of the system* (1) *under the suboptimal MPC law* (16) *is exponentially stable.*

Let *<sup>x</sup>* be restricted to the interval *<sup>x</sup>* <sup>∈</sup> [−5, 5]. Clearly, the quadratic growth property <sup>1</sup>

*f*(*x*) − *f*(*x*∗) is satisfied for *x* ∈ [−5, 5] although *f*(*x*) is not convex on this interval.

<sup>Γ</sup>*α*<sup>ˆ</sup> <sup>=</sup> {*<sup>x</sup>* <sup>∈</sup> **<sup>R</sup>***<sup>n</sup>* : *<sup>J</sup>*

xk+2

Fig. 1. Illustration of the suboptimal MPC implementation.

xk+1 traj. ¯x(*N*)

opt. error ΔJ(*N*)(x*k*)

*α* < *α* . (20)

<sup>0</sup> ∈ U[0,*T*] *satisfying*

*mV* <sup>−</sup> <sup>1</sup>) *MV*

*ml .*

<sup>2</sup> |*x* − *x*∗|

<sup>2</sup> <sup>≤</sup>

(0)

(*N*) *<sup>k</sup>* )

suboptimal cost J(x*k*, u¯

0 t<sup>k</sup> tk+1 tk+2 tk+3 t

optimal cost J∗(x*k*)

*<sup>k</sup>* (*τ*). The following theorem addresses this question and also gives

4*MV*

0 t<sup>k</sup> tk+1 tk+2 tk+3 t

xk

x

optimal traj. ¯x∗ *<sup>k</sup>*(τ)

(2007) and Appendix B). <sup>2</sup>

(*N*)

<sup>Δ</sup>*J*(0)(*x*0) <sup>≤</sup> *<sup>p</sup>*−*N*Δˆ*J the following holds:*

*satisfied and consider the subset of the domain* (10)

*<sup>k</sup>* (*τ*)*, τ* ∈ [0, *T*] *reaches the CLF region, i.e. x*¯

*4. The optimization error* (18) *decays exponentially.*

important points that deserve some comments.

state trajectory *x*¯

*x*¯ (*N*)

Instead of relying on one particular optimization method, it is assumed in Graichen & Kugi (2010) that an optimization algorithm exists that computes a control and state trajectory

$$\mathfrak{u}\_{\underline{k}}^{(j)}(\tau) := \mathfrak{u}^{(j)}(\tau; \mathbf{x}\_{\underline{k}}), \quad \mathfrak{x}\_{\underline{k}}^{(j)}(\tau) := \mathfrak{x}^{(j)}(\tau; \mathbf{x}\_{\underline{k}}, \mathfrak{u}\_{\underline{k}}^{(j)}), \tau \in [0, T], \quad j \in \mathbb{N}\_{0}^{+} \tag{14}$$

in each iteration *j* while satisfying a linear rate of convergence

$$J(\mathbf{x}\_k, \mathbf{u}\_k^{(j+1)}) - J^\*(\mathbf{x}\_k) \le p \left( J(\mathbf{x}\_k, \mathbf{u}\_k^{(j)}) - J^\*(\mathbf{x}\_k) \right), \quad j \in \mathbb{N}\_0^+ \tag{15}$$

with a convergence rate *<sup>p</sup>* ∈ (0, 1) and the limit lim*j*→<sup>∞</sup> *<sup>J</sup>*(*xk*, *<sup>u</sup>*¯ (*j*) *<sup>k</sup>* ) = *J*∗(*xk*).

In the spirit of a real-time feasible MPC implementation, the optimization algorithm is stopped after a fixed number of iterations, *j* = *N*, and the first part of the suboptimal control trajectory *u*¯ (*N*) *<sup>k</sup>* (*τ*) is used as control input

$$\boldsymbol{u}(t\_k + \tau) = \boldsymbol{\pi}\_k^{(N)}(\tau), \quad \tau \in [0, \Delta t), \quad k \in \mathbb{N}\_0^+ \tag{16}$$

to the system (1). In the absence of model errors and disturbances the next point *xk*<sup>+</sup><sup>1</sup> is given by *xk*<sup>+</sup><sup>1</sup> = *x*¯ (*N*) *<sup>k</sup>* (Δ*t*) and the closed-loop trajectories are

$$\begin{aligned} \mathbf{x}(t) &= \mathbf{x}(t\_k + \tau) = \overline{\mathbf{x}}^{(N)}(\tau; \mathbf{x}\_k), \\ \mathbf{u}(t) &= \mathbf{u}(t\_k + \tau) = \overline{\mathbf{u}}^{(N)}(\tau; \mathbf{x}\_k), \quad \tau \in [0, \Delta t), \quad k \in \mathbb{N}\_0^+. \end{aligned} \tag{17}$$

Compared to the "optimal" MPC case, where the optimal trajectories (6) are computed in each MPC step *k*, the trajectories (14) are suboptimal, which can be characterized by the *optimization error*

$$
\Delta J^{(N)}(\mathbf{x}\_k) := J(\mathfrak{a}\_k^{(N)}, \mathbf{x}\_k) - J^\*(\mathbf{x}\_k) \ge 0. \tag{18}
$$

In the next MPC step, the last control *u*¯ (*N*) *<sup>k</sup>* (shifted by Δ*t*) is re-used to construct a new initial control

$$
\tilde{u}\_{k+1}^{(0)}(\tau) = \begin{cases}
\tilde{u}\_k^{(N)}(\tau + \Delta t) & \text{if } \tau \in [0, T - \Delta t) \\
q(\tilde{x}\_k^{(N)}(T)) & \text{if } \tau \in [T - \Delta t, T] \end{cases} \tag{19}
$$

where the last part of *u*¯ (0) *<sup>k</sup>*+<sup>1</sup> is determined by the local CLF feedback law. The goal of the suboptimal MPC strategy therefore is to successively reduce the optimization error Δ*J*(*N*)(*xk*) in order to improve the MPC scheme in terms of optimality. Figure 1 illustrates this context.

#### **3.2 Stability and incremental improvement**

Several further assumptions are necessary to investigate the stability and the evolution of the optimization error for the suboptimal MPC scheme.

**Assumption 4.** *The optimal control law in* (7) *is locally Lipschitz continuous.*

**Assumption 5.** *For every u*¯ ∈ U[0,*T*] *, the cost J*(*xk*, *u*¯) *is twice continuously differentiable in xk.*

**Assumption 6.** *For all u*¯ ∈ U[0,*T*] *and all xk* ∈ Γ*α, the cost J*(*xk*, *u*¯) *satisfies the quadratic growth condition C*||*u*¯ − *u*¯<sup>∗</sup> *k* ||2 *L<sup>m</sup>* <sup>2</sup> [0,*T*] <sup>≤</sup> *<sup>J</sup>*(*xk*, *<sup>u</sup>*¯) <sup>−</sup> *<sup>J</sup>*∗(*xk*) *for some constant C* <sup>&</sup>gt; <sup>0</sup>*.*

6 Will-be-set-by-IN-TECH

Instead of relying on one particular optimization method, it is assumed in Graichen & Kugi (2010) that an optimization algorithm exists that computes a control and state trajectory

(*j*)

(*τ*; *xk*, *u*¯

*<sup>k</sup>* (*τ*), *<sup>τ</sup>* <sup>∈</sup> [0, <sup>Δ</sup>*t*), *<sup>k</sup>* <sup>∈</sup> **<sup>N</sup>**<sup>+</sup>

*<sup>k</sup>* (*τ* + Δ*t*) if *τ* ∈ [0, *T* − Δ*t*)

(*τ*; *xk* ), *<sup>τ</sup>* <sup>∈</sup> [0, <sup>Δ</sup>*t*), *<sup>k</sup>* <sup>∈</sup> **<sup>N</sup>**<sup>+</sup>

*<sup>k</sup>*+<sup>1</sup> is determined by the local CLF feedback law. The goal of the

*, the cost J*(*xk*, *u*¯) *is twice continuously differentiable in xk.*

(*j*)

<sup>∗</sup>(*xk*) 

(*j*)

*<sup>k</sup>* ), *<sup>τ</sup>* <sup>∈</sup> [0, *<sup>T</sup>*] , *<sup>j</sup>* <sup>∈</sup> **<sup>N</sup>**<sup>+</sup>

, *<sup>j</sup>* <sup>∈</sup> **<sup>N</sup>**<sup>+</sup>

*<sup>k</sup>* ) = *J*∗(*xk*).

<sup>0</sup> (14)

<sup>0</sup> (15)

<sup>0</sup> (16)

<sup>0</sup> . (17)

<sup>∗</sup>(*xk* ) ≥ 0 . (18)

*<sup>k</sup>* (shifted by Δ*t*) is re-used to construct a new initial

*<sup>k</sup>* (*T*)) if *<sup>τ</sup>* <sup>∈</sup> [*<sup>T</sup>* <sup>−</sup> <sup>Δ</sup>*t*, *<sup>T</sup>*] , (19)

*u*¯ (*j*)

(*N*)

(*N*)

trajectory *u*¯

by *xk*<sup>+</sup><sup>1</sup> = *x*¯

*error*

control

*<sup>k</sup>* (*τ*) :<sup>=</sup> *<sup>u</sup>*¯(*j*)

*J*(*xk* , *u*¯

(*j*+1) *<sup>k</sup>* ) − *J*

*<sup>k</sup>* (*τ*) is used as control input

(*τ*; *xk*), *x*¯

in each iteration *j* while satisfying a linear rate of convergence

with a convergence rate *<sup>p</sup>* ∈ (0, 1) and the limit lim*j*→<sup>∞</sup> *<sup>J</sup>*(*xk*, *<sup>u</sup>*¯

*u*(*tk* + *τ*) = *u*¯

*x*(*t*) = *x*(*tk* + *τ*) = *x*¯

Δ*J* (*N*)

In the next MPC step, the last control *u*¯

where the last part of *u*¯

*u*¯ (0) *<sup>k</sup>*+1(*τ*) =

**3.2 Stability and incremental improvement**

**Assumption 5.** *For every u*¯ ∈ U[0,*T*]

*k* ||2 *L<sup>m</sup>*

*condition C*||*u*¯ − *u*¯<sup>∗</sup>

(0)

optimization error for the suboptimal MPC scheme.

*<sup>u</sup>*(*t*) = *<sup>u</sup>*(*tk* + *<sup>τ</sup>*) = *<sup>u</sup>*¯(*N*)

*<sup>k</sup>* (Δ*t*) and the closed-loop trajectories are

(*j*) *<sup>k</sup>* (*τ*) := *x*¯

<sup>∗</sup>(*xk*) ≤ *p*

(*N*)

(*N*)

(*xk* ) := *J*(*u*¯

 *u*¯ (*N*)

> *q*(*x*¯ (*N*)

**Assumption 4.** *The optimal control law in* (7) *is locally Lipschitz continuous.*

(*N*)

 *J*(*xk*, *u*¯ (*j*) *<sup>k</sup>* ) − *J*

In the spirit of a real-time feasible MPC implementation, the optimization algorithm is stopped after a fixed number of iterations, *j* = *N*, and the first part of the suboptimal control

to the system (1). In the absence of model errors and disturbances the next point *xk*<sup>+</sup><sup>1</sup> is given

(*τ*; *xk*),

Compared to the "optimal" MPC case, where the optimal trajectories (6) are computed in each MPC step *k*, the trajectories (14) are suboptimal, which can be characterized by the *optimization*

> (*N*) *<sup>k</sup>* , *xk*) − *J*

suboptimal MPC strategy therefore is to successively reduce the optimization error Δ*J*(*N*)(*xk*) in order to improve the MPC scheme in terms of optimality. Figure 1 illustrates this context.

Several further assumptions are necessary to investigate the stability and the evolution of the

**Assumption 6.** *For all u*¯ ∈ U[0,*T*] *and all xk* ∈ Γ*α, the cost J*(*xk*, *u*¯) *satisfies the quadratic growth*

<sup>2</sup> [0,*T*] <sup>≤</sup> *<sup>J</sup>*(*xk*, *<sup>u</sup>*¯) <sup>−</sup> *<sup>J</sup>*∗(*xk*) *for some constant C* <sup>&</sup>gt; <sup>0</sup>*.*

Fig. 1. Illustration of the suboptimal MPC implementation.

Assumption 6 is always satisfied for linear systems with quadratic cost functional as proved in Appendix B. In general, the quadratic growth property in Assumption 6 represents a smoothness assumption which, however, is weaker than assuming strong convexity (it is well known that strong convexity on a compact set implies quadratic growth, see, e.g., Allaire (2007) and Appendix B). <sup>2</sup>

The stability analysis for the suboptimal MPC case is more involved than in the "optimal" MPC case due to the non-vanishing optimization error Δ*J*(*N*)(*xk*). An important question in this context is under which conditions the CLF region *S<sup>β</sup>* can be reached by the suboptimal state trajectory *x*¯ (*N*) *<sup>k</sup>* (*τ*). The following theorem addresses this question and also gives sufficient conditions for the stability of the suboptimal MPC scheme.

**Theorem 2** (**Stability of MPC scheme – suboptimal case).** *Suppose that Assumptions 1-6 are satisfied and consider the subset of the domain* (10)

$$\Gamma\_{\hat{\mathbb{R}}} = \{ \mathfrak{x} \in \mathbb{R}^{\mathfrak{n}} : f^\*(\mathfrak{x}) \le \hat{\mathfrak{a}} \}, \quad \hat{\mathfrak{a}} = \frac{m\_V}{4M\_V} \mathfrak{a} < \mathfrak{a} \,. \tag{20}$$

*Then, there exists a minimum number of iterations <sup>N</sup>*<sup>ˆ</sup> <sup>≥</sup> <sup>1</sup> *and a maximum admissible optimization error* <sup>Δ</sup>ˆ*<sup>J</sup>* <sup>≥</sup> <sup>0</sup>*, such that for all x*<sup>0</sup> <sup>∈</sup> <sup>Γ</sup>*α*<sup>ˆ</sup> *and all initial control trajectories <sup>u</sup>*¯ (0) <sup>0</sup> ∈ U[0,*T*] *satisfying* <sup>Δ</sup>*J*(0)(*x*0) <sup>≤</sup> *<sup>p</sup>*−*N*Δˆ*J the following holds:*


The proof of Theorem 2 consists of several intermediate lemmas and steps that are given in details in Graichen & Kugi (2010). The statements 1-4 in Theorem 2 summarize several important points that deserve some comments.

<sup>2</sup> A simple example is the function *f*(*x*) = *x*<sup>2</sup> + 10 sin<sup>2</sup> *x* with the global minimum *f*(*x*∗) = 0 at *x*<sup>∗</sup> = 0. Let *<sup>x</sup>* be restricted to the interval *<sup>x</sup>* <sup>∈</sup> [−5, 5]. Clearly, the quadratic growth property <sup>1</sup> <sup>2</sup> |*x* − *x*∗| <sup>2</sup> <sup>≤</sup> *f*(*x*) − *f*(*x*∗) is satisfied for *x* ∈ [−5, 5] although *f*(*x*) is not convex on this interval.

Nonlinear Model Predictive Control 9

A Real-Time Gradient Method for Nonlinear Model Predictive Control 17

The functions *Hx* and *Vx* denote the partial derivatives of *H* and *V* with respect to *x*. The minimization condition (24) also allows one to conclude that the partial derivative *Hu* =

<sup>T</sup> of the Hamiltonian with respect to the control *u* = [*u*1,..., *um*]

(*τ*) = *u*− *i*

(*τ*) ∈ (*u*<sup>−</sup>

(*τ*) = *u*<sup>+</sup> *i*

The adjoint dynamics in (23) possess *n* terminal conditions which is due to the free endpoint formulation of OCP (3). This property is taken advantage of by the gradient method, which solves the canonical BVP (22)-(23) iteratively forward and backward in time. Table 1

in order to achieve the maximum possible descent of the cost functional (3a). The function

(0)

(*j*)

*<sup>k</sup>* (*τ*), *u*¯

(*j*+1)

*<sup>k</sup>* (*τ*)), *x*¯

*<sup>k</sup>* (*τ*)), *<sup>λ</sup>*¯ (*j*)

(*j*)

(*j*) *<sup>k</sup> s*¯ (*j*) *<sup>k</sup>* (*τ*) �

*<sup>k</sup>* )| ≤ *εJ*. Otherwise set *j* ← *j* + 1 and return to 2).

*<sup>k</sup>* (*τ*)), *x*¯

*<sup>k</sup>* by (approximately) solving the line search problem

(0)

*<sup>k</sup>* (*T*) = *Vx*(*x*¯

(*j*) *<sup>k</sup>* ) �

(*j*+1)

*<sup>i</sup>* , *<sup>u</sup>*<sup>+</sup> *<sup>i</sup>* ),

*<sup>k</sup>* (*τ*), *τ* ∈ [0, *T*] is the direction of improvement for the current

*<sup>k</sup>* is computed in the subsequent line search problem (28)

*k*,*i*

*k*,*i*

*k*,*i*

<sup>T</sup> has to satisfy

*i* = 1, . . . , *m* , *τ* ∈ [0, *T*] .

*<sup>k</sup>* (0) = *xk* (25)

(*j*)

*<sup>k</sup>* (*τ*)), *τ* ∈ [0, *T*] (27)

*<sup>k</sup>* (0) = *xk* (30)

*<sup>k</sup>* (*T*)) (26)

(28)

(29)

[*Hu*,1,..., *Hu*,*m*]

*Hu*,*i*(*x*¯ ∗ *<sup>k</sup>* (*τ*), *<sup>λ</sup>*¯ <sup>∗</sup>

The search direction *s*¯

1) **Initialization for** *j* = 0 **:**

– Integrate forward in time

(*j*)

control *u*¯

*<sup>k</sup>* (*τ*), *u*¯ ∗ *<sup>k</sup>* (*τ*))

(*j*)

– Set convergence tolerance *ε<sup>J</sup>* (e.g. *ε<sup>J</sup>* = 10−6)

*x*¯˙ (0)

*<sup>k</sup>* (*τ*) = −*Hx*(*x*¯

*s*¯ (*j*)

– Compute the new control trajectory

*x*¯˙ (*j*+1)

*<sup>k</sup>* ) − *J*(*xk*, *u*¯

(*j*+1)

– Choose initial control trajectory *u*¯

2) **Gradient step: While** *j* ≤ *N* **Do** – Integrate backward in time ¯˙ *<sup>λ</sup>*(*j*)

– Compute the search direction

– Compute the step size *<sup>α</sup>*(*j*)

– Integrate forward in time

– Quit if |*J*(*xk*, *u*¯

*<sup>k</sup>* (*τ*). The step size *<sup>α</sup>*(*j*)

⎧ ⎪⎨

> 0 if *u*¯∗

= 0 if *u*¯∗

< 0 if *u*¯∗

(0) *k* ∈ U[0,*T*]

> (0) *<sup>k</sup>* (*τ*), *u*¯

(*j*) *<sup>k</sup>* (*τ*), *<sup>λ</sup>*¯ (*j*)

*<sup>k</sup>* <sup>=</sup> arg min *<sup>α</sup>*><sup>0</sup> *<sup>J</sup>*

(*j*+1) *<sup>k</sup>* (*τ*), *u*¯

*<sup>k</sup>* (*τ*), *u*¯

� *xk*, *ψ*(*u*¯ (*j*) *<sup>k</sup>* + *αs*¯

� *u*¯ (*j*) *<sup>k</sup>* (*τ*) + *α*

*<sup>k</sup>* (*τ*) = *f*(*x*¯

(*j*) *<sup>k</sup>* (*τ*), *<sup>λ</sup>*¯ (*j*)

*<sup>k</sup>* (*τ*) = −*Hu*(*x*¯

*α* (*j*)

*u*¯ (*j*+1) *<sup>k</sup>* (*τ*) = *ψ*

*<sup>k</sup>* (*τ*) = *f*(*x*¯

Table 1. Gradient projection method for solving OCP (3).

(*j*)

⎪⎩

summarizes the algorithm of the gradient (projection) method.


#### **4. Gradient projection method**

The efficient numerical implementation of the MPC scheme is of importance to guarantee the real-time feasibility for fast dynamical systems. This section describes the well-known gradient projection in optimal control as well as its suboptimal implementation in the context of MPC.

#### **4.1 Optimality conditions and algorithm**

The MPC formulation without terminal constraints has particular advantages for the structure of the optimality conditions of the OCP (3). To this end, we define the Hamiltonian

$$H(\mathbf{x}, \lambda, \mu) = l(\mathbf{x}, \mu) + \lambda^{\mathsf{T}} f(\mathbf{x}, \mu) \tag{21}$$

with the adjoint state *<sup>λ</sup>* <sup>∈</sup> **<sup>R</sup>***n*. Pontryagin's Maximum Principle <sup>3</sup> states that if *<sup>u</sup>*¯<sup>∗</sup> *<sup>k</sup>* (*τ*), *τ* ∈ [0, *T*] is an optimal control for OCP (3), then there exists an adjoint trajectory *λ*¯ <sup>∗</sup> *<sup>k</sup>* (*τ*), *τ* ∈ [0, *T*] such that *x*¯∗ *<sup>k</sup>* (*τ*) und *<sup>λ</sup>*¯ <sup>∗</sup> *<sup>k</sup>* (*τ*) satisfy the canonical boundary value problem (BVP)

$$\dot{\mathfrak{x}}\_k^\*(\tau) = f(\mathfrak{x}\_k^\*(\tau), \mathfrak{u}\_k^\*(\tau)), \tag{22}$$

$$
\dot{\bar{\lambda}}\_k^\*(\tau) = -H\_\mathbf{x}(\bar{\mathfrak{x}}\_k^\*(\tau), \bar{\lambda}\_k^\*(\tau), \mathfrak{u}\_k^\*(\tau)), \quad \bar{\lambda}\_k^\*(T) = V\_\mathbf{x}(\bar{\mathfrak{x}}\_k^\*(T)) \tag{23}
$$

and *u*¯∗ *<sup>k</sup>* (*τ*) minimizes the Hamiltonian for all times *τ* ∈ [0, *T*], i.e.

$$H(\mathfrak{x}\_k^\*(\tau), \bar{\lambda}\_k^\*(\tau), \tilde{u}\_k^\*(\tau)) \le H(\mathfrak{x}\_k^\*(\tau), \bar{\lambda}\_k^\*(\tau), u), \quad \forall \, u \in [u^-, u^+], \quad \forall \, \tau \in [0, T]. \tag{24}$$

<sup>3</sup> The general formulation of Pontryagin's Maximum Principle often uses the Hamiltonian definition *H*(*x*, *λ*, *u*, *λ*0) = *λ*<sup>0</sup> *l*(*x*, *u*) + *λ*<sup>T</sup> *f*(*x*, *u*), where *λ*<sup>0</sup> accounts for "abnormal" problems as, for instance, detailed in Hsu & Meyer (1968). Typically, *λ*<sup>0</sup> is set to *λ*<sup>0</sup> = 1, which corresponds to the definition (21).

The functions *Hx* and *Vx* denote the partial derivatives of *H* and *V* with respect to *x*. The minimization condition (24) also allows one to conclude that the partial derivative *Hu* = [*Hu*,1,..., *Hu*,*m*] <sup>T</sup> of the Hamiltonian with respect to the control *u* = [*u*1,..., *um*] <sup>T</sup> has to satisfy

$$\left\{ H\_{\boldsymbol{u},i}(\vec{x}\_k^\*(\tau), \vec{\lambda}\_k^\*(\tau), \vec{u}\_k^\*(\tau)) \right\} = 0 \quad \text{if} \quad \vec{u}\_{k,i}^\*(\tau) = u\_i^- \\ \quad = 0 \quad \text{if} \quad \vec{u}\_{k,i}^\*(\tau) \in \left( u\_i^-, u\_i^+ \right), \quad i = 1, \dots, m, \quad \tau \in \left[ 0, T \right].$$

$$\begin{cases} \end{cases}$$

The adjoint dynamics in (23) possess *n* terminal conditions which is due to the free endpoint formulation of OCP (3). This property is taken advantage of by the gradient method, which solves the canonical BVP (22)-(23) iteratively forward and backward in time. Table 1 summarizes the algorithm of the gradient (projection) method.

The search direction *s*¯ (*j*) *<sup>k</sup>* (*τ*), *τ* ∈ [0, *T*] is the direction of improvement for the current control *u*¯ (*j*) *<sup>k</sup>* (*τ*). The step size *<sup>α</sup>*(*j*) *<sup>k</sup>* is computed in the subsequent line search problem (28) in order to achieve the maximum possible descent of the cost functional (3a). The function

#### 1) **Initialization for** *j* = 0 **:**

8 Will-be-set-by-IN-TECH

1. The reduced size of Γ*α*<sup>ˆ</sup> compared to Γ*<sup>α</sup>* is the necessary "safety" margin to account for

2. An interesting fact is that it can still be guaranteed that Γ*α*<sup>ˆ</sup> is at least as large as the CLF region *S<sup>β</sup>* provided that the horizon time *T* satisfies a lower bound that depends on the quadratic estimates (5) of the integral and terminal cost functions. It is apparent from the

the integral cost function *l*(*x*, *u*), the larger this bound on the horizon length *T* will be. 3. The minimum number of iterations *N*ˆ for which stability can be guaranteed ensures – roughly speaking – that the numerical speed of convergence is faster than the system dynamics. In the proof of the theorem (Graichen & Kugi, 2010), the existence of the lower bound *N*ˆ is shown by means of Lipschitz estimates, which usually are too conservative to be used for design purposes. For many practical problems, however, one or two iterations

per MPC step are sufficient to ensure stability and a good control performance.

4. The exponential reduction of the optimization error Δ*J*(*N*)(*xk*) follows as part of the proof of stability and reveals the incremental improvement of the suboptimal MPC scheme over

The efficient numerical implementation of the MPC scheme is of importance to guarantee the real-time feasibility for fast dynamical systems. This section describes the well-known gradient projection in optimal control as well as its suboptimal implementation in the context

The MPC formulation without terminal constraints has particular advantages for the structure

*<sup>k</sup>* (*τ*) satisfy the canonical boundary value problem (BVP)

*<sup>k</sup>* (*τ*), *u*¯ ∗

*<sup>k</sup>* (*τ*)), *x*¯

<sup>3</sup> The general formulation of Pontryagin's Maximum Principle often uses the Hamiltonian definition *H*(*x*, *λ*, *u*, *λ*0) = *λ*<sup>0</sup> *l*(*x*, *u*) + *λ*<sup>T</sup> *f*(*x*, *u*), where *λ*<sup>0</sup> accounts for "abnormal" problems as, for instance, detailed in Hsu & Meyer (1968). Typically, *λ*<sup>0</sup> is set to *λ*<sup>0</sup> = 1, which corresponds to the definition (21).

*<sup>k</sup>* (*τ*)), *<sup>λ</sup>*¯ <sup>∗</sup>

of the optimality conditions of the OCP (3). To this end, we define the Hamiltonian

with the adjoint state *<sup>λ</sup>* <sup>∈</sup> **<sup>R</sup>***n*. Pontryagin's Maximum Principle <sup>3</sup> states that if *<sup>u</sup>*¯<sup>∗</sup>

is an optimal control for OCP (3), then there exists an adjoint trajectory *λ*¯ <sup>∗</sup>

of attraction Γ*α*<sup>ˆ</sup> together with an admissible upper bound on the optimization error

*ml* that the more dominant the terminal cost *V*(*x*) is with respect to

*H*(*x*, *λ*, *u*) = *l*(*x*, *u*) + *λ*<sup>T</sup> *f*(*x*, *u*) (21)

*<sup>k</sup>* (*T*) = *Vx*(*x*¯

∗

(*xk*). Thus, the domain

*<sup>k</sup>* (*τ*), *τ* ∈ [0, *T*]

*<sup>k</sup>* (*τ*), *τ* ∈ [0, *T*] such

*<sup>k</sup>* (*T*)) (23)

*<sup>k</sup>* (0) = *xk* (22)

∗

*<sup>k</sup>* (*τ*), *<sup>u</sup>*), <sup>∀</sup> *<sup>u</sup>* <sup>∈</sup> [*u*−, *<sup>u</sup>*+] , <sup>∀</sup> *<sup>τ</sup>* <sup>∈</sup> [0, *<sup>T</sup>*] . (24)

the suboptimality of the trajectories (6) characterized by Δ*J*(*N*)

guarantees the reachability of the CLF region *Sβ*.

*mV* <sup>−</sup> <sup>1</sup>) *MV*

bound *<sup>T</sup>* <sup>≥</sup> ( <sup>4</sup>*MV*

the MPC runtime.

of MPC.

that *x*¯∗

and *u*¯∗

*<sup>k</sup>* (*τ*) und *<sup>λ</sup>*¯ <sup>∗</sup>

*H*(*x*¯ ∗ *<sup>k</sup>* (*τ*), *<sup>λ</sup>*¯ <sup>∗</sup>

**4. Gradient projection method**

**4.1 Optimality conditions and algorithm**

*x*¯˙∗

¯˙ *λ*∗

> *<sup>k</sup>* (*τ*), *u*¯ ∗

*<sup>k</sup>* (*τ*) = *f*(*x*¯

*<sup>k</sup>* (*τ*) = −*Hx*(*x*¯

∗ *<sup>k</sup>* (*τ*), *u*¯ ∗

*<sup>k</sup>* (*τ*)) ≤ *H*(*x*¯

∗ *<sup>k</sup>* (*τ*), *<sup>λ</sup>*¯ <sup>∗</sup>

*<sup>k</sup>* (*τ*) minimizes the Hamiltonian for all times *τ* ∈ [0, *T*], i.e.

∗ *<sup>k</sup>* (*τ*), *<sup>λ</sup>*¯ <sup>∗</sup>


$$
\dot{\mathfrak{x}}\_k^{(0)}(\tau) = f(\mathfrak{x}\_k^{(0)}(\tau), \mathfrak{u}\_k^{(0)}(\tau)), \quad \mathfrak{x}\_k^{(0)}(0) = \mathfrak{x}\_k \tag{25}
$$
