**4.1 Artificial neural network for approximation**

196 Introduction to PID Controllers – Theory, Tuning and Application to Frontier Areas

Suppose conventional feedback control loop with discrete PID controller (7) and controlled

*U*(z-1) *YS*(z-1 *E* ) (z-1 *R*(z ) -1) **+**

1 1 1

*Rz Az Pz Bz Qz* 

( ) ( )( )

*Y z Bz Qz*

Denominator of Z – transfer function (8) is the characteristic polynomial

1 11 1 1

( ) ( )( ) ( )( )

It is well known that dynamics of the closed loop behaviour is defined by the characteristic polynomial (9). It has three tuneable variables which are PID controller parameters *q*0, *q*1, *q*2. The roots of the polynomial (9) are responsible for control dynamics and one can assign

Thus, discrete PID controller tuning using Pole Assignment means choosing desired control dynamics (desired definition of characteristic polynomial) and subsequent computing of

Let us show an example: suppose we need control dynamics defined by characteristic polynomial (10), where *d*1, *d*2, … are integers (there are many ways how to choose those parameters, one of them is introduced in the case study at the end of this contribution).

1 12

1 2 11 1 1

So we have to solve Diophantine equation (11) to obtain all controller parameters.

Comprehensive foundation to pole assignment technique is described in (Hunt, 1993).

The tuning technique described in section 3 requires linear model of controlled system in form of Z – transfer function. If controlled system is highly nonlinear process, linear model has to be updated continuously with operating point shifting. Except some classical techniques of continuous linearization (Gain Scheduling, Recurrent Least Squares Method, …), there has

If any solution exists, it provides us expected set of controller parameters.

**4. Continuous linearization using artificial neural network** 

those roots (so called poles) (see Fig. 5) by suitable tuning of the parameters *q*0, *q*1, *q*2.

*B*(z-1) *A*(z-1)

SYSTEM

1 11 1 1 *Dz Az Pz Bz Qz* ( ) ( )( ) ( )( ) (9)

1 2 *Dz dz dz* ( )1 (10)

1 2 <sup>1</sup> *dz dz Az Pz Bz Qz* ( )( ) ( )( ) (11)

(8)

**3. Discrete PID controller tuning using Pole Assignment technique** 

system described by nominator *B*(*z*-1) and denominator *A*(*z*-1) – see Fig. 4.

*Q*(z-1) *P*(z-1)

DISCRETE PID CONTROLLER

Fig. 4. Feedback control loop with discrete PID controller

Then, Z – transfer function of closed control loop is

discrete PID controller parameters.

**-**

According to Kolmogorov's superposition theorem, any real continuous multidimensional function can be evaluated by sum of real continuous one-dimensional functions (Hecht-Nielsen, 1987). If the theorem is applied to artificial neural network (ANN), it can be said that any real continuous multidimensional function can be approximated by certain threelayered ANN with arbitrary precision. Topology of that ANN is depictured in Fig. 6. Input layer brings external inputs *x*1, *x*2, …, *xP* into ANN. Hidden layer contains *S* neurons, which process sums of weighted inputs using continuous, bounded and monotonic activation function. Output layer contains one neuron, which processes sum of weighted outputs from hidden neurons. Its activation function has to be continuous and monotonic.

Fig. 5. The effect of characteristic polynomial poles to the control dynamics

So ANN in Fig. 6 takes *P* inputs, those inputs are processed by *S* neurons in hidden layer and then by one output neuron. Dataflow between input *i* and hidden neuron *j* is gained by weight *w*<sup>1</sup> *j,i*. Dataflow between hidden neuron *k* and output neuron is gained by weight *w*<sup>2</sup> <sup>1</sup>*,k*. Output of the network can be expressed by following equations.

$$\left(y\_a^{-1}\right)\_j = \sum\_{i=1}^p w^{1}\_{\;/j,i} \cdot x\_i + w^{1}\_{\;j} \tag{12}$$

$$\mathbf{y}^{1} \; \mathbf{y}^{1} \; \! \; \mathbf{y}^{1} \; \! \; \mathbf{y}^{1} \; \! \left( \mathbf{y}^{1} \; \! \; \mathbf{y} \right) \tag{13}$$

Discrete PID Controller Tuning Using Piecewise-Linear Neural Network 199

Mentioned theorem does not define how to set number of hidden neurons or how to tune weights. However, there have been published many papers which are focused especially on gradient training methods (Back-Propagation Gradient Descend Alg.) or derived methods

System identification means especially a procedure which leads to dynamic model of the system. ANN is used widely in system identification because of its outstanding approximation qualities. There are several ways to use ANN for system identification. One of them assumes that the system to be identified (with input *u* and output *yS*) is determined

( ) [ ( 1), , ( ), ( 1), , ( )], *SS S y k y k y k n uk uk m m n*

In equation (18), *ψ*(.) is nonlinear function, *k* is discrete time (formally better would be *k·T*)

The aim of the identification is to design ANN which approximates nonlinear function *ψ*(.).

( ) [ ( 1), , ( ), ( 1), , ( )], ˆ *yMM M k y k y k n uk uk m m n*

model is shown in Fig. 7. It is obvious that ANN in Fig. 7 has to be trained to provide *yM* as close to *yS* as possible. Existence of such a neural network is guaranteed by Kolmogorov's superposition theorem and whole process of neural model design is described in detail in

*z*-1

As mentioned in section 4.1, there is recommended to use hyperbolic tangent activation function for neurons in hidden layer and identical activation function for output neuron in

(18)

(19)

ˆ represents well trained ANN and *yM* is its output. Formal scheme of neural

*z*-1

*yM*(*k*)

(Levenberg-Marquardt Alg.) – see (Haykin, 1994).

Then, neural model can be expressed by (eq. 19).

*z*-1

*u*(*k*)

Fig. 7. Formal scheme of neural model

*z*-1

**4.3 Piecewise-linear neural model for discrete PID controller tuning** 

and *n* is difference equation order.

(Haykin, 1994) or (Nguyen et al., 2003).

In (Eq. 19),

**4.2 System identification by artificial neural network** 

by the following nonlinear discrete-time difference equation.

2

*<sup>a</sup>* <sup>1</sup> *y y* (17)

$$\left\|\left.y\_{a}\right\|\_{1}^{2}=\sum\_{i=1}^{S}w\_{\,1,i}^{2}\cdot y\_{\,i}^{1}+w\_{\,1}^{2}\right.\tag{14}$$

$$
\delta y = \phi^2 \left( y\_{a^{-1}}^{-2} \right) \tag{15}
$$

In equations above, *φ*1(.) means activation functions of hidden neurons and *φ*2(.) means output neuron activation function.

Fig. 6. Three-layered ANN

As it has been mentioned, there are some conditions applicable for activation functions. To satisfy those conditions, there is used mostly hyperbolic tangent activation function (Eq. 16) for neurons in hidden layer and identical activation function (Eq. 17) for output neuron.

$$\{y^1\} = \tanh\left(y^1\_{a^1j}\right) \tag{16}$$

11 1 , 1

(12)

(14)

∑

<sup>1</sup> *y*<sup>1</sup>

*w*2 1,1

<sup>2</sup> *w*<sup>2</sup> 1,2

> *w*2 1,*S*

*w*2 1

> *ya* 2 <sup>1</sup> *y*<sup>2</sup>

*y*

Output layer

*<sup>j</sup>* tanh *<sup>a</sup> <sup>j</sup> y y* (16)

(13)

(15)

*a j j i i j*

 1 11 *<sup>j</sup> <sup>a</sup> <sup>j</sup> y y* 

2 212 1 1, 1

In equations above, *φ*1(.) means activation functions of hidden neurons and *φ*2(.) means

*y*1 1

 2 2 *<sup>a</sup>* <sup>1</sup> *y y* 

*y w yw* 

*P*

*i y w xw* 

1

∑

*w*1 1

> *ya* 1 1

*w*1 2

> *ya* 1 2

∑

∑

layer Hidden

*w*1 *S*

> *ya* 1 *S*

> > *y*1 *S*

As it has been mentioned, there are some conditions applicable for activation functions. To satisfy those conditions, there is used mostly hyperbolic tangent activation function (Eq. 16) for neurons in hidden layer and identical activation function (Eq. 17) for output

1 1

layer

output neuron activation function.

*x*1

*x*2

*x*3

*xP*

Fig. 6. Three-layered ANN

neuron.

*w*1 1,1

> *w*1 *S*,*P*

Input

*S a ii i*

$$y = y\_a \, ^2{\,}\_1 \tag{17}$$

Mentioned theorem does not define how to set number of hidden neurons or how to tune weights. However, there have been published many papers which are focused especially on gradient training methods (Back-Propagation Gradient Descend Alg.) or derived methods (Levenberg-Marquardt Alg.) – see (Haykin, 1994).

#### **4.2 System identification by artificial neural network**

System identification means especially a procedure which leads to dynamic model of the system. ANN is used widely in system identification because of its outstanding approximation qualities. There are several ways to use ANN for system identification. One of them assumes that the system to be identified (with input *u* and output *yS*) is determined by the following nonlinear discrete-time difference equation.

$$\{y\_S(k) = \nu\{y\_S(k-1), \dots, y\_S(k-n), \mu(k-1), \dots, \mu(k-m)\}, m \le n \tag{18}$$

In equation (18), *ψ*(.) is nonlinear function, *k* is discrete time (formally better would be *k·T*) and *n* is difference equation order.

The aim of the identification is to design ANN which approximates nonlinear function *ψ*(.). Then, neural model can be expressed by (eq. 19).

$$y\_M(k) = \hat{\psi}\left[y\_M(k-1), \dots, y\_M(k-n), \mu(k-1), \dots, \mu(k-m)\right], m \le n \tag{19}$$

In (Eq. 19), ˆ represents well trained ANN and *yM* is its output. Formal scheme of neural model is shown in Fig. 7. It is obvious that ANN in Fig. 7 has to be trained to provide *yM* as close to *yS* as possible. Existence of such a neural network is guaranteed by Kolmogorov's superposition theorem and whole process of neural model design is described in detail in (Haykin, 1994) or (Nguyen et al., 2003).

Fig. 7. Formal scheme of neural model

#### **4.3 Piecewise-linear neural model for discrete PID controller tuning**

As mentioned in section 4.1, there is recommended to use hyperbolic tangent activation function for neurons in hidden layer and identical activation function for output neuron in

Discrete PID Controller Tuning Using Piecewise-Linear Neural Network 201

1 1, ,1

*a w zw*

*a w zw*

*b w zw*

*b w zw*

*cw w z z w w*

Thus, difference equation (22) defines ANN output and it is linear in some neighbourhood of actual state (in that neighbourhood, where saturation vector **z** stays constant). Difference

In other words, if the neural model of any nonlinear system in form of Fig. 9 is designed, then it is simple to determine parameters of linear difference equation which approximates

> *y*1 1

1

1

*i ii*

 2 1 2 1, ,2

 2 1 1 1, ,3

 2 1 2 1, ,4

 2 2 2 1 1 1, 1,

1

*ii i i i*

*w*<sup>2</sup> 1,1

1,2

*w*<sup>2</sup> 1,*S*

*y*1 *S* ∑

*w*<sup>2</sup> 1

> *ya* 2 <sup>1</sup> *y*<sup>2</sup> 1

*<sup>y</sup> <sup>y</sup>*<sup>1</sup> *<sup>M</sup>*(*k*) <sup>2</sup> *w*<sup>2</sup>

Output Layer

*i ii*

*i ii*

1

1

*i ii*

1

1

*i*

*S*

1

1

*i*

1

∑

*w*<sup>1</sup> 1

> *ya* 1 1

*w*<sup>1</sup> 2

> *ya* 1 2

∑

∑

Layer Hidden

*w*<sup>1</sup> *S*

> *ya* 1 *S*

Layer

*i*

equation (22) can be clearly extended into any order.

*w*<sup>1</sup> 1,1

*z*-1

*z*-2

*u*(*k*-1)

*u*(*k*-2)

Fig. 9. Piecewise-linear neural model

*yM*(*k*-2)

*yM*(*k*-1)

*z*-2

*z*-1

*u*(*k*)

*w*<sup>1</sup> *S*,4

Input

*S*

*i*

*S*

*i*

*S*

*S*

where 2 1

ANN used in neural model. However, if linear saturated activation function (Eq. 20) is used instead, ANN features stay similar because of resembling courses of both activation functions (see Fig. 8).

$$\begin{array}{ll} \begin{array}{ll} \end{array} \begin{array}{lcl} \text{1} & \text{for} & y\_{a}^{1}{}\_{j} > 1 \\\\ y\_{a}^{1}{}\_{j} & \text{for} & -1 \leq y\_{a}^{1}{}\_{j} \leq 1 \\\\ -1 & \text{for} & y\_{a}^{1}{}\_{j} < -1 \end{array} \end{array} \tag{20}$$

Fig. 8. Activation functions comparison

The output of linear saturated activation function is either constant or equal to input so neural model which uses ANN with linear saturated activation functions in hidden neurons acts as piecewise-linear model. One linear submodel turns to another when any hidden neuron becomes saturated or becomes not saturated.

Let us presume an existence of a dynamical neural model which uses ANN with linear saturated activation functions in hidden neurons and identic activation function in output neuron – see Fig. 9. Let us also presume *m = n =* 2 for making process plainer. ANN output can be computed using Eqs. (12), (13), (14), (15). However, another way for ANN output computing is useful. Let us define saturation vector **z** of *S* elements. This vector indicates saturation states of hidden neurons – see (Eq. 21).

$$z\_i = \begin{cases} -1 & \text{for} \quad y^{1}\_i > 1 \\ 0 & \text{for} \quad -1 \le y^{1}\_i \le 1 \\ -1 & \text{for} \quad y^{1}\_i < -1 \end{cases} \tag{21}$$

Then, ANN output can be expressed by (Eq. 22).

$$y\_M(k) = -a\_1 \cdot y\_M(k-1) - a\_2 \cdot y\_M(k-2) + b\_1 \cdot u(k-1) + b\_2 \cdot u(k-2) + c \tag{22}$$

ANN used in neural model. However, if linear saturated activation function (Eq. 20) is used instead, ANN features stay similar because of resembling courses of both activation

1

*a j*

1

*a j*

for 1 1

(20)

 

1 1

*y*

*a j a j*

*y y*

1 for 1


*ya*

The output of linear saturated activation function is either constant or equal to input so neural model which uses ANN with linear saturated activation functions in hidden neurons acts as piecewise-linear model. One linear submodel turns to another when any hidden

Let us presume an existence of a dynamical neural model which uses ANN with linear saturated activation functions in hidden neurons and identic activation function in output neuron – see Fig. 9. Let us also presume *m = n =* 2 for making process plainer. ANN output can be computed using Eqs. (12), (13), (14), (15). However, another way for ANN output computing is useful. Let us define saturation vector **z** of *S* elements. This vector indicates

1

*y*

*i*

 

1 for 1 0 for 1 1 1 for 1

1

*y*

*i*

1

*y*

1 2 12 *y k a y k a y k b uk b uk c MM M* ( ) ( 1) ( 2) ( 1) ( 2) (22)

*i*

(21)

*y*

1 for 1

1 *<sup>j</sup> y*

Hyperbolic tangent Linear saturated function

functions (see Fig. 8).


Fig. 8. Activation functions comparison

neuron becomes saturated or becomes not saturated.

saturation states of hidden neurons – see (Eq. 21).

Then, ANN output can be expressed by (Eq. 22).

*<sup>i</sup> z*


0

*y*

0.5

1

where 
$$a\_1 = -\sum\_{i=1}^{S} w^2\_{1,i} \cdot \left(1 - \left|z\_i\right|\right) \cdot w^1\_{i,1}$$
 
$$a\_2 = -\sum\_{i=1}^{S} w^2\_{1,i} \cdot \left(1 - \left|z\_i\right|\right) \cdot w^1\_{i,2}$$
 
$$b\_1 = \sum\_{i=1}^{S} w^2\_{1,i} \cdot \left(1 - \left|z\_i\right|\right) \cdot w^1\_{i,3}$$
 
$$b\_2 = \sum\_{i=1}^{S} w^2\_{1,i} \cdot \left(1 - \left|z\_i\right|\right) \cdot w^1\_{i,4}$$
 
$$c = w^2\_1 + \sum\_{i=1}^{S} \left(w^2\_{1,i} \cdot z\_i + \left(1 - \left|z\_i\right|\right) \cdot w^2\_{1,i} \cdot w^1\_{i,i}\right)$$

Thus, difference equation (22) defines ANN output and it is linear in some neighbourhood of actual state (in that neighbourhood, where saturation vector **z** stays constant). Difference equation (22) can be clearly extended into any order.

In other words, if the neural model of any nonlinear system in form of Fig. 9 is designed, then it is simple to determine parameters of linear difference equation which approximates

Fig. 9. Piecewise-linear neural model

system behaviour in some neighbourhood of actual state. This difference equation can be used then to the actual control action setting due to many of classical or modern control techniques.

In following examples, discrete PID controller with parameters tuned according to algorithm introduced in paragraph 3 is studied. As it is mentioned above, controlled system discrete model in form of Z – transfer function is required. So first, difference equation (22) should be transformed in following way. Let us define

$$
\tilde{\mu}(k) = \mu(k) - \mu\_0 \tag{23}
$$

Discrete PID Controller Tuning Using Piecewise-Linear Neural Network 203

7. Determine discrete PID controller parameters by solving of (Eq. 11) where *A*(*z*-1) and *B*(*z*-1) are denominator and nominator of Z – transfer function (26), respectively.

Discrete PID controller tuned continuously by technique introduced above is applied now to control of two nonlinear systems. Both of them are compiled by a combination of nonlinear

*u\**(*t*) *yS u* (*t*) (*t*)

The static element of the first demo system is defined by (Eq. 27) and dynamical system is

( ) ( ) 10 \* ( ) *dy t <sup>y</sup> t ut dt*

Control loop is designed as shown in paragraph 5. At first, dynamical piecewise-linear neural model in shape of Fig. 9 is created. This procedure involves training and testing set acquisition, neural network training and pruning and neural model validating. As this sequence of processes is illustrated closely in many other publications (Haykin, 1994), (Nguyen, 2003) it is not referred here in detail. Briefly, training set is gained by controlled system excitation by set of step functions with various amplitudes while both *u* and *yS* are measured (sampling interval *T* = 1 s) – see Fig. 13. Then, order of the neural model is set: *n =* 1 (Eq. 19) because the controlled system is first order one, too. After that, artificial neural network is trained by Backpropagation Gradient Descent Algorithm repeatedly (see Fig. 14) while pruning is applied – optimal neural network topology is determined as two inputs,

2 () <sup>2</sup> \*( ) <sup>1</sup> <sup>1</sup> *u t u t e* 

LINEAR DYMANICAL ELEMENT

3

(27)

(28)

1. Create neural model of controlled plant in form of Fig. 9.

6. Transform (Eq. 22) into Z – transfer function (26).

static part and linear dynamical system – see Fig. 11.

NONLINEAR STATIC ELEMENT

Graphic characteristics of the system are shown in Fig. 12.

5. Determine the parameters *ai*, *bi* and *c* of difference equation (22).

8. Determine *u k* ( ) using discrete PID controller tuned in previous step. 9. Transform *u k* ( ) into *u*(*k*) using (Eq. 23) and perform control action.

Introduced algorithm is suitable to control of highly nonlinear systems, especially.

2. Determine polynomial *D*(*z*-1) of (10).

4. Measure system output *yS*(*k*).

3. Set *k* = 0.

10. *k = k +* 1, go to 4.

Fig. 11. System to control

**6.1 First order nonlinear system** 

defined by differential equation (28).

**6. Case study** 

where *u*0 is constant. Then, (Eq. 22) turns into

$$y\_M(k) = -a\_1 \cdot y\_M(k-1) - a\_2 \cdot y\_M(k-2) + b\_1 \cdot \tilde{\mathfrak{u}}(k-1) + b\_2 \cdot \tilde{\mathfrak{u}}(k-2) + c + (b\_1 + b\_2) \cdot \mathfrak{u}\_0 \tag{24}$$

Equation (24) becomes constant term free, if (Eq. 25) is satisfied.

$$
\mu\_0 = -\frac{c}{b\_1 + b\_2} \tag{25}
$$

In Z domain, model (24) witch respect to (Eq. 25) is defined by Z – transfer function (26).

$$\frac{Y\_M(z^{-1})}{\tilde{M}(z^{-1})} = \frac{b\_1 z^{-1} + b\_2 z^{-2}}{1 + a\_1 z^{-1} + a\_2 z^{-2}}\tag{26}$$
