We are IntechOpen, the world's leading publisher of Open Access books Built by scientists, for scientists

4,000+

Open access books available

116,000+

International authors and editors

120M+

Downloads

Our authors are among the

Top 1%

most cited scientists

12.2%

Contributors from top 500 universities

Selection of our books indexed in the Book Citation Index in Web of Science™ Core Collection (BKCI)

## Interested in publishing with us? Contact book.department@intechopen.com

Numbers displayed above are based on latest data collected. For more information visit www.intechopen.com

## **Meet the editor**

Dr Ginalber Serra was born in São Luis, Brazil, in 1976. He received the B.Sc. and M. Sc. degrees in electrical engineering from Federal University of Maranhão, Maranhão, Brazil, in 1999 and 2001, respectively. He received the Ph.D. degree in electrical engineering from State University of Campinas (UNICAMP), Campinas, Brazil, on September 2005. He finished his Postdoc-

toral research on multivariable neuro-fuzzy adaptive control of nonlinear systems, Department of Machines, Components and Intelligent Systems, State University of Campinas, Campinas, Brazil, on September 2006. From 2006 to 2007, he was a researcher of the Department of Electrical Engineering, University of Santiago, Chile. He is currently a professor and head of research group in Computational Intelligence Applied to Technology, Federal Institute of Education, Sciences and Technology (IFMA), Maranhão, Brazil. Dr. Serra has served as reviewer for many prestigious international journals and conferences. His research interest includes topics on fuzzy systems, neural networks, genetic algorithms, instrumentation for highperformance control, real-time control, signal processing, applications in automation and industrial control.

### Contents

#### **Preface XI**


#### Chapter 8 **Adaptive Coordinated Cooperative Control of Multi-Mobile Manipulators 163**  Víctor H. Andaluz, Paulo Leica, Flavio Roberti, Marcos Toibero and Ricardo Carelli


### Preface

The current control problems present natural trend of increasing its complexity due to performance criteria that is becoming more sophisticated. The necessity of practicers and engineers in dealing with complex dynamic systems has motivated the design of controllers, whose structures are based on multiobjective constraints, knowledge from expert, uncertainties, nonlinearities, parameters that vary with time, time delay conditions, multivariable systems, and others. The classic and modern control theories, characterized by input-output representation and state-space representation, respectively, have contributed for proposal of several control methodologies, taking into account the complexity of the dynamic system. Nowadays, the explosion of new technologies made the use of computational intelligence in the controller structure possible, considering the impacts of Neural Networks, Genetic Algorithms, Fuzzy systems, and others tools inspired in the human intelligence or evolutive behavior. The fusion of classical and modern control theories and the computational intelligence has also promoted new discoveries and important insights for proposal of new advanced control techniques in the context of robust control, adaptive control, optimal control, predictive control and intelligent control. These techniques have contributed to a successful implementations of controllers and obtained great attention from industry and academy to propose new theories and applications on advanced control systems.

In recent years, the control theory has received significant attention from the academy and industry so that researchers still carry on making contribution to this emerging area. In this regard, there is a need to publish a book covering this technology. Although there have been many journal and conference articles in the literature, they often look fragmental and messy, and thus are not easy to follow up. In particular, a rookie who plans to do research in this field can not immediately keep pace to the evolution of these related research issues. This book, Frontiers in Advanced Control Systems, pretends to bring the state-of-art research results on advanced control from both the theoretical and practical perspectives. The fundamental and advanced research results as well as the contributions in terms of the technical evolution of control theory are of particular interest.

Chapter one highlights some aspects on fuzzy model based advanced control systems. The interest in this brief discussion is motivated due to applicability of fuzzy systems

#### X Preface

to represent dynamic systems with complex characteristics such as nonlinearity, uncertainty, time delay, etc., so that controllers, designed based on such models, can ensure stability and robustness of the control system. Finally, experimental results of a case study on adaptive fuzzy model based control of a multivariable nonlinear pH process, commonly found in industrial environment, are presented.

Chapter two brings together cooperative control, reinforcement learning, and game theory to solve multi-player differential games on communication graph topologies. The coupled Riccati equations are developed and stability and solution for Nash equilibrium are proven. A policy iteration algorithm for the solution of graphical games is proposed and its convergence is proven. A simulation example illustrates the effectiveness of the proposed algorithms in learning in real-time, and the solutions of graphical games.

Chapter three presents an application of adaptive neural networks to the estimation of the product compositions in a binary methanol-water continuous distillation column from available temperature measurements. A software sensor is applied to train a neural network model so that a GA performs the search for the optimal dual control law applied to the distillation column. Experimental results of the proposed methodology show the performance of the designed neural network based control system for both set point tracking and disturbance rejection cases.

Chapter four proposes new methods for optimizing the controller's norm, considering different criteria of stability, as well as the inclusion of a decay rate in LMIs formulation. The 3-DOF helicopter practical application shows the advantage of the proposed method regarding implementation cost and required effort on the motors. These characteristics of optimality and robustness make the design methodology attractive from the standpoint of practical applications for systems subject to structural failure, guaranteeing robust stability and small oscillations in the occurrence of faults.

Chapter five presents a study about the stability and control design for switched affine systems. A new theorem for designing switching affine control systems, is proposed. Finally, simulation results involving four types of converters namely Buck, Boost, Buck-Boost and Sepic illustrate the simplicity, quality and usefulness of the proposed methodology.

Chapter six proposes a new method of model based PID controller tuning for a large class of processes (stable processes, processes having oscillatory dynamics, integrating and unstable processes), in a classification plane, to guarantee the desired performance/robustness tradeoff according to parameter plane. Experimental results show the advantage and efficiency of the proposed methodology for the PID control of a real thermal plant by using a look-up table of parameters.

In chapter seven, Bio-inspired Optimization Methods (BiOM) are used for controllers tuning in chemical engineering problems. For this finality, three problems are studied, with emphasis on a realistic application: the control design of heat exchangers on pilot scale. Experimental results show a comparative analysis with classical methods, in the sense of illustrating that the proposed methodology represents an interesting alternative for this purpose.

X Preface

graphical games.

methodology.

to represent dynamic systems with complex characteristics such as nonlinearity, uncertainty, time delay, etc., so that controllers, designed based on such models, can ensure stability and robustness of the control system. Finally, experimental results of a case study on adaptive fuzzy model based control of a multivariable nonlinear pH

Chapter two brings together cooperative control, reinforcement learning, and game theory to solve multi-player differential games on communication graph topologies. The coupled Riccati equations are developed and stability and solution for Nash equilibrium are proven. A policy iteration algorithm for the solution of graphical games is proposed and its convergence is proven. A simulation example illustrates the effectiveness of the proposed algorithms in learning in real-time, and the solutions of

Chapter three presents an application of adaptive neural networks to the estimation of the product compositions in a binary methanol-water continuous distillation column from available temperature measurements. A software sensor is applied to train a neural network model so that a GA performs the search for the optimal dual control law applied to the distillation column. Experimental results of the proposed methodology show the performance of the designed neural network based control

Chapter four proposes new methods for optimizing the controller's norm, considering different criteria of stability, as well as the inclusion of a decay rate in LMIs formulation. The 3-DOF helicopter practical application shows the advantage of the proposed method regarding implementation cost and required effort on the motors. These characteristics of optimality and robustness make the design methodology attractive from the standpoint of practical applications for systems subject to structural failure, guaranteeing robust stability and small oscillations in the occurrence of faults.

Chapter five presents a study about the stability and control design for switched affine systems. A new theorem for designing switching affine control systems, is proposed. Finally, simulation results involving four types of converters namely Buck, Boost, Buck-Boost and Sepic illustrate the simplicity, quality and usefulness of the proposed

Chapter six proposes a new method of model based PID controller tuning for a large class of processes (stable processes, processes having oscillatory dynamics, integrating and unstable processes), in a classification plane, to guarantee the desired performance/robustness tradeoff according to parameter plane. Experimental results show the advantage and efficiency of the proposed methodology for the PID control of

In chapter seven, Bio-inspired Optimization Methods (BiOM) are used for controllers tuning in chemical engineering problems. For this finality, three problems are studied,

process, commonly found in industrial environment, are presented.

system for both set point tracking and disturbance rejection cases.

a real thermal plant by using a look-up table of parameters.

In chapter eight, a novel method for centralized-decentralized coordinated cooperative control of multiple wheeled mobile manipulators, is proposed. In this strategy, the desired motions are specified as a function of cluster attributes, such as position, orientation, and geometry. These attributes guide the selection of a set of independent system state variables suitable for specification, control, and monitoring. The control is based on a virtual 3-dimensional structure, where the position control (or tracking control) is carried out considering the centroid of the upper side of a geometric structure (shaped as a prism) corresponding to a three-mobile manipulators formation. Simulation results show the good performance of proposed multi-layer control scheme.

Chapter nine proposes a Model Predictive Control (MPC) strategy, formulated under a stabilizing control law assuming that this law (underlying input sequence) is present throughout the predictions. The MPC proposed is an Infinite Horizon MPC (IHMPC) that includes an underlying control sequence as a (deficient) reference candidate to be improved for the tracking control. Then, by solving on line a constrained optimization problem, the input sequence is corrected, and so the learning updating is performed.

Chapter ten has its focus on the PID average output feedback controller, implemented in an FPGA, to stabilize the output voltage of a "buck" power converter around a desired constant output reference voltage. Experimental results show the effectiveness of the FPGA realization of the PID controller in the design of switched mode power supplies with efficiency greater than 95%.

Chapter eleven aims at discussing parameter estimation techniques to generate suitable models for predictive controllers. Such discussion is based on the most noticeable approaches in Model Predictive Control (MPC) relevant identification literature. The first contribution to be emphasized is that these methods are described in a multivariable context. Furthermore, the comparisons performed between the presented techniques are pointed as another main contribution, since it provides insights into numerical issues and exactness of each parameter estimation approach for predictive control of multivariable plants.

Chapter twelve presents a contribution for systems identification using Orthonormal Basis Filter (OBF). Considerations are made based on several characteristics that make them very promising for system identification and their application in predictive control scenario.

This book can serve as a bridge between people who are working on the theoretical and practical research on control theory, and facilitate the proposal for development of

#### XII Preface

new control techniques and its applications. In addition, this book presents educational importance to help students and researchers to know the frontiers in control technology. The target audience of this book can be composed of professionals and researchers working in the fields of automation, control and instrumentation. Book can provide to the target audience the state-of-art in control theory from both the theoretical and practical aspects. Moreover, it can serve as a research handbook on the trends in the control theory and solutions for research problems which requires immediate results.

#### **Prof. Ginalber Luiz de Oliveira Serra**

Federal Institute of Education, Sciences and Technology, Brazil

XII Preface

immediate results.

new control techniques and its applications. In addition, this book presents educational importance to help students and researchers to know the frontiers in control technology. The target audience of this book can be composed of professionals and researchers working in the fields of automation, control and instrumentation. Book can provide to the target audience the state-of-art in control theory from both the theoretical and practical aspects. Moreover, it can serve as a research handbook on the trends in the control theory and solutions for research problems which requires

**Prof. Ginalber Luiz de Oliveira Serra**

Brazil

Federal Institute of Education, Sciences and Technology,

## **Highlighted Aspects from Black Box Fuzzy Modeling for Advanced Control Systems Design**

Ginalber Luiz de Oliveira Serra

*Federal Institute of Education, Science and Technology Laboratory of Computational Intelligence Applied to Technology, São Luis, Maranhão Brazil* 

#### **1. Introduction**

This chapter presents an overview of a specific application of computational intelligence techniques, specifically, fuzzy systems: **fuzzy model based advanced control systems design**. In the last two decades, fuzzy systems have been useful for identification and control of complex nonlinear dynamical systems. This rapid growth, and the interest in this discussion is motivated by the fact that the practical control design, due to the presence of nonlinearity and uncertainty in the dynamical system, fuzzy models are capable of representing the dynamic behavior well enough so that the real controllers designed based on such models can garantee, mathematically, stability and robustness of the control system (Åström et al., 2001; Castillo-Toledo & Meda-Campaña, 2004; Kadmiry & Driankov, 2004; Ren & Chen, 2004; Tong & Li, 2002; Wang & Luoh, 2004; Yoneyama, 2004).

Automatic control systems have become an essential part of our daily life. They are applied in an electroelectronic equipment and up to even at most complex problem as aircraft and rockets. There are different control systems schemes, but in common, all of them have the function to handle a dynamic system to meet certain performance specifications. An intermediate and important control systems design step, is to obtain some knowledge of the plant to be controlled, this is, the dynamic behavior of the plant under different operating conditions. If such knowledge is not available, it becomes difficult to create an efficient control law so that the control system presents the desired performance. A practical approach for controllers design is from the mathematical model of the plant to be controlled.

Mathematical modeling is a set of heuristic and/or computational procedures properly established on a real plant in order to obtain a mathematical equation (models) to represent accurately its dynamic behavior in operation. There are three basic approaches for mathematical modeling:

• White box modeling. In this case, such models can be satisfactorily obtained from the physical laws governing the dynamic behavior of the plant. However, this may be a limiting factor in practice, considering plants with uncertainties, nonlinearities, time delay, parametric variations, among other dynamic complexity characteristics. The poor understanding of physical phenomena that govern the plant behavior and the resulting model complexity, makes the white box approach a difficult and time consuming task. In addition, a complete understanding of the physical behavior of a real plant is almost impossible in many practical applications.


The area of mathematical modeling covers topics from linear regression up to sofisticated concepts related to qualitative information from expert, and great attention have been given to this issue in the academy and industry (Abonyi et al., 2000; Brown & Harris, 1994; Pedrycz & Gomide, 1998; Wang, 1996). A mathematical model can be used for:


Modeling techniques are widely used in the control systems design, and successful applications have appeared over the past two decades. There are cases in which the identification procedure is implemented in real time as part of the controller design. This technique, known as adaptive control, is suitable for nonlinear and/or time varying plants. In adaptive control schemes, the plant model, valid in several operating conditions is identified on-line. The controller is designed in accordance to current identified model, in order to garantee the performance specifications. There is a vast literature on modeling and control design (Åström & Wittenmark, 1995; Keesman, 2011; Sastry & Bodson, 1989; Isermann & Münchhof, 2011; Zhu, 2011; Chalam, 1987; Ioannou, 1996; Lewis & Syrmos, 1995; Ljung, 1999; Söderström & Stoica, 1989; Van Overschee & De Moor, 1996; Walter & Pronzato, 1997). Most approaches have a focus on models and controllers described by linear differential or finite differences equations, based on transfer functions or state space representation. Moreover, motivated by the fact that all plant present some type of nonlinear behavior, there are several approaches to analysis, modeling and control of nonlinear plants (Tee et al., 2011; Isidori, 1995; Khalil, 2002; Sjöberg et al., 1995; Ogunfunmi, 2007; Vidyasagar, 2002), and one of the key elements for these applications are the fuzzy systems (Lee et al., 2011; Hellendoorn & Driankov, 1997; Grigorie, 2010; Vukadinovic, 2011; Michels, 2006; Serra & Ferreira, 2011; Nelles, 2011).

#### **2. Fuzzy inference systems**

2 Will-be-set-by-IN-TECH

• Black box modeling. In this case, if such models, from the physical laws, are difficult or even impossible to obtain, is necessary the task of extracting a model from experimental data related to dynamic behavior of the plant. The modeling problem consists in choosing an appropriate structure for the model, so that enough information about the dynamic behavior of the plant can be extracted efficiently from the experimental data. Once the structure was determined, there is the parameters estimation problem so that a quadratic cost function of the approximation error between the outputs of the plant and the model

impossible in many practical applications.

is minimized. This problem is known as

of the white box and black box approaches.

biology, sociology, physics and chemistry);

in critical conditions of health;

In addition, a complete understanding of the physical behavior of a real plant is almost

**systems**

have been proposed for linear and nonlinear plant modeling. A limitation of this approach is that the structure and parameters of the obtained models usually do not have physical

• Gray box modeling. In this case some information on the dynamic behavior of the plant is available, but the model structure and parameters must be determined from experimental data. This approach, also known as hybrid modeling, combines the features

The area of mathematical modeling covers topics from linear regression up to sofisticated concepts related to qualitative information from expert, and great attention have been given to this issue in the academy and industry (Abonyi et al., 2000; Brown & Harris, 1994; Pedrycz

• Analysis and better understanding of phenomena (models in engineering, economics,

• Teaching through simulators for aircraft, plants in the area of nuclear energy and patients

• Control and regulation around some operating point, optimal control and robust control;

Modeling techniques are widely used in the control systems design, and successful applications have appeared over the past two decades. There are cases in which the identification procedure is implemented in real time as part of the controller design. This technique, known as adaptive control, is suitable for nonlinear and/or time varying plants. In adaptive control schemes, the plant model, valid in several operating conditions is identified on-line. The controller is designed in accordance to current identified model, in order to garantee the performance specifications. There is a vast literature on modeling and control design (Åström & Wittenmark, 1995; Keesman, 2011; Sastry & Bodson, 1989; Isermann & Münchhof, 2011; Zhu, 2011; Chalam, 1987; Ioannou, 1996; Lewis & Syrmos, 1995; Ljung, 1999; Söderström & Stoica, 1989; Van Overschee & De Moor, 1996; Walter & Pronzato, 1997). Most approaches have a focus on models and controllers described by linear differential or finite

• Estimate quantities from indirect measurements, where no sensor is available; • Hypothesis testing (fault diagnostics, medical diagnostics and quality control);

meaning and they are not associated to physical variables of the plant.

& Gomide, 1998; Wang, 1996). A mathematical model can be used for:

• Prediction of behavior (adaptive control of time-varying plants);

• Signal processing (cancellation of noise, filtering and interpolation);

**identification**

and several techniques

The theory of fuzzy systems has been proposed by Lotfi A. Zadeh (Zadeh, 1965; 1973), as a way of processing vague, imprecise or linguistic information, and since 1970 presents wide industrial application. This theory provides the basis for knowledge representation and developing the essential mechanisms to infer decisions about appropriate actions to be taken on a real problem. Fuzzy inference systems are typical examples of techniques that make use of human knowledge and deductive process. Its structure allows the mathematical modeling of a large class of dynamical behavior, in many applications, and provides greater flexibility in designing high-performance control with a certain degree of transparency for interpretation and analysis, that is, they can be used to explain solutions or be built from expert knowledge in a particular field of interest. For example, although it does not know the exact mathematical model of an oven, one can describe their behavior as follows: " **IF** is applied more power on the heater **THEN** the temperature increases", where **more** and **increases** are linguistic terms that, while imprecise, they are important information about the behavior of the oven. In fact, for many control problems, an expert can determine a set of efficient control rules based on linguistic descriptions of the plant to be controlled. Mathematical models can not incorporate the traditional linguistic descriptions directly into their formulations. Fuzzy inference systems are powerful tools to achieve this goal, since the logical structure of its **IF** <antecedent proposition> **THEN** <consequent proposition> rules facilitates the understanding and analysis of the problem in question. According to consequent proposition, there are two types of fuzzy inference systems:


#### **2.1 Mamdani fuzzy inference systems**

The Mamdani fuzzy inference system was proposed by E. H. Mamdani (Mamdani, 1977) to capture the qualitative knowledge available in a given application. Without loss of generality, this inference system presents a set of rules of the form:

$$\mathfrak{R}^{\tilde{i}}: \mathbf{IF} \,\tilde{\mathbf{x}}\_{\mathbf{I}} \text{ is } \boldsymbol{F}\_{\boldsymbol{j}|\tilde{\boldsymbol{x}}\_{\mathbf{I}}}^{\boldsymbol{i}} \mathbf{AND} \dots \text{ AND } \tilde{\boldsymbol{x}}\_{\boldsymbol{n}} \text{ is } \boldsymbol{F}\_{\boldsymbol{j}|\tilde{\boldsymbol{x}}\_{\boldsymbol{n}}}^{\boldsymbol{i}} \mathbf{THEN} \tilde{\boldsymbol{y}} \text{ is } \boldsymbol{G}\_{\boldsymbol{j}|\tilde{\boldsymbol{y}}}^{\boldsymbol{i}} \tag{1}$$

In each rule *i* | [*i*=1,2,...,*l*] , where *l* is the number of rules, *x*˜1, *x*˜2,..., *x*˜*<sup>n</sup>* are the linguistic variables of the antecedent (input) and *y*˜ is the linguistic variable of the consequent (output),

defined, respectively, in the own universe of discourse U*x*˜1 ,..., U*x*˜*<sup>n</sup>* e Y. The fuzzy sets *Fi j*|*x*˜1 , *F<sup>i</sup> j*|*x*˜2 ,..., *F<sup>i</sup> <sup>j</sup>*|*x*˜*<sup>n</sup>* <sup>e</sup> *<sup>G</sup><sup>i</sup> j*|*y*˜ , are the linguistic values (terms) used to partition the unierse of discourse of the linguistic variables of antecedent and consequent in the inference system, that is, *F<sup>i</sup> <sup>j</sup>*|*x*˜*<sup>t</sup>* ∈ {*F<sup>i</sup>* 1|*x*˜*<sup>t</sup>* , *F<sup>i</sup>* 2|*x*˜*<sup>t</sup>* ,..., *F<sup>i</sup> px*˜*<sup>t</sup>* |*x*˜*<sup>t</sup>* }*t*=1,2,...,*<sup>n</sup>* and *<sup>G</sup><sup>i</sup> <sup>j</sup>*|*y*˜ ∈ {*G<sup>i</sup>* 1|*y*˜ , *G<sup>i</sup>* 2|*y*˜ ,..., *G<sup>i</sup> py*˜|*y*˜ }, where *px*˜*<sup>t</sup>* and *py*˜ are the partitions number of the universes of discourses associated to the linguistic variables *x*˜*<sup>t</sup>* and *y*˜, respectively. The variable *x*˜*<sup>t</sup>* belongs to the fuzzy set *F<sup>i</sup> <sup>j</sup>*|*x*˜*<sup>t</sup>* with a value *<sup>μ</sup><sup>i</sup> Fj*|*x*˜*<sup>t</sup>* defined by the membership function *μ<sup>i</sup> <sup>x</sup>*˜*<sup>t</sup>* : *<sup>R</sup>* <sup>→</sup> [0, 1], where *<sup>μ</sup><sup>i</sup> Fj*|*x*˜*<sup>t</sup>* ∈ {*μ<sup>i</sup> F*1|*x*˜*<sup>t</sup>* , *μi F*2|*x*˜*<sup>t</sup>* ,..., *μ<sup>i</sup> Fpx*˜*t* |*x*˜*t* }. The variable *y*˜ belongs to the fuzzy set *G<sup>i</sup> <sup>j</sup>*|*y*˜ with a value *<sup>μ</sup><sup>i</sup> Gj*|*y*˜ defined by the membership function *μ<sup>i</sup> <sup>y</sup>*˜ : *<sup>R</sup>* <sup>→</sup> [0, 1] where *<sup>μ</sup><sup>i</sup> Gj*|*y*˜ ∈ {*μ<sup>i</sup> G*1|*y*˜ , *μi G*2|*y*˜ ,..., *μ<sup>i</sup> Gpy*˜|*y*˜ }. Each rule is interpreted by a fuzzy implication

$$\mathfrak{R}^i: \mu\_{F\_{\mathbb{I}|\mathbb{A}\_1}}^i \star \mu\_{F\_{\mathbb{I}|\mathbb{A}\_2}}^i \star \dots \star \mu\_{F\_{\mathbb{I}|\mathbb{A}\_n}}^i \to \mu\_{G\_{\mathbb{I}|\mathbb{A}}}^i \tag{2}$$

where is a T-norm, *μ<sup>i</sup> Fj*|*x*˜1 *μ<sup>i</sup> Fj*|*x*˜2 ... *μ<sup>i</sup> Fj*|*x*˜*<sup>n</sup>* is the fuzzy relation between the linguistic inputs, on the universes of discourses <sup>U</sup>*x*˜1 × U*x*˜2 <sup>×</sup> ... × U*x*˜*<sup>n</sup>* , and *<sup>μ</sup><sup>i</sup> Gj*|*y*˜ is the linguistic output defined on the universe of discourse Y. The Mamdani inference systems can represent MISO (Multiple Input and Single Output) systems directly, and the set of implications correspond to a unique fuzzy relation in U*x*˜1 × U*x*˜2 × ... × U*x*˜*<sup>n</sup>* × Y of the form

$$\mathfrak{R}\_{MISO}: \bigvee\_{i=1}^{l} [\mu\_{F\_{f|\bar{\mathbf{x}}\_1}}^{i} \star \mu\_{F\_{f|\bar{\mathbf{x}}\_2}}^{i} \star \dots \star \mu\_{F\_{f|\bar{\mathbf{x}}\_n}}^{i} \star \mu\_{G\_{f|\bar{\mathbf{y}}}}^{i}] \tag{3}$$

where is a S-norm.

The fuzzy output *m* | [*m*=1,2,...,*r*] is given by

$$G(\mathfrak{Y}\_{\mathfrak{m}}) = \mathfrak{R}\_{MISO} \circ (\mu^{i}\_{F\_{j|\mathbb{E}^\*\_1}} \star \mu^{i}\_{F\_{j|\mathbb{E}^\*\_2}} \star \dots \star \mu^{i}\_{F\_{j|\mathbb{E}^\*\_h}}) \tag{4}$$

where ◦ is a inference based composition operator, which can be of the type *max-min* or *max-product*, and *x*˜ ∗ *<sup>t</sup>* is any point in U*xt* . The Mamdani inference systems can represent MIMO (Multiple Input and Multple Output) systems of*r* outputs by a set of*r* MISO sub-rules coupled base <sup>R</sup>*<sup>j</sup> MISO* | [*j*=1,2,...,*l*] , that is,

$$\mathbf{G}(\vec{\mathbf{y}}) = \mathfrak{R}\_{MIMO} \circ (\mu\_{F\_{j|\mathbb{1}\_1^\*}}^i \star \mu\_{F\_{j|\mathbb{1}\_2^\*}}^i \star \dots \star \mu\_{F\_{j|\mathbb{1}\_n^\*}}^i) \tag{5}$$

with G(y˜)=[*G*(*y*˜1),..., *G*(*y*˜*r*)]*<sup>T</sup>* and

$$\mathfrak{R}\_{MIMO}: \bigcup\_{m=1}^{r} \left\{ \bigvee\_{i=1}^{l} [\mu\_{F\_{f|\bar{x}\_1}}^{i} \star \mu\_{F\_{f|\bar{x}\_2}}^{i} \star \dots \star \mu\_{F\_{f|\bar{x}\_n}}^{i} \star \mu\_{G\_{f|\bar{g}\_m}}^{i}] \right\} \tag{6}$$

where the operator represents the set of all fuzzy relations <sup>R</sup>*<sup>j</sup> MISO* associated to each output *y*˜*m*.

#### **2.2 Takagi-Sugeno fuzzy inference systems**

4 Will-be-set-by-IN-TECH

defined, respectively, in the own universe of discourse U*x*˜1 ,..., U*x*˜*<sup>n</sup>* e Y. The fuzzy sets

discourse of the linguistic variables of antecedent and consequent in the inference system,

}*t*=1,2,...,*<sup>n</sup>* and *<sup>G</sup><sup>i</sup>*

and *py*˜ are the partitions number of the universes of discourses associated to the linguistic

*<sup>x</sup>*˜*<sup>t</sup>* : *<sup>R</sup>* <sup>→</sup> [0, 1], where *<sup>μ</sup><sup>i</sup>*

... *μ<sup>i</sup>*

on the universe of discourse Y. The Mamdani inference systems can represent MISO (Multiple Input and Single Output) systems directly, and the set of implications correspond to a unique

> *Fj*|*x*˜<sup>∗</sup> 1 *μ<sup>i</sup> Fj*|*x*˜<sup>∗</sup> 2

where ◦ is a inference based composition operator, which can be of the type *max-min* or

(Multiple Input and Multple Output) systems of*r* outputs by a set of*r* MISO sub-rules coupled

*Fj*|*x*˜<sup>∗</sup> 1 *μ<sup>i</sup> Fj*|*x*˜<sup>∗</sup> 2

*<sup>j</sup>*|*y*˜ with a value *<sup>μ</sup><sup>i</sup>*

,..., *μ<sup>i</sup>*

... *μ<sup>i</sup>*

*Fj*|*x*˜*<sup>n</sup> <sup>μ</sup><sup>i</sup> Gj*|*y*˜

... *μ<sup>i</sup>*

... *μ<sup>i</sup>*

*Fj*|*x*˜*<sup>n</sup> <sup>μ</sup><sup>i</sup>*

*<sup>t</sup>* is any point in U*xt* . The Mamdani inference systems can represent MIMO

... *μ<sup>i</sup>*

*Fj*|*x*˜<sup>∗</sup> *n*

*Fj*|*x*˜<sup>∗</sup> *n*

, are the linguistic values (terms) used to partition the unierse of

*<sup>j</sup>*|*y*˜ ∈ {*G<sup>i</sup>*

*Gpy*˜|*y*˜

*Fj*|*x*˜*<sup>n</sup>* <sup>→</sup> *<sup>μ</sup><sup>i</sup>*

1|*y*˜ , *G<sup>i</sup>* 2|*y*˜

*Fj*|*x*˜*<sup>t</sup>*

∈ {*μ<sup>i</sup> F*1|*x*˜*<sup>t</sup>* , *μi F*2|*x*˜*<sup>t</sup>*

*Fj*|*x*˜*<sup>n</sup>* is the fuzzy relation between the linguistic inputs,

,..., *G<sup>i</sup>*

*py*˜|*y*˜

*Gj*|*y*˜ defined by the membership

}. Each rule is interpreted by a

*Gj*|*y*˜ (2)

*Gj*|*y*˜ is the linguistic output defined

] (3)

) (4)

) (5)

*Gj*|*y*˜*<sup>m</sup>* ]} (6)

*MISO* associated to each output

*<sup>j</sup>*|*x*˜*<sup>t</sup>* with a value *<sup>μ</sup><sup>i</sup>*

,..., *μ<sup>i</sup> Fpx*˜*t* |*x*˜*t* }.

}, where *px*˜*<sup>t</sup>*

*Fj*|*x*˜*<sup>t</sup>*

*Fi j*|*x*˜1 , *F<sup>i</sup> j*|*x*˜2

that is, *F<sup>i</sup>*

function *μ<sup>i</sup>*

fuzzy implication

where is a T-norm, *μ<sup>i</sup>*

where is a S-norm.

The fuzzy output *m* |

*max-product*, and *x*˜

*MISO* |

base <sup>R</sup>*<sup>j</sup>*

*y*˜*m*.

∗

, that is,

R*MIMO* :

[*j*=1,2,...,*l*]

with G(y˜)=[*G*(*y*˜1),..., *G*(*y*˜*r*)]*<sup>T</sup>* and

,..., *F<sup>i</sup>*

*<sup>j</sup>*|*x*˜*<sup>t</sup>* ∈ {*F<sup>i</sup>*

*<sup>j</sup>*|*x*˜*<sup>n</sup>* <sup>e</sup> *<sup>G</sup><sup>i</sup> j*|*y*˜

> 1|*x*˜*<sup>t</sup>* , *F<sup>i</sup>* 2|*x*˜*<sup>t</sup>*

defined by the membership function *μ<sup>i</sup>*

The variable *y*˜ belongs to the fuzzy set *G<sup>i</sup>*

*<sup>y</sup>*˜ : *<sup>R</sup>* <sup>→</sup> [0, 1] where *<sup>μ</sup><sup>i</sup>*

*Fj*|*x*˜1 *μ<sup>i</sup> Fj*|*x*˜2

,..., *F<sup>i</sup>*

R*<sup>i</sup>* : *μ<sup>i</sup> Fj*|*x*˜1 *μ<sup>i</sup> Fj*|*x*˜2

on the universes of discourses <sup>U</sup>*x*˜1 × U*x*˜2 <sup>×</sup> ... × U*x*˜*<sup>n</sup>* , and *<sup>μ</sup><sup>i</sup>*

fuzzy relation in U*x*˜1 × U*x*˜2 × ... × U*x*˜*<sup>n</sup>* × Y of the form

R*MISO* :

[*m*=1,2,...,*r*] is given by

*px*˜*<sup>t</sup>* |*x*˜*<sup>t</sup>*

variables *x*˜*<sup>t</sup>* and *y*˜, respectively. The variable *x*˜*<sup>t</sup>* belongs to the fuzzy set *F<sup>i</sup>*

*Gj*|*y*˜ ∈ {*μ<sup>i</sup>*

... *μ<sup>i</sup>*

 *l*

*i*=1 [*μi Fj*|*x*˜1 *μ<sup>i</sup> Fj*|*x*˜2

*<sup>G</sup>*(*y*˜*m*) = <sup>R</sup>*MISO* ◦ (*μ<sup>i</sup>*

<sup>G</sup>(y˜) = <sup>R</sup>*MIMO* ◦ (*μ<sup>i</sup>*

*i*=1 [*μi Fj*|*x*˜1 *μ<sup>i</sup> Fj*|*x*˜2

*r m*=1 { *l*

where the operator represents the set of all fuzzy relations <sup>R</sup>*<sup>j</sup>*

*G*1|*y*˜ , *μi G*2|*y*˜ The Takagi-Sugeno fuzzy inference system uses in the consequent proposition, a functional expression of the linguistic variables defined in the antecedent proposition (Takagi & Sugeno, 1985). Without loss of generality, the *i* | [*i*=1,2,...,*l*] th rule of this inference system, where *l* is the maximum number of rules, is given by:

$$\left[ \mathbf{R}^{i} : \mathbf{I} \mathbf{F} \,\tilde{\mathbf{x}}\_{1} \text{ is } F\_{\mathbf{j} | \tilde{\mathbf{x}}\_{1}}^{i} \mathbf{ AND } \dots \text{ AND } \tilde{\mathbf{x}}\_{l} \text{ is } F\_{\mathbf{j} | \tilde{\mathbf{x}}\_{l}}^{i} \mathbf{ THEN } \tilde{y}\_{l} = f\_{i}(\tilde{\mathbf{z}}) \tag{7}$$

The vector ˜<sup>x</sup> ∈ �*<sup>n</sup>* contains the linguistic variables of the antecedent proposition. Each linguistic variable has its own universe of discourse U*x*˜1 ,..., U*x*˜*<sup>n</sup>* partitioned by fuzzy sets which represent the linguistic terms. The variable *x*˜*<sup>t</sup>* | *<sup>t</sup>*=1,2,...,*<sup>n</sup>* belongs to the fuzzy set *Fi <sup>j</sup>*|*x*˜*<sup>t</sup>* with value *<sup>μ</sup><sup>i</sup> Fj*|*x*˜*<sup>t</sup>* defined by a membership function *μ<sup>i</sup> <sup>x</sup>*˜*<sup>t</sup>* : *<sup>R</sup>* <sup>→</sup> [0, 1], with *<sup>μ</sup><sup>i</sup> Fj*|*x*˜*<sup>t</sup>* ∈ {*μi F*1|*x*˜*<sup>t</sup>* , *μi F*2|*x*˜*<sup>t</sup>* ,..., *μ<sup>i</sup> Fpx*˜*t* |*x*˜*t* }, where *px*˜*<sup>t</sup>* is the partitions number of the universe of discourse associated to the linguistic variable *x*˜*t*. The activation degree *hi* of the rule *i* is given by:

$$h\_{\dot{i}}(\vec{x}) = \mu\_{F\_{\mathbb{I}/\mathbb{I}\_1^\*}}^i \star \mu\_{F\_{\mathbb{I}/\mathbb{I}\_2^\*}}^i \star \dots \star \mu\_{F\_{\mathbb{I}/\mathbb{I}\_h^\*}}^i \tag{8}$$

where *x*˜ ∗ *<sup>t</sup>* is any point in U*x*˜*<sup>t</sup>* . The normalized activation degree of the rule *i* is defined as:

$$\gamma\_i(\tilde{\mathbf{x}}) = \frac{h\_i(\tilde{\mathbf{x}})}{\sum\_{r=1}^{l} h\_r(\tilde{\mathbf{x}})} \tag{9}$$

This normalization implies that

$$\sum\_{i=1}^{l} \gamma\_i(\tilde{\mathbf{x}}) = 1 \tag{10}$$

The response of the Takagi-Sugeno fuzzy inference system is a weighted sum of the functional expressions defined on the consequent proposition of each rule, that is, a convex combination of local functions *fi*:

$$y = \sum\_{i=1}^{l} \gamma\_i(\tilde{x}) f\_i(\tilde{x}) \tag{11}$$

Such inference system can be seen as linear parameter varying system. In this sense, the Takagi-Sugeno fuzzy inference system can be considered as a mapping from antecedent space (input) to the convex region (polytope) defined on the local functional expressions in the consequent space. This property allows the analysis of the Takagi-Sugeno fuzzy inference system as a robust system which can be applied in modeling and controllers design for complex plants.

#### **3. Fuzzy computational modeling based control**

Many human skills are learned from examples. Therefore, it is natural establish this "didactic principle" in a computer program, so that it can learn how to provide the desired output as function of a given input. The Computational intelligence techniques, basically derived from the theory of Fuzzy Systems, associated to computer programs, are able to process numerical data and/or linguistic information, whose parameters can be adjusted from examples. The examples represent what these systems should respond when subjected to a particular input. These techniques use a numeric representation of knowledge, demonstrate adaptability and fault tolerance in contrast to the classical theory of artificial intelligence that uses symbolic representation of knowledge. The human knowledge, in turn, can be classified into two categories:


Fuzzy systems are able to coordinate both types of knowledge to solve real problems. The necessity of expert and engineers to deal with increasingly complex control systems problems, has enabled via computational intelligence techniques, the identification and control of real plants with difficult mathematical modeling. The computational intelligence techniques, once related to classical and modern control techniques, allow the use of constraints in its formulation and satisfaction of robustness and stability requirements in an efficient and practical form. The implementation of intelligent systems, especially from 70's, has been characterized by the growing need to improve the efficiency of industrial control systems in the following aspects: increasing product quality, reduced losses, and other factors related to the improvement of the disabilities of the identification and control methods. The intelligent identification and control methodologies are based on techniques motivated by biological systems, human intelligence, and have been introduced exploring alternative representations schemes from the natural language, rules, semantic networks or qualitative models.

The research on fuzzy inference systems has been developed in two main directions. The first direction is the linguistic or qualitative information, in which the fuzzy inference system is developed from a collection of rules (propositions). The second direction is the quantitative information and is related to the theory of classical and modern systems. The combination of the qualitative and quantitative informations, which is the main motivation for the use of intelligent systems, has resulted in several contributions on stability and robustness of advanced control systems. In (Ding, 2011) is addressed the output feedback predictive control for a fuzzy system with bounded noise. The controller optimizes an infinite-horizon objective function respecting the input and state constraints. The control law is parameterized as a dynamic output feedback that is dependent on the membership functions, and the closed-loop stability is specified by the notion of quadratic boundedness. In (Wang et al., 2011) is considered the problem of fuzzy control design for a class of nonlinear distributed parameter systems that is described by first-order hyperbolic partial differential equations (PDEs), where the control actuators are continuously distributed in space. The goal of this methodology is to develop a fuzzy state-feedback control design methodology for these systems by employing a combination of PDE theory and concepts from Takagi-Sugeno fuzzy control. First, the Takagi-Sugeno fuzzy hyperbolic PDE model is proposed to accurately represent the nonlinear 6 Will-be-set-by-IN-TECH

the theory of Fuzzy Systems, associated to computer programs, are able to process numerical data and/or linguistic information, whose parameters can be adjusted from examples. The examples represent what these systems should respond when subjected to a particular input. These techniques use a numeric representation of knowledge, demonstrate adaptability and fault tolerance in contrast to the classical theory of artificial intelligence that uses symbolic representation of knowledge. The human knowledge, in turn, can be classified into two

1. *Objective knowledge*: This kind of knowledge is used in the engineering problems formulation and is defined by mathematical equations (mathematical model of a submarine, aircraft or robot; statistics analysis of the communication channel behaviour;

2. *Subjective knowledge*: This kind of knowledge represents the linguistic informations defined through set of rules, knowledge from expert and design specifications, which are usually

Fuzzy systems are able to coordinate both types of knowledge to solve real problems. The necessity of expert and engineers to deal with increasingly complex control systems problems, has enabled via computational intelligence techniques, the identification and control of real plants with difficult mathematical modeling. The computational intelligence techniques, once related to classical and modern control techniques, allow the use of constraints in its formulation and satisfaction of robustness and stability requirements in an efficient and practical form. The implementation of intelligent systems, especially from 70's, has been characterized by the growing need to improve the efficiency of industrial control systems in the following aspects: increasing product quality, reduced losses, and other factors related to the improvement of the disabilities of the identification and control methods. The intelligent identification and control methodologies are based on techniques motivated by biological systems, human intelligence, and have been introduced exploring alternative representations

Newton's laws for motion analysis and Kirchhoff's Laws for circuit analysis).

schemes from the natural language, rules, semantic networks or qualitative models.

The research on fuzzy inference systems has been developed in two main directions. The first direction is the linguistic or qualitative information, in which the fuzzy inference system is developed from a collection of rules (propositions). The second direction is the quantitative information and is related to the theory of classical and modern systems. The combination of the qualitative and quantitative informations, which is the main motivation for the use of intelligent systems, has resulted in several contributions on stability and robustness of advanced control systems. In (Ding, 2011) is addressed the output feedback predictive control for a fuzzy system with bounded noise. The controller optimizes an infinite-horizon objective function respecting the input and state constraints. The control law is parameterized as a dynamic output feedback that is dependent on the membership functions, and the closed-loop stability is specified by the notion of quadratic boundedness. In (Wang et al., 2011) is considered the problem of fuzzy control design for a class of nonlinear distributed parameter systems that is described by first-order hyperbolic partial differential equations (PDEs), where the control actuators are continuously distributed in space. The goal of this methodology is to develop a fuzzy state-feedback control design methodology for these systems by employing a combination of PDE theory and concepts from Takagi-Sugeno fuzzy control. First, the Takagi-Sugeno fuzzy hyperbolic PDE model is proposed to accurately represent the nonlinear

impossible to be described quantitatively.

categories:

first-order hyperbolic PDE system. Subsequently, based on the Takagi-Sugeno fuzzy-PDE model, a Lyapunov technique is used to analyze the closed-loop exponential stability with a given decay rate. Then, a fuzzy state-feedback control design procedure is developed in terms of a set of spatial differential linear matrix inequalities (SDLMIs) from the resulting stability conditions. The developed design methodology is successfully applied to the control of a nonisothermal plug-flow reactor. In (Sadeghian & Fatehi, 2011) is used a nonlinear system identification method to predict and detect process fault of a cement rotary kiln from the White Saveh Cement Company. After selecting proper inputs and output, an inputUoutput ˝ locally linear neuro-fuzzy (LLNF) model is identified for the plant in various operation points in the kiln. In (Li & Lee, 2011) an observer-based adaptive controller is developed from a hierarchical fuzzy-neural network (HFNN) is employed to solve the controller time-delay problem for a class of multi-input multi-output(MIMO) non-affine nonlinear systems under the constraint that only system outputs are available for measurement. By using the implicit function theorem and Taylor series expansion, the observer-based control law and the weight update law of the HFNN adaptive controller are derived. According to the design of the HFNN hierarchical fuzzy-neural network, the observer-based adaptive controller can alleviate the online computation burden and can guarantee that all signals involved are bounded and that the outputs of the closed-loop system track asymptotically the desired output trajectories.

Fuzzy inference systems are widely found in the following areas: Control Applications - aircraft (Rockwell Corp.), cement industry and motor/valve control (Asea Brown Boveri Ltd.), water treatment and robots control (Fuji Electric), subway system (Hitachi), board control (Nissan), washing machines (Matsushita, Hitachi), air conditioning system (Mitsubishi); Medical Technology - cancer diagnosis (Kawasaki medical School); Modeling and Optimization - prediction system for earthquakes recognition (Institute of Seismology Bureau of Metrology, Japan); Signal Processing For Adjustment and Interpretation vibration compensation in video camera (Matsushita), video image stabilization (Matsushita / Panasonic), object and voice recognition (CSK, Hitachi Hosa Univ., Ricoh), adjustment of images on TV (Sony). Due to the development, the many practical possibilities and the commercial success of their applications, the theory of fuzzy systems have a wide acceptance in academic community as well as industrial applications for modeling and advanced control systems design.

#### **4. Takagi-Sugeno fuzzy black box modeling**

This section aims to illustrate the problem of black box modeling, well known as systems identification, addressing the use of Takagi-Sugeno fuzzy inference systems. The nonlinear input-output representation is often used for building TS fuzzy models from data, where the regression vector is represented by a finite number of past inputs and outputs of the system. In this work, the nonlinear autoregressive with exogenous input (NARX) structure model is used. This model is applied in most nonlinear identification methods such as neural networks, radial basis functions, cerebellar model articulation controller (CMAC), and also fuzzy logic. The NARX model establishes a relation between the collection of past scalar input-output data and the predicted output

$$y\_{k+1} = F[y\_{k'}, \dots, y\_{k-n\_{\tilde{y}}+1}, u\_{k'}, \dots, \dots, u\_{k-n\_{\tilde{u}}+1}] \tag{12}$$

where *k* denotes discrete time samples, *ny* and *nu* are integers related to the system's order. In terms of rules, the model is given by

$$\mathcal{R}^l: \text{IF } y\_k \text{ is } F\_1^l \text{ AND } \cdots \text{ AND } y\_{k-n\_y+1} \text{ is } F\_{n\_y}^l \text{ AND } u\_k \text{ is } G\_1^l \text{ AND } \cdots \text{ AND } u\_{k-n\_y+1} \text{ is } G\_{n\_y}^l$$

$$\text{THEN } \mathcal{Y}\_{k+1}^l = \sum\_{j=1}^{n\_y} a\_{i,j} y\_{k-j+1} + \sum\_{j=1}^{n\_u} b\_{i,j} u\_{k-j+1} + c\_i \tag{13}$$

where *ai*,*j*, *bi*,*<sup>j</sup>* and *ci* are the consequent parameters to be determined. The inference formula of the TS fuzzy model is a straightforward extension of (11) and is given by

$$y\_{k+1} = \frac{\sum\_{i=1}^{l} h\_i(x)\theta\_{k+1}^i}{\sum\_{i=1}^{l} h\_i(x)}\tag{14}$$

or

$$y\_{k+1} = \sum\_{i=1}^{l} \gamma\_i(x)\mathfrak{H}\_{k+1}^i \tag{15}$$

with

$$\mathbf{x} = [y\_{k'} \dots \mathbf{y}\_{k-n\_{\mathcal{Y}}+1}, \mathbf{u}\_{k'} \dots \mathbf{u}\_{k-n\_{\mathcal{Y}}+1}] \tag{16}$$

and *hi*(x) is given as (8). This NARX model represents multiple input and single output (MISO) systems directly and multiple input and multiple output (MIMO) systems in a decomposed form as a set of coupled MISO models.

#### **4.1 Antecedent parameters estimation problem**

The experimental data based antecedent parameters estimation can be done by fuzzy clustring algorithms. A cluster is a group of similar objects. The term "similarity" should be understood as mathematical similarity measured in some well-define sense. In metric spaces, similarity is often defined by means of a distance norm. Distance can be measured from data vector to some cluster prototypical (center). Data can reveal clusters of different geometric shapes, sizes and densities. The clusters also can be characterized as linear and nonlinear subspaces of the data space.

The objective of clustering is partitioning the data set Z into *c* clusters. Assume that *c* is known, based on priori knowledge. The fuzzy partition of Z can be defined as a family of subsets {*Ai*|1 ≤ *i* ≤ *c*} ⊂ *P*(*Z*), with the following properties:

$$\bigcup\_{i=1}^{c} A\_i = Z$$

$$A\_{\dot{l}} \cap A\_{\dot{j}} = 0 \tag{18}$$

8 Will-be-set-by-IN-TECH

where *k* denotes discrete time samples, *ny* and *nu* are integers related to the system's order. In

where *ai*,*j*, *bi*,*<sup>j</sup>* and *ci* are the consequent parameters to be determined. The inference formula

*l* ∑ *i*=1

*hi*(x)*y*ˆ *i k*+1

*γi*(x)*y*ˆ *i*

<sup>x</sup> = [*yk*,..., *yk*−*ny*<sup>+</sup>1, *uk*,..., *uk*−*nu*+1] (16)

*hi*(x)

*l* ∑ *i*=1

*l* ∑ *i*=1

and *hi*(x) is given as (8). This NARX model represents multiple input and single output (MISO) systems directly and multiple input and multiple output (MIMO) systems in a

The experimental data based antecedent parameters estimation can be done by fuzzy clustring algorithms. A cluster is a group of similar objects. The term "similarity" should be understood as mathematical similarity measured in some well-define sense. In metric spaces, similarity is often defined by means of a distance norm. Distance can be measured from data vector to some cluster prototypical (center). Data can reveal clusters of different geometric shapes, sizes and densities. The clusters also can be characterized as linear and nonlinear subspaces of the

The objective of clustering is partitioning the data set Z into *c* clusters. Assume that *c* is known, based on priori knowledge. The fuzzy partition of Z can be defined as a family of

> *c i*=1

*ny* AND *uk* is *<sup>G</sup><sup>i</sup>*

<sup>1</sup> AND ··· AND *uk*−*nu*+<sup>1</sup> is *<sup>G</sup><sup>i</sup>*

*<sup>k</sup>*+<sup>1</sup> (15)

*Ai* = *Z* (17)

*Ai* ∩ *Aj* = 0 (18)

*bi*,*juk*−*j*+<sup>1</sup> + *ci* (13)

*nu*

(14)

terms of rules, the model is given by

*ny* ∑ *j*=1

<sup>1</sup> AND ··· AND *yk*−*ny*+<sup>1</sup> is *<sup>F</sup><sup>i</sup>*

*nu* ∑ *j*=1

of the TS fuzzy model is a straightforward extension of (11) and is given by

*yk*<sup>+</sup><sup>1</sup> =

*yk*<sup>+</sup><sup>1</sup> =

*ai*,*jyk*−*j*+<sup>1</sup> +

decomposed form as a set of coupled MISO models.

subsets {*Ai*|1 ≤ *i* ≤ *c*} ⊂ *P*(*Z*), with the following properties:

**4.1 Antecedent parameters estimation problem**

*<sup>R</sup><sup>i</sup>* : IF *yk* is *<sup>F</sup><sup>i</sup>*

THEN *y*ˆ *i <sup>k</sup>*+<sup>1</sup> =

or

with

data space.

$$0 \subset A\_{\bar{l}} \subset Z\_{\bar{l}} \tag{19}$$

Equation (17) means that the subsets *Ai* collectively contain all the data in Z. The subsets must be disjoint, as stated by (18), and none off them is empty nor contains all the data in Z, as stated by (19). In terms of membership functions, *μAi* is the membership function of *Ai*. To simplifly the notation, in this paper is used *μik* instead of *μ<sup>i</sup>* (*zk*). The *c* × *N* matrix U = [*μik*] represents a fuzzy partitioning space if and only if:

$$M\_{fc} = \left\{ U \in \mathfrak{R}^{c \times N} | \mu\_{ik} \in [0, 1], \forall i, k; \sum\_{i=1}^{c} \mu\_{ik} = 1, \forall k; 0 < \sum\_{k=1}^{N} \mu\_{ik} < N, \forall i \right\} \tag{20}$$

The *i*-th row of the fuzzy partition matrix U contains values of the *i*-th membership function of the fuzzy subset *Ai* of Z. The clustering algorithm optimizes an initial set of centroids by minimizing a cost function *J* in an iterative process. This function is usually formulated as:

$$\mathbf{J}\left(\mathbf{Z};\mathbf{U},\mathbf{V},\mathbf{A}\right) = \sum\_{i=1}^{c} \sum\_{k=1}^{N} \mu\_{ik}^{m} D\_{ikA\_{i}}^{2} \tag{21}$$

where, **Z** = {*z*1, *z*2, ··· , *zN*} is a finite data set. **U** = [*μik*] ∈ *Mf c* is a fuzzy partition of **Z**. **<sup>V</sup>** <sup>=</sup> {**v**1, **<sup>v</sup>**2, ··· , **<sup>v</sup>***c*} , **<sup>v</sup>***<sup>i</sup>* ∈ �*n*, is a vector of cluster prototypes (centers). **<sup>A</sup>** denote a *<sup>c</sup>*-tuple of the norm-induting matrices: <sup>A</sup> <sup>=</sup> (A1, <sup>A</sup>2, ··· , <sup>A</sup>*c*). *<sup>D</sup>*<sup>2</sup> *ik*A*<sup>i</sup>* is a square inner-product distance norm. The *m* ∈ [1, ∞) is a weighting exponent which determines the fuzziness of the clusters. The clustering algorithms differ in the choice of the norm distance. The norm metric influences the clustering criterion by changing the measure of dissimilarity. The Euclidean norm induces hiperspherical clusters. It's characterizes the FCM algorithm, where the norm-inducing matrix A*iFCM* is equal to identity matrix (A*iFCM* = I), which strictly imposes a circular shape to all clusters. The Euclidean Norm is given by:

$$D\_{ik\_{FCM}}^2 = (z\_k - v\_i)^T \mathbf{A}\_{i\_{FCM}} (z\_k - v\_i) \tag{22}$$

An adaptative distance norm in order to detect clusters of different geometrical shapes in a data set characterizes the *GK algorithm*:

$$D\_{ik\_{GK}}^2 = \left(z\_k - v\_i\right)^T \mathbf{A}\_{i\_{GK}} \left(z\_k - v\_i\right) \tag{23}$$

In this algorithm, each cluster has its own norm-inducing matrix A*iGK* , where each cluster adapts the distance norm to the local topological structure of the data set. A*iGK* is given by:

$$\mathbf{A}\_{i\_{\mathbb{C}K}} = \left[\rho\_i \det \left(\mathbf{F}\_i\right)\right]^{1/n} \mathbf{F}\_i^{-1} \,, \tag{24}$$

where *ρ<sup>i</sup>* is cluster volume, usually fixed in 1. The *n* is data dimension. The F*<sup>i</sup>* is fuzzy covariance matrix of the *i*-th cluster defined by:

$$F\_{\bar{l}} = \frac{\sum\_{k=1}^{N} \left(\mu\_{i\bar{k}}\right)^{m} \left(z\_{k} - v\_{\bar{l}}\right) \left(z\_{k} - v\_{\bar{l}}\right)^{T}}{\sum\_{k=1}^{N} \left(\mu\_{i\bar{k}}\right)^{m}} \tag{25}$$

The eigenstructure of the cluster covariance matrix provides information about the shape and orientation cluster. The ratio of the hyperellipsoid axes is given by the ratio of the square roots of the eigenvalues of **F***i*. The directions of the axes are given by the eigenvectores of **F***i*. The eigenvector corresponding to the smallest eigenvalue determines the normal to the hyperplane, and it can be used to compute optimal local linear models from the covariance matrix. The fuzzy maximum likelihood estimates (FLME) algorithm employs a distance norm based on maximum lekelihood estimates:

$$D\_{ik\_{\rm FLEE}} = \frac{\sqrt{\mathbf{G}\_{i\_{\rm FLEE}}}}{P\_{\rm i}} \exp\left[\frac{1}{2} \left(\mathbf{z}\_{k} - \mathbf{v}\_{i}\right)^{T} \mathbf{F}\_{i\_{\rm FLEE}}^{-1} \left(\mathbf{z}\_{k} - \mathbf{v}\_{i}\right)\right] \tag{26}$$

Note that, contrary to the GK algorithm, this distance norm involves an exponential term and decreases faster than the inner-product norm. The F*iFLME* denotes the fuzzy covariance matrix of the *i*-th cluster, given by (25). When *m* is equal to 1, it has a strict algorithm FLME. If *m* is greater than 1, it has a extended algorithm FLME, or Gath-Geva (GG) algorithm. Gath and Geva reported that the FLME algorithm is able to detect clusters of varying shapes, sizes and densities. This is because the cluster covariance matrix is used in conjuncion with an "exponential" distance, and the clusters are not constrained in volume. *Pi* is the prior probability of selecting cluster *i*, given by:

$$P\_i = \frac{1}{N} \sum\_{k=1}^{N} \left(\mu\_{ik}\right)^m \tag{27}$$

#### **4.2 Consequent parameters estimation problem**

The inference formula of the TS fuzzy model in (15) can be expressed as

$$y\_{k+1} = \gamma\_1(\mathbf{x}\_k)[a\_{1,1}y\_k + \dots + a\_{1,ny}y\_{k-n\_y+1} + b\_{1,1}u\_k + \dots + b\_{1,nu}u\_{k-n\_u+1} + c\_1] + \\\\\gamma\_2(\mathbf{x}\_k)[a\_{2,1}y\_k + \dots + a\_{2,ny}y\_{k-n\_y+1} + b\_{2,1}u\_k + \dots + b\_{2,nu}u\_{k-n\_u+1} + c\_2] + \\\\\vdots\\\vdots\\\gamma\_{l,1}[a\_{l,1}y\_k + \dots + a\_{l,ny}y\_{k-n\_y+1} + b\_{l,1}u\_k + \dots + b\_{l,nu}u\_{k-n\_u+1} + c\_l] \\\tag{28}$$

$$\gamma\_l(\mathbf{x}\_k)[a\_{l,1}y\_k + \dots + a\_{l,ny}y\_{k-n\_y+1} + b\_{l,1}u\_k + \dots + b\_{l,nu}u\_{k-n\_u+1} + c\_l] \\\tag{29}$$

which is linear in the consequent parameters: a, b and c. For a set of *N* input-output data pairs {(x*k*, *yk*)|*i* = 1, 2, . . . , *N*} available, the following vetorial form is obtained

$$\mathbf{Y} = [\psi\_1 \mathbf{X}, \psi\_2 \mathbf{X}, \dots, \psi\_l \mathbf{X}] \boldsymbol{\theta} + \boldsymbol{\Xi} \tag{29}$$

where <sup>ψ</sup>*<sup>i</sup>* <sup>=</sup> *diag*(*γi*(x*k*)) ∈ �*N*×*N*, <sup>X</sup> = [y*k*,..., <sup>y</sup>*k*−*ny*<sup>+</sup>1, <sup>u</sup>*k*,..., <sup>u</sup>*k*−*nu*<sup>+</sup>1, <sup>1</sup>] <sup>∈</sup> �*N*×(*ny*+*nu*+1) , <sup>Y</sup> ∈ �*N*×1, **<sup>Ξ</sup>** ∈ �*N*×<sup>1</sup> and <sup>θ</sup> ∈ �*l*(*ny*+*nu*+1)×<sup>1</sup> are the normalized membership degree matrix of (9), the data matrix, the output vector, the approximation error vector and the estimated parameters vector, respectively. If the unknown parameters associated variables are *exactly known* quantities, then the least squares method can be used efficiently. However, in practice, and in the present context, the elements of X are no exactly known quantities so that its value can be expressed as

$$y\_k = \chi\_k^T \boldsymbol{\theta} + \eta\_k \tag{30}$$

where, at the *k*-th sampling instant, χ*<sup>T</sup> <sup>k</sup>* = [*γ*<sup>1</sup> *<sup>k</sup>* (x*<sup>k</sup>* <sup>+</sup> <sup>ξ</sup>*k*),..., *<sup>γ</sup><sup>l</sup> <sup>k</sup>*(x*<sup>k</sup>* + ξ*k*)] is the vector of the data with error in variables, <sup>x</sup>*<sup>k</sup>* = [*yk*−1,..., *yk*−*ny* , *uk*−1,..., *uk*−*nu* , 1] *<sup>T</sup>* is the vector of the data with exactly known quantities, e.g., free noise input-output data, ξ*<sup>k</sup>* is a vector of noise associated with the observation of x*k*, and η*<sup>k</sup>* is a disturbance noise.

The normal equations are formulated as

$$\mathbb{E}[\sum\_{j=1}^{k} \mathbf{x}\_{j} \mathbf{x}\_{j}^{T}] \hat{\boldsymbol{\theta}}\_{k} = \sum\_{j=1}^{k} \mathbf{x}\_{j} \mathbf{y}\_{j} \tag{31}$$

and multiplying by <sup>1</sup> *<sup>k</sup>* gives

$$\left\{\frac{1}{k}\sum\_{j=1}^{k}[\gamma\_{j}^{1}(\boldsymbol{x}\_{j}+\boldsymbol{\xi}\_{j}),\ldots,\gamma\_{j}^{l}(\boldsymbol{x}\_{j}+\boldsymbol{\xi}\_{j})][\gamma\_{j}^{1}(\boldsymbol{x}\_{j}+\boldsymbol{\xi}\_{j}),\ldots,\gamma\_{j}^{l}(\boldsymbol{x}\_{j}+\boldsymbol{\xi}\_{j})]^{T}\right\}\hat{\boldsymbol{\theta}}\_{k} = $$

$$\frac{1}{k}\sum\_{j=1}^{k}[\gamma\_{j}^{1}(\boldsymbol{x}\_{j}+\boldsymbol{\xi}\_{j}),\ldots,\gamma\_{j}^{l}(\boldsymbol{x}\_{j}+\boldsymbol{\xi}\_{j})]y\_{j}\tag{32}$$

Noting that *yj* = χ*<sup>T</sup> <sup>j</sup>* θ + *ηj*,

$$\begin{aligned} \{\frac{1}{k}\sum\_{j=1}^{k}[\gamma\_{j}^{1}(\mathbf{z}\_{j}+\boldsymbol{\xi}\_{j}),\ldots,\gamma\_{j}^{l}(\mathbf{z}\_{j}+\boldsymbol{\xi}\_{j})][\gamma\_{j}^{1}(\mathbf{z}\_{j}+\boldsymbol{\xi}\_{j}),\ldots,\gamma\_{j}^{l}(\mathbf{z}\_{j}+\boldsymbol{\xi}\_{j})]^{T}\}\boldsymbol{\hat{\theta}}\_{k} &=\\ \frac{1}{k}\sum\_{j=1}^{k}[\gamma\_{j}^{1}(\mathbf{z}\_{j}+\boldsymbol{\xi}\_{j}),\ldots,\gamma\_{j}^{l}(\mathbf{z}\_{j}+\boldsymbol{\xi}\_{j})][\gamma\_{j}^{1}(\mathbf{z}\_{j}+\boldsymbol{\xi}\_{j}),\ldots,\gamma\_{j}^{l}(\mathbf{z}\_{j}+\boldsymbol{\xi}\_{j})]^{T}\boldsymbol{\theta} + \frac{1}{k}\sum\_{j=1}^{k}[\gamma\_{j}^{1}(\mathbf{z}\_{j}+\boldsymbol{\xi}\_{j}),\ldots,\gamma\_{j}^{l}(\mathbf{z}\_{j}+\boldsymbol{\xi}\_{j})]^{T}\boldsymbol{\theta} \\ &\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\dots[\gamma\_{j}^{1}(\mathbf{z}\_{j}+\boldsymbol{\xi}\_{j})]\eta\_{j}]\tag{33} \end{aligned}$$

and

10 Will-be-set-by-IN-TECH

The eigenstructure of the cluster covariance matrix provides information about the shape and orientation cluster. The ratio of the hyperellipsoid axes is given by the ratio of the square roots of the eigenvalues of **F***i*. The directions of the axes are given by the eigenvectores of **F***i*. The eigenvector corresponding to the smallest eigenvalue determines the normal to the hyperplane, and it can be used to compute optimal local linear models from the covariance matrix. The fuzzy maximum likelihood estimates (FLME) algorithm employs a distance norm

exp

<sup>P</sup>*<sup>i</sup>* <sup>=</sup> <sup>1</sup> *N*

The inference formula of the TS fuzzy model in (15) can be expressed as

 1

Note that, contrary to the GK algorithm, this distance norm involves an exponential term and decreases faster than the inner-product norm. The F*iFLME* denotes the fuzzy covariance matrix of the *i*-th cluster, given by (25). When *m* is equal to 1, it has a strict algorithm FLME. If *m* is greater than 1, it has a extended algorithm FLME, or Gath-Geva (GG) algorithm. Gath and Geva reported that the FLME algorithm is able to detect clusters of varying shapes, sizes and densities. This is because the cluster covariance matrix is used in conjuncion with an "exponential" distance, and the clusters are not constrained in volume. *Pi* is the prior

> *N* ∑ *k*=1

*yk*<sup>+</sup><sup>1</sup> = *<sup>γ</sup>*1(x*k*)[*a*1,1*yk* + ... + *<sup>a</sup>*1,*nyyk*−*ny*+<sup>1</sup> + *<sup>b</sup>*1,1*uk* + ... + *<sup>b</sup>*1,*nuuk*−*nu*+<sup>1</sup> + *<sup>c</sup>*1] +

which is linear in the consequent parameters: a, b and c. For a set of *N* input-output data

where <sup>ψ</sup>*<sup>i</sup>* <sup>=</sup> *diag*(*γi*(x*k*)) ∈ �*N*×*N*, <sup>X</sup> = [y*k*,..., <sup>y</sup>*k*−*ny*<sup>+</sup>1, <sup>u</sup>*k*,..., <sup>u</sup>*k*−*nu*<sup>+</sup>1, <sup>1</sup>] <sup>∈</sup>

degree matrix of (9), the data matrix, the output vector, the approximation error vector and the estimated parameters vector, respectively. If the unknown parameters associated variables are *exactly known* quantities, then the least squares method can be used efficiently. However, in practice, and in the present context, the elements of X are no exactly known quantities so

*yk* = <sup>χ</sup>*<sup>T</sup>*

pairs {(x*k*, *yk*)|*i* = 1, 2, . . . , *N*} available, the following vetorial form is obtained

*<sup>γ</sup>*2(x*k*)[*a*2,1*yk* + ... + *<sup>a</sup>*2,*nyyk*−*ny*+<sup>1</sup> + *<sup>b</sup>*2,1*uk* + ... + *<sup>b</sup>*2,*nuuk*−*nu*+<sup>1</sup> + *<sup>c</sup>*2] + . . .

*<sup>γ</sup>l*(x*k*)[*al*,1*yk* + ... + *al*,*nyyk*−*ny*+<sup>1</sup> <sup>+</sup> *bl*,1*uk* + ... + *bl*,*nuuk*−*nu*+<sup>1</sup> + *cl*] (28)

, <sup>Y</sup> ∈ �*N*×1, **<sup>Ξ</sup>** ∈ �*N*×<sup>1</sup> and <sup>θ</sup> ∈ �*l*(*ny*+*nu*+1)×<sup>1</sup> are the normalized membership

Y = [ψ1X, ψ2X,..., ψ*l*X]θ + **Ξ** (29)

*<sup>k</sup>* θ + *η<sup>k</sup>* (30)

(*μik*)

<sup>2</sup> (*zk* <sup>−</sup> *vi*)

*<sup>T</sup>* F <sup>−</sup><sup>1</sup>

*iFLME* (*zk* − *vi*)

*<sup>m</sup>* (27)

(26)

based on maximum lekelihood estimates:

probability of selecting cluster *i*, given by:

�*N*×(*ny*+*nu*+1)

that its value can be expressed as

**4.2 Consequent parameters estimation problem**

*DikFLME* =

 G*iFLME Pi*

$$\bar{\boldsymbol{\theta}}\_{k} = \{ \frac{1}{k} \sum\_{j=1}^{k} [\gamma\_{j}^{1}(\boldsymbol{x}\_{j} + \boldsymbol{\xi}\_{j}), \dots, \gamma\_{j}^{l}(\boldsymbol{x}\_{j} + \boldsymbol{\xi}\_{j})] [\gamma\_{j}^{1}(\boldsymbol{x}\_{j} + \boldsymbol{\xi}\_{j}), \dots, \gamma\_{j}^{l}(\boldsymbol{x}\_{j} + \boldsymbol{\xi}\_{j})]^{T} \}^{-1} \frac{1}{k} \sum\_{j=1}^{k} [\gamma\_{j}^{1}(\boldsymbol{x}\_{j} + \boldsymbol{\xi}\_{j}), \dots, \gamma\_{j}^{l}(\boldsymbol{x}\_{j} + \boldsymbol{\xi}\_{j})]^{T} \}^{-1} \frac{1}{k} \sum\_{j=1}^{k} [\gamma\_{j}^{1}(\boldsymbol{x}\_{j} + \boldsymbol{\xi}\_{j}), \dots, \gamma\_{j}^{l}(\boldsymbol{x}\_{j} + \boldsymbol{\xi}\_{j})]^{T} \}^{-1}$$

where θ˜ *<sup>k</sup>* = θˆ *<sup>k</sup>* − θ is the parameter error. Taking the probability in the limit as *k* → ∞,

$$\text{p.lim } \theta\_k = \text{p.lim } \left\{ \frac{1}{k} C\_k^{-1} \frac{1}{k} b\_k \right\} \tag{35}$$

with

$$\begin{aligned} \mathbf{C}\_{k} &= \sum\_{j=1}^{k} [\gamma\_{j}^{1}(\boldsymbol{x}\_{j} + \boldsymbol{\xi}\_{j}), \dots, \gamma\_{j}^{l}(\boldsymbol{x}\_{j} + \boldsymbol{\xi}\_{j})] [\gamma\_{j}^{1}(\boldsymbol{x}\_{j} + \boldsymbol{\xi}\_{j}), \dots, \gamma\_{j}^{l}(\boldsymbol{x}\_{j} + \boldsymbol{\xi}\_{j})]^{T} \\ \mathbf{b}\_{k} &= \sum\_{j=1}^{k} [\gamma\_{j}^{1}(\boldsymbol{x}\_{j} + \boldsymbol{\xi}\_{j}), \dots, \gamma\_{j}^{l}(\boldsymbol{x}\_{j} + \boldsymbol{\xi}\_{j})] \boldsymbol{\eta}\_{j} \end{aligned}$$

Applying Slutsky's theorem and assuming that the elements of <sup>1</sup> *k* C*<sup>k</sup>* and <sup>1</sup> *<sup>k</sup>* b*<sup>k</sup>* converge in probability, we have

$$\text{p.lim } \tilde{\theta}\_k = \text{p.lim } \frac{1}{k} \mathbf{C}\_k^{-1} \text{ p.lim } \frac{1}{k} \mathbf{b}\_k \tag{36}$$

Thus,

$$\begin{aligned} \text{p.lim}\,\frac{1}{k}\mathbf{C}\_{k} &= \text{p.lim}\,\frac{1}{k}\sum\_{j=1}^{k}[\gamma\_{j}^{1}(\mathbf{x}\_{j}+\boldsymbol{\xi}\_{j}),\ldots,\gamma\_{j}^{l}(\mathbf{x}\_{j}+\boldsymbol{\xi}\_{j})][\gamma\_{j}^{1}(\mathbf{x}\_{j}+\boldsymbol{\xi}\_{j}),\ldots,\gamma\_{j}^{l}(\mathbf{x}\_{j}+\boldsymbol{\xi}\_{j})]^{T} \\\\ \text{p.lim}\,\frac{1}{k}\mathbf{C}\_{k} &= \text{p.lim}\,\frac{1}{k}\sum\_{j=1}^{k}(\gamma\_{j}^{1})^{2}(\mathbf{x}\_{j}+\boldsymbol{\xi}\_{j})(\mathbf{x}\_{j}+\boldsymbol{\xi}\_{j})^{T} + \dots + \text{p.lim}\,\frac{1}{k}\sum\_{j=1}^{k}(\gamma\_{j}^{l})^{2}(\mathbf{x}\_{j}+\boldsymbol{\xi}\_{j})(\mathbf{x}\_{j}+\boldsymbol{\xi}\_{j})^{T} \end{aligned}$$

Assuming x*<sup>j</sup>* and ξ*<sup>j</sup>* statistically independent,

$$\begin{aligned} \text{p.lim } \frac{1}{k} \mathbf{C}\_{k} &= \text{p.lim } \frac{1}{k} \sum\_{j=1}^{k} (\gamma\_{j}^{1})^{2} [\mathbf{x}\_{j}\mathbf{x}\_{j}^{T} + \mathbf{\xi}\_{j}\mathbf{\xi}\_{j}^{T}] + \dots \\\\ \text{p.lim } \frac{1}{k} \mathbf{C}\_{k} &= \text{p.lim } \frac{1}{k} \sum\_{j=1}^{k} \mathbf{x}\_{j}\mathbf{x}\_{j}^{T} [(\gamma\_{j}^{1})^{2} + \dots + (\gamma\_{j}^{l})^{2}] + \text{p.lim } \frac{1}{k} \sum\_{j=1}^{k} \mathbf{\xi}\_{j}\mathbf{\xi}\_{j}^{T} [(\gamma\_{j}^{1})^{2} + \dots + (\gamma\_{j}^{l})^{2}] \end{aligned}$$

with *l* ∑ *i*=1 *γi <sup>j</sup>* = 1. Hence, the asymptotic analysis of the TS fuzzy model consequent parameters estimation is based in a weighted sum of the fuzzy covariance matrices of x and ξ. Similarly,

$$\text{p.lim } \frac{1}{k} \mathbf{b}\_k = \text{p.lim } \frac{1}{k} \sum\_{j=1}^k [\gamma\_j^1 (\mathbf{x}\_j + \boldsymbol{\xi}\_j), \dots, \gamma\_j^l (\mathbf{x}\_j + \boldsymbol{\xi}\_j)] \eta\_j$$

$$\text{p.lim } \frac{1}{k} \mathbf{b}\_k = \text{p.lim } \frac{1}{k} \sum\_{j=1}^k [\gamma\_j^1 \boldsymbol{\xi}\_j \eta\_j, \dots, \gamma\_j^l \boldsymbol{\xi}\_j \eta\_j] \tag{38}$$

Substituting from (37) and (38) in (36), results

$$\operatorname{p.lim}\tilde{\theta}\_{k} = \{\operatorname{p.lim}\frac{1}{k}\sum\_{j=1}^{k}\boldsymbol{x}\_{j}\boldsymbol{x}\_{j}^{T}[(\gamma\_{j}^{1})^{2} + \dots + (\gamma\_{j}^{l})^{2}] + \operatorname{p.lim}\frac{1}{k}\sum\_{j=1}^{k}\boldsymbol{\xi}\_{j}\boldsymbol{\xi}\_{j}^{T}[(\gamma\_{j}^{1})^{2} + \dots + (\gamma\_{j}^{l})^{2}] + \dots\}$$

$$+ (\gamma\_{j}^{l})^{2}[)^{-1}\operatorname{p.lim}\frac{1}{k}\sum\_{j=1}^{k}[\gamma\_{j}^{1}\boldsymbol{\xi}\_{j}\eta\_{j}, \dots, \gamma\_{j}^{l}\boldsymbol{\xi}\_{j}\eta\_{j}] \qquad (39)$$

with *l* ∑ *i*=1 *γi <sup>j</sup>* = 1. For the case of only one rule (*l* = 1), the analysis is simplified to the linear one, with *γ<sup>i</sup> j* | *i*=1 *<sup>j</sup>*=1,...,*k*= 1. Thus, this analysis, which is a contribution of this article, is an extension of the standard linear one, from which can result several studies for fuzzy filtering and modeling in a noisy environment, fuzzy signal enhancement in communication channel, and so forth. 12 Will-be-set-by-IN-TECH

*k* C−<sup>1</sup>

(x*<sup>j</sup>* + ξ*j*)][*γ*<sup>1</sup>

*<sup>j</sup>* ] + ... <sup>+</sup> p.lim <sup>1</sup>

*<sup>j</sup>* = 1. Hence, the asymptotic analysis of the TS fuzzy model consequent parameters

*<sup>j</sup>*(x*<sup>j</sup>* <sup>+</sup> <sup>ξ</sup>*j*),..., *<sup>γ</sup><sup>l</sup>*

*j*

<sup>2</sup>] + p.lim <sup>1</sup>

<sup>2</sup>]}<sup>−</sup>1p.lim <sup>1</sup>

*k*

*k* ∑ *j*=1 [*γ*<sup>1</sup>

*k*

*k* ∑ *j*=1

ξ*j*ξ*<sup>T</sup> <sup>j</sup>* [(*γ*<sup>1</sup>

*<sup>j</sup>* <sup>ξ</sup>*jηj*,..., *<sup>γ</sup><sup>l</sup>*

*j* )

*<sup>j</sup>* = 1. For the case of only one rule (*l* = 1), the analysis is simplified to the linear one,

*<sup>j</sup>*=1,...,*k*= 1. Thus, this analysis, which is a contribution of this article, is an extension of

)2] + p.lim <sup>1</sup>

*j*

estimation is based in a weighted sum of the fuzzy covariance matrices of x and ξ. Similarly,

*<sup>j</sup>* )<sup>2</sup> <sup>+</sup> ... + (*γ<sup>l</sup>*

+(*γ<sup>l</sup> j* )

the standard linear one, from which can result several studies for fuzzy filtering and modeling in a noisy environment, fuzzy signal enhancement in communication channel, and so forth.

*<sup>k</sup>* p.lim <sup>1</sup>

*k*

*<sup>j</sup>*(x*<sup>j</sup>* <sup>+</sup> <sup>ξ</sup>*j*),..., *<sup>γ</sup><sup>l</sup>*

*k*

*k* ∑ *j*=1 (*γl j* )2[x*j*x*<sup>T</sup>*

*k* ∑ *j*=1

ξ*j*ξ*<sup>T</sup> <sup>j</sup>* [(*γ*<sup>1</sup> *j* )

(x*<sup>j</sup>* + ξ*j*)]*η<sup>j</sup>*

ξ*jηj*] (38)

*<sup>j</sup>* <sup>ξ</sup>*jηj*,..., *<sup>γ</sup><sup>l</sup>*

*<sup>j</sup>*)<sup>2</sup> <sup>+</sup> ...

*j*

ξ*jηj*] (39)

*k*

*k*

*j*

*k* ∑ *j*=1 (*γl j*

*<sup>k</sup>* <sup>=</sup> p.lim <sup>1</sup>

*j*

*<sup>j</sup>* )2(x*<sup>j</sup>* <sup>+</sup> <sup>ξ</sup>*j*)(x*<sup>j</sup>* <sup>+</sup> <sup>ξ</sup>*j*)*<sup>T</sup>* <sup>+</sup> ... <sup>+</sup> p.lim <sup>1</sup>

*k*

C*<sup>k</sup>* and <sup>1</sup>

*j*

b*<sup>k</sup>* (36)

(x*<sup>j</sup>* + ξ*j*)]*<sup>T</sup>*

*<sup>j</sup>* <sup>+</sup> <sup>ξ</sup>*j*ξ*<sup>T</sup> j* ]

<sup>2</sup> + ... + (*γ<sup>l</sup>*

*j* )2](37)

)2(x*<sup>j</sup>* + ξ*j*)(x*<sup>j</sup>* + ξ*j*)*<sup>T</sup>*

*<sup>k</sup>* b*<sup>k</sup>* converge in

Applying Slutsky's theorem and assuming that the elements of <sup>1</sup>

p.lim θ˜

*<sup>j</sup>*(x*<sup>j</sup>* <sup>+</sup> <sup>ξ</sup>*j*),..., *<sup>γ</sup><sup>l</sup>*

*<sup>j</sup>* )2[x*j*x*<sup>T</sup>*

<sup>b</sup>*<sup>k</sup>* <sup>=</sup> p.lim <sup>1</sup>

<sup>b</sup>*<sup>k</sup>* <sup>=</sup> p.lim <sup>1</sup>

x*j*x*<sup>T</sup> <sup>j</sup>* [(*γ*<sup>1</sup>

*k*

*k*

*k* ∑ *j*=1 [*γ*<sup>1</sup>

*k* ∑ *j*=1 [*γ*<sup>1</sup>

*<sup>j</sup>* <sup>+</sup> <sup>ξ</sup>*j*ξ*<sup>T</sup>*

<sup>2</sup> + ... + (*γ<sup>l</sup>*

probability, we have

<sup>C</sup>*<sup>k</sup>* <sup>=</sup>p.lim <sup>1</sup>

<sup>C</sup>*<sup>k</sup>* <sup>=</sup>p.lim <sup>1</sup>

<sup>C</sup>*<sup>k</sup>* <sup>=</sup>p.lim <sup>1</sup>

<sup>C</sup>*<sup>k</sup>* <sup>=</sup>p.lim <sup>1</sup>

*k*

*k*

*k* ∑ *j*=1 [*γ*<sup>1</sup>

*k* ∑ *j*=1 (*γ*<sup>1</sup>

Assuming x*<sup>j</sup>* and ξ*<sup>j</sup>* statistically independent,

*k* ∑ *j*=1 (*γ*<sup>1</sup>

*k* ∑ *j*=1

x*j*x*<sup>T</sup> <sup>j</sup>* [(*γ*<sup>1</sup> *j*)

*k*

*k*

p.lim <sup>1</sup> *k*

p.lim <sup>1</sup> *k*

Substituting from (37) and (38) in (36), results

*k*

*k* ∑ *j*=1

*<sup>k</sup>* <sup>=</sup> {p.lim <sup>1</sup>

Thus,

p.lim <sup>1</sup> *k*

p.lim <sup>1</sup> *k*

> p.lim <sup>1</sup> *k*

> p.lim <sup>1</sup> *k*

> > *l* ∑ *i*=1 *γi*

> > > p.lim θ˜

with

with

with *γ<sup>i</sup> j* | *i*=1

*l* ∑ *i*=1 *γi* observation intervals. As a result, the fuzzy covariance matrix *k* ∑ *j*=1 x*j*x*<sup>T</sup> <sup>j</sup>* [(*γ*<sup>1</sup> *<sup>j</sup>*)<sup>2</sup> <sup>+</sup> ... + (*γ<sup>l</sup> j* )2]

will also be non-singular and its inverse will exist. Thus, the only way in which the asymptotic error can be zero is for ξ*jη<sup>j</sup>* identically zero. But, in general, ξ*<sup>j</sup>* and *η<sup>j</sup>* are correlated, the asymptotic error will not be zero and the least squares estimates will be asymptotically biased to an extent determined by the relative ratio of noise to signal variances. In other words, least squares method is not appropriate to estimate the TS fuzzy model consequent parameters in a noisy environment because the estimates will be inconsistent and the bias error will remain no matter how much data can be used in the estimation.

As a consequence of this analysis, the definition of the vector [*β*<sup>1</sup> *<sup>j</sup>* **<sup>z</sup>***j*,..., *<sup>β</sup><sup>l</sup> j* **z***j*] as *fuzzy instrumental variable vector* or simply the *fuzzy instrumental variable* (FIV) is proposed. Clearly, with the use of the FIV vector in the form suggested, becomes possible to eliminate the asymptotic bias while preserving the existence of a solution. However, the statistical efficiency of the solution is dependent on the degree of correlation between [*β*<sup>1</sup> *<sup>j</sup>* <sup>z</sup>*j*,..., *<sup>β</sup><sup>l</sup> j* z*j*] and [*γ*<sup>1</sup> *<sup>j</sup>* <sup>x</sup>*j*,..., *<sup>γ</sup><sup>l</sup> j* x*j*]. In particular, the lowest variance estimates obtained from this approach occur only when z*<sup>j</sup>* = x*<sup>j</sup>* and *β<sup>i</sup> j* | *i*=1,...,*l <sup>j</sup>*=1,...,*k*<sup>=</sup> *<sup>γ</sup><sup>i</sup> j* | *i*=1,...,*l <sup>j</sup>*=1,...,*<sup>k</sup>* , i.e., when the z*<sup>j</sup>* are equal to the dynamic system "free noise" variables, which are unavailable in practice. According to situation, several fuzzy instrumental variables can be chosen. An effective choice of FIV would be the one based on the delayed input sequence

$$z\_j = [\mathfrak{u}\_{k-\tau\prime} \dots \mathfrak{u}\_{k-\tau-n\prime} \mathfrak{u}\_{k\prime} \dots \mathfrak{u}\_{k-n}]^T$$

where *τ* is chosen so that the elements of the fuzzy covariance matrix Czx are maximized. In this case, the input signal is considered persistently exciting, e.g., it continuously perturbs or excites the system. Another FIV would be the one based on the delayed input-output sequence

$$\mathbf{z}\_{\mathbf{j}} = \begin{bmatrix} y\_{k-1-dl\nu} \cdot \cdot \cdot \cdot \, \prime \, \mathcal{Y}k - n\_y - dl\nu \, \mathcal{U}k - 1 - dl\nu \, \cdot \, \cdot \, \cdot \, \prime \, \mathcal{U}k - n\_{\mu} - dl \end{bmatrix}^T$$

where *dl* is the applied delay. Other FIV could be the one based in the input-output from a "fuzzy auxiliar model" with the same structure of the one used to identify the nonlinear dynamic system. Thus,

$$z\_{\hat{\jmath}} = [\hat{y}\_{k-1\prime} \cdots \hat{\jmath}\_{\prime} \hat{y}\_{k-n\_{\hat{y}\prime}} \, \mathfrak{u}\_{k-1\prime} \cdots \, \mathfrak{u}\_{k-n\_{\hat{u}}}]^T$$

where *y*ˆ*<sup>k</sup>* is the output of the fuzzy auxiliar model, and *uk* is the input of the dynamic system. The inference formula of this fuzzy auxiliar model is given by

$$\begin{aligned} \hat{y}\_{k+1} &= \beta\_1(\mathbf{z}\_k)[a\_{1,1}\dot{y}\_k + \dots + a\_{1,ny}\dot{y}\_{k-n\_y+1} + \rho\_{1,1}u\_k + \dots + \rho\_{1,nu}u\_{k-n\_u+1} + \delta\_1] + \\ \beta\_2(\mathbf{z}\_k)[a\_{2,1}\dot{y}\_k + \dots + a\_{2,ny}\dot{y}\_{k-n\_y+1} + \rho\_{2,1}u\_k + \dots + \rho\_{2,nu}u\_{k-n\_u+1} + \delta\_2] + \\ &\vdots \\ \beta\_l(\mathbf{z}\_k)[a\_{l,1}\dot{y}\_k + \dots + a\_{l,ny}\dot{y}\_{k-n\_y+1} + \rho\_{l,1}u\_k + \dots + \rho\_{l,nu}u\_{k-n\_u+1} + \delta\_l] \end{aligned} \tag{40}$$

which is also linear in the consequent parameters: *α*, *ρ* and *δ*. The closer these parameters are to the actual, but unknown, system parameters (a, b, c), more correlated z*<sup>k</sup>* and x*<sup>k</sup>* will be, and the obtained FIV estimates closer to the optimum.

#### **4.2.1 Batch processing scheme**

The normal equations are formulated as

$$\sum\_{j=1}^{k} [\beta\_j^1 \mathbf{z}\_{\circ}, \dots, \beta\_j^l \mathbf{z}\_{\circ}] [\gamma\_j^1 (\mathbf{z}\_{\circ} + \boldsymbol{\xi}\_j), \dots, \gamma\_j^l (\mathbf{z}\_{\circ} + \boldsymbol{\xi}\_j)]^T \boldsymbol{\theta}\_k - \sum\_{j=1}^{k} [\beta\_j^1 \mathbf{z}\_{\circ}, \dots, \beta\_j^l \mathbf{z}\_{\circ}] y\_{\circ} = \mathbf{0} \tag{41}$$

or, with *ζ<sup>j</sup>* = [*β*<sup>1</sup> *<sup>j</sup>* <sup>z</sup>*j*,..., *<sup>β</sup><sup>l</sup> j* z*j*],

$$\mathbb{E}\left[\sum\_{j=1}^{k}\mathbb{E}\_{j}\chi\_{j}^{T}\right]\hat{\theta}\_{k} - \sum\_{j=1}^{k}\mathbb{E}\_{j}y\_{j} = 0\tag{42}$$

so that the FIV estimate is obtained as

$$\boldsymbol{\theta}\_{k} = \{ \sum\_{j=1}^{k} [\boldsymbol{\beta}\_{j}^{1} \boldsymbol{z}\_{j}, \dots, \boldsymbol{\beta}\_{j}^{l} \boldsymbol{z}\_{j}] [\boldsymbol{\gamma}\_{j}^{1} (\boldsymbol{x}\_{j} + \boldsymbol{\xi}\_{j}), \dots, \boldsymbol{\gamma}\_{j}^{l} (\boldsymbol{x}\_{j} + \boldsymbol{\xi}\_{j})]^{T} \}^{-1} \sum\_{j=1}^{k} [\boldsymbol{\beta}\_{j}^{1} \boldsymbol{z}\_{j}, \dots, \boldsymbol{\beta}\_{j}^{l} \boldsymbol{z}\_{j}] \boldsymbol{y}\_{j} \tag{43}$$

and, in vectorial form, the interest problem may be placed as

$$\boldsymbol{\hat{\theta}} = (\boldsymbol{\Gamma}^T \boldsymbol{\Sigma})^{-1} \boldsymbol{\Gamma}^T \mathbf{Y} \tag{44}$$

where <sup>Γ</sup>*<sup>T</sup>* ∈ �*l*(*ny*+*nu*+1)×*<sup>N</sup>* is the fuzzy extended instrumental variable matrix with rows given by *<sup>ζ</sup>j*, <sup>Σ</sup> ∈ �*N*×*l*(*ny*+*nu*+1) is the fuzzy extended data matrix with rows given by *<sup>χ</sup><sup>j</sup>* and <sup>Y</sup> ∈ �*N*×<sup>1</sup> is the output vector and <sup>ˆ</sup> *<sup>θ</sup>* ∈ �*l*(*ny*+*nu*+1)×<sup>1</sup> is the parameters vector. The models can be obtained by the following two approaches:

• *Global approach* : In this approach all linear consequent parameters are estimated simultaneously, minimizing the criterion:

$$\hat{\theta} = \arg\min \parallel \Gamma^T \Sigma \theta - \Gamma^T \mathbf{Y} \parallel\_2^2 \tag{45}$$

• *Local approach* : In this approach the consequent parameters are estimated for each rule *i*, and hence independently of each other, minimizing a set of weighted local criteria (*i* = 1, 2, . . . , *l*):

$$\boldsymbol{\theta}\_{i} = \arg\min \parallel \mathbf{Z}^{T} \mathbf{Y}\_{i} \mathbf{X} \boldsymbol{\theta}\_{i} - \mathbf{Z}^{T} \mathbf{Y}\_{i} \mathbf{Y} \parallel\_{2}^{2} \tag{46}$$

where Z*<sup>T</sup>* has rows given by z*<sup>j</sup>* and Ψ*<sup>i</sup>* is the normalized membership degree diagonal matrix according to z*j*.

*Example 1.* So that the readers can understand the definitions of global and local fuzzy modeling estimations, consider the following second-order polynomial given by

$$y = 2u\_k^2 - 4u\_k + 3\tag{47}$$

where *uk* is the input and *yk* is the output, respectively. The TS fuzzy model used to approximate this polynomial has the following structure with 2 rules:

$$\mathbb{R}^l: \text{ IF } u\_k \text{ is } F\_l \text{ THEN } \mathcal{Y}\_k = a\_0 + a\_1 u\_k.$$

where *i* = 1, 2. It was choosen the points *uk* = −0.5 and *uk* = 0.5 to analysis the consequent models obtained by global and local estimation, and it was defined triangular membership functions for −0.5 ≤ *uk* ≤ 0.5 in the antecedent. The following rules were obtained:

Local estimation:

14 Will-be-set-by-IN-TECH

which is also linear in the consequent parameters: *α*, *ρ* and *δ*. The closer these parameters are to the actual, but unknown, system parameters (a, b, c), more correlated z*<sup>k</sup>* and x*<sup>k</sup>* will be,

*j*

(x*<sup>j</sup>* + *ξj*)]*<sup>T</sup>* ˆ

*k* ∑ *j*=1

*j*

where <sup>Γ</sup>*<sup>T</sup>* ∈ �*l*(*ny*+*nu*+1)×*<sup>N</sup>* is the fuzzy extended instrumental variable matrix with rows given by *<sup>ζ</sup>j*, <sup>Σ</sup> ∈ �*N*×*l*(*ny*+*nu*+1) is the fuzzy extended data matrix with rows given by *<sup>χ</sup><sup>j</sup>* and

• *Global approach* : In this approach all linear consequent parameters are estimated

*<sup>θ</sup>* <sup>=</sup> arg min � <sup>Γ</sup>*T*Σ*<sup>θ</sup>* <sup>−</sup> <sup>Γ</sup>*T*<sup>Y</sup> �<sup>2</sup>

*<sup>θ</sup><sup>i</sup>* <sup>=</sup> arg min � <sup>Z</sup>*T*Ψ*i*X*θ<sup>i</sup>* <sup>−</sup> <sup>Z</sup>*T*Ψ*i*<sup>Y</sup> �<sup>2</sup>

where Z*<sup>T</sup>* has rows given by z*<sup>j</sup>* and Ψ*<sup>i</sup>* is the normalized membership degree diagonal

*Example 1.* So that the readers can understand the definitions of global and local fuzzy

modeling estimations, consider the following second-order polynomial given by

*y* = 2*u*<sup>2</sup>

• *Local approach* : In this approach the consequent parameters are estimated for each rule *i*, and hence independently of each other, minimizing a set of weighted local criteria (*i* =

*θ<sup>k</sup>* −

(x*<sup>j</sup>* <sup>+</sup> *<sup>ξ</sup>j*)]*T*}−<sup>1</sup>

*k* ∑ *j*=1 [*β*1

*<sup>j</sup>* <sup>z</sup>*j*,..., *<sup>β</sup><sup>l</sup> j*

*ζjyj* = 0 (42)

*<sup>j</sup>* <sup>z</sup>*j*,..., *<sup>β</sup><sup>l</sup> j*

<sup>2</sup> (45)

<sup>2</sup> (46)

*k* ∑ *j*=1 [*β*1

*θ* = (Γ*T*Σ)−1Γ*T*Y (44)

*<sup>θ</sup>* ∈ �*l*(*ny*+*nu*+1)×<sup>1</sup> is the parameters vector. The models

*<sup>k</sup>* − 4*uk* + 3 (47)

z*j*]*yj* = 0 (41)

z*j*]*yj* (43)

and the obtained FIV estimates closer to the optimum.

*<sup>j</sup>*(x*<sup>j</sup>* <sup>+</sup> *<sup>ξ</sup>j*),..., *<sup>γ</sup><sup>l</sup>*

[ *k* ∑ *j*=1

*ζjχ<sup>T</sup> j* ]ˆ *θ<sup>k</sup>* −

*<sup>j</sup>*(x*<sup>j</sup>* <sup>+</sup> *<sup>ξ</sup>j*),..., *<sup>γ</sup><sup>l</sup>*

ˆ

**4.2.1 Batch processing scheme**

*k* ∑ *j*=1 [*β*1

or, with *ζ<sup>j</sup>* = [*β*<sup>1</sup>

ˆ *θ<sup>k</sup>* = {

1, 2, . . . , *l*):

matrix according to z*j*.

The normal equations are formulated as

*<sup>j</sup>* <sup>z</sup>*j*,..., *<sup>β</sup><sup>l</sup> j* z*j*],

so that the FIV estimate is obtained as

<sup>Y</sup> ∈ �*N*×<sup>1</sup> is the output vector and <sup>ˆ</sup>

can be obtained by the following two approaches:

simultaneously, minimizing the criterion:

*<sup>j</sup>* <sup>z</sup>*j*,..., *<sup>β</sup><sup>l</sup> j* z*j*][*γ*<sup>1</sup>

and, in vectorial form, the interest problem may be placed as

ˆ

ˆ

*k* ∑ *j*=1 [*β*1

*<sup>j</sup>* <sup>z</sup>*j*,..., *<sup>β</sup><sup>l</sup> j* z*j*][*γ*<sup>1</sup>

*<sup>R</sup>*<sup>1</sup> : IF *uk* is <sup>−</sup> 0.5 THEN *<sup>y</sup>*<sup>ˆ</sup> = 3.1000 <sup>−</sup> 4.4012*uk <sup>R</sup>*<sup>2</sup> : IF *uk* is <sup>+</sup> 0.5 THEN *<sup>y</sup>*<sup>ˆ</sup> = 3.1000 <sup>−</sup> 3.5988*uk*

Global estimation:

*<sup>R</sup>*<sup>1</sup> : IF *uk* is <sup>−</sup> 0.5 THEN *<sup>y</sup>*<sup>ˆ</sup> = 4.6051 <sup>−</sup> 1.7503*uk R*<sup>2</sup> : IF *uk* is + 0.5 THEN *y*ˆ = 1.3464 + 0.3807*uk*

The application of local and global estimation to the TS fuzzy model results in the consequent models given in Fig. 1. The consequent models obtained by local estimation describe properly the local behavior of the function and the fuzzy model can easily be interpreted in terms of the local behavior (the rule consequents). The consequent models obtained by global estimation are not relevant for the local behavior of the nonlinear function. The fit of the function is

Fig. 1. The nonlinear function and the result of global (top) and local (bottom) estimation of the consequent parameters of the TS fuzzy models.

shown in Fig. 2. The global estimation gives a good fit and a minimal prediction error, but it bias the estimates of the consequent as parameters of local models. In the local estimation a larger prediction error is obtained than with global estimation, but it gives locally relevant parameters of the consequent. This is the tradeoff between local and global estimation. All the results of the Example 1 can be extended for any nonlinear estimation problem and they would be considered for computational and experimental results analysis in this paper.

Fig. 2. The nonlinear function approximation result by global (top) and local (bottom) estimation of the consequent parameters of the TS fuzzy models.

#### **4.2.2 Recursive processing scheme**

An on line FIV scheme can be obtained by utilizing the recursive solution to the FIV equations and then updating the fuzzy auxiliar model continuously on the basis of these recursive consequent parameters estimates. The FIV estimate in (43) can take the form

$$
\hat{\theta}\_k = \mathbf{P}\_k \mathbf{b}\_k \tag{48}
$$

where

$$P\_k = \sum\_{j=1}^k [\beta\_j^1 z\_{j\prime} \dots \beta\_j^l z\_j] [\gamma\_j^1 (x\_{\vec{j}} + \vec{\xi}\_{\vec{j}}) \dots \gamma\_{\vec{j}}^l (x\_{\vec{j}} + \vec{\xi}\_{\vec{j}})]^T$$

and

$$\mathbf{b}\_{\mathbb{k}} = \sum\_{j=1}^{k} [\beta\_j^1 z\_{j\prime} \dots \beta\_j^l z\_{\mathbb{j}}] y\_{\mathbb{j}}$$

which can be expressed as

$$P\_k^{-1} = P\_{k-1}^{-1} + [\beta\_k^1 x\_k \dots \, , \beta\_j^l z\_k][\gamma\_j^1 (x\_k + \xi\_k) \dots \, \gamma\_k^l (x\_k + \xi\_k)]^T \tag{49}$$

and

$$\mathbf{b}\_{k} = \mathbf{b}\_{k-1} + [\boldsymbol{\beta}\_{k}^{1} \mathbf{z}\_{k} \dots \boldsymbol{\beta}\_{k}^{l} \mathbf{z}\_{k}] y\_{k} \tag{50}$$

$$\mathbf{P}\_{k-1} = \mathbf{P}\_k + \mathbf{P}\_k[\boldsymbol{\beta}\_k^1 \mathbf{z}\_k \dots \boldsymbol{\beta}\_j^l \mathbf{z}\_k][\boldsymbol{\gamma}\_j^1 (\mathbf{z}\_k + \boldsymbol{\xi}\_k) \dots \boldsymbol{\gamma}\_k^l (\mathbf{z}\_k + \boldsymbol{\xi}\_k)]^T \mathbf{P}\_{k-1} \tag{51}$$

then post-multiplying (51) by the FIV vector [*β*<sup>1</sup> *<sup>j</sup>* <sup>z</sup>*j*,..., *<sup>β</sup><sup>l</sup> j* z*j*], results

<sup>P</sup>*k*−1[*β*<sup>1</sup> *<sup>k</sup>*z*k*,..., *<sup>β</sup><sup>l</sup> j* z*k*] = P*k*[*β*<sup>1</sup> *<sup>k</sup>*z*k*,..., *<sup>β</sup><sup>l</sup> j* z*k*] + P*k*[*β*<sup>1</sup> *<sup>k</sup>*z*k*,..., *<sup>β</sup><sup>l</sup> j* z*k*][*γ*<sup>1</sup> *<sup>j</sup>*(x*<sup>k</sup>* <sup>+</sup> *<sup>ξ</sup>k*),..., *<sup>γ</sup><sup>l</sup> <sup>k</sup>*(x*k*+ *<sup>ξ</sup>k*)]*T*P*k*−1[*β*<sup>1</sup> *<sup>k</sup>*z*k*,..., *<sup>β</sup><sup>l</sup> j* z*k*](52) <sup>P</sup>*k*−1[*β*<sup>1</sup> *<sup>k</sup>*z*k*,..., *<sup>β</sup><sup>l</sup> j* <sup>z</sup>*k*] = <sup>P</sup>*k*[*β*<sup>1</sup> *<sup>k</sup>*z*k*,..., *<sup>β</sup><sup>l</sup> j* <sup>z</sup>*k*]{<sup>1</sup> + [*γ*<sup>1</sup> *<sup>j</sup>*(x*<sup>k</sup>* <sup>+</sup> *<sup>ξ</sup>k*),..., *<sup>γ</sup><sup>l</sup> <sup>k</sup>*(x*<sup>k</sup>* <sup>+</sup> *<sup>ξ</sup>k*)]*T*P*k*−<sup>1</sup> [*β*<sup>1</sup> *<sup>k</sup>*z*k*,..., *<sup>β</sup><sup>l</sup> j* z*k*]}(53)

Then, post-multiplying by

$$\begin{aligned} \{1 + [\gamma\_j^1(\mathbf{z}\_k + \boldsymbol{\xi}\_k), \dots, \gamma\_k^l(\mathbf{z}\_k + \boldsymbol{\xi}\_k)]^T \mathbf{P}\_{k-1} [\beta\_k^1 \mathbf{z}\_k, \dots, \beta\_j^l \boldsymbol{z}\_k] \}^{-1} [\gamma\_j^1(\mathbf{z}\_k + \boldsymbol{\xi}\_k), \dots, \\ \gamma\_k^l(\mathbf{z}\_k + \boldsymbol{\xi}\_k)^T \mathbf{P}\_{k-1} \qquad (54) \end{aligned}$$

we obtain

16 Will-be-set-by-IN-TECH

the results of the Example 1 can be extended for any nonlinear estimation problem and they would be considered for computational and experimental results analysis in this paper.

> nonlinear function global estimation

> nonlinear function local estimation

*θ<sup>k</sup>* = P*k*b*<sup>k</sup>* (48)

(x*<sup>j</sup>* <sup>+</sup> *<sup>ξ</sup>j*)]*T*}−<sup>1</sup>

*<sup>k</sup>*(x*<sup>k</sup>* <sup>+</sup> *<sup>ξ</sup>k*)]*<sup>T</sup>* (49)

*<sup>k</sup>*z*k*]*yk* (50)

−1 −0.5 0 0.5 1

−1 −0.5 0 0.5 1

An on line FIV scheme can be obtained by utilizing the recursive solution to the FIV equations and then updating the fuzzy auxiliar model continuously on the basis of these recursive

*<sup>j</sup>*(x*<sup>j</sup>* <sup>+</sup> *<sup>ξ</sup>j*),..., *<sup>γ</sup><sup>l</sup>*

*<sup>j</sup>* <sup>z</sup>*j*,..., *<sup>β</sup><sup>l</sup> j* z*j*]*yj*

*<sup>k</sup>*z*k*,..., *<sup>β</sup><sup>l</sup>*

*j*

*<sup>j</sup>*(x*<sup>k</sup>* <sup>+</sup> *<sup>ξ</sup>k*),..., *<sup>γ</sup><sup>l</sup>*

Fig. 2. The nonlinear function approximation result by global (top) and local (bottom)

consequent parameters estimates. The FIV estimate in (43) can take the form

*<sup>j</sup>* <sup>z</sup>*j*,..., *<sup>β</sup><sup>l</sup> j* z*j*][*γ*<sup>1</sup>

b*<sup>k</sup>* =

*<sup>k</sup>*z*k*,..., *<sup>β</sup><sup>l</sup>*

<sup>b</sup>*<sup>k</sup>* <sup>=</sup> <sup>b</sup>*k*−<sup>1</sup> + [*β*<sup>1</sup>

*k* ∑ *j*=1 [*β*1

> *j* z*k*][*γ*<sup>1</sup>

ˆ

estimation of the consequent parameters of the TS fuzzy models.

**4.2.2 Recursive processing scheme**

P*<sup>k</sup>* =

which can be expressed as

P <sup>−</sup><sup>1</sup> *<sup>k</sup>* <sup>=</sup> <sup>P</sup> <sup>−</sup><sup>1</sup>

*k* ∑ *j*=1 [*β*1

*<sup>k</sup>*−<sup>1</sup> + [*β*<sup>1</sup>

where

and

and

$$\mathbf{P}\_{k-1}[\boldsymbol{\beta}\_{k}^{1}\mathbf{z}\_{k}, \dots, \boldsymbol{\beta}\_{j}^{l}\mathbf{z}\_{k}] \{1 + [\boldsymbol{\gamma}\_{j}^{1}(\mathbf{z}\_{k} + \boldsymbol{\xi}\_{k}), \dots, \boldsymbol{\gamma}\_{k}^{l}(\mathbf{z}\_{k} + \boldsymbol{\xi}\_{k})]^{T}\mathbf{P}\_{k-1}[\boldsymbol{\beta}\_{k}^{1}\mathbf{z}\_{k}, \dots, \boldsymbol{\beta}\_{j}^{l}\mathbf{z}\_{k}] \}^{-1}$$

$$[\boldsymbol{\gamma}\_{j}^{1}(\mathbf{z}\_{k} + \boldsymbol{\xi}\_{k}), \dots, \boldsymbol{\gamma}\_{k}^{l}(\mathbf{z}\_{k} + \boldsymbol{\xi}\_{k})]^{T}\mathbf{P}\_{k-1} =$$

$$\mathbf{P}\_{k}[\boldsymbol{\beta}\_{k}^{1}\mathbf{z}\_{k}, \dots, \boldsymbol{\beta}\_{j}^{l}\mathbf{z}\_{k}][\boldsymbol{\gamma}\_{j}^{1}(\mathbf{z}\_{k} + \boldsymbol{\xi}\_{k}), \dots, \boldsymbol{\gamma}\_{k}^{l}(\mathbf{z}\_{k} + \boldsymbol{\xi}\_{k})]^{T}\mathbf{P}\_{k-1} \quad (55)$$

Substituting (51) in (55), we have

$$\mathbf{P}\_{k} = \mathbf{P}\_{k-1} - \mathbf{P}\_{k-1} [\boldsymbol{\beta}\_{k}^{1} \boldsymbol{z}\_{k} \boldsymbol{\varsigma} \dots \boldsymbol{\beta}\_{j}^{l} \boldsymbol{z}\_{k}] \{1 + [\boldsymbol{\gamma}\_{j}^{1} (\boldsymbol{x}\_{k} + \boldsymbol{\xi}\_{k}) \boldsymbol{\varsigma} \dots \boldsymbol{\gamma}\_{k}^{l} (\boldsymbol{x}\_{k} + \boldsymbol{\xi}\_{k})]^{T} \boldsymbol{P}\_{k-1}$$

$$[\boldsymbol{\beta}\_{k}^{1} \boldsymbol{z}\_{k} \boldsymbol{\varsigma} \dots \boldsymbol{\beta}\_{j}^{l} \boldsymbol{z}\_{k}] \}^{-1} [\boldsymbol{\gamma}\_{j}^{1} (\boldsymbol{x}\_{k} + \boldsymbol{\xi}\_{k}) \boldsymbol{\varsigma} \dots \boldsymbol{\gamma}\_{k}^{l} (\boldsymbol{x}\_{k} + \boldsymbol{\xi}\_{k})]^{T} \boldsymbol{P}\_{k-1} \tag{56}$$

Substituting (56) and (50) in (48), the recursive consequent parameters estimates will be:

$$\hat{\theta}\_{k} = \{\mathbf{P}\_{k-1} - \mathbf{P}\_{k-1}[\boldsymbol{\beta}\_{k}^{\mathrm{T}} \mathbf{z}\_{k} \boldsymbol{\omega}\_{k} \dots \boldsymbol{\beta}\_{j}^{\mathrm{I}} \mathbf{z}\_{k}] \{1 + [\boldsymbol{\gamma}\_{j}^{\mathrm{I}}(\mathbf{z}\_{k} + \boldsymbol{\xi}\_{k}) \boldsymbol{\upgamma}\_{k} \dots \boldsymbol{\gamma}\_{k}^{\mathrm{I}}(\mathbf{z}\_{k} + \boldsymbol{\xi}\_{k})]^{\mathrm{T}} \mathbf{P}\_{k-1}$$
 
$$\boldsymbol{\alpha}^{\mathrm{I}} \quad \boldsymbol{\gamma} \boldsymbol{\alpha} - \boldsymbol{\upgamma} \boldsymbol{\alpha} \quad \boldsymbol{\gamma} \quad \boldsymbol{\gamma} \quad \boldsymbol{\gamma} \quad \boldsymbol{\gamma} \quad \boldsymbol{\gamma} \quad \boldsymbol{\gamma} \quad \boldsymbol{\gamma} \quad \boldsymbol{\gamma} \quad \boldsymbol{\gamma} \quad \boldsymbol{\gamma} \quad \boldsymbol{\gamma} \quad \boldsymbol{\gamma} \quad \boldsymbol{\gamma} \quad \boldsymbol{\gamma} \quad \boldsymbol{\gamma} \quad \boldsymbol{\gamma} \quad \boldsymbol{\gamma} \quad \boldsymbol{\gamma} \quad \boldsymbol{\gamma} \quad \boldsymbol{\gamma} \quad \boldsymbol{\gamma} \quad \boldsymbol{\gamma} \quad \boldsymbol{\gamma} \quad \boldsymbol{\gamma} \quad \boldsymbol{\gamma} \quad \boldsymbol{\gamma} \quad \boldsymbol{\gamma} \quad \boldsymbol{\gamma} \quad \boldsymbol{\gamma} \quad \boldsymbol{\gamma} \quad \boldsymbol{\gamma} \quad \boldsymbol{\gamma} \quad \boldsymbol{\gamma} \quad \boldsymbol{\gamma} \quad \boldsymbol{\gamma} \quad \boldsymbol{\gamma} \quad \boldsymbol{\gamma} \quad \boldsymbol{\gamma} \quad \boldsymbol{\gamma} \quad \boldsymbol{\gamma} \quad \boldsymbol{\gamma} \quad \boldsymbol{\gamma} \quad \boldsymbol{\gamma} \quad \boldsymbol{\gamma} \quad \boldsymbol{\gamma} \quad \boldsymbol{\gamma} \quad \boldsymbol{\gamma$$

$$\{ [\boldsymbol{\beta}\_k^1 \boldsymbol{z}\_k, \dots, \boldsymbol{\beta}\_j^l \boldsymbol{z}\_k] \}^{-1} [\boldsymbol{\gamma}\_j^1 (\boldsymbol{x}\_k + \boldsymbol{\xi}\_k), \dots, \boldsymbol{\gamma}\_k^l (\boldsymbol{x}\_k + \boldsymbol{\xi}\_k)]^T \boldsymbol{P}\_{k-1} \} \{ \boldsymbol{b}\_{k-1} + [\boldsymbol{\beta}\_k^1 \boldsymbol{z}\_k, \dots, \boldsymbol{\beta}\_k^l \boldsymbol{z}\_k] \boldsymbol{y}\_k \} \tag{57}$$

so that finally,

$$\hat{\theta}\_{k} = \hat{\theta}\_{k-1} - \mathbf{K}\_{k} \{ \left[ \gamma\_{\hat{j}}^{1} (\mathbf{z}\_{k} + \boldsymbol{\xi}\_{k}), \dots, \gamma\_{k}^{l} (\mathbf{z}\_{k} + \boldsymbol{\xi}\_{k}) \right]^{T} \hat{\theta}\_{k-1} - y\_{k} \} \tag{58}$$

where

$$\mathbf{K}\_{k} = \mathbf{P}\_{k-1}[\boldsymbol{\beta}\_{k}^{1}\mathbf{z}\_{k}, \dots, \boldsymbol{\beta}\_{k}^{l}\mathbf{z}\_{k}] \{1 + [\boldsymbol{\gamma}\_{j}^{1}(\mathbf{z}\_{k} + \boldsymbol{\xi}\_{k}), \dots, \boldsymbol{\gamma}\_{k}^{l}(\mathbf{z}\_{k} + \boldsymbol{\xi}\_{k})]^{T}\mathbf{P}\_{k-1}[\boldsymbol{\beta}\_{k}^{1}\mathbf{z}\_{k}, \dots, \boldsymbol{\beta}\_{j}^{l}\mathbf{z}\_{k}] \}^{-1} \\ \text{(59)}$$

Equations (56)-(59) compose the recursive algorithm to be implemented so the consequent parameters of a Takagi-Sugeno fuzzy model can be estimated from experimental data.

#### **5. Results**

In the sequel, some results will be presented to demonstrate the effectiveness of black box fuzzy modeling for advanced control systems design.

#### **5.1 Computational results**

#### **5.1.1 Stochastic nonlinear SISO system identification**

The plant to be identified consists on a second order highly nonlinear discrete-time system

$$y\_{k+1} = \frac{y\_k y\_{k-1} (y\_k + 2.5)}{1 + y\_k^2 + y\_{k-1}^2} + u\_k + e\_k \tag{60}$$

which is a benchmark problem in neural and fuzzy modeling, where *yk* is the output and *uk* = sin( <sup>2</sup>*π<sup>k</sup>* <sup>25</sup> ) is the applied input. In this case *ek* is a white noise with zero mean and variance *<sup>σ</sup>*2. The TS model has two inputs *yk* and *yk*−<sup>1</sup> and a single output *yk*+1, and the antecedent part of the fuzzy model (the fuzzy sets) is designed based on the evolving clustering method (ECM). The model is composed of rules of the form:

$$\mathcal{R}^l: \text{ IF } y\_k \text{ is } F\_1^l \text{ AND } y\_{k-1} \text{ is } F\_2^l \text{ THEN}$$

$$\mathcal{Y}\_{k+1}^l = a\_{i,1}y\_k + a\_{i,2}y\_{k-1} + b\_{i,1}u\_k + c\_i \tag{61}$$

where *F<sup>i</sup>* 1,2 are gaussian fuzzy sets.

Experimental data sets of *<sup>N</sup>* points each are created from (60), with *<sup>σ</sup>*<sup>2</sup> <sup>∈</sup> [0, 0.20]. This means that the noise applied take values between 0 and ±30% of the output nominal value, which is an acceptable practical percentage of noise. These data sets are presented to the proposed algorithm, for obtaining an IV fuzzy model, and to the LS based algorithm, for obtaining a LS fuzzy model. The models are obtained by the global and local approaches as in (45) and (46), repectively. The noise influence is analized according to the difference between the outputs of the fuzzy models, obtained from the noisy experimental data, and the output of the plant without noise. The antecedent parameters and the structure of the fuzzy models are the same in the experiments, while the consequent parameters are obtained by the proposed method and by the LS method. Thus, the obtained results are due to these algorithms and accuracy conclusions will be derived about the proposed algorithm performance in the presence of noise. Two criteria, widely used in experimental data analysis, are applied to avaliate the obtained fuzzy models fit: Variance Accounted For (VAF)

$$\mathbf{VAF}(\%) = 100 \times \left[ 1 - \frac{var(\mathbf{Y} - \hat{\mathbf{Y}})}{var(\mathbf{Y})} \right] \tag{62}$$

where **Y** is the nominal plant output, **Yˆ** is the fuzzy model output and *var* means signal variance, and Mean Square Error (MSE)

$$\mathbf{MSE} = \frac{1}{N} \sum\_{k=1}^{N} (y\_k - \hat{y}\_k)^2 \tag{63}$$

18 Will-be-set-by-IN-TECH

In the sequel, some results will be presented to demonstrate the effectiveness of black box

The plant to be identified consists on a second order highly nonlinear discrete-time system

*<sup>k</sup>* <sup>+</sup> *<sup>y</sup>*<sup>2</sup> *k*−1

which is a benchmark problem in neural and fuzzy modeling, where *yk* is the output and

*<sup>σ</sup>*2. The TS model has two inputs *yk* and *yk*−<sup>1</sup> and a single output *yk*+1, and the antecedent part of the fuzzy model (the fuzzy sets) is designed based on the evolving clustering method

<sup>1</sup> AND *yk*−<sup>1</sup> is *<sup>F</sup><sup>i</sup>*

Experimental data sets of *<sup>N</sup>* points each are created from (60), with *<sup>σ</sup>*<sup>2</sup> <sup>∈</sup> [0, 0.20]. This means that the noise applied take values between 0 and ±30% of the output nominal value, which is an acceptable practical percentage of noise. These data sets are presented to the proposed algorithm, for obtaining an IV fuzzy model, and to the LS based algorithm, for obtaining a LS fuzzy model. The models are obtained by the global and local approaches as in (45) and (46), repectively. The noise influence is analized according to the difference between the outputs of the fuzzy models, obtained from the noisy experimental data, and the output of the plant without noise. The antecedent parameters and the structure of the fuzzy models are the same in the experiments, while the consequent parameters are obtained by the proposed method and by the LS method. Thus, the obtained results are due to these algorithms and accuracy conclusions will be derived about the proposed algorithm performance in the presence of noise. Two criteria, widely used in experimental data analysis, are applied to avaliate the

where **Y** is the nominal plant output, **Yˆ** is the fuzzy model output and *var* means signal

*N* ∑ *k*=1

*N*

<sup>1</sup> <sup>−</sup> *var*(**<sup>Y</sup>** <sup>−</sup> **Yˆ**) *var*(**Y**)

(*yk* − *y*ˆ*k*)

<sup>2</sup> (63)

(62)

<sup>25</sup> ) is the applied input. In this case *ek* is a white noise with zero mean and variance

<sup>2</sup> THEN

*<sup>k</sup>*+<sup>1</sup> <sup>=</sup> *ai*,1*yk* + *ai*,2*yk*−<sup>1</sup> + *bi*,1*uk* + *ci* (61)

+ *uk* + *ek* (60)

*yk*<sup>+</sup><sup>1</sup> <sup>=</sup> *ykyk*−1(*yk* <sup>+</sup> 2.5) 1 + *y*<sup>2</sup>

fuzzy modeling for advanced control systems design.

**5.1.1 Stochastic nonlinear SISO system identification**

(ECM). The model is composed of rules of the form:

1,2 are gaussian fuzzy sets.

*R<sup>i</sup>* : IF *yk* is *F<sup>i</sup>*

*y*ˆ *i*

obtained fuzzy models fit: Variance Accounted For (VAF)

variance, and Mean Square Error (MSE)

**VAF**(%) = 100 ×

**MSE** <sup>=</sup> <sup>1</sup>

**5. Results**

*uk* = sin( <sup>2</sup>*π<sup>k</sup>*

where *F<sup>i</sup>*

**5.1 Computational results**

where *yk* is the nominal plant output, *y*ˆ*<sup>k</sup>* is the fuzzy model output and *N* is the number of points. Once obtained these values, a comparative analysis will be established between the proposed algorithm, based on IV, and the algorithm based on LS according to the approaches presented above. In the performance of the TS models obtained off-line according to (45) and (46), the number of points is 500, the proposed algorithm used *λ* equal to 0.99; the number of rules is 4, the structure is the presented in (61) and the antecedent parameters are obtained by the ECM method for both algorithms. The proposed algorithm performs better than the LS algorithm for the two approaches as it is more robust to noise. This is due to the chosen instrumental variable matrix, with *dl* = 1, to satisfy the convergence conditions as well as possible. In the global approach, for low noise variance, both algorithms presented similar performance with VAF and MSE of 99.50% and 0.0071 for the proposed algorithm and of 99.56% and 0.0027 for the LS based algorithm, respectively. However, when the noise variance increases, the chosen instrumental variable matrix satisfies the convergence conditions, and as a consequence the proposed algorithm becomes more robust to the noise with VAF and MSE of 98.81% and 0.0375. On the other hand the LS based algorithm presented VAF and MSE of 82.61% and 0.4847, respectively, that represents a poor performance. Similar analysis can be applied to the local approach: increasing the noise variance, both algorithms present good performances where the VAF and MSE values increase too. This is due to the polytope property, where the obtained models can represent local approximations giving more flexibility curves fitting. The proposed algorithm presented VAF and MSE values of 93.70% and 0.1701 for the worst case and of 96.3% and 0.0962 for the better case. The LS based algorithm presented VAF and MSE values of 92.4% and 0.2042 for the worst case and of 95.5% and 0.1157 for the better case. The worst case of noisy data set was still used by the algorithm proposed in (Wang & Langari, 1995), where the VAF and MSE values were of 92.6452% and 0.1913, and by the algorithm proposed in (Pedrycz, 2006) where the VAF and MSE values were of 92.5216% and 0.1910, respectively. These results, considering the local approach, show that they have an intermediate performance between the proposed method in this paper and the LS based algorithm. For the global approach, the VAF and MSE values are 96.5% and 0.09 for the proposed method and of 81.4% and 0.52 for the LS based algorithm, respectively. For the local approach, the VAF and MSE values are 96.0% and 0.109 for the proposed method and of 95.5% and 0.1187 for the LS based algorithm, respectively. In sense to be clear to the reader, the results of local and global estimation to the TS fuzzy model from the stochastic SISO nonlinear system identification, it has the following conclusions: When interpreting TS fuzzy models obtained from data, one has to be aware of the tradeoffs between local and global estimation. The TS fuzzy models estimated by local approach describe properly the local behavior of the nonlinear system, but not give a good fit; for the global approach, the opposite holds - a perfect fit is obtained, but the TS fuzzy models are not relevant for the local behavior of the nonlinear system. This is the tradeoffs between local and global estimation. To illustrate the robustness of the FIV algorithm, it was performed a numerical experiment based on 300 different realizations of noise. The numerical experiment followed a particular computational pattern:


• Aggregate the results of IV and LS algorithms according to VAF and MSE criteria into the final result from histograms, indicating the number of its occurences (frequency) during the numerical experiment.

Fig. 3. Robustness analysis: Histogram of VAF for the IV and LS based algorithms.

The IV and LS based algorithms were submitted to these different conditions of noise at same time and the efficiency was observed through VAF and MSE criteria according to the histograms shown on Fig. 3 and Fig. 4, respectively. Clearly, the proposed method presented the best results compared with LS based algorithm. For the global approach, the results of VAF and MSE values are of 98.60 ± 1.25% and 0.037 ± 0.02 for the proposed method and of 84.70 ± 0.65% and 0.38 ± 0.15 for the LS based algorithm, respectively. For the local approach, the results of VAF and MSE values are of 96.70 ± 0.55% and 0.07 ± 0.015 for the proposed method and of 95.30 ± 0.15% and 0.1150 ± 0.005 for the LS based algorithm, respectively. In general, from the results shown in Tab. 1, it can conclude that the proposed method has favorable results compared with existing techniques and good robustness properties for identification of stochastic nonlinear systems.

#### **5.2 Experimental results**

In this section, the experimental results on adaptive model based control of a multivariable (two inputs and one output) nonlinear pH process, commonly found in industrial environment, are presented.

Fig. 4. Robustness analysis: Histogram of MSE for the IV and LS based algorithms.

#### **5.2.1 Fuzzy adaptive black box fuzzy model based control of pH neutralization process**

The input-output experimental data set of the nonlinear plant were obtained from DAISY<sup>1</sup> (Data Acquisition For Identification of Systems) plataform.

This plant presents the following input-output variables:

• *u*1(*t*): acid flow (*l*);

20 Will-be-set-by-IN-TECH

• Aggregate the results of IV and LS algorithms according to VAF and MSE criteria into the final result from histograms, indicating the number of its occurences (frequency) during

0

0

20

40

frequency

60

80

5

10

15

frequency

20

25

85 90 95 100

85 90 95 100

VAF (%)

VAF (%)

LS:local approach

FIV:local approach

the numerical experiment.

0

0

**5.2 Experimental results**

environment, are presented.

5

10

15

frequency

20

25

5

10

15

frequency

20

25

85 90 95 100

85 90 95 100

Fig. 3. Robustness analysis: Histogram of VAF for the IV and LS based algorithms.

The IV and LS based algorithms were submitted to these different conditions of noise at same time and the efficiency was observed through VAF and MSE criteria according to the histograms shown on Fig. 3 and Fig. 4, respectively. Clearly, the proposed method presented the best results compared with LS based algorithm. For the global approach, the results of VAF and MSE values are of 98.60 ± 1.25% and 0.037 ± 0.02 for the proposed method and of 84.70 ± 0.65% and 0.38 ± 0.15 for the LS based algorithm, respectively. For the local approach, the results of VAF and MSE values are of 96.70 ± 0.55% and 0.07 ± 0.015 for the proposed method and of 95.30 ± 0.15% and 0.1150 ± 0.005 for the LS based algorithm, respectively. In general, from the results shown in Tab. 1, it can conclude that the proposed method has favorable results compared with existing techniques and good robustness properties for

In this section, the experimental results on adaptive model based control of a multivariable (two inputs and one output) nonlinear pH process, commonly found in industrial

VAF (%)

identification of stochastic nonlinear systems.

VAF (%)

LS:global approach

FIV:global approach


Figure 5 shows the open loop temporal response of the plant, considering a sampling time of 10 seconds. These data will be used for modeling of the process. The obtained fuzzy model will be used for indirect multivariable adaptive fuzzy control design. The TS fuzzy inference system uses a functional expression of the pH level in the tank. The *i* | *i*=1,2,...,*l* -th rule of the multivariable TS fuzzy model, where *l* is the number of rules is given by:

$$\mathbb{R}^{i}: \mathbf{IF}\,\tilde{Y}(z)z^{-1} \text{ is } F\_{j|\tilde{Y}(z)z^{-1}}^{i}\mathbf{THEN}$$

$$Y^{i}(z) = \frac{b\_{1}^{i}}{1 - a\_{1}^{i}z^{-1} - a\_{2}^{i}z^{-2}} \mathrm{Id}\_{1}(z) + \frac{b\_{2}^{i}}{1 - a\_{1}^{i}z^{-1} - a\_{2}^{i}z^{-2}} \mathrm{Id}\_{2}(z) \tag{64}$$

<sup>1</sup> accessed in http://homes.esat.kuleuven.be/ smc/daisy.

Fig. 5. Open loop temporal response of the nonlinear pH process

The C-means fuzzy clustering algorithm was used to estimate the antecedent parameters of the TS fuzzy model. The fuzzy recursive instrumental variable algorithm based on QR factorization, was used to estimate the consequent submodels parameters of the TS fuzzy model. For initial estimation was used 100 points, the number of rules was *l* = 2, and the fuzzy frequency response validation method was used for fuzzy controller design based on the inverse model (Serra & Ferreira, 2011).

The parameters of the submodels in the consequent proposition of the multivariable TS fuzzy model are shown in Figure 6. It is observed that in addition to nonlinearity, the pH neutralization process presents uncertainty behavior in order to commit any application of fix control design.

Fig. 6. TS fuzzy model parameters estimated by fuzzy instrumental variable algortihm based on QR factoration

The TS multivariable fuzzy model, at last sample, is given by:

22 Will-be-set-by-IN-TECH

0 1 2 3 4 5

0 1 2 3 4 5

0 1 2 3 4 5

Time (hours)

Time (hours)

Time (hours)

The parameters of the submodels in the consequent proposition of the multivariable TS fuzzy model are shown in Figure 6. It is observed that in addition to nonlinearity, the pH neutralization process presents uncertainty behavior in order to commit any application of fix

<sup>0</sup> <sup>1</sup> <sup>2</sup> <sup>3</sup> <sup>4</sup> <sup>5</sup> <sup>6</sup> −0.5

<sup>0</sup> <sup>1</sup> <sup>2</sup> <sup>3</sup> <sup>4</sup> <sup>5</sup> <sup>6</sup> −0.5

Fig. 6. TS fuzzy model parameters estimated by fuzzy instrumental variable algortihm based

Time (hour)

Time (hour)

b1 2 b1 1 a1 2

a1 1

> a2 1

a2 2 b2 1 b2 2


0 0.5 1 1.5 2

Fig. 5. Open loop temporal response of the nonlinear pH process

y(t): pH

the inverse model (Serra & Ferreira, 2011).

0 0.5 1 1.5

0 0.5 1 1.5

Parameters of the submodel 2

Parameters of the submodel 1

The C

control design.

on QR factoration

u1(t) : acid solution

u2(t): base solution

$$R^1: \text{IF } \vec{y}(k-1) \text{ is } F^1 \text{THEN}$$

$$y^1(k) = 1.1707y(k-1) - 0.2187y(k-2) + 0.0372u\_1(k) + 0.1562u\_2(k)$$

$$R^2: \text{IF } \vec{y}(k-1) \text{ is } F^2 \text{THEN}$$

$$y^2(k) = 1.0919y(k-1) - 0.1861y(k-2) + 0.0304u\_1(k) + 0.4663u\_2(k)\tag{65}$$

The validation of the TS fuzzy model, according to equation (65) via fuzzy frequency response

Fig. 7. Recursive estimation processing for submodels parameters in the TS multivariable fuzzy model consequent proposition.

is shown in Figure 8. It can be observed the efficiency of the proposed identification algorithm to track the output variable of pH neutralization process. This result has fundamental importance for multivariable adaptive fuzzy controller design step. The region of uncertainty defined by fuzzy frequency response for the identified model contains the frequency response of the pH process. It means that the fuzzy model represents the dynamic behavior perfectly, considering the uncertainties and nonlinearities of the pH neutralization process. Consequently, the model based control design presents robust stability characteristic. The adaptive control design methodology adopted in this paper consists of a control action based on the inverse model. Once the plant model becomes known precisely by the rules of multivariable TS fuzzy model, considering the fact that the submodels are stable, one can develop a strategy to control the flow of acid and base, in order to maintain the pH level of 7. Thus, the multivariable fuzzy controller is designed so that the control system closed-loop presents unity gain and the output is equal to the reference. So, it yields:

$$\mathcal{G}\_{MF}(z) = \frac{\mathcal{R}(z)}{Y(z)} = \frac{\mathcal{G}\_{c\_1}^{i}\mathcal{G}\_{p\_1}^{i} + \mathcal{G}\_{c\_2}^{i}\mathcal{G}\_{p\_2}^{i}}{1 + \mathcal{G}\_{c\_1}^{i}\mathcal{G}\_{p\_1}^{i} + \mathcal{G}\_{c\_2}^{i}\mathcal{G}\_{p\_2}^{i}} \tag{66}$$

where *G<sup>i</sup> <sup>c</sup>*<sup>1</sup> <sup>e</sup> *<sup>G</sup><sup>i</sup> <sup>c</sup>*<sup>2</sup> are the transfer functions of the controllers in the *<sup>i</sup>*-th rule, as *<sup>G</sup><sup>i</sup> <sup>p</sup>*<sup>1</sup> and *<sup>G</sup><sup>i</sup> p*2 are submodels in the consequent proposition from the output *Y*(*z*) to inputs *U*1(*z*) and *U*2(*z*),

Fig. 8. Validation step of the multivariable TS fuzzy model: (a) - (b) Fuzzy frequency response of the TS fuzzy model (black curve) representing the dynamic behavior of the pH level and the flow of acid solution (red curve), (c) - (d) Fuzzy frequency response of the TS fuzzy model (black curve) representing the dynamic behavior of the pH level and flow of the base (red curve).

*Gi <sup>c</sup>*<sup>1</sup> <sup>=</sup> <sup>1</sup> *Gi p*1

*Gi <sup>c</sup>*<sup>2</sup> <sup>=</sup> <sup>1</sup> *Gi p*2

respectively. Considering

and

results:

$$G\_{MF}(z) = \frac{\mathcal{R}(z)}{\mathcal{Y}(z)} = \frac{2}{3} \tag{67}$$

this is,

$$Y(z) = \frac{2}{3}R(z)\tag{68}$$

For compensation this closed loop gain of the control system, it is necessary generate a reference signal so that *Y*(*z*) = *R*(*z*). Therefore, adopting the new reference signal *R* (*z*) = 3 <sup>2</sup>*R*(*z*), it yields:

$$Y(z) = \frac{2}{3}R'(z) \tag{69}$$

$$Y(z) = \frac{2}{3} \frac{3}{2} \mathcal{R}(z) \tag{70}$$

$$Y(z) = \mathcal{R}(z)\tag{71}$$

For the inverse model based indirect multivariable fuzzy control design, one adopte a new reference signal given by *R* (*z*) = <sup>3</sup> <sup>2</sup>*R*(*z*). The TS fuzzy multivariable controller presents the folowing structure:

24 Will-be-set-by-IN-TECH

Phase (rad)

−0.02 −0.01 0 0.01 0.02 0.03

−0.04 −0.02 0 0.02

Phase (rad)

Fig. 8. Validation step of the multivariable TS fuzzy model: (a) - (b) Fuzzy frequency response of the TS fuzzy model (black curve) representing the dynamic behavior of the pH level and the flow of acid solution (red curve), (c) - (d) Fuzzy frequency response of the TS fuzzy model (black curve) representing the dynamic behavior of the pH level and flow of the

> *Gi <sup>c</sup>*<sup>1</sup> <sup>=</sup> <sup>1</sup> *Gi p*1

> *Gi <sup>c</sup>*<sup>2</sup> <sup>=</sup> <sup>1</sup> *Gi p*2

*GMF*(*z*) = *<sup>R</sup>*(*z*)

*<sup>Y</sup>*(*z*) = <sup>2</sup> 3

*<sup>Y</sup>*(*z*) = <sup>2</sup> 3 *R*

*<sup>Y</sup>*(*z*) = <sup>2</sup> 3 3 2

(*z*) = <sup>3</sup>

For compensation this closed loop gain of the control system, it is necessary generate a reference signal so that *Y*(*z*) = *R*(*z*). Therefore, adopting the new reference signal *R*

For the inverse model based indirect multivariable fuzzy control design, one adopte a new

*<sup>Y</sup>*(*z*) <sup>=</sup> <sup>2</sup>

<sup>3</sup> (67)

(*z*) =

*R*(*z*) (68)

(*z*) (69)

*R*(*z*) (70)

*Y*(*z*) = *R*(*z*) (71)

<sup>2</sup>*R*(*z*). The TS fuzzy multivariable controller presents the

<sup>0</sup> 0.02 0.04 0.06 0.08 0.1 −0.03

<sup>0</sup> <sup>20</sup> <sup>40</sup> <sup>60</sup> <sup>80</sup> <sup>100</sup> −0.06

Frequency (Hz)

Frequency (Hz)

<sup>0</sup> 0.02 0.04 0.06 0.08 0.1 −8

Frequency (Hz)

<sup>0</sup> <sup>20</sup> <sup>40</sup> <sup>60</sup> <sup>80</sup> <sup>100</sup> <sup>0</sup>

Frequency (Hz)

−6 −4 −2 0 2

Magnitude (dB)

base (red curve).

and

results:

this is,

3

<sup>2</sup>*R*(*z*), it yields:

reference signal given by *R*

respectively. Considering

Magnitude (dB)

$$R^i: \text{IF } \tilde{Y}(z)z^{-1} \text{ is } F^i\_{\tilde{f}|\tilde{Y}(z)z^{-1}} \text{ THEN}$$

$$G^i\_{c\_1} = \frac{1 - \pounds^i z^{-1} - \pounds^i z^{-2}}{\hat{b}^i\_1} E(z)$$

$$G^i\_{c\_2} = \frac{1 - \hat{a}^i\_1 z^{-1} - \hat{a}^i\_2 z^{-2}}{\hat{b}^i\_2} E(z) \tag{72}$$

The temporal response of the TS fuzzy multivariable adaptive control is shown in Fig. 9. It can be observed the control system track the reference signal, *pH* = 7, because the controller can tune itself based on the identified TS fuzzy multivariable model.

Fig. 9. Performance of the TS fuzzy multivariable adaptive control system.

#### **6. References**


26 Will-be-set-by-IN-TECH

Sastry, S. and Bodson, M. (1989). *Adaptive Control: Stability, Convergence, and Robustness*.

Isermann, R. and Münchhof, M. (2011). *Identification of Dynamic Systems: An Introduction with*

Li, I. and Lee, L.W. Hierarchical Structure of Observer-Based Adaptive Fuzzy-Neural

Zhu, Y. (2011). *Multivariable System Identification For Process Control*. Elsevier, 1*st* Edition, ISBN

Lee, D. H., Park, J. B. and Joo, Y. H. Approaches to extended non-quadratic stability and

Castillo-Toledo, B. and Meda-Campaña, A. The Fuzzy Discrete-Time Robust Regulation

Chalam, V.V. (1987). *Adaptive Control Systems: Techniques and Applications*. Marcel Dekker, ISBN

Ding, B. Dynamic Output Feedback Predictive Control for Nonlinear Systems Represented by

Hellendoorn, H. and Driankov, D. *Fuzzy Model Identification: Selected Approaches*. Springer

Ioannou, P.A. and Sun, J. (1996). *Robust Adaptive Control*. Prentice Hall , ISBN 978-0134391007,

Isidori, A. (1995). *Nonlinear Control Systems*. Springer Verlag, 3*rd* Edition, ISBN 978-3540199168

Wang, J.W., Wu, H.N. and Li, H.X. Distributed Fuzzy Control Design of Nonlinear

Kadmiry, B. and Driankov, D. A Fuzzy Gain-Scheduler for the Attitude Control of an

Khalil, H. *Nonlinear Systems*. Prentice Hall, 3*rd* Edition, ISBN 0-13-067389-7, Upper Saddle

Sadeghian, M. and Fatehi, A. Identification, prediction and detection of the process fault in a

Grigorie, L. (2010). *Fuzzy Controllers, Theory and Applications*. Intech, ISBN 978-953-307-543-3 Vukadinovic, D. (2011). *Fuzzy Control Systems*. Nova Science Publishers, ISSN

Lewis, F.L. and Syrmos, V.L. (1995). *Optimal Control*. John Wiley & Sons - IEEE, 2*nd* Edition,

Ljung, L. (1999). *System Identification: Theory for the User*. Prentice Hall, 2*nd* Edition, ISBN

Hyperbolic PDE Systems With Application to Nonisothermal Plug-Flow Reactor. *IEEE Transactions on Fuzzy Systems*, Vol. 19, No. 3, June 2011, 514-526, ISSN 1063-6706

Unmanned Helicopter. *IEEE Transactions on Fuzzy Systems*, Vol.12, No.3, August 2004,

cement rotary kiln by locally linear neuro-fuzzy technique. *Journal of Process Control*,

Vol. 47, No. 3, March 2011, 534-538, ISSN 0005-1098

0-82-477650-X, New York, United States of America

Verlag, ISBN 978-3540627210, Berlin-Heidelberg

Vol. 21, No. 2, February 2011, 302-308, ISSN 0959-1524

ISBN 0-471-03378-2, United States of America

0-13-656695-2, Upper Saddle River, New Jersey

New Jersey

52-82, ISSN 0165-0114

2004, 360–367, ISSN 1063-6706

2011, 831-843, ISSN 1063-6706

Upper Saddle River, New Jersey

, Berlin-Heidelberg

502-515, ISSN 1063-6706

River, New Jersey

978-1-61324-488-3

978-0-08-043985-3

Prentice Hall Advanced Reference Series, ISBN 0-13-004326-5, Englewood Cliffs,

*Applications*. Springer-Verlag, 1*st* Edition, ISBN 978-3-540-78878-2, Berlin-Heidelberg

Controller for MIMO systems. *Fuzzy Sets and Systems*, Vol. 185, No. 1, December 2011,

stabilization conditions for discrete-time Takagi-Sugeno fuzzy systems. *Automatica*,

Problem: An LMI Approach. *IEEE Transactions on Fuzzy Systems*, Vol.12, No.3, June

a Takagi-Sugeno Model. *IEEE Transactions on Fuzzy Systems*, Vol. 19, No. 5, October


### **Online Adaptive Learning Solution of Multi-Agent Differential Graphical Games**

Kyriakos G. Vamvoudakis1 and Frank L. Lewis2

*1Center for Control, Dynamical-Systems, and Computation (CCDC), University of California, Santa Barbara, 2Automation and Robotics Research Institute, The University of Texas at Arlington, USA* 

#### **1. Introduction**

28 Will-be-set-by-IN-TECH

28 Frontiers in Advanced Control Systems

Wang, L. and Langari, R. Building Sugeno Type Models Using Fuzzy Discretization and

Yoneyama, J. *H*∞ Control for Fuzzy Time-Delay Systems via Descriptor System. *Proceedings of*

Zadeh, L.A. Fuzzy Sets. *Information and Control*, Vol. 8, No. 3, June 1965, 338-353, ISSN

Zadeh, L.A. Outline of a New Approach to the Analysis of Complex Systems and Decision

Vol. 3, No. 4, November 1995, 454–458, ISSN 1063-6706

Taiwan, September 2004, Taipei

1973, 28-44, ISSN 0018-9472

0019-9958

Orthogonal Parameter Estimation Techniques. *IEEE Transactions on Fuzzy Systems*,

*IEEE International Symposium on Intelligent Control*, pp. 407-412, ISBN 0-7803-8635-3,

Processes. *IEEE Transactions on Systems, Man and Cybernetics*, Vol. 3, No. 1, January

Distributed networks have received much attention in the last year because of their flexibility and computational performance. The ability to coordinate agents is important in many real-world tasks where it is necessary for agents to exchange information with each other. Synchronization behavior among agents is found in flocking of birds, schooling of fish, and other natural systems. Work has been done to develop cooperative control methods for consensus and synchronization (Fax and Murray, 2004; Jadbabaie, Lin and Morse, 2003; Olfati-Saber, and Murray, 2004; Qu, 2009; Ren, Beard, and Atkins, 2005; Ren, and beard, 2005; Ren, and Beard, 2008; Tsitsiklis, 1984). See (Olfati-Saber, Fax, and Murray, 2007; Ren, Beard, and Atkins, 2005) for surveys. Leaderless consensus results in all nodes converging to common value that cannot generally be controlled. We call this the cooperative regulator problem. On the other hand the problem of cooperative tracking requires that all nodes synchronize to a leader or control node (Hong, Hu, and Gao, 2006; Li, Wang, and Chen, 2004; Ren, Moore, and Chen, 2007; Wang, and Chen, 2002). This has been called pinning control or control with a virtual leader. Consensus has been studied for systems on communication graphs with fixed or varying topologies and communication delays.

Game theory provides an ideal environment in which to study multi-player decision and control problems, and offers a wide range of challenging and engaging problems. Game theory (Tijs, 2003) has been successful in modeling strategic behavior, where the outcome for each player depends on the actions of himself and all the other players. Every player chooses a control to minimize independently from the others his own performance objective. Multi player cooperative games rely on solving coupled Hamilton-Jacobi (HJ) equations, which in the linear quadratic case reduce to the coupled algebraic Riccati equations (Basar, and Olsder, 1999; Freiling, Jank, and Abou-Kandil, 2002; Gajic, and Li, 1988). Solution methods are generally offline and generate fixed control policies that are then implemented in online controllers in real time. These coupled equations are difficult to solve.

Reinforcement learning (RL) is a sub-area of machine learning concerned with how to methodically modify the actions of an agent (player) based on observed responses from its environment (Sutton, and Barto, 1998). RL methods have allowed control systems researchers to develop algorithms to learn online in real time the solutions to optimal control problems for dynamic systems that are described by difference or ordinary differential equations. These involve a computational intelligence technique known as Policy Iteration (PI) (Bertsekas, and Tsitsiklis, 1996), which refers to a class of algorithms with two steps, *policy evaluation* and *policy improvement*. PI has primarily been developed for discrete-time systems, and online implementation for control systems has been developed through approximation of the value function (Bertsekas, and Tsitsiklis, 1996; Werbos, 1974; Werbos, 1992). PI provides effective means of learning solutions to HJ equations online. In control theoretic terms, the PI algorithm amounts to learning the solution to a nonlinear Lyapunov equation, and then updating the policy through minimizing a Hamiltonian function. Policy Iteration techniques have been developed for continuous-time systems in (Vrabie, Pastravanu, Lewis, and Abu-Khalaf, 2009).

RL methods have been used to solve multiplayer games for finite-state systems in (Busoniu, Babuska, and De Schutter, 2008; Littman, 2001). RL methods have been applied to learn online in real-time the solutions for optimal control problems for dynamic systems and differential games in (Dierks, and Jagannathan, 2010; Johnson, Hiramatsu, Fitz-Coy, and Dixon, 2010; Vamvoudakis 2010; Vamvoudakis 2011).

This book chapter brings together cooperative control, reinforcement learning, and game theory to solve multi-player differential games on communication graph topologies. There are four main contributions in this chapter. The first involves the formulation of a *graphical game* for dynamical systems networked by a communication graph. The dynamics and value function of each node depend only on the actions of that node and its neighbors. This graphical game allows for synchronization as well as Nash equilibrium solutions among neighbors. It is shown that standard definitions for Nash equilibrium are not sufficient for graphical games and a new definition of "Interactive Nash Equilibrium" is given. The second contribution is the derivation of coupled Riccati equations for solution of graphical games. The third contribution is a Policy Iteration algorithm for solution of graphical games that relies only on local information from neighbor nodes. It is shown that this algorithm converges to the best response policy of a node if its neighbors have fixed policies, and to the Nash solution if all nodes update their policies. The last contribution is the development of an online adaptive learning algorithm for computing the Nash equilibrium solutions of graphical games.

The book chapter is organized as follows. Section 2 reviews synchronization in graphs and derives an error dynamics for each node that is influenced by its own actions and those of its neighbors. Section 3 introduces differential graphical games cooperative Nash equilibrium. Coupled Riccati equations are developed and stability and solution for Nash equilibrium are proven. Section 4 proposes a policy iteration algorithm for the solution of graphical games and gives proofs of convergence. Section 5 presents an online adaptive learning solution based on the structure of the policy iteration algorithm of Section 4. Finally Section 6 presents a simulation example that shows the effectiveness of the proposed algorithms in learning in real-time the solutions of graphical games.

#### **2. Synchronization and node error dynamics**

#### **2.1 Graphs**

30 Frontiers in Advanced Control Systems

Reinforcement learning (RL) is a sub-area of machine learning concerned with how to methodically modify the actions of an agent (player) based on observed responses from its environment (Sutton, and Barto, 1998). RL methods have allowed control systems researchers to develop algorithms to learn online in real time the solutions to optimal control problems for dynamic systems that are described by difference or ordinary differential equations. These involve a computational intelligence technique known as Policy Iteration (PI) (Bertsekas, and Tsitsiklis, 1996), which refers to a class of algorithms with two steps, *policy evaluation* and *policy improvement*. PI has primarily been developed for discrete-time systems, and online implementation for control systems has been developed through approximation of the value function (Bertsekas, and Tsitsiklis, 1996; Werbos, 1974; Werbos, 1992). PI provides effective means of learning solutions to HJ equations online. In control theoretic terms, the PI algorithm amounts to learning the solution to a nonlinear Lyapunov equation, and then updating the policy through minimizing a Hamiltonian function. Policy Iteration techniques have been developed for continuous-time systems in

RL methods have been used to solve multiplayer games for finite-state systems in (Busoniu, Babuska, and De Schutter, 2008; Littman, 2001). RL methods have been applied to learn online in real-time the solutions for optimal control problems for dynamic systems and differential games in (Dierks, and Jagannathan, 2010; Johnson, Hiramatsu, Fitz-Coy, and

This book chapter brings together cooperative control, reinforcement learning, and game theory to solve multi-player differential games on communication graph topologies. There are four main contributions in this chapter. The first involves the formulation of a *graphical game* for dynamical systems networked by a communication graph. The dynamics and value function of each node depend only on the actions of that node and its neighbors. This graphical game allows for synchronization as well as Nash equilibrium solutions among neighbors. It is shown that standard definitions for Nash equilibrium are not sufficient for graphical games and a new definition of "Interactive Nash Equilibrium" is given. The second contribution is the derivation of coupled Riccati equations for solution of graphical games. The third contribution is a Policy Iteration algorithm for solution of graphical games that relies only on local information from neighbor nodes. It is shown that this algorithm converges to the best response policy of a node if its neighbors have fixed policies, and to the Nash solution if all nodes update their policies. The last contribution is the development of an online adaptive learning algorithm for computing the Nash equilibrium solutions of

The book chapter is organized as follows. Section 2 reviews synchronization in graphs and derives an error dynamics for each node that is influenced by its own actions and those of its neighbors. Section 3 introduces differential graphical games cooperative Nash equilibrium. Coupled Riccati equations are developed and stability and solution for Nash equilibrium are proven. Section 4 proposes a policy iteration algorithm for the solution of graphical games and gives proofs of convergence. Section 5 presents an online adaptive learning solution based on the structure of the policy iteration algorithm of Section 4. Finally Section 6 presents a simulation example that shows the effectiveness of the proposed algorithms in

(Vrabie, Pastravanu, Lewis, and Abu-Khalaf, 2009).

Dixon, 2010; Vamvoudakis 2010; Vamvoudakis 2011).

learning in real-time the solutions of graphical games.

graphical games.

Consider a graph *G V* (,) with a nonempty finite set of N nodes 1 *Vv v* {,, } *<sup>N</sup>* and a set of edges or arcs *V V* . We assume the graph is simple, e.g. no repeated edges and (,) , *i i vv E i* no self loops. Denote the connectivity matrix as [ ] *E e ij* with 0 (,) *ij j i e if v v* and 0 *ij e* otherwise. Note 0 *ii e* . The set of neighbors of a node *<sup>i</sup> v* is { :( , ) } *N v vv i j ji* , i.e. the set of nodes with arcs incoming to *<sup>i</sup> v* . Define the in-degree matrix as a diagonal matrix ( ) *D diag d <sup>i</sup>* with *i i ij j N d e* the weighted in-degree of node *<sup>i</sup>*

(i.e. *i -th* row sum of *E*). Define the graph Laplacian matrix as *LDE* , which has all row sums equal to zero.

A directed path is a sequence of nodes 0 1 ,,, *<sup>r</sup> vv v* such that 1 ( , ) , {0,1, , 1} *i i v v Ei r* . A directed graph is strongly connected if there is a directed path from *<sup>i</sup> v* to *<sup>j</sup> v* for all distinct nodes , *i j vv V* . A (directed) tree is a connected digraph where every node except one, called the root, has in-degree equal to one. A graph is said to have a spanning tree if a subset of the edges forms a directed tree. A strongly connected digraph contains a spanning tree.

General directed graphs with fixed topology are considered in this chapter.

#### **2.2 Synchronization and node error dynamics**

Consider the *N* systems or agents distributed on communication graph *G* with node dynamics

$$
\dot{\mathbf{x}}\_i = \mathbf{A}\mathbf{x}\_i + B\_i \boldsymbol{u}\_i \tag{1}
$$

where ( ) *<sup>n</sup> <sup>i</sup> x t* is the state of node *i,* ( ) *mi u t <sup>i</sup>* its control input. Cooperative team objectives may be prescribed in terms of the *local neighborhood tracking error <sup>n</sup> i* (Khoo, Xie, and Man, 2009) as

$$\delta \mathcal{S} = \sum\_{j \in N\_i} e\_{ij} (\mathbf{x}\_i - \mathbf{x}\_j) \quad + \mathbf{g}\_i (\mathbf{x}\_i - \mathbf{x}\_0) \tag{2}$$

The pinning gain 0 *<sup>i</sup> g* is nonzero for a small number of nodes *i* that are coupled directly to the leader or control node 0 *x* , and 0 *<sup>i</sup> g* for at least one *i* (Li, Wang, and Chen, 2004). We refer to the nodes *i* for which 0 *<sup>i</sup> g* as the pinned or controlled nodes. Note that *<sup>i</sup>* represents the information available to node *i* for state feedback purposes as dictated by the graph structure.

The state of the control or target node is 0( ) *<sup>n</sup> x t* which satisfies the dynamics

$$
\dot{\mathbf{x}}\_0 = A\mathbf{x}\_0 \tag{3}
$$

Note that this is in fact a *command generator* (Lewis, 1992) and we seek to design a cooperative control command generator tracker. Note that the trajectory generator *A* may not be stable.

The Synchronization control design problem is to design local control protocols for all the nodes in *G* to synchronize to the state of the control node, i.e. one requires 0 ( ) ( ), *<sup>i</sup> xt x t i* .

From (2), the overall error vector for network *Gr* is given by

$$\mathcal{S} = \left( \left( L + \mathcal{G} \right) \otimes I\_n \right) \left( \mathbf{x} - \underline{\mathbf{x}}\_0 \right) = \left( \left( L + \mathcal{G} \right) \otimes I\_n \right) \zeta \tag{4}$$

where the global vectors are

1 2 *<sup>T</sup> T T T nN xxx xN* 1 2 *<sup>T</sup> T T T nN <sup>N</sup>* and 0 0 *nN x Ix* , with 1 *nN n <sup>n</sup> I IR* and 1 the *N-*vector of ones. The Kronecker product is (Brewer, 1978). *N N G R* is a diagonal matrix with diagonal entries equal to the pinning gains *<sup>i</sup> <sup>g</sup>* . The (global) consensus or synchronization error (e.g. the disagreement vector in (Olfati-Saber, and Murray, 2004)) is

$$
\underline{\mathcal{L}} = \left(\underline{\boldsymbol{x}} - \underline{\boldsymbol{x}}\_{0}\right) \in \mathbb{R}^{nN} \tag{5}
$$

The communication digraph is assumed to be strongly connected. Then, if 0 *<sup>i</sup> g* for at least one *i* , *L G* is nonsingular with all eigenvalues having positive real parts (Khoo, Xie, and Man, 2009). The next result therefore follows from (4) and the Cauchy Schwartz inequality and the properties of the Kronecker product (Brewer, 1978).

**Lemma 1.** Let the graph be strongly connected and 0 *G* . Then the synchronization error is bounded by

$$\left\|\zeta\right\| \le \left\|\delta\right\| / \left\langle \underline{\sigma}(L+G) \tag{6}$$

with ( ) *L G* the minimum singular value of *L G* , and () 0 *t* if and only if the nodes synchronize, that is

$$\mathbf{x}(t) = \underline{I}\underline{\mathbf{x}}\_0(t) \tag{7}$$

■

Our objective now shall be to make small the local neighborhood tracking errors ( ) *<sup>i</sup> t* , which in view of Lemma 1 will guarantee synchronization.

To find the dynamics of the local neighborhood tracking error, write

$$\dot{\mathcal{S}}\_i = A \mathcal{S}\_i + (d\_i + \mathcal{g}\_i) B\_i \mu\_i - \sum\_{j \neq N\_i} \mathcal{e}\_{ij} B\_j \mu\_j \tag{8}$$

with , , *<sup>n</sup> mi i i u i* .

32 Frontiers in Advanced Control Systems

Note that this is in fact a *command generator* (Lewis, 1992) and we seek to design a cooperative control command generator tracker. Note that the trajectory generator *A* may

The Synchronization control design problem is to design local control protocols for all the nodes in *G* to synchronize to the state of the control node, i.e. one requires 0 ( ) ( ), *<sup>i</sup> xt x t i* .

*LG I xx LG I n n* <sup>0</sup>

*<sup>n</sup> I IR* and 1 the *N-*vector of ones. The Kronecker product is (Brewer, 1978). *N N G R* is a diagonal matrix with diagonal entries equal to the pinning gains *<sup>i</sup> <sup>g</sup>* . The (global) consensus or synchronization error (e.g. the disagreement vector in (Olfati-Saber,

<sup>0</sup>

The communication digraph is assumed to be strongly connected. Then, if 0 *<sup>i</sup> g* for at least one *i* , *L G* is nonsingular with all eigenvalues having positive real parts (Khoo, Xie, and Man, 2009). The next result therefore follows from (4) and the Cauchy Schwartz inequality

**Lemma 1.** Let the graph be strongly connected and 0 *G* . Then the synchronization error is

 

Our objective now shall be to make small the local neighborhood tracking errors ( ) *<sup>i</sup>*

( )

*i i i i ii ijjj*

*A d g Bu e Bu*

( ) *L G* the minimum singular value of *L G* , and

To find the dynamics of the local neighborhood tracking error, write

 

in view of Lemma 1 will guarantee synchronization.

*<sup>T</sup> T T T nN*

 

*nN*

*<sup>N</sup>* and 0 0

From (2), the overall error vector for network *Gr* is given by

and the properties of the Kronecker product (Brewer, 1978).

*<sup>T</sup> T T T nN xxx xN* 1 2

not be stable.

where the global vectors are

1 2

1 *nN n*

and Murray, 2004)) is

bounded by

synchronize, that is

with 

0 0 *x Ax* (3)

*x x* (5)

/( ) *L G* (6)

<sup>0</sup> *x t Ix t* () () (7)

() 0 *t* if and only if the nodes

*t* , which

■

*i*

(8)

*j N*

(4)

*nN x Ix* , with

This is a dynamical system with multiple control inputs, from node *i* and all of its neighbors.

#### **3. Cooperative multi-player games on graphs**

We wish to achieve synchronization while simultaneously optimizing some performance specifications on the agents. To capture this, we intend to use the machinery of multi-player games (Basar, Olsder, 1999). Define { : , } *u u Gi j j N j i* as the set of policies of all other nodes in the graph other than node *i*. Define ( ) *u t i* as the vector of the control inputs {: } *uj <sup>i</sup> j N* of the neighbors of node *i*.

#### **3.1 Cooperative performance index**

Define the local performance indices

$$\int J\_i(\mathcal{S}\_i(0), \boldsymbol{\mu}\_i, \boldsymbol{\mu}\_{-i}) = \frac{1}{2} \int\_0^\eta (\boldsymbol{\mathcal{S}}\_i^T \boldsymbol{Q}\_{ii} \boldsymbol{\delta}\_i + \boldsymbol{\mu}\_i^T \boldsymbol{R}\_{ii} \boldsymbol{\mu}\_i + \sum\_{j \neq N\_i} \boldsymbol{\mu}\_j^T \boldsymbol{R}\_{ij} \boldsymbol{\mu}\_j) \, dt \equiv \frac{1}{2} \int\_0^\eta L\_i(\boldsymbol{\mathcal{S}}\_i(t), \boldsymbol{\mu}\_i(t), \boldsymbol{\mu}\_{-i}(t)) \, dt \tag{9}$$

where all weighting matrices are constant and symmetric with 0, 0, 0 *QRR ii ii ij* . Note that the *i*-th performance index includes only information about the inputs of node *i* and its neighbors.

For dynamics (8) with performance objectives (9), introduce the associated Hamiltonians

$$H\_i(\mathcal{S}\_i, p\_i, \mu\_i, \mu\_{-i}) = p\_i^{\top} \left( A \mathcal{S}\_i + (d\_i + \mathcal{g}\_i) B\_i \mu\_i - \sum\_{j \in N\_i} e\_{\bar{\eta}} B\_j \mu\_j \right) + \frac{1}{2} \mathcal{S}\_i^T Q\_{ii} \mathcal{S}\_i + \frac{1}{2} \mu\_i^T R\_{\bar{\eta}} \mu\_i + \frac{1}{2} \sum\_{j \in N\_i} \mu\_j^T R\_{\bar{\eta}} \mu\_j = 0 \tag{10}$$

where *<sup>i</sup> p* is the costate variable. Necessary conditions (Lewis, and Syrmos, 1995) for a minimum of (9) are (1) and

$$-\dot{p}\_i = \frac{\partial H\_i}{\partial \mathcal{S}\_i} \equiv A^T p\_i + Q\_{ii} \mathcal{S}\_i \tag{11}$$

$$0 = \frac{\partial H\_i}{\partial u\_i} \Longrightarrow u\_i = -(d\_i + \mathcal{g}\_i) R\_{ii}^{-1} B\_i^{\;T} p\_i \tag{12}$$

#### **3.2 Graphical games**

Interpreting the control inputs , *u ui <sup>j</sup>* as state dependent policies or strategies, the value function for node *i* corresponding to those policies is

$$V\_i(\mathcal{S}\_i(t)) = \frac{1}{2} \int\_t^\infty (\mathcal{S}\_i^T \mathcal{Q}\_{ii} \mathcal{S}\_i + \mu\_i^T \mathcal{R}\_{ii} \mu\_i + \sum\_{j \neq N\_i} \mu\_j^T \mathcal{R}\_{ij} \mu\_j) \, dt \tag{13}$$

**Definition 1.** Control policies , *u i <sup>i</sup>* are defined as admissible if *ui* are continuous, (0) 0 *ui* , *ui* stabilize systems (8) locally, and values (13) are finite.

When *Vi* is finite, using Leibniz' formula, a differential equivalent to (13) is given in terms of the Hamiltonian function by the Bellman equation

$$H\_i(\boldsymbol{\delta}\_i, \frac{\partial V\_i}{\partial \boldsymbol{\delta}\_i}, \boldsymbol{\mu}\_i, \boldsymbol{\mu}\_i) \equiv \frac{\partial V\_i}{\partial \boldsymbol{\delta}\_i} \left( A \boldsymbol{\delta}\_i + \{\boldsymbol{d}\_i + \boldsymbol{g}\_i\} \boldsymbol{R}\_i \boldsymbol{\mu}\_i - \sum\_{j \in \mathcal{N}\_i} \boldsymbol{e}\_{\overline{\eta}} \boldsymbol{B}\_j \boldsymbol{\mu}\_j \right) + \frac{1}{2} \boldsymbol{\delta}\_i^T \boldsymbol{Q}\_{\overline{\boldsymbol{u}}} \boldsymbol{\delta}\_i + \frac{1}{2} \boldsymbol{\mu}\_i^T \boldsymbol{R}\_i \boldsymbol{\mu}\_i + \frac{1}{2} \sum\_{j \in \mathcal{N}\_i} \boldsymbol{u}\_j^T \boldsymbol{R}\_{\overline{\eta}} \boldsymbol{\mu}\_j = \boldsymbol{0} \tag{14}$$

with boundary condition (0) 0 *V <sup>i</sup>* . (The gradient is disabused here as a column vector.) That is, solution of equation (14) serves as an alternative to evaluating the infinite integral (13) for finding the value associated to the current feedback policies. It is shown in the Proof of Theorem 2 that (14) is a Lyapunov equation. According to (13) and (10) one equates *p V i ii* / .

The local dynamics (8) and performance indices (9) only depend for each node *i* on its own control actions and those of its neighbors. We call this a *graphical game*. It depends on the topology of the communication graph *G V* (,) . We assume throughout the chapter that the game is well-formed in the following sense.

**Definition 2.** The graphical game with local dynamics (8) and performance indices (9) is well-formed if 0 *B eE <sup>j</sup> ij* , 0 *R eE ij ij* .

The control objective of agent *i* in the graphical game is to determine

$$V\_i^\*\left(\mathcal{S}\_i(t)\right) = \min\_{\boldsymbol{\mu}\_i} \int\_t^\mathbb{R} \left(\mathcal{S}\_i^T \mathcal{Q}\_{i\bar{\boldsymbol{\mu}}} \mathcal{S}\_i + \boldsymbol{\mu}\_i^T \mathcal{R}\_{i\bar{\boldsymbol{\mu}}} \boldsymbol{\mu}\_i + \sum\_{j \in \mathcal{N}\_i} \boldsymbol{\mu}\_j^T \mathcal{R}\_{i\bar{\boldsymbol{\mu}}} \boldsymbol{\mu}\_j\right) dt \tag{15}$$

Employing the stationarity condition (12) (Lewis, and Syrmos, 1995) one obtains the control policies

$$\mu\_i = \mu\_i(V\_i) \equiv -(d\_i + g\_i) R\_{ii}^{-1} B\_i^T \frac{\partial V\_i}{\partial \mathcal{S}\_i} \equiv -h\_i(p\_i) \tag{16}$$

The game defined in (15) corresponds to Nash equilibrium.

**Definition 3.** (Basar, and Olsder, 1999) **(**Global Nash equilibrium) An *N-tuple* of policies \*\* \* 1 2 *uu u* , ,..., is said to constitute a global Nash equilibrium solution for an *N* player game if for all *i N*

$$\text{l}\_{i}^{\*} \triangleq \text{l}\_{i} \left( \boldsymbol{\mu}\_{i}^{\*}, \boldsymbol{\mu}\_{\text{G}-i}^{\*} \right) \leq \text{l}\_{i} \left( \boldsymbol{\mu}\_{i}, \boldsymbol{\mu}\_{\text{G}-i}^{\*} \right) \tag{17}$$

The *N-tuple* of game values \*\* \* 1 2 *JJ J* , ,..., is known as a Nash equilibrium outcome of the *N*player game.

34 Frontiers in Advanced Control Systems

**Definition 1.** Control policies , *u i <sup>i</sup>* are defined as admissible if *ui* are continuous,

When *Vi* is finite, using Leibniz' formula, a differential equivalent to (13) is given in terms

2 22 (, ,, ) ( ) 0

with boundary condition (0) 0 *V <sup>i</sup>* . (The gradient is disabused here as a column vector.) That is, solution of equation (14) serves as an alternative to evaluating the infinite integral (13) for finding the value associated to the current feedback policies. It is shown in the Proof of Theorem 2 that (14) is a Lyapunov equation. According to (13) and (10) one equates

The local dynamics (8) and performance indices (9) only depend for each node *i* on its own control actions and those of its neighbors. We call this a *graphical game*. It depends on the topology of the communication graph *G V* (,) . We assume throughout the chapter that

**Definition 2.** The graphical game with local dynamics (8) and performance indices (9) is

<sup>2</sup> ( ( )) min ( ) *<sup>i</sup> <sup>i</sup>*

Employing the stationarity condition (12) (Lewis, and Syrmos, 1995) one obtains the control

*<sup>V</sup> u uV d g R B hp*

**Definition 3.** (Basar, and Olsder, 1999) **(**Global Nash equilibrium) An *N-tuple* of policies

1 2 *uu u* , ,..., is said to constitute a global Nash equilibrium solution for an *N* player game

<sup>1</sup> () ( ) ( ) *<sup>T</sup> <sup>i</sup> i i i i i ii i i i*

*i i i ii i i ii i j ij j <sup>u</sup> <sup>t</sup> j N*

 

*TT T*

*Q u R u u R u dt*

*i*

(16)

\* \*\* \* (, ) (, ) *i i i Gi i i Gi J Juu Juu* (17)

1 2 *JJ J* , ,..., is known as a Nash equilibrium outcome of the *N*-

 (15)

*i i TT T ii i i i i i i i ij j j i ii i i ii i j ij j i i j N j N V V H u u A d g Bu e Bu Q u R u u R u*

*i i*

(14)

 

(0) 0 *ui* , *ui* stabilize systems (8) locally, and values (13) are finite.

1 11

of the Hamiltonian function by the Bellman equation

*T*

the game is well-formed in the following sense.

well-formed if 0 *B eE <sup>j</sup> ij* , 0 *R eE ij ij* .

*V t* 

The *N-tuple* of game values \*\* \*

The control objective of agent *i* in the graphical game is to determine

\* 1

The game defined in (15) corresponds to Nash equilibrium.

 

*p V i ii* /

policies

\*\* \*

if for all *i N*

player game.

. The distributed multiplayer graphical game with local dynamics (8) and local performance indices (9) should be contrasted with standard multiplayer games (Abou-Kandil, Freiling, Ionescu, and Jank, 2003; Basar, and Olsder 1999) which have centralized dynamics

$$\dot{z} = Az + \sum\_{i=1}^{N} B\_i u\_i \tag{18}$$

where *<sup>n</sup> <sup>z</sup>* is the state*,* ( ) *mi u t <sup>i</sup>* is the control input for every player, and where the performance index of each player depends on the control inputs of all other players. In the graphical games, by contrast, each node's dynamics and performance index only depends on its own state, its control, and the controls of its immediate neighbors.

It is desired to study the distributed game on a graph defined by (15) with distributed dynamics (8). It is not clear in this scenario how global Nash equilibrium is to be achieved.

Graphical games have been studied in the computational intelligence community (Kakade, Kearns, Langford, and Ortitz, 2003; Kearns, Littman, and Singh, 2001; Shoham, and Leyton-Brown, 2009). A (nondynamic) graphical game has been defined there as a tuple ( , ,) *GU v* with *G VE* (,) a graph with *N* nodes, action set *UU U* <sup>1</sup> *<sup>N</sup>* with *Ui* the set of actions available to node *i*, and <sup>1</sup> *T vv v <sup>N</sup>* a payoff vector, with ( ,{ : }) *ii j i vU U j N R* the payoff function of node *i*. It is important to note that *the payoff of node i only depends on its own action and those of its immediate neighbors*. The work on graphical games has focused on developing algorithms to find standard Nash equilibria for payoffs generally given in terms of matrices. Such algorithms are simplified in that they only have complexity on the order of the maximum node degree in the graph, not on the order of the number of players *N*. Undirected graphs are studied, and it is assumed that the graph is connected.

The intention in this chapter is to provide online real-time adaptive methods for solving differential graphical games that are distributed in nature. That is, the control protocols and adaptive algorithms of each node are allowed to depend only information about itself and its neighbors. Moreover, as the game solution is being learned, all node dynamics are required to be stable, until finally all the nodes synchronize to the state of the control node. These online methods are discussed in Section V.

The following notions are needed in the study of differential graphical games.

**Definition 4.** (Shoham, and Leyton-Brown, 2009) Agent *i*'s *best response* to fixed policies *u<sup>i</sup>* of his neighbors is the policy \* *ui* such that

$$\mathbb{J}\_i \left( \boldsymbol{u}\_i^\*, \boldsymbol{u}\_{-i} \right) \le \mathbb{J}\_i \left( \boldsymbol{u}\_i, \boldsymbol{u}\_{-i} \right) \tag{19}$$

for all policies *ui* of agent *i*.

For centralized multi-agent games, where the dynamics is given by (18) and the performance of each agent depends on the actions of all other agents, an equivalent definition of Nash equilibrium is that each agent is in best response to all other agents. In graphical games, if all agents are in best response to their neighbors, then all agents are in Nash equilibrium, as seen in the proof of Theorem 1.

However, a counterexample shows the problems with the definition of Nash equilibrium in graphical games. Consider the completely disconnected graph with empty edge set where each node has no neighbors. Then Definition 4 holds if each agent simply chooses

his single-player optimal control solution \* \* ( ) *i ii J Ju* , since, for the disconnected graph case one has

$$J\_i\left(\mu\_i\right) = J\_i\left(\mu\_i, \mu\_{G-i}\right) = J\_i\left(\mu\_i, \mu\_{G-i}'\right), \,\forall i \tag{20}$$

for any choices of the two sets , ' *u u Gi Gi* of the policies of all the other nodes. That is, the value function of each node does not depend on the policies of any other nodes.

Note, however, that Definition 3 also holds, that is, the nodes are in a global Nash equilibrium. Pathological cases such as this counterexample cannot occur in the standard games with centralized dynamics (18), particularly because stabilizability conditions are usually assumed.

#### **3.3 Interactive Nash equilibrium**

The counterexample in the previous section shows that in pathological cases when the graph is disconnected, agents can be in Nash equilibrium, yet have no influence on each others' games. In such situations, the definition of coalition-proof Nash equilibrium (Shinohara, 2010) may also hold, that is, no set of agents has an incentive to break away from the Nash equilibrium and seek a new Nash solution among themselves.

To rule out such undesirable situations and guarantee that all agents in a graph are involved in the same game, we make the following stronger definition of global Nash equilibrium.

**Definition 5.** (Interactive Global Nash equilibrium) An *N-tuple* of policies \*\* \* 1 2 *uu u* , ,..., is

said to constitute an interactive global Nash equilibrium solution for an *N* player game if, for all *i N* , the Nash condition (17) holds and in addition there exists a policy ' *u <sup>k</sup>* such that

$$\mathbf{J}\_i \left( \boldsymbol{\mu}\_{k'}^\* \boldsymbol{\mu}\_{\mathbf{G}-k}^\* \right) \neq \mathbf{J}\_i \left( \boldsymbol{\mu}\_{k'}^\prime \boldsymbol{\mu}\_{\mathbf{G}-k}^\* \right) \tag{21}$$

for all ,*ik N* . That is, at equilibrium there exists a policy of every player *k* that influences the performance of all other players *i*.

If the systems are in Interactive Nash equilibrium, the graphical game is well-defined in the sense that all players are in a single Nash equilibrium with each player affecting the decisions of all other players. Condition (21) means that the reaction curve (Basar, and Olsder, 1999) of any player *i* is not constant with respect to all variations in the policy of any other player *k.*

The next results give conditions under which the local best responses in Definition 4 imply the interactive global Nash of Definition 5.

Consider the systems (8) in closed-loop with admissible feedbacks (12), (16) denoted by *u Kp v k kk k* for a single node *k* and , *u Kp j k j jj* . Then

$$\dot{\mathcal{S}}\_i = A\mathcal{S}\_i + (d\_i + g\_i)B\_i K\_i p\_i - \sum\_{j \in N\_i} e\_{ij} B\_j K\_j p\_j + e\_{ik} B\_k \upsilon\_{k\prime} \quad k \neq i \tag{22}$$

The global closed-loop dynamics are

36 Frontiers in Advanced Control Systems

graphical games, if all agents are in best response to their neighbors, then all agents are in

However, a counterexample shows the problems with the definition of Nash equilibrium in graphical games. Consider the completely disconnected graph with empty edge set where each node has no neighbors. Then Definition 4 holds if each agent simply chooses his single-player optimal control solution \* \* ( ) *i ii J Ju* , since, for the disconnected graph

for any choices of the two sets , ' *u u Gi Gi* of the policies of all the other nodes. That is, the

Note, however, that Definition 3 also holds, that is, the nodes are in a global Nash equilibrium. Pathological cases such as this counterexample cannot occur in the standard games with centralized dynamics (18), particularly because stabilizability conditions are

The counterexample in the previous section shows that in pathological cases when the graph is disconnected, agents can be in Nash equilibrium, yet have no influence on each others' games. In such situations, the definition of coalition-proof Nash equilibrium (Shinohara, 2010) may also hold, that is, no set of agents has an incentive to break away

To rule out such undesirable situations and guarantee that all agents in a graph are involved in the same game, we make the following stronger definition of global Nash equilibrium.

said to constitute an interactive global Nash equilibrium solution for an *N* player game if, for all *i N* , the Nash condition (17) holds and in addition there exists a policy ' *u <sup>k</sup>* such

for all ,*ik N* . That is, at equilibrium there exists a policy of every player *k* that influences

If the systems are in Interactive Nash equilibrium, the graphical game is well-defined in the sense that all players are in a single Nash equilibrium with each player affecting the decisions of all other players. Condition (21) means that the reaction curve (Basar, and Olsder, 1999) of any player *i* is not constant with respect to all variations in the policy of any

**Definition 5.** (Interactive Global Nash equilibrium) An *N-tuple* of policies \*\* \*

value function of each node does not depend on the policies of any other nodes.

from the Nash equilibrium and seek a new Nash solution among themselves.

( ) ( , ) ( , ' ), *i i i i Gi i i Gi J u J u u J uu i* (20)

\* \* \* ( , ) ( ', ) *i k Gk i k Gk Juu Ju u* (21)

1 2 *uu u* , ,..., is

Nash equilibrium, as seen in the proof of Theorem 1.

case one has

usually assumed.

that

other player *k.*

**3.3 Interactive Nash equilibrium** 

the performance of all other players *i*.

$$
\begin{bmatrix} \dot{\delta} \\ \dot{p} \end{bmatrix} = \begin{bmatrix} (I\_N \otimes A) & ((L+G)\otimes I\_n)\text{diag}(B\_i K\_i) \\\\ -\text{diag}(Q\_{ii}) & -(I\_N \otimes A^T) \end{bmatrix} \begin{bmatrix} \delta \\ p \end{bmatrix} + \begin{bmatrix} ((L+G)\otimes I\_n)\underline{B}\_k \\ 0 \end{bmatrix}
\overline{v}\_k \equiv \overline{A} \begin{bmatrix} \delta \\ p \end{bmatrix} + \overline{B} \overline{v}\_k \tag{23}
$$

with ( ) *B diag B k i* and 0 0 *<sup>T</sup> <sup>T</sup> k k v v* has all block entries zero with *<sup>k</sup> <sup>v</sup>* in block *k*. Consider node *i* and let 0 *M* be the first integer such that [( ) ] 0 *<sup>M</sup> L G ik* , where [.]*ik* denotes the element *(i,k)* of a matrix. That is, *M* is the length of the shortest directed path from *k* to *i*. Denote the nodes along this path by 01 1 *k kk k k i* ,,, , *M M* . Denote element *(i,k)* of *L G* by *ik* . Then the *n m* block element in block row *i* and block column *k* of matrix 2( 1) *<sup>M</sup> A B* is equal to

$$\left[\overline{A}^{2(M-1)}\overline{B}\right]^{\text{ijk}} = \sum\_{k\_{M-1},\cdots,k\_{1}} \ell\_{i,k\_{M-1}}\cdots\ell\_{k\_{1},k} B\_{k\_{M-1}} K\_{k\_{M-1}} Q\_{k\_{M-1}} B\_{k\_{M-2}} \cdots B\_{k\_{1}} K\_{k\_{1}} Q\_{k\_{1}} B\_{k} = \sum\_{k\_{M-1}} B\_{k\_{M-1}} \overline{B}\_{k\_{M-1},k} \tag{24}$$

where <sup>1</sup> <sup>1</sup> , *k k <sup>M</sup> M m m B R k k* and *ik* denotes the position of the block element in the block matrix.

Assumption 1.

a. <sup>1</sup> <sup>1</sup> , *k k <sup>M</sup> M m m B R k k* has rank *mkM* <sup>1</sup> .

All shortest paths to node *i* from node *k* pass through a single neighbor *kM* 1 of *i*.

An example case where Assumption 1a holds is when there is a single shortest path from *k* to *i*, , *mmi <sup>i</sup>* , () , *<sup>i</sup> rank B m i* .

**Lemma 2.** Let ( , ) *A Bj* be reachable for all *j N* and let Assumption 1 hold. Then the *i*-th closed-loop system (22) is reachable from input *<sup>k</sup> v* if and only if there exists a directed path from node *k* to node *i*.

#### **Proof:**

*Sufficiency.* If *k i* the result is obvious. Otherwise, the reachability matrix from node *k* to node *i* has the *n m* block element in block row *i* and block column *k* given as

■

■

$$
\begin{bmatrix}
\overbrace{A}^{2(M-1)}\overbrace{B}^{2(M-1)}\overbrace{A}^{2(M-1)+1}\overbrace{B}^{2(M-1)+2}\overbrace{B}^{-}\dots\overbrace{.}^{jk} &= \left|\sum\_{k\_{M-1}}B\_{k\_{M-1}}\sum\_{k\_{M-1}}AB\_{k\_{M-1}}\sum\_{k\_{M-1}}A^{2}B\_{k\_{M-1}}\dots\right\rangle\right|
$$

$$
\begin{bmatrix}
\overbrace{B}\_{k\_{M-1},k} & \* & \* \\
0 & \overbrace{B}\_{k\_{M-1},k} & \* \\
\vdots & 0 & \overbrace{B}\_{k\_{M-1},k} & \\
0 & \cdots & 0 & \ddots
\end{bmatrix}
$$

where \* denotes nonzero entries. Under the assumptions, the matrix on the right has full row rank and the matrix on the left is written as 11 1 2 *kk k MM M B AB A B* .

However, 1 (, ) *kM A B* is reachable.

*Necessity.* If there is no path from node *k* to node *i*, then the control input of node *k* cannot influence the state or value of node *i*.

**Theorem 1.** Let (, ) *A Bi* be reachable for all *i N* . Let every node *i* be in best response to all his neighbors *<sup>i</sup> j N* . Let Assumption 1 hold. Then all nodes in the graph are in interactive global Nash equilibrium if and only if the graph is strongly connected.

#### **Proof:**

Let every node *i* be in best response to all his neighbors *<sup>i</sup> j N* . Then \* ( , ) ( , ), *ii i ii i J uu J uu i* . Hence \* , *uu uu jj j <sup>i</sup>* and \* \* \* ( , ) ( , ), *ii i ii i J uu J uu i* . However, according to (9) \* \* \* \* ( , , ) ( , , ), { } *ii ik ii ik <sup>i</sup> Juu u Juu u k i N* so that \* \* \* ( , ) ( , ), *i i Gi i i Gi J uu J uu i* and the nodes are in Nash equilibrium.

*Necessity.* If the graph is not strongly connected, then there exist nodes *k* and *i* such that there is no path from node *k* to node *i*. Then, the control input of node *k* cannot influence the state or the value of node *i*. Therefore, the Nash equilibrium is not interactive.

*Sufficiency.* Let (, ) *A Bi* be reachable for all *i N* . Then if there is a path from node *k* to node *i*, the state *<sup>i</sup>* is reachable from *uk* , and from (9) input *uk* can change the value *<sup>i</sup> J* . Strong connectivity means there is a path from every node *k* to every node *i* and condition (21) holds for all *ik N* , .

The reachability condition is sufficient but not necessary for Interactive Nash equilibrium.

According to the results just established, the following assumptions are made.

#### **Assumptions 2.**


#### **3.4 Stability and solution of graphical games**

Substituting control policies (16) into (14) yields the coupled cooperative game Hamilton-Jacobi (HJ) equations

$$\frac{\left\|\boldsymbol{\hat{V}\_{i}}\right\|^{\boldsymbol{T}}\boldsymbol{A}\_{i}^{\boldsymbol{c}}+\frac{1}{2}\boldsymbol{\delta}\_{i}^{\top}\boldsymbol{Q}\_{i\boldsymbol{h}}\boldsymbol{\delta}\_{i}+\frac{1}{2}(\boldsymbol{d}\_{i}+\boldsymbol{g}\_{i})^{2}\frac{\left\|\boldsymbol{V}\_{i}\right\|^{\boldsymbol{T}}}{\left\|\boldsymbol{\delta}\_{i}\right\|^{-1}\boldsymbol{B}\_{i}\boldsymbol{R}\_{ii}^{-1}\boldsymbol{B}\_{i}^{\top}\frac{\left\|\boldsymbol{V}\_{i}\right\|}{\left\|\boldsymbol{\delta}\_{i}\right\|}+\frac{1}{2}\sum\_{j\in\mathcal{N}\_{i}}\left(\boldsymbol{d}\_{j}+\boldsymbol{g}\_{j}\right)^{2}\frac{\left\|\boldsymbol{V}\_{j}\right\|^{\boldsymbol{T}}}{\left\|\boldsymbol{\delta}\_{j}\right\|^{\boldsymbol{T}}\boldsymbol{R}\_{jj}^{-1}\boldsymbol{R}\_{ij}\boldsymbol{R}\_{jj}^{-1}\boldsymbol{B}\_{j}^{\top}}{\left\|\boldsymbol{\delta}\_{i}\right\|}=\boldsymbol{0},i\in\boldsymbol{N}\quad\text{(25)}$$

where the closed-loop matrix is

38 Frontiers in Advanced Control Systems

where \* denotes nonzero entries. Under the assumptions, the matrix on the right has full

*Necessity.* If there is no path from node *k* to node *i*, then the control input of node *k* cannot

**Theorem 1.** Let (, ) *A Bi* be reachable for all *i N* . Let every node *i* be in best response to all his neighbors *<sup>i</sup> j N* . Let Assumption 1 hold. Then all nodes in the graph are in interactive

Let every node *i* be in best response to all his neighbors *<sup>i</sup> j N* . Then

according to (9) \* \* \* \* ( , , ) ( , , ), { } *ii ik ii ik <sup>i</sup> Juu u Juu u k i N* so that \* \* \* ( , ) ( , ), *i i Gi i i Gi J uu J uu i*

*Necessity.* If the graph is not strongly connected, then there exist nodes *k* and *i* such that there is no path from node *k* to node *i*. Then, the control input of node *k* cannot influence the

*Sufficiency.* Let (, ) *A Bi* be reachable for all *i N* . Then if there is a path from node *k* to node

connectivity means there is a path from every node *k* to every node *i* and condition (21)

The reachability condition is sufficient but not necessary for Interactive Nash equilibrium.

b. The graph is strongly connected and at least one pinning gain *<sup>i</sup> g* is nonzero. Then

is reachable from *uk* , and from (9) input *uk* can change the value *<sup>i</sup> J* . Strong

, *uu uu jj j <sup>i</sup>*

state or the value of node *i*. Therefore, the Nash equilibrium is not interactive.

According to the results just established, the following assumptions are made.

2( 1) 2( 1) 1 2( 1) 2 2

*A B A B A B B AB A B*

1

*k k*

*M*

However, 1 (, ) *kM A B*

*B*

**Proof:** 

*i*, the state *<sup>i</sup>*

holds for all *ik N* , .

**Assumptions 2.** 

,

1

*k k*

0 \* 0 0 0

*M*

*B*

,

influence the state or value of node *i*.

\* ( , ) ( , ), *ii i ii i J uu J uu i* . Hence \*

and the nodes are in Nash equilibrium.

a. (, ) *A Bi* is reachable for all *i N* .

*L G* is nonsingular.

 

\* \*

1

is reachable.

*k k*

*ik MM M*

*M*

*B*

,

row rank and the matrix on the left is written as 11 1

global Nash equilibrium if and only if the graph is strongly connected.

11 1

*MM M*

*kk k*

2

and \* \* \* ( , ) ( , ), *ii i ii i J uu J uu i* . However,

.

■

■

11 1

*MM M*

*kk k*

*kk k MM M B AB A B* 

$$A\_i^c = A\mathcal{S}\_i - \left(d\_i + \mathbf{g}\_i\right)^2 B\_i \mathcal{R}\_{\vec{n}}^{-1} B\_i^T \frac{\partial V\_i}{\partial \mathcal{S}\_i} + \sum\_{j \in \mathcal{N}\_i} \boldsymbol{\varepsilon}\_{\vec{\eta}} (d\_j + \mathbf{g}\_j) B\_j \mathcal{R}\_{\vec{\eta}}^{-1} B\_j^T \frac{\partial V\_j}{\partial \mathcal{S}\_j}, i \in \mathcal{N} \tag{26}$$

For a given *Vi* , define \* ( ) *u uV i ii* as (16) given in terms of *Vi* . Then HJ equations (25) can be written as

$$H\_i(\mathcal{S}\_i, \frac{\partial V\_i}{\partial \mathcal{S}\_i}, \boldsymbol{\mu}\_i^\*, \boldsymbol{\mu}\_{-i}^\*) = 0 \tag{27}$$

There is one coupled HJ equation corresponding to each node, so solution of this *N*-player game problem is blocked by requiring a solution to *N* coupled partial differential equations. In the next sections we show how to solve this *N*-player cooperative game online in a distributed fashion at each node, requiring only measurements from neighbor nodes, by using techniques from reinforcement learning.

It is now shown that the coupled HJ equations (25) can be written as coupled Riccati equations. For the global state given in (4) we can write the dynamics as

$$\dot{\mathcal{S}} = (I\_N \otimes A)\mathcal{S} + (L+G) \otimes I\_n \text{diag}(B\_i)u \tag{28}$$

where *u* is the control given by

$$\mu = -\text{diag}(\mathbf{R}\_{ii}^{\quad} \mathbf{B}\_i^T) \{ (\mathbf{D} + \mathbf{G}) \otimes I\_n \mathbf{p} \} \tag{29}$$

where *diag*(.) denotes diagonal matrix of appropriate dimensions. Furthermore the global costate dynamics are

$$-\dot{p} = \frac{\partial H}{\partial \mathcal{S}} \equiv \left(I\_N \otimes A\right)^T p + \text{diag}\left(Q\_{ii}\right)\mathcal{S} \tag{30}$$

This is a set of coupled dynamic equations reminiscent of standard multi-player games (Basar, and Olsder, 1999) or single agent optimal control (Lewis, and Syrmos, 1995). Therefore the solution can be written without any loss of generality as

$$p = \overline{P}\mathcal{S} \tag{31}$$

for some matrix *P* 0 *nNxnN* .

**Lemma 3.** HJ **e**quations (25) are equivalent to the coupled Riccati equations

$$
\delta \int \delta^T \overline{P}^T \overline{A}\_i \delta - \delta^T \overline{P}^T \overline{B}\_i \overline{P} \delta + \frac{1}{2} \delta^T \overline{Q}\_i \delta + \frac{1}{2} \delta^T \overline{P}^T \overline{R}\_i \overline{P} \delta = 0 \tag{32}
$$

or equivalently, in closed-loop form,

$$\left(\overline{P}^T \overline{A}\_{ic} + \overline{A}\_{ic}\,^T \overline{P} + \overline{Q}\_i + \overline{P}^T \overline{R}\_i \overline{P}\right) = 0\tag{33}$$

where *P* is defined by (31), and

$$
\overline{A}\_{i} = \begin{bmatrix} 0 \\ & \\ & \mathbf{0} \\ & & \mathbf{0} \end{bmatrix}, \overline{B}\_{i} = \begin{bmatrix} \mathbf{0} \\ & \\ & \\ & \mathbf{0} \end{bmatrix}, \overline{B}\_{i} = \begin{bmatrix} \mathbf{0} \\ & \\ & \begin{bmatrix} \left(\mathbf{d}\_{i} + \mathbf{g}\_{i}\right)I\_{n} \end{bmatrix}^{\overline{\boldsymbol{u}}} \begin{bmatrix} -\boldsymbol{a}\_{ij}I\_{n} \\ \boldsymbol{\uppi} - \boldsymbol{a}\_{ij}I\_{n} \\ \mathbf{0} \end{bmatrix} \text{diag}\{\left(\mathbf{d}\_{i} + \mathbf{g}\_{i}\right)\mathbf{B}\_{i}\mathbf{R}\_{ii}^{-1}\mathbf{B}\_{i}^{T}\}
$$
 
$$
\overline{A}\_{ic} = \overline{A}\_{i} - \overline{B}\_{i}\overline{P}
$$

$$
\overline{Q}\_{i} = \begin{bmatrix} 0 & & & \\ & 0 & & \\ & & \left[ Q\_{ii} \right]^{\overline{u}} & & \\ & & & 0 \end{bmatrix}, \overline{R}\_{i} = \text{diag}((d\_{i} + g\_{i})R\_{i}R\_{i}^{-1}) \begin{bmatrix} R\_{i1} & & & & \\ & \ddots & & & \\ & & R\_{i\overline{u}} & & \\ & & & \ddots & \\ & & & & R\_{i\overline{u}} & \\ & & & & & R\_{i\overline{u}} \end{bmatrix} \text{diag}((d\_{i} + g\_{i})R\_{i}^{-1}B\_{i}^{T})^{\frac{1}{2}}
$$

#### **Proof:**

Take (14) and write it with respect to the global state and costate as

$$H\_i \equiv \begin{bmatrix} \frac{\partial V\_1}{\partial \delta\_1} \\ \vdots \\ \vdots \\ \frac{\partial V\_N}{\partial \delta\_N} \end{bmatrix}^T \begin{bmatrix} 0 & & & \\ & 0 & & \\ & & [A]^{\tilde{\mu}} & & \\ & & & 0 \end{bmatrix} \delta^\tau$$

$$
\begin{bmatrix}
\frac{\partial V\_1}{\partial \delta\_1} \\
\vdots \\
\vdots \\
\vdots \\
\frac{\partial V\_N}{\partial \delta\_N} \\
\hline
\overline{\partial \delta\_N}
\end{bmatrix}
\begin{bmatrix}
0 & \cdots & 0 & 0 \\
\vdots & 0 & \vdots & \vdots \\
\vdots & \vdots & \left[\left(d\_i + g\_i\right)I\_n\right]^{\overline{n}} & \left[-a\_{ij}I\_n\right]^{\overline{n}} \\
0 & \cdots & 0 & 0
\end{bmatrix}
\begin{bmatrix}
B\_1 & & & & \\
& \ddots & & & \\
& & B\_i & & \\
& & & B\_N
\end{bmatrix}
\begin{bmatrix}
u\_1 \\
\vdots \\
u\_i \\
u\_N
\end{bmatrix}
\tag{34}
$$

$$\begin{bmatrix} +\frac{1}{2}\boldsymbol{\delta}^{T} \\\\ \boldsymbol{\delta} + \frac{1}{2}\boldsymbol{\delta}^{T} \\\\ \boldsymbol{\delta} \end{bmatrix} \qquad \begin{bmatrix} \mathbf{0} \\\\ \boldsymbol{\delta}\boldsymbol{\delta}\_{i\boldsymbol{i}} \end{bmatrix} \qquad \begin{bmatrix} \boldsymbol{u}\_{1} \\\\ \boldsymbol{\delta} + \frac{1}{2} \\\\ \boldsymbol{u}\_{i} \\ \boldsymbol{u}\_{N} \end{bmatrix} \begin{bmatrix} \boldsymbol{R}\_{i1} \\\\ \boldsymbol{R}\_{i\dot{j}} \\\\ \boldsymbol{R}\_{i\dot{i}} \\\\ \boldsymbol{R}\_{iN} \end{bmatrix} \qquad \begin{bmatrix} \boldsymbol{u}\_{1} \\\\ \boldsymbol{\delta}\_{i} \\\\ \boldsymbol{u}\_{i} \\\\ \boldsymbol{u}\_{N} \end{bmatrix} = \mathbf{0}$$

By definition of the costate one has

$$p = \left[\frac{\partial V\_1}{\partial \delta\_1} \quad \dots \quad \dots \quad \frac{\partial V\_N}{\partial \delta\_N}\right]^\top = \overline{P}\delta \tag{35}$$

From the control policies (16), (34) becomes (32).

It is now shown that if solutions can be found for the coupled design equations (25), they provide the solution to the graphical game problem.

#### **Theorem 2. Stability and Solution for Cooperative Nash Equilibrium.**

Let Assumptions 1 and 2a hold. Let <sup>1</sup> 0 , *V C iN <sup>i</sup>* be smooth solutions to HJ equations (25) and control policies \* *ui* , *i N* be given by (16) in terms of these solutions *Vi* . Then

a. Systems (8) are asymptotically stable so all agents synchronize.

 \*\* \* 1 2 *uu u* , ,..., are in global Nash equilibrium and the corresponding game values are

$$\left(\Box f\_i^\*\left(\delta\_i(\mathbf{0})\right)\right) = V\_i \quad i \in N \tag{36}$$

#### **Proof:**

40 Frontiers in Advanced Control Systems

1 1 2 2 <sup>0</sup> *TT TT T TT P A P BP Q P RP ii i i*

 (32)

1

*T*

(34)

(( ) )

( ) 0 *TT T P A A P Q P RP ic ic i i* (33)

0

0

*i i i i n ij n*

*d g I aI B u*

*B u*

1 1

*ii*

*R*

*ij T*

*iN*

*R*

*ii ij i i i i ii i*

, (( ) ) (( ) )

*R*

*i in ij n <sup>B</sup> diag d g B R B d g I aI*

*A*

0

1 1 1

*N N N*

*V B u*

*ii ij*

 

1

*i*

*R*

*A A BP ic i i*

*i ii i i i i ii i i ii i*

*<sup>Q</sup> R diag d g B R diag d g R B <sup>Q</sup>*

**Lemma 3.** HJ **e**quations (25) are equivalent to the coupled Riccati equations

,

1 <sup>1</sup> 0

*V*

 

> *N N*

*V*

0 00

0 00

*T*

*i ii*

Take (14) and write it with respect to the global state and costate as

*H*

0

0

 

for some matrix *P* 0 *nNxnN* .

or equivalently, in closed-loop form,

0

1

*V*

*N*

*T*

 

0

where *P* is defined by (31), and

0 0

*Ai ii <sup>A</sup>*

 

*ii*

0 0

**Proof:** 

 

If 0 *Vi* satisfies (25) then it also satisfies (14). Take the time derivative to obtain

$$\dot{V}\_{i} = \frac{\partial V\_{i}}{\partial \mathcal{S}\_{i}} \quad \dot{\mathcal{S}}\_{i} = \frac{\partial V\_{i}}{\partial \mathcal{S}\_{i}} \left( A \mathcal{S}\_{i} + (d\_{i} + g\_{i}) B\_{i} \boldsymbol{\mu}\_{i} - \sum\_{j \in \mathcal{N}\_{i}} e\_{ij} B\_{j} \boldsymbol{\mu}\_{j} \right) = -\frac{1}{2} \left( \boldsymbol{\mathcal{S}}\_{i}^{T} Q\_{ii} \boldsymbol{\mathcal{S}}\_{i} + \boldsymbol{u}\_{i}^{T} R\_{ii} \boldsymbol{\mu}\_{i} + \sum\_{j \in \mathcal{N}\_{i}} \boldsymbol{u}\_{j}^{T} R\_{ij} \boldsymbol{\mu}\_{j} \right) \tag{37}$$

which is negative definite since 0 *Qii* . Therefore *Vi* is a Lyapunov function for *<sup>i</sup>* and systems (8) are asymptotically stable.

According to part *a*, () 0 *<sup>i</sup> t* for the selected control policies. For any smooth functions ( ), *V iN i i* , such that (0) 0 *Vi* , setting ( ( )) 0 *Vi i* one can write (9) as

$$\begin{aligned} \mathcal{J}\_i(\mathcal{S}\_i(\mathbf{0}), \boldsymbol{\mu}\_i, \boldsymbol{\mu}\_{-i}) &= \frac{1}{2} \int\_0^\infty (\boldsymbol{\mathcal{S}}\_i^T \boldsymbol{Q}\_{ii} \boldsymbol{\mathcal{S}}\_i + \boldsymbol{\mu}\_i^T \boldsymbol{R}\_{ii} \boldsymbol{\mu}\_i + \sum\_{j \in N\_i} \boldsymbol{\mu}\_j^T \boldsymbol{R}\_{ij} \boldsymbol{\mu}\_j) \, dt + V\_i(\boldsymbol{\mathcal{S}}\_i(\mathbf{0})) \\ &+ \int\_0^\infty \frac{\mathcal{D}V\_i}{\partial \boldsymbol{\mathcal{S}}\_i}^T \left( A \boldsymbol{\mathcal{S}}\_i + (\boldsymbol{d}\_i + \boldsymbol{g}\_i) B\_i \boldsymbol{\mu}\_i - \sum\_{j \in N\_i} \boldsymbol{e}\_{ij} \boldsymbol{B}\_j \boldsymbol{\mu}\_j \right) dt \end{aligned}$$

■

Now let *Vi* satisfy (25) and \* \* , *u u i i* be the optimal controls given by (16). By completing the squares one has

$$\begin{split} \mathcal{J}\_{i}(\mathcal{S}\_{i}(\mathbf{0}),\boldsymbol{u}\_{i},\boldsymbol{u}\_{-i}) &= \boldsymbol{V}\_{i} \; \{\mathcal{S}\_{i}(\mathbf{0})\} + \prod\_{0}^{\bullet} \Big( \frac{1}{2} \sum\_{j \in \mathcal{N}\_{i}} \left( \boldsymbol{u}\_{j} - \boldsymbol{u}\_{j}^{\ast} \right)^{T} \boldsymbol{R}\_{ij} (\boldsymbol{u}\_{j} - \boldsymbol{u}\_{j}^{\ast}) + \frac{1}{2} (\boldsymbol{u}\_{i} - \boldsymbol{u}\_{i}^{\ast})^{T} \boldsymbol{R}\_{ii} (\boldsymbol{u}\_{i} - \boldsymbol{u}\_{i}^{\ast}) \Big) \\ &- \frac{\partial \boldsymbol{V}\_{i}}{\partial \mathcal{S}\_{i}} \sum\_{j \in \mathcal{N}\_{i}} \boldsymbol{e}\_{ij} \boldsymbol{B}\_{j} (\boldsymbol{u}\_{j} - \boldsymbol{u}\_{j}^{\ast}) + \sum\_{j \in \mathcal{N}\_{i}} \boldsymbol{u}\_{j}^{\ast T} \boldsymbol{R}\_{ij} (\boldsymbol{u}\_{j} - \boldsymbol{u}\_{j}^{\ast}) \big) \mathrm{d}t \end{split}$$

At the equilibrium point \* *u u i i* and \* *u u j j* so

$$J\_i^\*\left(\delta\_i(0)\_\prime \mu\_i^\*\;\mu\_{-i}^\*\right) = V\_i\left(\delta\_i(0)\right).$$

Define

$$J\_i(\boldsymbol{\mu}\_i, \boldsymbol{\mu}\_{-i}^\*) = V\_i \text{ (}\boldsymbol{\delta}\_i(0)\text{)} + \frac{1}{2} \int\_0^\infty (\boldsymbol{\mu}\_i - \boldsymbol{\mu}\_i^\*)^T \boldsymbol{R}\_{ii} (\boldsymbol{\mu}\_i - \boldsymbol{\mu}\_i^\*) dt$$

and \* ( (0)) *i ii J V* . Then clearly \* *<sup>i</sup> J* and \* (, ) *ii i Juu* satisfy (19). Since this is true for all *i*, Nash condition (17) is satisfied.

The next result shows when the systems are in Interactive Nash equilibrium. This means that the graphical game is well defined in the sense that all players are in a single Nash equilibrium with each player affecting the decisions of all other players.

**Corollary 1.** Let the hypotheses of Theorem 2 hold. Let Assumptions 1 and 2 hold so that the graph is strongly connected. Then \*\* \* 1 2 *uu u* , ,..., are in interactive Nash equilibrium and all agents synchronize.

#### **Proof:**

From Theorems 1 and 2.

■

■

#### **3.5 Global and local performance objectives: Cooperation and competition**

The overall objective of all the nodes is to ensure synchronization of all the states ( ) *<sup>i</sup> x t* to <sup>0</sup> *x t*( ) . The multi player game formulation allows for considerable freedom of each agent while achieving this objective. Each agent has a performance objective that can embody team objectives as well as individual node objectives.

The performance objective of each node can be written as

$$J\_i = \frac{1}{N\_i} \sum\_{j \in \mathcal{N}\_i} J\_j + \frac{1}{N\_i} \sum\_{j \in \mathcal{N}\_i} (J\_i - J\_j) \equiv J\_{\text{team}} + J\_i^{\text{confict}}$$

where *Jteam* is the overall ('center of gravity') performance objective of the networked team and *conflict <sup>i</sup> J* is the conflict of interest or competitive objective. *Jteam* measures how much the players are vested in common goals, and *conflict <sup>i</sup> J* expresses to what extent their objectives differ. The objective functions can be chosen by the individual players, or they may be assigned to yield some desired team behavior.

#### **4. Policy iteration algorithms for cooperative multi-player games**

Reinforcement learning (RL) techniques have been used to solve the single-player optimal control problem online using adaptive learning techniques to determine the optimal value function. Especially effective are the approximate dynamic programming (ADP) methods (Werbos, 1974; Werbos, 1992). RL techniques have also been applied for multiplayer games with centralized dynamics (18). See for example (Busoniu, Babuska, and De Schutter, 2008; Vrancx, Verbeeck, and Nowe, 2008). Most applications of RL for solving optimal control problems or games online have been to finite-state systems or discrete-time dynamical systems. In this section is given a policy iteration algorithm for solving continuous-time differential games on graphs. The structure of this algorithm is used in the next section to provide online adaptive solutions for graphical games.

#### **4.1 Best response**

42 Frontiers in Advanced Control Systems

( (0), , ) ( (0)) ( ( ) ( ) ( ) ( )

*ii i i i i j j ijj j i i ii i i*

( (0), , ) ( (0)) *ii i i ii J uu V*

\* 1 \* \* 2 0 ( , ) ( (0)) ( ) ( ) *<sup>T</sup> ii i i i i i ii i i J u u V u u R u u dt*

 

The next result shows when the systems are in Interactive Nash equilibrium. This means that the graphical game is well defined in the sense that all players are in a single Nash

**Corollary 1.** Let the hypotheses of Theorem 2 hold. Let Assumptions 1 and 2 hold so that the

The overall objective of all the nodes is to ensure synchronization of all the states ( ) *<sup>i</sup> x t* to <sup>0</sup> *x t*( ) . The multi player game formulation allows for considerable freedom of each agent while achieving this objective. Each agent has a performance objective that can embody

*J uu V u u Ru u u u Ru u*

0

*ij j j j j ij j j*

*<sup>V</sup> e B u u u R u u dt*

*i i*

 

*i T*

*i j N j N*

At the equilibrium point \* *u u i i* and \* *u u j j* so

. Then clearly \*

graph is strongly connected. Then \*\* \*

team objectives as well as individual node objectives.

The performance objective of each node can be written as

( ) ( ))

equilibrium with each player affecting the decisions of all other players.

**3.5 Global and local performance objectives: Cooperation and competition** 

1 1 ( ) *i i i i*

*jN jN*

*i N N j i j team i*

*J J JJ J J*

\*\* \*

*j N*

*i*

\* \*\*

2 2

 

, *u u i i* be the optimal controls given by (16). By completing the

1 1 \* \*\* \*

*T T*

*<sup>i</sup> J* and \* (, ) *ii i Juu* satisfy (19). Since this is true for all *i*, Nash

1 2 *uu u* , ,..., are in interactive Nash equilibrium and all

*conflict*

■

■

Now let *Vi* satisfy (25) and \* \*

squares one has

and \* ( (0)) *i ii J V* 

agents synchronize.

From Theorems 1 and 2.

**Proof:**

condition (17) is satisfied.

Define

*T*

Theorem 2 and Corollary 1 reveal that, under assumptions 1 and 2, the systems are in interactive Nash equilibrium if, for all *i N* node *i* selects his best response policy to his neighbors policies and the graph is strongly connected. Define the best response HJ equation as the Bellman equation (14) with control \* *u u i i* given by (16) and arbitrary policies { : } *u u <sup>i</sup> <sup>j</sup> <sup>i</sup> j N*

$$0 = \mathbf{H}\_i(\boldsymbol{\delta}\_i, \frac{\partial \boldsymbol{V}\_i}{\partial \boldsymbol{\delta}\_i}, \boldsymbol{u}\_i^\*, \boldsymbol{u}\_{-i}) \equiv \frac{\boldsymbol{\mathcal{D}} \boldsymbol{V}\_i^T}{\boldsymbol{\mathcal{D}} \boldsymbol{\delta}\_i} \quad \boldsymbol{A}\_i^c + \frac{1}{2} \boldsymbol{\mathcal{S}}\_i^T \boldsymbol{Q}\_{i\bar{i}} \boldsymbol{\delta}\_i + \frac{1}{2} (\boldsymbol{d}\_i + \boldsymbol{g}\_i)^2 \frac{\boldsymbol{\mathcal{D}} \boldsymbol{V}\_i^T}{\boldsymbol{\mathcal{D}} \boldsymbol{\delta}\_i} \quad \boldsymbol{B}\_i \boldsymbol{R}\_{ii}^{-1} \boldsymbol{B}\_i^T \frac{\boldsymbol{\mathcal{D}} \boldsymbol{V}\_i}{\boldsymbol{\mathcal{D}} \boldsymbol{\delta}\_i} + \frac{1}{2} \sum\_{j \neq N\_i} \boldsymbol{u}\_j^T \boldsymbol{R}\_{\bar{\boldsymbol{\eta}}} \boldsymbol{u}\_j \tag{38}$$

where the closed-loop matrix is

$$A\_i^\varepsilon = A\mathcal{S}\_i - \left(d\_i + \mathcal{g}\_i\right)^2 B\_i R\_{ii}^{-1} B\_i^{\ T} \frac{\partial V\_i}{\partial \mathcal{S}\_i} - \sum\_{j \neq N\_i} e\_{ij} B\_j \mu\_j \tag{39}$$

#### **Theorem 3. Solution for Best Response Policy**

Given fixed neighbor policies { : } *u u <sup>i</sup> <sup>j</sup> <sup>i</sup> j N* , assume there is an admissible policy *ui* . Let <sup>1</sup> 0 *V C <sup>i</sup>* be a smooth solution to the best response HJ equation (38) and let control policy \* *ui* be given by (16) in terms of this solution *Vi* . Then

a. Systems (8) are asymptotically stable so that all agents synchronize.

b. \* *ui* is the best response to the fixed policies *ui* of its neighbors.

#### **Proof:**


$$\begin{aligned} \mathcal{J}\_i(\mathcal{S}\_i(\mathbf{0}), \boldsymbol{\mu}\_i, \boldsymbol{\mu}\_{-i}) &= \frac{1}{2} \int\_0^\infty (\boldsymbol{\delta}\_i^T \boldsymbol{Q}\_{ii} \boldsymbol{\delta}\_i + \boldsymbol{\mu}\_i^T \boldsymbol{R}\_{ii} \boldsymbol{\mu}\_i + \sum\_{j \neq N\_i} \boldsymbol{\mu}\_j^T \boldsymbol{R}\_{ij} \boldsymbol{\mu}\_j) \, dt + V\_i(\boldsymbol{\delta}\_i(\mathbf{0})) \\ &+ \int\_0^\infty \frac{\partial V\_i}{\partial \boldsymbol{\delta}\_i}^T \left( A \boldsymbol{\delta}\_i + (\boldsymbol{d}\_i + \boldsymbol{g}\_i) B\_i \boldsymbol{\mu}\_i - \sum\_{j \neq N\_i} \boldsymbol{e}\_{ij} B\_j \boldsymbol{\mu}\_j \right) dt \end{aligned}$$

Now let *Vi* satisfy (38), \* *ui* be the optimal controls given by (16), and *ui* be arbitrary policies. By completing the squares one has

$$J\_i(\mathcal{S}\_i(\mathbf{0}), \boldsymbol{\mu}\_i, \boldsymbol{\mu}\_{-i}) = V\_i \text{ (}\mathcal{S}\_i(\mathbf{0})\text{)} + \int\_{\mathbf{0}}^{\boldsymbol{\mu}} \frac{1}{2} (\boldsymbol{\mu}\_i - \boldsymbol{\mu}\_i^\*)^T R\_{ii} (\boldsymbol{\mu}\_i - \boldsymbol{\mu}\_i^\*) dt$$

The agents are in best response to fixed policies *u<sup>i</sup>* when \* *u u i i* so

$$J\_i(\delta\_i(0)\_\prime \mu\_i^", \mu\_{-i}) = V\_i \text{ (}\delta\_i(0)\text{)}\text{)}$$

Then clearly ( (0), , ) *ii i i J uu* and \* ( (0), , ) *ii i i J uu* satisfy (19).


#### **4.2 Policy iteration solution for graphical games**

The following algorithm for the *N-*player distributed games is motivated by the structure of policy iteration algorithms in reinforcement learning (Bertsekas, and Tsitsiklis, 1996; Sutton, and Barto, 1998) which rely on repeated policy evaluation (e.g. solution of (14)) and policy improvement (solution of (16)). These two steps are repeated until the policy improvement step no longer changes the present policy. If the algorithm converges for every *i* , then it converges to the solution to HJ equations (25), and hence provides the distributed Nash equilibrium. One must note that the costs can be evaluated only in the case of admissible control policies, admissibility being a condition for the control policy which initializes the algorithm.

#### **Algorithm 1. Policy Iteration (PI) Solution for** N**-player distributed games.**

*Step 0*: Start with admissible initial policies <sup>0</sup> , *u i <sup>i</sup>* .

*Step 1*: (Policy Evaluation) Solve for *<sup>k</sup> Vi* using (14)

$$H\_i(\mathcal{S}\_i, \frac{\partial V\_i}{\partial \mathcal{S}\_i}^k, \mu\_i^{k\_i}, \mu\_{-i}^{-k}) = 0 \; , \forall i = 1, \dots, N \tag{40}$$

*Step 2*: (Policy Improvement) Update the *N*-tuple of control policies using

$$\mu\_i^{k+1} = \underset{u\_i}{\text{arg min}} \, H\_i(\mathcal{S}\_i \frac{\partial V\_i}{\partial \mathcal{S}\_i}^k, \mu\_{i'} \mu\_{-i}^{-k}), \forall i = 1, \dots, N$$

which explicitly is

44 Frontiers in Advanced Control Systems

( (0), , ) ( ) ( (0))

*J u u Q u R u u R u dt V*

*TT T ii i i i ii i i ii i j ij j i i*

*i*

Now let *Vi* satisfy (38), \* *ui* be the optimal controls given by (16), and *ui* be arbitrary

2 0 ( (0), , ) ( (0)) ( ) ( ) *<sup>T</sup> i i i i i i i i ii i i J u u V u u R u u dt*

 

 

\* ( (0), , ) ( (0)) *ii i i i i J uu V*

satisfy (19).

The following algorithm for the *N-*player distributed games is motivated by the structure of policy iteration algorithms in reinforcement learning (Bertsekas, and Tsitsiklis, 1996; Sutton, and Barto, 1998) which rely on repeated policy evaluation (e.g. solution of (14)) and policy improvement (solution of (16)). These two steps are repeated until the policy improvement step no longer changes the present policy. If the algorithm converges for every *i* , then it converges to the solution to HJ equations (25), and hence provides the distributed Nash equilibrium. One must note that the costs can be evaluated only in the case of admissible control policies, admissibility being a condition for the control policy which initializes the

*t* for the selected control policies. For any smooth functions

*i*

1 \* \*

, 1, , *i N* (40)

*j N*

one can write (9) as

■

a. 0 *Vi* satisfies (38). Proof follows Theorem 2, part a.

, such that (0) 0 *Vi* , setting ( ( )) 0 *Vi i*

1 2 0

*i j N*

The agents are in best response to fixed policies *u<sup>i</sup>* when \* *u u i i* so

 and \* ( (0), , ) *ii i i J uu* 

**4.2 Policy iteration solution for graphical games** 

*Step 0*: Start with admissible initial policies <sup>0</sup> , *u i <sup>i</sup>* .

*Step 1*: (Policy Evaluation) Solve for *<sup>k</sup> Vi* using (14)

**Algorithm 1. Policy Iteration (PI) Solution for** N**-player distributed games.** 

(, , , )0 *k i k k ii i i i <sup>V</sup> H uu* 

(( ) )

*<sup>V</sup> A d g B u e B u dt*

 

*i i i i i ij j j*

 

*T i*

policies. By completing the squares one has

b. According to part *a*, () 0 *<sup>i</sup>*

0

Then clearly ( (0), , ) *ii i i J uu* 

algorithm.

( ), *V iN i i* 

**Proof:** 

$$
\mu\_i^{k+1} = -(d\_i + g\_i) R\_{ii}^{-1} B\_i^{\;T} \frac{\partial V\_i^{\;k}}{\partial \delta\_i} \; , \forall i = 1, \ldots, N \; . \tag{41}
$$

Go to step 1.

On convergence- End

■

The following two theorems prove convergence of the policy iteration algorithm for distributed games for two different cases. The two cases considered are the following, i) *only*  agent *i* updates its policy and ii) all the agents update their policies.

**Theorem 4.** Convergence of Policy Iteration algorithm when only i*th* agent updates its policy and all players *u<sup>i</sup>* in its neighborhood do not change. Given fixed neighbors policies *u<sup>i</sup>* , assume there exists an admissible policy *ui* . Assume that agent *i* performs Algorithm 1 and the its neighbors do not update their control policies. Then the algorithm converges to the best response *ui* to policies *ui* of the neighbors and to the solution *Vi* to the best response HJ equation (38).

#### **Proof:**

It is clear that

$$H\_i^o(\mathcal{S}\_i, \frac{\partial V\_i^k}{\partial \mathcal{S}\_i}, \mu\_{-i}^k) \equiv \min\_{\boldsymbol{\mu}\_i} H\_i(\mathcal{S}\_i, \frac{\partial V\_i^k}{\partial \mathcal{S}\_i}, \mu\_i^k, \mu\_{-i}^k) = H\_i(\mathcal{S}\_i, \frac{\partial V\_i^k}{\partial \mathcal{S}\_i}, \mu\_i^{k+1}, \mu\_{-i}^k) \tag{42}$$

Let (, , , )0 *k i k k ii i i i <sup>V</sup> H uu* from (40) then according to (42) it is clear that

$$H\_i^o(\boldsymbol{\delta}\_i, \frac{\boldsymbol{\delta}V\_i^k}{\boldsymbol{\delta}\boldsymbol{\delta}\_i}, \boldsymbol{u}\_{-i}^k) \le 0 \tag{43}$$

Using the next control policy *<sup>k</sup>* <sup>1</sup> *ui* and the current policies *<sup>k</sup> <sup>u</sup>i* one has the orbital derivative (Leake, Wen Liu, 1967)

$$\dot{V}\_i^k = H\_i(\mathcal{S}\_i \frac{\partial V\_i^k}{\partial \mathcal{S}\_i}, \mu\_i^{k+1}, \mu\_{-i}^k) - L\_i(\mathcal{S}\_i, \mu\_i^{k+1}, \mu\_{-i}^k)$$

From (42) and (43) one has

$$\dot{V}\_i^k = H\_i^0(\boldsymbol{\delta}\_i, \frac{\partial V\_i^k}{\partial \boldsymbol{\delta}\_i}, \boldsymbol{\mu}\_{-i}^k) - L\_i(\boldsymbol{\delta}\_i, \boldsymbol{\mu}\_i^{k+1}, \boldsymbol{\mu}\_{-i}^k) \leq -L\_i(\boldsymbol{\delta}\_i, \boldsymbol{\mu}\_i^{k+1}, \boldsymbol{\mu}\_{-i}^k) \tag{44}$$

Because only agent *i* update its control it is true that *k k* <sup>1</sup> *u u i i* and

$$H\_i(\delta\_i, \frac{\partial V\_i^{k+1}}{\partial \delta\_i}, \mu\_i^{k+1}, \mu\_{-i}^k) = 0 \ . $$

But since <sup>1</sup> 1 1 (, , ) *<sup>k</sup> k k V L uu i iii i* , from (44) one has

$$\dot{V}\_i^k = H\_i^0(\mathcal{S}\_i, \frac{\partial V\_i^k}{\partial \mathcal{S}\_i}, \mu\_{-i}^k) - L\_i(\mathcal{S}\_i, \mu\_i^{k+1}, \mu\_{-i}^k) \le -L\_i(\mathcal{S}\_i, \mu\_i^{k+1}, \mu\_{-i}^k) = \dot{V}\_i^{k+1} \tag{45}$$

So that *k k* <sup>1</sup> *V V i i* and by integration it follows that

$$V\_i^{k+1} \le V\_i^k \tag{46}$$

■

Since \* *<sup>k</sup> V V i i* , the algorithm converges, to \* *Vi* , to the best response HJ equation (38).

The next result concerns the case where all nodes update their policies at each step of the algorithm. Define the relative control weighting as <sup>1</sup> ( ) *ij jj R Rij* , where <sup>1</sup> ( ) *R Rjj ij* is the maximum singular value of <sup>1</sup> *R Rjj ij* .

**Theorem 5. Convergence of Policy Iteration algorithm when all agents update their policies.** Assume all nodes *i* update their policies at each iteration of PI. Then for small enough edge weights *ij e* and *ij* , *ui* converges to the global Nash equilibrium and for all

*i* **,** and the values converge to the optimal game values *<sup>k</sup>* \* *V V i i* .

#### **Proof:**

It is clear that

$$\begin{split} &H\_{i}(\delta\_{i},\frac{\partial V\_{i}}{\partial \delta\_{i}}^{k+1},\boldsymbol{u}\_{i}^{k+1},\boldsymbol{u}\_{-i}^{k+1}) \equiv H\_{i}^{0}(\delta\_{i},\frac{\partial V\_{i}}{\partial \delta\_{i}}^{k+1},\boldsymbol{u}\_{-i}^{k}) + \frac{1}{2} \sum\_{j \in N\_{i}} \left(\boldsymbol{u}\_{j}^{k+1} - \boldsymbol{u}\_{j}^{k}\right)^{T} R\_{ij} \{\boldsymbol{u}\_{j}^{k+1} - \boldsymbol{u}\_{j}^{k}\} \\ &+ \sum\_{j \in N\_{i}} \boldsymbol{u}\_{j}^{kT} R\_{ij} \{\boldsymbol{u}\_{j}^{k+1} - \boldsymbol{u}\_{j}^{k}\} + \frac{\partial V\_{i}}{\partial \delta\_{i}}^{k+1 \text{T}} \sum\_{j \in N\_{i}} \boldsymbol{e}\_{ij} B\_{j} \{\boldsymbol{u}\_{j}^{k} - \boldsymbol{u}\_{j}^{k+1}\} \end{split}$$

and so

$$\begin{split} \dot{V}\_{i}^{k+1} &= -\mathcal{L}\_{i} \{ \mathcal{S}\_{i}, \boldsymbol{\mu}\_{i}^{k+1}, \boldsymbol{\mu}\_{-i}^{k+1} \} = -\mathcal{L}\_{i} \{ \mathcal{S}\_{i}, \boldsymbol{\mu}\_{i}^{k+1}, \boldsymbol{\mu}\_{-i}^{k} \} + \frac{1}{2} \sum\_{j \neq N\_{i}} \{ \boldsymbol{u}\_{j}^{k+1} - \boldsymbol{u}\_{j}^{k} \}^{T} \mathcal{R}\_{ij} \{ \boldsymbol{u}\_{j}^{k+1} - \boldsymbol{u}\_{j}^{k} \} \\ &+ \frac{\mathcal{D} V\_{i}}{\mathcal{C} \mathcal{S}\_{i}} \sum\_{j \neq N\_{i}} \boldsymbol{e}\_{ij} \boldsymbol{B}\_{j} \{ \boldsymbol{u}\_{j}^{k} - \boldsymbol{u}\_{j}^{k+1} \} + \sum\_{j \neq N\_{i}} \boldsymbol{u}\_{j}^{k\top} \mathcal{R}\_{ij} \{ \boldsymbol{u}\_{j}^{k+1} - \boldsymbol{u}\_{j}^{k} \} \end{split}$$

Therefore,

46 Frontiers in Advanced Control Systems

0 11 (, , ) (, , ) (, , )

> *i k k ii i i i <sup>V</sup> H uu*

<sup>0</sup> 1 11 (, , ) (, , ) (, , )

(45)

 *ij jj R Rij*

1 10 1 1 1 2

2

*i*

*j N*

(, , ) (, , ) ( ) ( )

*i i k k k k kT k k ii i i i i i j j ijj j*

*i*

(, , , ) (, , ) ( ) ( )

*V V H u u H u u u Ru u*

1 1 1 1 1 1 1

*V L u u L u u u u Ru u*

*k k k k k k kT k k i iii i iii i j j ijj j*

1 1

() ()

 

*ij* , *ui* converges to the global Nash equilibrium and for all

 .

*k i k kk kk k i i i i iii i iii i i*

*k k* <sup>1</sup> *V V i i*

The next result concerns the case where all nodes update their policies at each step of the

**Theorem 5. Convergence of Policy Iteration algorithm when all agents update their policies.** Assume all nodes *i* update their policies at each iteration of PI. Then for small

Since \* *<sup>k</sup> V V i i* , the algorithm converges, to \* *Vi* , to the best response HJ equation (38).

*<sup>V</sup> VH u L uu L uu V*

 

(46)

, where <sup>1</sup> ( )

*R Rjj ij* ■

is the

and

 

> 

(44)

*k i k kk kk i i i i iii i iii i*

*<sup>V</sup> VH u L uu L uu*

Because only agent *i* update its control it is true that *k k* <sup>1</sup> *u u i i*

*k*

1 <sup>1</sup> (, , , )0

*k*

*i*

, from (44) one has

*k*

algorithm. Define the relative control weighting as <sup>1</sup> ( )

.

*i* **,** and the values converge to the optimal game values *<sup>k</sup>* \* *V V i i* .

1 1

*<sup>V</sup> u Ru u eB u u*

 

*i i*

*i j N j N*

*k k*

*i i*

*j N i j N*

1 1 1

 

*i i j N k T kT k k i k k j ij j j ij j j j*

 

*i k k kT k k ij j j j j ij j j*

*<sup>V</sup> eB u u u R u u*

() ()

 

*i*

and by integration it follows that

But since <sup>1</sup> 1 1 (, , ) *<sup>k</sup> k k V L uu i iii i*

maximum singular value of <sup>1</sup> *R Rjj ij*

enough edge weights *ij e* and

**Proof:** 

and so

It is clear that

1

*k T*

So that *k k* <sup>1</sup> *V V i i*

$$\begin{split} \dot{V}\_{i}^{k} &\leq \dot{V}\_{i}^{k+1} - \frac{1}{2} \sum\_{j \in \mathcal{N}\_{i}} \{\boldsymbol{\mu}\_{j}^{k+1} - \boldsymbol{\mu}\_{j}^{k}\}^{T} \boldsymbol{R}\_{ij} \{\boldsymbol{\mu}\_{j}^{k+1} - \boldsymbol{\mu}\_{j}^{k}\} \\ &+ \frac{\partial \mathcal{V}\_{i}}{\partial \mathcal{S}\_{i}} \sum\_{j \in \mathcal{N}\_{i}} e\_{ij} \boldsymbol{B}\_{j} \{\boldsymbol{\mu}\_{j}^{k+1} - \boldsymbol{\mu}\_{j}^{k}\} - \sum\_{j \in \mathcal{N}\_{i}} \boldsymbol{\mu}\_{j}^{kT} \boldsymbol{R}\_{ij} \{\boldsymbol{\mu}\_{j}^{k+1} - \boldsymbol{\mu}\_{j}^{k}\}. \end{split}$$

A sufficient condition for *k k* <sup>1</sup> *V V i i* is

$$\frac{1}{2}\Delta\boldsymbol{\mu}\_{j}^{T}\boldsymbol{R}\_{i\boldsymbol{j}}\Delta\boldsymbol{\mu}\_{j} - \boldsymbol{e}\_{i\boldsymbol{j}}(\boldsymbol{p}\_{i}^{k+1})^{T}\boldsymbol{B}\_{\boldsymbol{j}}\Delta\boldsymbol{\mu}\_{j} - (\boldsymbol{d}\_{j} + \boldsymbol{\mathsf{g}}\_{j})(\boldsymbol{p}\_{j}^{k-1})\boldsymbol{B}\_{\boldsymbol{j}}^{T}\boldsymbol{R}\_{\boldsymbol{j}\boldsymbol{j}}^{-1}\boldsymbol{R}\_{i\boldsymbol{j}}\Delta\boldsymbol{\mu}\_{j} > 0$$

1 1 1 <sup>2</sup> ( ) ( ) *k k R u ep B d g p B ij j ij <sup>i</sup> j jj ijj j* where <sup>1</sup> ( ) *k k uu u jj j* , *<sup>i</sup> <sup>p</sup>* the costate and ( ) *Rij* is the minimum singular value of *Rij* .

This holds if 0, 0 *ij ij e* . By continuity, it holds for small values of , *ij ij e* .

This proof indicates that for the PI algorithm to converge, the neighbors' controls should not unduly influence the *i*-th node dynamics (8), and the *j*-th node should weight its own control *uj* in its performance index *<sup>j</sup> J* relatively more than node *i* weights *uj* in *<sup>i</sup> J* . These requirements are consistent with selecting the weighting matrices to obtain proper performance in the simulation examples. An alternative condition for convergence in Theorem 5 is that the norm *Bj* should be small. This is similar to the case of weakly coupled dynamics in multi-player games in (Basar, and Olsder, 1999).

#### **5. Online solution of multi-agent cooperative games using neural networks**

In this section an online algorithm for solving cooperative Hamilton-Jacobi equations (25) based on (Vamvoudakis, Lewis 2011) is presented. This algorithm uses the structure in the PI Algorithm 1 to develop an actor/critic adaptive control architecture for approximate online solution of (25). Approximate solutions of (40), (41) are obtained using value function approximation (VFA). The algorithm uses two approximator structures at each node, which are taken here as neural networks (NN) (Abu-Khalaf, and Lewis, 2005; Bertsekas, and Tsitsiklis, 1996; Vamvoudakis, Lewis 2010; Werbos, 1974; Werbos, 1992). One critic NN is used at each node for value function approximation, and one actor NN at each node to approximate the control policy (41). The critic NN seeks to solve Bellman equation (40). We give tuning laws for the actor NN and the critic NN such that equations (40) and (41) are solved simultaneously online for each node. Then, the solutions to the coupled HJ equations (25) are determined. Though these coupled HJ equations are difficult to solve, and may not even have analytic solutions, we show how to tune the NN so that the approximate solutions are learned online. The next assumption is made.

**Assumption 2.** For each admissible control policy the nonlinear Bellman equations (14), (40) have smooth solutions 0 *Vi* .

■

In fact, only local smooth solutions are needed. To solve the Bellman equations (40), approximation is required of both the value functions *Vi* and their gradients *Vi i* / . This requires approximation in Sobolev space (Abu-Khalaf, and Lewis, 2005).

#### **5.1 Critic neural network**

According to the Weierstrass higher-order approximation Theorem (Abou-Khalaf, and Lewis, 2005) there are NN weights *Wi* such that the smooth value functions *Vi* are approximated using a critic NN as

$$V\_i(\mathcal{S}\_i) = V\_i^\top \phi\_i(z\_i) + \varepsilon\_i \tag{47}$$

where ( ) *<sup>i</sup> z t* is an information vector constructed at node *i* using locally available measurements, e.g. ( ), { ( ) : } *<sup>i</sup> <sup>j</sup> <sup>i</sup> t t j N* . Vectors ( ) *<sup>h</sup> i i z* are the critic NN activation function vectors, with *h* the number of neurons in the critic NN hidden layer. According to the Weierstrass Theorem, the NN approximation error *<sup>i</sup>* converges to zero uniformly as *<sup>h</sup>* . Assuming current weight estimates ˆ*Wi* , the outputs of the critic NN are given by

$$
\hat{V}\_i = \hat{V}\hat{V}\_i^T \phi\_i \tag{48}
$$

Then, the Bellman equation (40) can be approximated at each step *k* as

$$\mathbf{H}\_{i}(\boldsymbol{\delta}\_{i},\boldsymbol{\hat{W}\_{i}},\boldsymbol{\mu}\_{i},\boldsymbol{\mu}\_{i}) = \boldsymbol{\delta}\_{i}^{T}\mathbf{Q}\_{i}\boldsymbol{\delta}\_{i} + \boldsymbol{\mu}\_{i}^{T}\mathbf{R}\_{i}\boldsymbol{\mu}\_{i} + \sum\_{j\in\mathcal{N}\_{i}}\boldsymbol{\mu}\_{j}^{T}\mathbf{R}\_{ij}\boldsymbol{\mu}\_{j} + \boldsymbol{\hat{W}\_{i}^{T}}\frac{\boldsymbol{\hat{\boldsymbol{\sigma}}}\boldsymbol{\hat{\boldsymbol{\sigma}}}\_{i}}{\boldsymbol{\hat{\sigma}}\boldsymbol{\delta}\_{i}}(\boldsymbol{\epsilon}\boldsymbol{\delta}\_{i} + (\boldsymbol{\mathrm{d}}\_{i} + \boldsymbol{\mathrm{g}}\_{i})\mathbf{B}\_{i}\boldsymbol{\mu}\_{i} - \sum\_{j\in\mathcal{N}\_{i}}\boldsymbol{\mathrm{e}}\_{ij}\mathbf{B}\_{j}\boldsymbol{\mu}\_{j}) = \boldsymbol{\varepsilon}\_{\mathcal{H}\_{i}} \tag{49}$$

It is desired to select ˆ*Wi* to minimize the square residual error

$$E\_1 = \frac{1}{2} e\_{H\_i}^T e\_{H\_i} \tag{50}$$

Then ˆ*W W i i* which solves (49) in a least-squares sense and *Hi <sup>e</sup>* becomes small. Theorem 6 gives a tuning law for the critic weights that achieves this.

#### **5.2 Action neural network and online learning**

Define the control policy in the form of an action neural network which computes the control input (41) in the structured form

$$
\hat{\boldsymbol{u}}\_{i} \equiv \hat{\boldsymbol{u}}\_{i+N} = -\frac{1}{2} (\boldsymbol{d}\_{i} + \boldsymbol{g}\_{i}) \boldsymbol{R}\_{ii}^{-1} \boldsymbol{B}\_{i}^{T} \frac{\boldsymbol{\mathcal{D}} \boldsymbol{\phi}\_{i}^{T}}{\boldsymbol{\mathcal{D}} \boldsymbol{\delta}\_{i}}^{T} \boldsymbol{\hat{W}}\_{i+N} \tag{51}
$$

where ˆ*Wi N* denotes the current estimated values of the ideal actor NN weights *Wi* . The notation ˆ*ui N* is used to keep indices straight in the proof. Define the critic and actor NN estimation errors as <sup>ˆ</sup> *WWW iii* and <sup>ˆ</sup> *W WW iN i iN* .

The next results show how to tune the critic NN and actor NN in real time at each node so that equations (40) and (41) are simultaneously solved, while closed-loop system stability is also guaranteed. Simultaneous solution of (40) and (41) guarantees that the coupled HJ equations (25) are solved for each node *i*. System (8) is said to be uniformly ultimately bounded (UUB) if there exists a compact set *<sup>n</sup> S* so that for all (0) *<sup>i</sup> S* there exists a bound *B* and a time ( , (0)) *T B <sup>i</sup>* such that ( ) *<sup>i</sup> t B* for all 0 *tt T* .

Select the tuning law for the *ith* critic NN as

$$\begin{split} \dot{\hat{\mathcal{W}}}\_{i} &= -a\_{i} \frac{\partial \mathcal{E}\_{1}}{\partial \hat{\mathcal{W}}\_{i}} = -a\_{i} \frac{\sigma\_{i+N}}{\left(1 + \sigma\_{i+N}\,\prescript{T}{}{\sigma}\_{i+N}\right)^{2}} \big[ \sigma\_{i+N} \, ^{\mathsf{T}} \hat{\mathcal{W}}\_{i} + \boldsymbol{\delta}\_{i}^{\mathsf{T}} \boldsymbol{Q}\_{i\boldsymbol{i}} \boldsymbol{\delta}\_{i} + \frac{1}{4} \boldsymbol{\delta} \hat{\mathcal{W}}\_{i+N} \, ^{\mathsf{T}} \hat{\mathcal{D}}\_{i} \hat{\mathcal{W}}\_{i+N} \\ &+ \frac{1}{4} \sum\_{j \in \mathcal{N}\_{i}} \left( \boldsymbol{d}\_{j} + \boldsymbol{g}\_{j} \right)^{2} \hat{\mathcal{W}}\_{j+N} \, ^{\mathsf{T}} \frac{\hat{\mathcal{C}} \boldsymbol{\delta}\_{j}}{\hat{\mathcal{C}} \boldsymbol{\delta}\_{j}} \boldsymbol{B}\_{j} \boldsymbol{R}\_{jj}^{-T} \boldsymbol{R}\_{ij} \boldsymbol{R}\_{jj}^{-1} \boldsymbol{B}\_{j}^{\top} \frac{\hat{\mathcal{C}} \boldsymbol{\delta}\_{j}^{\mathsf{T}}}{\hat{\mathcal{C}} \boldsymbol{\delta}\_{j}} \hat{\mathcal{W}}\_{j+N} \big] \end{split} \tag{52}$$

where (( ) ˆ ˆ ) *i i i N i i i iiN ijjj N i j N A d g Bu e Bu* , and the tuning law for the *ith* actor

NN as

48 Frontiers in Advanced Control Systems

In fact, only local smooth solutions are needed. To solve the Bellman equations (40), approximation is required of both the value functions *Vi* and their gradients *Vi i* /

According to the Weierstrass higher-order approximation Theorem (Abou-Khalaf, and Lewis, 2005) there are NN weights *Wi* such that the smooth value functions *Vi* are

() () *<sup>T</sup> V Wz ii iii i*

where ( ) *<sup>i</sup> z t* is an information vector constructed at node *i* using locally available

function vectors, with *h* the number of neurons in the critic NN hidden layer. According to

*<sup>h</sup>* . Assuming current weight estimates ˆ*Wi* , the outputs of the critic NN are given by

<sup>ˆ</sup> <sup>ˆ</sup> *<sup>T</sup> V W i ii* 

ˆ ˆ (, ,, ) (( ) ) *<sup>i</sup>*

*i i i i i i ii i i ii i j ij j i i i i ii ijjj H*

1 1 2 *i i*

Then ˆ*W W i i* which solves (49) in a least-squares sense and *Hi <sup>e</sup>* becomes small. Theorem

Define the control policy in the form of an action neural network which computes the

1 1

*u u d gRB W*

*i i N i i ii i i N*

where ˆ*Wi N* denotes the current estimated values of the ideal actor NN weights *Wi* . The notation ˆ*ui N* is used to keep indices straight in the proof. Define the critic and actor NN

The next results show how to tune the critic NN and actor NN in real time at each node so that equations (40) and (41) are simultaneously solved, while closed-loop system stability is

 

2 ˆ ˆ ˆ ( )

*H W u u Q u R u u R u W A d g Bu e Bu e*

*j N* . Vectors ( ) *<sup>h</sup>*

 

*i i* 

*i i*

*T T i*

*i*

(49)

*<sup>T</sup> E ee H H* (50)

*j N i j N*

 

requires approximation in Sobolev space (Abu-Khalaf, and Lewis, 2005).

**5.1 Critic neural network** 

approximated using a critic NN as

measurements, e.g. ( ), { ( ) : } *<sup>i</sup> <sup>j</sup> <sup>i</sup> t t* 

> 

the Weierstrass Theorem, the NN approximation error *<sup>i</sup>*

Then, the Bellman equation (40) can be approximated at each step *k* as

It is desired to select ˆ*Wi* to minimize the square residual error

6 gives a tuning law for the critic weights that achieves this.

estimation errors as <sup>ˆ</sup> *WWW iii* and <sup>ˆ</sup> *W WW iN i iN* .

**5.2 Action neural network and online learning** 

control input (41) in the structured form

*TT T T i*

. This

(47)

*z* are the critic NN activation

(48)

(51)

converges to zero uniformly as

$$
\dot{\hat{W}}\_{i+N} = -a\_{i+N} \{ (\mathbf{S}\_i \hat{\mathbf{W}}\_{i+N} - \mathbf{F}\_i \overline{\boldsymbol{\sigma}}\_{i+N}^T \mathbf{\hat{W}}\_i) - \frac{1}{4} \overline{\mathbf{D}}\_i \hat{\mathbf{W}}\_{i+N} \frac{\overline{\sigma}\_{i+N}}{m\_{si}}^T \hat{\mathbf{W}}\_i$$
 
$$- \frac{1}{4} \sum\_{j \neq N\_i} (\mathbf{d}\_j + \mathbf{g}\_j)^2 \frac{\partial \boldsymbol{\phi}\_j}{\partial \boldsymbol{\mathcal{S}}\_j} \boldsymbol{B}\_j \mathbf{R}\_{jj}^{-T} \mathbf{R}\_{ij} \mathbf{R}\_{jj}^{-1} \mathbf{B}\_j^T \frac{\partial \boldsymbol{\phi}\_j^T}{\partial \boldsymbol{\mathcal{S}}\_j} \boldsymbol{\hat{W}}\_j \frac{\overline{\mathbf{G}}\_{i+N}}{m\_{s\_i}}^T \hat{\mathbf{W}}\_{i+N} \} \tag{53}$$

where

<sup>1</sup> ( ) *T i i T i i ii i i i D x BR B* , ( 1) *<sup>i</sup> <sup>T</sup> ms iN iN* , /( 1) *<sup>T</sup> iN iN iN iN* , and 0, 0 *a a i iN* and 0, 0, *F G iN i i* are tuning parameters.

#### **Theorem 6. Online Cooperative Games.**

Let the error dynamics be given by (8), and consider the cooperative game formulation in (15). Let the critic NN at each node be given by (48) and the control input be given for each node by actor NN (51). Let the tuning law for the *ith* critic NN be provided by (52) and the tuning law for the *ith* actor NN be provided by (53). Assume /( 1) *<sup>T</sup> iN iN iN iN* is persistently exciting. Then the closed-loop system states ( ) *<sup>i</sup> t* , the critic NN errors *Wi* , and the actor NN errors *Wi N* are uniformly ultimately bounded.

#### **Proof:**

The proof is similar to (Vamvoudakis, 2011).

**Remark 1.** Theorem 6 provides algorithms for tuning the actor/critic networks of the *N* agents at the same time to guarantee stability and make the system errors ( ) *<sup>i</sup> t* small and

■

the NN approximation errors bounded. Small errors guarantee synchronization of all the node trajectories.

**Remark 2.** Persistence of excitation is needed for proper identification of the value functions by the critic NNs, and nonstandard tuning algorithms are required for the actor NNs to guarantee stability. It is important to notice that the actor NN tuning law of every agent needs information of the critic weights of all his neighbors, while the critic NN tuning law of every agent needs information of the actor weights of all his neighbors,

**Remark 3.** NN usage suggests starting with random, nonzero control NN weights in (51) in order to converge to the coupled HJ equation solutions. However, extensive simulations show that convergence is more sensitive to the persistence of excitation in the control inputs than to the NN weight initialization. If the proper persistence of excitation is not selected, the control weights may not converge to the correct values.

**Remark 4.** The issue of which inputs ( ) *<sup>i</sup> z t* to use for the critic and actor NNs needs to be addressed. According to the dynamics (8), the value functions (13), and the control inputs (16), the NN inputs at node *i* should consist of its own state, the states of its neighbors, and the costates of its neighbors. However, in view of (31) the costates are functions of the states. In view of the approximation capabilities of NN, it is found in simulations that it is suitable to take as the NN inputs at node *i* its own state and the states of its neighbors.

The next result shows that the tuning laws given in Theorem 6 guarantee approximate solution to the coupled HJ equations (25) and convergence to the Nash equilibrium.

#### **Theorem 7. Convergence to Cooperative Nash Equilibrium.**

Suppose the hypotheses of Theorem 6 hold. Then:

a. <sup>ˆ</sup> ( , , , ), ˆ ˆ *H W uu i N ii i i i* are uniformly ultimately bounded, where

 $\hat{\mathbf{u}}\_{i} = -\frac{1}{2}(\boldsymbol{d}\_{i} + \boldsymbol{g}\_{i})\mathbf{R}\_{ii}^{-1}\boldsymbol{B}\_{i}^{T}\frac{\partial \boldsymbol{\hat{\phi}}\_{i}^{T}}{\partial \boldsymbol{\delta}\_{i}}\,\hat{\mathbf{W}}\_{i}$   $\text{That is, }\hat{\mathbf{W}}\_{i}$  $\text{converge to the approximate cooperative to}$ 

coupled HJ-solution.

b. ˆ*ui N* converge to the approximate cooperative Nash equilibrium (Definition 2) for every *i* .

#### **Proof:**

The proof is similar to (Vamvoudakis, 2011) but is done only with respect to the neighbors (local information) of each agent and not with respect to all agents.

Consider the weights ˆ ˆ , *W Wi iN* to be UUB as proved in Theorem 6.

a. The approximate coupled HJ equations are <sup>ˆ</sup> ( , , , ), ˆ ˆ *H W uu i N ii i i i* .

$$H\_i(\boldsymbol{\delta}\_i, \hat{\mathcal{W}}\_i, \hat{\boldsymbol{u}}\_i, \hat{\boldsymbol{u}}\_i, \hat{\boldsymbol{u}}\_{-i}) = H\_i(\boldsymbol{\delta}\_i, \hat{\mathcal{W}}\_i, \hat{\mathcal{W}}\_{-i}) = \boldsymbol{\delta}\_i^T \boldsymbol{Q}\_{ii} \boldsymbol{\delta}\_i + \hat{\mathcal{W}}\_i^T \frac{\hat{\boldsymbol{\mathcal{C}}} \boldsymbol{\delta}\_i}{\hat{\boldsymbol{\mathcal{C}}} \boldsymbol{\delta}\_i} \boldsymbol{A} \boldsymbol{\delta}\_i - \frac{1}{4} (\boldsymbol{d}\_i + \boldsymbol{g}\_i)^2 \hat{\boldsymbol{W}}\_i^T \frac{\hat{\boldsymbol{\mathcal{C}}} \boldsymbol{\delta}\_i}{\hat{\boldsymbol{\mathcal{C}}} \boldsymbol{\delta}\_i} \boldsymbol{B}\_i \boldsymbol{R}\_{ii}^{-1} \boldsymbol{B}\_i^T \frac{\hat{\boldsymbol{\mathcal{O}}} \boldsymbol{\delta}\_i}{\hat{\boldsymbol{\mathcal{O}}} \boldsymbol{\delta}\_i} \hat{\boldsymbol{W}}\_i$$

$$+\frac{1}{4}\sum\_{j\in\mathcal{N}\_{i}}\left(\boldsymbol{d}\_{j}+\boldsymbol{g}\_{j}\right)^{2}\boldsymbol{\hat{W}}\_{j}^{T}\frac{\boldsymbol{\hat{\sigma}}\boldsymbol{\hat{\phi}}\_{j}}{\boldsymbol{\hat{\sigma}}\boldsymbol{\hat{\sigma}}\_{j}}\boldsymbol{\hat{R}}\_{j}\boldsymbol{R}\_{\boldsymbol{\hat{\eta}}}^{-1}\boldsymbol{R}\_{\boldsymbol{\hat{\eta}}\boldsymbol{R}}\boldsymbol{R}\_{\boldsymbol{\hat{\eta}}\boldsymbol{I}}^{-1}\boldsymbol{\boldsymbol{B}}\_{j}^{T}\frac{\boldsymbol{\hat{\sigma}}\boldsymbol{\hat{\phi}}\_{j}}{\boldsymbol{\hat{\sigma}}\boldsymbol{\hat{\delta}}\_{j}}\boldsymbol{\hat{W}}\_{j}+\frac{1}{2}\boldsymbol{\hat{\nu}}\boldsymbol{\hat{W}}\_{i}^{T}\frac{\boldsymbol{\hat{\sigma}}\boldsymbol{\hat{\phi}}\_{i}}{\boldsymbol{\hat{\sigma}}\boldsymbol{\hat{\delta}}\_{i}}\boldsymbol{\boldsymbol{e}}\_{j}\boldsymbol{R}\_{\boldsymbol{\hat{\rho}}}^{-1}\boldsymbol{R}\_{\boldsymbol{\hat{\rho}}}^{T}\frac{\boldsymbol{\hat{\sigma}}\boldsymbol{\hat{\phi}}\_{j}}{\boldsymbol{\hat{\sigma}}\boldsymbol{\hat{\delta}}\_{j}}\boldsymbol{\hat{W}}\_{j}$$

where , *HJi i* are the residual errors due to approximation.

After adding zero we have

50 Frontiers in Advanced Control Systems

the NN approximation errors bounded. Small errors guarantee synchronization of all the

**Remark 2.** Persistence of excitation is needed for proper identification of the value functions by the critic NNs, and nonstandard tuning algorithms are required for the actor NNs to guarantee stability. It is important to notice that the actor NN tuning law of every agent needs information of the critic weights of all his neighbors, while the critic NN tuning law of

**Remark 3.** NN usage suggests starting with random, nonzero control NN weights in (51) in order to converge to the coupled HJ equation solutions. However, extensive simulations show that convergence is more sensitive to the persistence of excitation in the control inputs than to the NN weight initialization. If the proper persistence of excitation is not selected,

**Remark 4.** The issue of which inputs ( ) *<sup>i</sup> z t* to use for the critic and actor NNs needs to be addressed. According to the dynamics (8), the value functions (13), and the control inputs (16), the NN inputs at node *i* should consist of its own state, the states of its neighbors, and the costates of its neighbors. However, in view of (31) the costates are functions of the states. In view of the approximation capabilities of NN, it is found in simulations that it is suitable

The next result shows that the tuning laws given in Theorem 6 guarantee approximate

b. ˆ*ui N* converge to the approximate cooperative Nash equilibrium (Definition 2) for

The proof is similar to (Vamvoudakis, 2011) but is done only with respect to the neighbors

ˆ ˆ ˆ ˆ ˆ ˆ (, ,, ) (, ) ˆ ˆ ( )

*H W u u H W W Q W A d g W BR B W*

*i i i i i i i i i i ii i i i i i i i ii i i*

 

 .

*T T i T T i i*

4

1 2 1

*i i i*

*T*

 

 

. That is, ˆ*Wi* converge to the approximate cooperative

to take as the NN inputs at node *i* its own state and the states of its neighbors.

solution to the coupled HJ equations (25) and convergence to the Nash equilibrium.

are uniformly ultimately bounded, where

every agent needs information of the actor weights of all his neighbors,

the control weights may not converge to the correct values.

**Theorem 7. Convergence to Cooperative Nash Equilibrium.** 

*T T i*

(local information) of each agent and not with respect to all agents.

Consider the weights ˆ ˆ , *W Wi iN* to be UUB as proved in Theorem 6.

a. The approximate coupled HJ equations are <sup>ˆ</sup> ( , , , ), ˆ ˆ *H W uu i N ii i i i*

 

*i*

Suppose the hypotheses of Theorem 6 hold. Then:

a. <sup>ˆ</sup> ( , , , ), ˆ ˆ *H W uu i N ii i i i*

coupled HJ-solution.

1 1

ˆ ˆ ( )

*u d g RB W*

*i i i ii i i*

every *i* .

**Proof:** 

 2

node trajectories.

ˆ ˆ (, ) *<sup>T</sup> <sup>i</sup> ii i i i i i H WW W A* <sup>1</sup> 2 1 <sup>4</sup> ( ) *T T T i i i i i i ii i i i i d g W BR B W* 1 1 2 1 2 1 2 2 ˆ ( ) ( ) *T T T T i i T T i i i i i i ii i i i i i i ii i i i i i i d g W BR B W d g W BR B W* 1 1 2 1 1 2 1 1 4 2 ( ) ( ) *i i T T T T j j T T j j j j j j jj ij jj j j j j j j jj ij jj j j j N j j j N j j d g W BR R R B W d g W BR R R B W* 1 2 1 1 1 2 1 1 1 1 2 2 ˆ ˆ ˆ ( ) ˆ *i i i i i T T T TT T j j j i j j j j jj ij jj j j i ij j jj j j HJ j N j j i j N j T T T TT T j j i i i ij j jj j j i ij j jj j j i i j N j j j N d g W BR R R B W W e BR B W W e BR B W W e BR B W* 1 1 2 ˆ *i T T T j i i ij j jj j j i j N j W e BR B W* (54)

But

$$
\hat{\mathcal{W}}\_i = -\tilde{\mathcal{W}}\_i + \mathcal{W}\_{i'} \quad \forall i \; \tag{55}
$$

After taking norms in (55) and letting *W W i i* max one has

$$\left\| \left\| \mathcal{V} \tilde{\mathbf{V}}\_{i} \right\| \right\| = \left\| -\tilde{\mathcal{V}}\_{i} + \mathcal{V} \mathcal{V}\_{i} \right\| \leq \left\| \tilde{\mathcal{V}}\_{i} \right\| + \left\| \mathcal{V} \mathcal{V}\_{i} \right\| \leq \left\| \tilde{\mathcal{V}}\_{i} \right\| + \left\| \mathcal{V}\_{i} \right\| $$

Now (54) with sup *HJ i <sup>i</sup>* becomes

 2 2 1 <sup>2</sup> <sup>2</sup> <sup>1</sup> 4 2 2 1 1 2 2 2 1 2 2 1 2 2 max max max ˆ ˆ (, ) ( ) ( ) ( ) *i i ii i i i i ii i i ii i i i i ii i i ii i i i i ii i i i i H WW W A d g W B R dg W B R dg W B R W W* 

■

2 2 2 2 2 2 1 1 <sup>2</sup> 1 1 <sup>2</sup> 1 1 4 2 max ( ) ( ) *i i j j jj j j jj ij jj jj j j jj ij jj j N j j j N d g W B R RR d g W B R RR* 2 2 1 1 <sup>1</sup> max max <sup>2</sup> *i i j j i i i i ij j jj j j i ij j jj j i i j N j j j N W W eB R W W W eB R W* <sup>2</sup> <sup>1</sup> <sup>1</sup> 2 max max *i j i i ij j jj j j i j N j W eB R W W* <sup>2</sup> <sup>1</sup> <sup>1</sup> 2 max max 2 *i j i i i ij j jj j i j N j W W eB R W* (56)

All the signals on the right hand side of (56) are UUB and convergence to the approximate coupled HJ solution is obtained for every agent.

b. According to Theorem 6, <sup>ˆ</sup> , *W Wi iN i* are UUB. Then it is obvious that ˆ , *u i i N* give the approximate cooperative Nash equilibrium (Definition 2).

#### **6. Simulation results**

This section shows the effectiveness of the online approach described in Theorem 6 for two different cases.

Consider the three-node strongly connected digraph structure shown in Figure 1 with a leader node connected to node 3. The edge weights and the pinning gains are taken equal to 1 so that 12 3 *dd d* 1, 2 .

Select the weight matrices in (9) as

52 Frontiers in Advanced Control Systems

2 2 2 2 1 1 <sup>2</sup> 1 1 <sup>2</sup> 1 1

*i i*

 <sup>2</sup> <sup>1</sup> <sup>1</sup> 2 max max

2 max max 2 *i*

All the signals on the right hand side of (56) are UUB and convergence to the approximate

b. According to Theorem 6, <sup>ˆ</sup> , *W Wi iN i* are UUB. Then it is obvious that ˆ , *u i i N* give

This section shows the effectiveness of the online approach described in Theorem 6 for two

Consider the three-node strongly connected digraph structure shown in Figure 1 with a leader node connected to node 3. The edge weights and the pinning gains are taken equal to

*i i ij j jj j*

*W W eB R W*

*j i i ij j jj j j i j N j W eB R W W*

 

*W W eB R W W W eB R W*

4 2 max ( ) ( )

the approximate cooperative Nash equilibrium (Definition 2).

Fig. 1. Three agent communication graph showing the interactions.

*j N j j j N*

*i*

<sup>2</sup> <sup>1</sup> <sup>1</sup>

2 2 1 1 <sup>1</sup> max max <sup>2</sup>

*i i*

coupled HJ solution is obtained for every agent.

**6. Simulation results** 

1 so that 12 3 *dd d* 1, 2 .

Select the weight matrices in (9) as

different cases.

2 2

*j j i i i i ij j jj j j i ij j jj j i i j N j j j N*

 

*j i*

 

*i j N j*

 

 

 

(56)

 

 

 

■

*j j jj j j jj ij jj jj j j jj ij jj*

*d g W B R RR d g W B R RR*

$$\begin{aligned} Q\_{11} = Q\_{22} = Q\_{33} &= \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}, R\_{11} = \mathbf{4}, R\_{12} = \mathbf{1}, R\_{13} = -\mathbf{1}, \\\ R\_{31} = -\mathbf{4}, \ R\_{22} = \mathbf{9}, R\_{23} = \mathbf{1}, \ R\_{33} = \mathbf{9}, R\_{32} = \mathbf{1}, \ R\_{21} = \mathbf{1} \end{aligned}$$

In the examples below, every node is a second-order system. Then, for every agent 1 2 *T i ii* .

According to the graph structure, the information vector at each node is

$$\boldsymbol{z}\_1 = \begin{bmatrix} \boldsymbol{\delta}\_1^T & \boldsymbol{\delta}\_3^T \end{bmatrix}^T \text{ , } \boldsymbol{z}\_2 = \begin{bmatrix} \boldsymbol{\delta}\_1^T & \boldsymbol{\delta}\_2^T \end{bmatrix}^T \text{ , } \boldsymbol{z}\_3 = \begin{bmatrix} \boldsymbol{\delta}\_1^T & \boldsymbol{\delta}\_2^T & \boldsymbol{\delta}\_3^T \end{bmatrix}^T$$

Since the value is quadratic, the critic NNs basis sets were selected as the quadratic vector in the agent's components and its neighbors' components. Thus the NN activation functions are

$$
\phi\_1(\delta\_1, 0, \delta\_3) = \begin{bmatrix}
\delta\_{11}^2 & \delta\_{11}\delta\_{12} & \delta\_{12}^2 & 0 & 0 & 0 & \delta\_{31}^2 & \delta\_{31}\delta\_{32} & \delta\_{32}^2
\end{bmatrix}^T
$$

$$
\phi\_1(\delta\_1, \delta\_2, 0) = \begin{bmatrix}
\delta\_{11}^2 & \delta\_{11}\delta\_{12} & \delta\_{12}^2 & \delta\_{21}^2 & \delta\_{21}\delta\_{22} & \delta\_{21}^2 & 0 & 0 & 0
\end{bmatrix}^T
$$

$$
\phi\_3(\delta\_1, \delta\_2, \delta\_3) = \begin{bmatrix}
\delta\_{11}^2 & \delta\_{11}\delta\_{12} & \delta\_{12}^2 & \delta\_{21}^2 & \delta\_{21}\delta\_{22} & \delta\_{22}^2 & \delta\_{31}^2 & \delta\_{31}\delta\_{32} & \delta\_{32}^2
\end{bmatrix}^T
$$

#### **6.1 Position and velocity regulated to zero**

For the graph structure shown, consider the node dynamics

$$
\dot{\mathbf{x}}\_1 = \begin{bmatrix} -2 & 1 \\ -4 & -1 \end{bmatrix} \mathbf{x}\_1 + \begin{bmatrix} 2 \\ 1 \end{bmatrix} \boldsymbol{u}\_1,\\
\dot{\mathbf{x}}\_2 = \begin{bmatrix} -2 & 1 \\ -4 & -1 \end{bmatrix} \mathbf{x}\_2 + \begin{bmatrix} 2 \\ 3 \end{bmatrix} \boldsymbol{u}\_2,\\
\dot{\mathbf{x}}\_3 = \begin{bmatrix} -2 & 1 \\ -4 & -1 \end{bmatrix} \mathbf{x}\_3 + \begin{bmatrix} 2 \\ 2 \end{bmatrix} \mathbf{x}\_3.
$$
 
$$
\text{and the common generator } \dot{\mathbf{x}}\_0 = \begin{bmatrix} -2 & 1 \\ -4 & -1 \end{bmatrix} \mathbf{x}\_0.
$$

The graphical game is implemented as in Theorem 6. Persistence of excitation was ensured by adding a small exponentially decreasing probing noise to the control inputs. Figure 2 shows the convergence of the critic parameters for every agent. Figure 3 shows the evolution of the states for the duration of the experiment.

#### **6.2 All the nodes synchronize to the curve behavior of the leader node**

For the graph structure shown above consider the following node dynamics

$$\dot{\mathbf{x}}\_1 = \begin{bmatrix} 0 & 1 \\ -1 & 0 \end{bmatrix} \mathbf{x}\_1 + \begin{bmatrix} 2 \\ 1 \end{bmatrix} \boldsymbol{\mu}\_1,\\ \dot{\mathbf{x}}\_2 = \begin{bmatrix} 0 & 1 \\ -1 & 0 \end{bmatrix} \mathbf{x}\_2 + \begin{bmatrix} 2 \\ 3 \end{bmatrix} \boldsymbol{\mu}\_2,\\ \dot{\mathbf{x}}\_3 = \begin{bmatrix} 0 & 1 \\ -1 & 0 \end{bmatrix} \mathbf{x}\_3 + \begin{bmatrix} 2 \\ 2 \end{bmatrix} \boldsymbol{\mu}\_3.$$

with target generator 0 0 0 1 1 0 *x x* .

Fig. 2. Convergence of the critic parameters.

Fig. 3. Evolution of the system states and regulation.

Fig. 4. Convergence of the critic parameters.

54 Frontiers in Advanced Control Systems

Fig. 2. Convergence of the critic parameters.

Fig. 3. Evolution of the system states and regulation.

The command generator is marginally stable with poles at *s j* , so it generates a sinusoidal reference trajectory.

Fig. 5. Synchronization of all the agents to the leader node.

The graphical game is implemented as in Theorem 6. Persistence of excitation was ensured by adding a small exponential decreasing probing noise to the control inputs. Figure 4 shows the critic parameters converging for every agent. Figure 5 shows the synchronization of all the agents to the leader's behavior as given by the circular Lissajous plot.

#### **7. Conclusion**

This chapter brings together cooperative control, reinforcement learning, and game theory to solve multi-player differential games on communication graph topologies. It formulates graphical games for dynamic systems and provides policy iteration and online learning algorithms along with proof of convergence to the Nash equilibrium or best response. Simulation results show the effectiveness of the proposed algorithms.

#### **8. References**


56 Frontiers in Advanced Control Systems

The graphical game is implemented as in Theorem 6. Persistence of excitation was ensured by adding a small exponential decreasing probing noise to the control inputs. Figure 4 shows the critic parameters converging for every agent. Figure 5 shows the synchronization

This chapter brings together cooperative control, reinforcement learning, and game theory to solve multi-player differential games on communication graph topologies. It formulates graphical games for dynamic systems and provides policy iteration and online learning algorithms along with proof of convergence to the Nash equilibrium or best response.

Abou-Kandil H., Freiling G., Ionescu V., & Jank G., (2003). *Matrix Riccati Equations in Control* 

Abu-Khalaf M., & Lewis F. L., (2005). Nearly Optimal Control Laws for Nonlinear Systems

Başar T., & Olsder G. J.,(1999). *Dynamic Noncooperative Game Theory*, 2nd ed. Philadelphia, PA:

Bertsekas D. P., & Tsitsiklis J. N. (1996). *Neuro-Dynamic Programming*, Athena Scientific, MA. Brewer J.W., (1978). Kronecker products and matrix calculus in system theory, *IEEE* 

Busoniu L., Babuska R., & De Schutter B., (2008). A Comprehensive Survey of Multi-Agent

Dierks T. & Jagannathan S., (2010). Optimal Control of Affine Nonlinear Continuous-time

Fax J. & Murray R., (2004). Information flow and cooperative control of vehicle formations,

Freiling G., Jank G., & Abou-Kandil H., (2002). On global existence of Solutions to Coupled

Hong Y., Hu J., & Gao L., (2006). Tracking control for multi-agent consensus with an active

Gajic Z., & Li T-Y., (1988). Simulation results for two new algorithms for solving coupled

Jadbabaie A., Lin J., & Morse A., (2003). Coordination of groups of mobile autonomous agents using nearest neighbor rules, *IEEE Trans. Autom. Control*, *48(6)*, *988–1001*. Johnson M., Hiramatsu T., Fitz-Coy N., & Dixon W. E., (2010). Asymptotic Stackelberg

leader and variable topology, *Automatica*, *42 (7)*, *1177–1182*.

with Saturating Actuators Using a Neural Network HJB Approach, *Automatica*,

Reinforcement Learning, *IEEE Transactions on Systems, Man, and Cybernetics — Part* 

Systems Using an Online Hamilton-Jacobi-Isaacs Formulation1, *Proc. IEEE Conf* 

Matrix Riccati equations in closed loop Nash Games, *IEEE Transactions on Automatic* 

algebraic Riccati equations, *Third Int. Symp. On Differential Games*, Sophia,

Optimal Control Design for an Uncertain Euler Lagrange System, *IEEE Conference* 

of all the agents to the leader's behavior as given by the circular Lissajous plot.

Simulation results show the effectiveness of the proposed algorithms.

*Transactions Circuits and Systems*, *25*, *772-781*.

*C: Applications and Reviews*, *38(2)*, *156–172*.

*Decision and Control*, Atlanta, *3048-3053*.

*Control*, *41(2)*, *264- 269*.

Antipolis, France.

*on Decision and Control*, *6686-6691*.

*IEEE Trans. Autom. Control*, *49(9)*, *1465–1476*.

*and Systems Theory*, Birkhäuser.

*41*(5), 779-791.

SIAM.

**7. Conclusion** 

**8. References** 


### **Neural and Genetic Control Approaches in Process Engineering**

Javier Fernandez de Canete, Pablo del Saz-Orozco, Alfonso Garcia-Cerezo and Inmaculada Garcia-Moral *University of Malaga, Spain* 

#### **1. Introduction**

58 Frontiers in Advanced Control Systems

Vrabie, D., Pastravanu, O., Lewis, F. L., & Abu-Khalaf, M. (2009). Adaptive Optimal Control

Vrancx P., Verbeeck K., & Nowe A., (2008). Decentralized learning in markov games, *IEEE* 

Wang X. & Chen G., (2002). Pinning control of scale-free dynamical networks, Physica A,

Werbos P. J. (1974). *Beyond Regression: New Tools for Prediction and Analysis in the Behavior* 

Werbos P. J. (1992). Approximate dynamic programming for real-time control and neural

modeling, *Handbook of Intelligent Control*, ed. D.A. White and D.A. Sofge, New York:

*Transactions on Systems, Man and Cybernetics*, *38(4)*, *976-981*.

477-484.

*310(3-4)*, *521–531*.

*Sciences*, Ph.D. Thesis.

Van Nostrand Reinhold.

for continuous-time linear systems based on policy iteration. Automatica, 45(2),

Nowadays, advanced control systems are playing a fundamental role in plant operations because they allow for effective plant management. Typically, advanced control systems rely heavily on real-time process modeling, and this puts strong demands on developing effective process models that, as a prime requirement, have to exhibit real-time responses. Because in many instances detailed process modeling is not viable, efforts have been devoted towards the development of approximate dynamic models.

Approximate process models are based either on first principles, and thus require good understanding of the process physics, or on some sort of black-box modeling. Neural network modeling represents an effective framework to develop models when relying on an incomplete knowledge of the process under examination (Haykin, 2008). Because of the simplicity of neural models, they exhibit great potentials in all those model-based control applications that require real-time solutions of dynamic process models. The better understanding acquired on neural network modeling has driven its exploitation in many process engineering applications (Hussain, 1999).

Genetic algorithms (GA) are model machine learning methodologies, which derive their behavior from a metaphor of the processes of evolution in nature and are able to overcome complex non-linear optimization tasks like non-convex problems, non-continuous objective functions, etc. (Michalewitz, 1992). They are based on an initial random population of solutions and an iterative procedure, which improves the characteristics of the population and produces solutions that are closer to the global optimum. This is achieved by applying a number of genetic operators to the population, in order to produce the next generation of solutions. GAs have been used successfully in combinations with neural and fuzzy systems (Fleming & Purhouse, 2002).

Distillation remains the most important separation technique in chemical process industries around the world. Therefore, improved distillation control can have a significant impact on reducing energy consumption, improving product quality and protecting environmental resources. However, both distillation modeling and control are difficult tasks because it is usually a nonlinear, non-stationary, interactive, and subject to constraints and disturbances process.

In this scenario, most of the contributions that have appeared in literature about advanced control schemes have been tested for nonlinear simulation models (Himmelblau, 2008), while applications with advanced control algorithms over industrial or pilot plants (Frattini et al, 2000) (Varshney and Panigrahi, 2005) (Escano et al, 2009) or even with classical control (Noorai et al, 1999) (Tellez-Anguiano et al, 2009) are hardly found.

Composition monitoring and composition control play an essential role in distillation control (Skogestad, 1997). In practice, on- line analyzer for composition is rarely used due to its costs and measurement delay. Therefore composition is often regulated indirectly using tray temperature close to product withdrawal location. In order to achieve the control purpose, many control strategies with different combination of manipulated variables configurations have been proposed (Skogestad, 2004). If a first-principles model describes the dynamics with sufficient accurately, a model-based soft sensor can be derived, such an extended Kalman filter or its adaptive versions (Venkateswarlu & Avantika, 2001), while inferential models can also be used when process data are available by developing heuristic models (Zamprogna et al, 2005). Artificial neural networks can be considered from an engineering viewpoint, as a nonlinear heuristic model useful to make predictions and data classifications, and have been also used as a soft sensors for process control (Bahar et al, 2004).

Nevertheless, few results are reported when is considered the composition control of experimental distillation columns, and some results are found either by applying direct temperature control (Marchetti et al, 1985) or by using the vapor-liquid equilibrium to estimate composition from temperature (Fileti et al, 2007), or even by using chromatographs (Fieg, 2002).

In this chapter we describe the application of adaptive neural networks to the estimation of the product compositions in a binary methanol-water continuous distillation column from available temperature measurements. This software sensor is then applied to train a neural network model so that a GA performs the searching for the optimal dual control law applied to the distillation column. The performance of the developed neural network estimator is further tested by observing the performance of the neural network control system designed for both set point tracking and disturbance rejection cases.

### **2. Neural networks and genetic algorithms for control**

#### **2.1 Neural networks for identification**

Neural networks offer an alternative approach to modelling process behaviour as they do not require a priori knowledge of the process phenomena. They learn by extracting preexisting patterns from a data set that describe the relationship between the inputs and the outputs in any given process phenomenon. When appropriate inputs are applied to the network, the network acquires knowledge from the environment in a process known as learning. As a result, the network assimilates information that can be recalled later. Neural networks are capable of handling complex and nonlinear problems, process information rapidly and can reduce the engineering effort required in controller model development (Basheer & Hajmeer, 2000).

Neural networks come in a variety of types, and each has their distinct architectural differences and reasons for their usage. The type of neural network used in this work is known as a feedforward network (Fig. 1) and has been found effective in many applications. 60 Frontiers in Advanced Control Systems

In this scenario, most of the contributions that have appeared in literature about advanced control schemes have been tested for nonlinear simulation models (Himmelblau, 2008), while applications with advanced control algorithms over industrial or pilot plants (Frattini et al, 2000) (Varshney and Panigrahi, 2005) (Escano et al, 2009) or even with classical control

Composition monitoring and composition control play an essential role in distillation control (Skogestad, 1997). In practice, on- line analyzer for composition is rarely used due to its costs and measurement delay. Therefore composition is often regulated indirectly using tray temperature close to product withdrawal location. In order to achieve the control purpose, many control strategies with different combination of manipulated variables configurations have been proposed (Skogestad, 2004). If a first-principles model describes the dynamics with sufficient accurately, a model-based soft sensor can be derived, such an extended Kalman filter or its adaptive versions (Venkateswarlu & Avantika, 2001), while inferential models can also be used when process data are available by developing heuristic models (Zamprogna et al, 2005). Artificial neural networks can be considered from an engineering viewpoint, as a nonlinear heuristic model useful to make predictions and data classifications, and have been

Nevertheless, few results are reported when is considered the composition control of experimental distillation columns, and some results are found either by applying direct temperature control (Marchetti et al, 1985) or by using the vapor-liquid equilibrium to estimate composition from temperature (Fileti et al, 2007), or even by using chromatographs

In this chapter we describe the application of adaptive neural networks to the estimation of the product compositions in a binary methanol-water continuous distillation column from available temperature measurements. This software sensor is then applied to train a neural network model so that a GA performs the searching for the optimal dual control law applied to the distillation column. The performance of the developed neural network estimator is further tested by observing the performance of the neural network control

Neural networks offer an alternative approach to modelling process behaviour as they do not require a priori knowledge of the process phenomena. They learn by extracting preexisting patterns from a data set that describe the relationship between the inputs and the outputs in any given process phenomenon. When appropriate inputs are applied to the network, the network acquires knowledge from the environment in a process known as learning. As a result, the network assimilates information that can be recalled later. Neural networks are capable of handling complex and nonlinear problems, process information rapidly and can reduce the engineering effort required in controller model development

Neural networks come in a variety of types, and each has their distinct architectural differences and reasons for their usage. The type of neural network used in this work is known as a feedforward network (Fig. 1) and has been found effective in many applications.

system designed for both set point tracking and disturbance rejection cases.

**2. Neural networks and genetic algorithms for control** 

**2.1 Neural networks for identification** 

(Basheer & Hajmeer, 2000).

(Noorai et al, 1999) (Tellez-Anguiano et al, 2009) are hardly found.

also used as a soft sensors for process control (Bahar et al, 2004).

(Fieg, 2002).

It has been shown that a continuous-valued neural network with a continuous differentiable nonlinear transfer function can approximate any continuous function arbitrarily well in a compact set (Cybenko, 1989).

Fig. 1. Feedforward neural network architecture

There are several different approaches to neural network training, the process of determining an appropriate set of weights. Historically, training is developed with the backpropagation algorithm, but in practice quite a few simple improvements have been used to speed up convergence and to improve the robustness of the backpropagation algorithm (Hagan & Menhaj, 1994). The learning rule used here is common to a standard nonlinear optimization or least-squares technique. The entire set of weights is adjusted at once instead of adjusting them sequentially from the output layer to the input layer. The weight adjustment is done at the end of each epoch and the sum of squares of all errors for all patterns is used as the objective function for the optimization problem.

In particular we have employed the Levenberg-Marquardt algorithm to train the neural network used (Singh et al, 2007), which is a variation of the Newton's method, designed for minimizing functions that are sums of squares of other nonlinear functions. Newton's method for optimizing a performance index �(�) is given by

$$\mathfrak{x}\_{k+1} = \mathfrak{x}\_k - A\_k^{-1} g\_k \tag{1}$$

where �� = � ��(�)|���� and �� = ��(�)|���� are the hessian and the gradient of �(�), respectively, and where �� is the set of net parameters at time *k*. In cases where �(�) is the sum of the square of errors �(�) over the � targets in the training set

$$F(\mathbf{x}) = \sum\_{l=1}^{Q} e\_l^2(\mathbf{x}) = e^T(\mathbf{x})e(\mathbf{x}) \tag{2}$$

then the gradient would be given by

$$\nabla F(\mathbf{x}) = \mathbf{2}l^T(\mathbf{x})e(\mathbf{x})\tag{3}$$

where �(�) is the Jacobian matrix formed by elements ���(�) ��� . On the other hand, the hessian would be approximated by

$$\nabla^2 F(\mathbf{x}) \cong 2 \, f^T(\mathbf{x}) \cdot f(\mathbf{x}) \tag{4}$$

$$\mathbf{x}\_{k+1} = \mathbf{x}\_k - [f^T(\mathbf{x}\_k) f(\mathbf{x}\_k) \ ]^{-1} f^T(\mathbf{x}\_k) e(\mathbf{x}\_k) \tag{5}$$

$$\mathbf{x}\_{k+1} = \mathbf{x}\_k - \left[f^T(\mathbf{x}\_k)f(\mathbf{x}\_k) + \mu\_k I\right]^{-1} f^T(\mathbf{x}\_k)\mathbf{e}(\mathbf{x}\_k) \tag{6}$$

$$
\hat{\mathfrak{X}}(t+1) = NN\_l(\mathfrak{x}(t), \mathfrak{u}(t)) \tag{7}
$$

62 Frontiers in Advanced Control Systems

Adding a constant term ��� to ��(��)�(��), this lead to the Levenberg-Marquardt training

where �� is the learning coefficient, which is set at a small value in the beginning of the training procedure (�� = 1e-03) and is increased (decreased) by a factor ��� (i.e. � � ��) according to the increase (decrease) of �(�) in order to provide faster convergence. In fact, when �� is set to a small value the Levenberg-Marquardt algorithm approaches that of Gauss-Newton, otherwise it behaves as a gradient descent technique. The neural network was configured to stop training after the mean squared error went below 0.05, the minimum gradient went below 1e-10 or the maximum number of epochs was reached (normally a high

The identification of the neural network model occurred via a dynamic structure constituted by a feedforward neural network representing the nonlinear relationship between input and output signals of the system to be modelled. The application of feedforward networks to dynamic systems modelling requires the use of external delay lines involving both input

The network input vector dimension was associated with the time window length selected for each input variable, which was dependent on distillation column dynamics and is usually chosen according to the expertise of process engineers (Basheer & Hajmeer, 2000). The hidden layer dimension was defined by using a trial and error procedure after selecting the input vector, while the net's output vector dimension directly resulted from the selected

Therefore, the neural network identification model ��� after selecting the optimal input

where ��(� � �) stands for the predicted value of the neural network corresponding to the

The resulting identification model was obtained after selecting the best neural network structure among the possible ones, after a training process. Finally, a neural network validation process was performed by comparing the network output with additional data

Genetic Algorithms are adaptive methods which can be used to solve optimization problems. They are based on genetic processes of biological organisms. Over many generations, natural populations evolve according to the principles of natural selection and

��(� � �) � ���(�(�)� �(�)) (7)

Then, substituting (3) and (4) into (1), it results in the Gauss-Newton method

number is selected so that this is a non-limiting condition).

and output signals (Norgaard et al, 2000).

actual net input vector �(�) and the state vector �(�).

that were not included in the training data (validation set).

**2.2 Genetic algorithms for optimization and control** 

controlled variables.

vector was given by

rule so that

���(�) � � ��(�) � �(�) (4)

���� � �� − ���(��)�(��) �����(��)�(��) (5)

���� � �� − ���(��)�(��) � ��� �����(��)�(��) (6)

survival of the fittest. In nature, individuals with the highest survival rate have relatively a large number of offspring, that is, the genes from the highly adapted or fit individuals spread to an increasing number of individuals in each successive generation. The strong characteristics from different ancestors can sometimes produce super-fit offspring, whose fitness is greater than that of either parent. In this way, species evolve to become better suited to their environment in an iterative way by following selection, recombination and mutation processes starting from an initial population.

The control scheme here proposed is based on the different strengths that neural network and genetic algorithms present. One of the most profitable characteristic of the neural networks is its capability of identification and generalization while genetic algorithms are used for optimizing functions.

If an accurate identification model is available, the controller can use the information provided by selecting the optimum input that makes the system as near as possible to the goal to achieve. So one of the main differences between this controller and the rest is the way it selects the inputs to the system.

Fig. 2. Genetic Algorithm Structure

In this way, the function to minimize in each step is the absolute value of the difference between the predicted output (by means of the neural identification network) and the reference. This difference depends, usually, on known variables as past states of the system and past inputs and on unknown variables as are the current inputs to apply. Those inputs will be obtained from the genetic algorithm.

#### **2.3 Neural networks for estimation**

Most popular sensors used in process control are the ones that measure temperature, pressure and fluid level, due to the high accuracy, fast response properties and their cheapness. On the other hand, some of the most controlled variables, such as composition, present great difficulties in the measurement phase because it should be done off-line in laboratory, by involving both a high delay time and an extra cost due to the use of expensive equipment requiring both initial high investment and maintenance, such as occurs with chromatography.

The composition control is crucial in order to achieve the final product specifications during the distillation process. The use of sensors able to infer composition values from secondary variables (values easier to be measured) could be a solution to overcome the referred drawbacks, being this approach defined as a software sensor (Brosilow & Joseph, 2002).

In this way, an inferential system has been developed for achieving an on-line composition control. As the value of the controlled variable is inferred from other secondary variables, the model should be very accurate mainly in the operating region. The inferential system based on the first principles model approach presents the drawback of increasing computing time as the number of variables increase.

A black-box model approach relating the plant outputs with the corresponding sampled inputs has been used instead. Neural networks have proven to be universal approximators (Haykin, 2008), so they will be used to infer the composition from other secondary variables, defining thus the neural soft estimator.

One of the main difficulties in determining the complete structure of the neural estimator is the choice of the secondary variables to be used (both the nature and the location), selected among the ones provided by the set of sensors installed on the experimental pilot plant. In the literature there are several papers dedicated to the selection of variables for composition estimation and no consensus is reached in terms of number or position of the secondary sensors (here position is understood as the stage or plate where the variable is measured). In (Quintero-Marmol et al, 1991), the number that assures robust performance is �� + 2, where *Nc* is the number of components. With respect to the location of the most sensitive trays, (Luyben, 2006) develops a very exhaustive study and concludes that the optimal position depends heavily on the plant and on the feed tray. In this way, the neural estimator should have as an input the optimum combination of selected secondary variables to determine accurately the product composition.

In order to select the most suitable secondary variables for our control purposes, a multivariate statistical technique based on the principal component analysis (PCA) methodology (Jackson,1991) has been used, following the same approach described by (Zamprogna et al,2005). The resulting neural network estimator ��� is given by

$$\mathfrak{X}\_p(t) = NN\_E(\mathfrak{x}\_s(t)) \tag{8}$$

where ������ and ����� stands for the primary and secondary selected variables.

#### **2.4 Neurogenetic control structure**

As an accurate neural network model that relates the past states, current states, and the current control inputs with the future outputs is available, the future output of the system can be predicted depending on the control inputs through a non linear function. In this way, the function to be minimized in each step is a cost function that is related to the absolute 64 Frontiers in Advanced Control Systems

cheapness. On the other hand, some of the most controlled variables, such as composition, present great difficulties in the measurement phase because it should be done off-line in laboratory, by involving both a high delay time and an extra cost due to the use of expensive equipment requiring both initial high investment and maintenance, such as occurs with

The composition control is crucial in order to achieve the final product specifications during the distillation process. The use of sensors able to infer composition values from secondary variables (values easier to be measured) could be a solution to overcome the referred drawbacks, being this approach defined as a software sensor (Brosilow & Joseph, 2002).

In this way, an inferential system has been developed for achieving an on-line composition control. As the value of the controlled variable is inferred from other secondary variables, the model should be very accurate mainly in the operating region. The inferential system based on the first principles model approach presents the drawback of increasing

A black-box model approach relating the plant outputs with the corresponding sampled inputs has been used instead. Neural networks have proven to be universal approximators (Haykin, 2008), so they will be used to infer the composition from other secondary variables,

One of the main difficulties in determining the complete structure of the neural estimator is the choice of the secondary variables to be used (both the nature and the location), selected among the ones provided by the set of sensors installed on the experimental pilot plant. In the literature there are several papers dedicated to the selection of variables for composition estimation and no consensus is reached in terms of number or position of the secondary sensors (here position is understood as the stage or plate where the variable is measured). In (Quintero-Marmol et al, 1991), the number that assures robust performance is �� + 2, where *Nc* is the number of components. With respect to the location of the most sensitive trays, (Luyben, 2006) develops a very exhaustive study and concludes that the optimal position depends heavily on the plant and on the feed tray. In this way, the neural estimator should have as an input the optimum combination of selected secondary variables to determine

In order to select the most suitable secondary variables for our control purposes, a multivariate statistical technique based on the principal component analysis (PCA) methodology (Jackson,1991) has been used, following the same approach described by

As an accurate neural network model that relates the past states, current states, and the current control inputs with the future outputs is available, the future output of the system can be predicted depending on the control inputs through a non linear function. In this way, the function to be minimized in each step is a cost function that is related to the absolute

������ � ���������� (8)

(Zamprogna et al,2005). The resulting neural network estimator ��� is given by

where ������ and ����� stands for the primary and secondary selected variables.

computing time as the number of variables increase.

defining thus the neural soft estimator.

accurately the product composition.

**2.4 Neurogenetic control structure** 

chromatography.

value of the difference between the predicted output and the desired reference to follow. This difference depends, usually, on known variables such as past inputs and past states of the system and on unknown variables such as the current control inputs to apply, which will be obtained from the genetic algorithm.

In this way, the optimization problem for controlling the distillation plant can be stated as the problem of finding the input that minimizes the norm of the difference, multiplied by a weighting matrix between the reference command to follow and the neural network model output, considering the input and the past and current states of the system. This procedure can be stated as ����‖�� · (�� � ���(�� �))‖, with �� representing the reference command to follow, ���� is the neural network model output, � represents the past and current states of the system, ��� is the control action and � is the universe of possible control actions and �� is a weighting matrix.

In the present case, the reference command �� will be given by the desired composition variables together with the desired level variables, while ��� represents the optimum neurogenetic control action, and the weighting matrix penalizes the errors in composition twice the errors in level, since composition control is more difficult to achieve than level control. In Fig. 3 the neurogenetic control strategy that is used here is shown, together with the neural composition estimator.

Fig. 3. Neural Estimation and Neurogenetic Control Structure

#### **3. Application to a pilot distillation column**

#### **3.1 Description of the pilot distillation column**

The pilot distillation column DELTALAB is composed of 9 plates, one condenser, and one boiler (Fig. 4). The instrumentation equipment consists of 12 Pt 100 RTD temperature sensors (T1-T12), 3 flow meters (FI1-FI3), 2 level sensors (LT1-LT2) and 1 differential pressure meter (PD), together with 3 pneumatic valves (LIC1-LIC2-TIC2) and a heating thermo-coil (TIC1), with up to four control loops for plant operation. Additionally, feed temperature and coolant flow control are included with corresponding valve (FIC1) and heating resistance (PDC1), being both variables considered as disturbances.

Fig. 4. Pilot distillation plant configuration

The condenser provides the necessary cooling to condense the distilled product. The condenser contains the cooling water provided by an external pump. The flow of the cooling liquid is regulated through a pneumatic valve with one flow controller, which as a last resort depends on the variable water flow supply. Two temperature sensors measure the temperature of the inlet and outlet flows.

Once the top stream is condensed, the liquid is stored in an intermediate reflux drum, endowed with level meter, temperature sensor and recirculation pump for reflux stream. The reflux to distillate ratio is controlled by 2 proportional pneumatic valves for reflux and distillate respectively, each flow measured through the corresponding flow meter with display.

66 Frontiers in Advanced Control Systems

pressure meter (PD), together with 3 pneumatic valves (LIC1-LIC2-TIC2) and a heating thermo-coil (TIC1), with up to four control loops for plant operation. Additionally, feed temperature and coolant flow control are included with corresponding valve (FIC1) and

The condenser provides the necessary cooling to condense the distilled product. The condenser contains the cooling water provided by an external pump. The flow of the cooling liquid is regulated through a pneumatic valve with one flow controller, which as a last resort depends on the variable water flow supply. Two temperature sensors measure the

Once the top stream is condensed, the liquid is stored in an intermediate reflux drum, endowed with level meter, temperature sensor and recirculation pump for reflux stream. The reflux to distillate ratio is controlled by 2 proportional pneumatic valves for reflux and distillate respectively, each flow measured through the corresponding flow meter with

heating resistance (PDC1), being both variables considered as disturbances.

Fig. 4. Pilot distillation plant configuration

temperature of the inlet and outlet flows.

display.

The main body of the distillation column is composed of 9 bubble cap plates distributed into 3 sections. Two of them are connected to the feeding device, and can either function like feeding or normal plates, selecting each one through a manual valve. Four temperature sensors measure the temperature in each section junction.

The boiler provides the required heat to the distillation column by actuating on an electric heating thermo-coil located inside the boiler. A temperature sensor is located inside the boiler and a level meter measures the liquid stored in an intermediate bottom drum. A differential-pressure sensor indicates the pressure changes throughout the column which is operated at atmospheric pressure. The bottom flow is controlled by a proportional pneumatic valve and two temperature sensors measure the temperature of the inlet and outlet flows before cooling, with corresponding flow meters with display.

The feeding ethanol-water mixture is stored in a deposit, whose temperature is controlled by a pre-heating electric thermo-coil. The mixture to be distilled is fed into the column in small doses by a feeding pump with temperature controller (TIC3) and sensors installed to measure the temperature of the inlet and outlet feed flows.

The whole instrumentation of the distillation pilot plant is monitored under LabVIEW platform and is connected to the neural based controller designed under MATLAB platform, through a communication system based both on PCI and USB buses, with up to four control loops. In this experimental set-up, boiler heat flow *QB*, reflux valve opening *VR*, distillate valve opening *VD* and bottom valve opening *VB* constitute the set of manipulated variables, while light composition *CD*, bottom composition *CB*, light product level *LD* and heavy product level *LB* define the corresponding set of controlled variables (Fig. 5), while the feed flow temperature *TF* is considered as a disturbance.

Fig. 5. Pilot distillation plant configuration

It is important to highlight that a dynamical model has not been derived to represent the pilot column behavior, instead of this we have made use of an approximate neural network model to identify the plant dynamics starting from selected I/O plant data operation.

#### **3.2 Monitoring and control interface system**

The monitoring and control interface system requires a communication system between the sensors and actuators on the one hand and the computer on the other hand throughout I/O modules, whose specifications are settled by the instrumentation characteristics utilized (Table 1 and 2).

In order to manage the I/O signals, USB and PCI buses have been chosen. On the one hand, the PCI bus enables the dynamic configuration of peripheral equipments, since during the operating system startup, the devices connected to PCI buses communicate with the BIOS and calculate the required resources for each one. On the other hand, the USB bus entails a substantial improvement regarding the 'plug and play' technology, having as main objective to suppress the necessity of acquiring different boards for computer ports. Besides this, an optimal performance is achieved for the set of different devices integrated into the instrumentation system, connectable without the needing to open the system.


Table 1. Sensors characteristics for the pilot distillation column

The acquisition system configuration for the monitoring and control of the pilot plant is constituted by the next set of DAQ (Data Acquisition) boards: NI PCI-6220, NI-PCI-6722, NI-USB-6009, NI-USB-6210 for analog voltage signal acquisition and NI-PCI-6704 for analog current signal acquisition, all supplied by National Instruments (NI). Measurements obtained from the sensors have been conditioned to operate into the standard operational range, and signal averaging for noise cancelation has been applied using specific LabVIEW toolkits (Bishop, 2004).

The monitoring and control interface system developed for the pilot plant is configured throughout the interconnection of the NI Data acquisition system with both the LabVIEW monitoring subsystem and the neurogenetic controller implemented in MATLAB (Fig. 6), both environments linked together through the Mathscripts and running under a Intel core duo with 2.49 GHZ and 3 GB of RAM.

68 Frontiers in Advanced Control Systems

It is important to highlight that a dynamical model has not been derived to represent the pilot column behavior, instead of this we have made use of an approximate neural network

The monitoring and control interface system requires a communication system between the sensors and actuators on the one hand and the computer on the other hand throughout I/O modules, whose specifications are settled by the instrumentation characteristics utilized

In order to manage the I/O signals, USB and PCI buses have been chosen. On the one hand, the PCI bus enables the dynamic configuration of peripheral equipments, since during the operating system startup, the devices connected to PCI buses communicate with the BIOS and calculate the required resources for each one. On the other hand, the USB bus entails a substantial improvement regarding the 'plug and play' technology, having as main objective to suppress the necessity of acquiring different boards for computer ports. Besides this, an optimal performance is achieved for the set of different devices integrated into the

The acquisition system configuration for the monitoring and control of the pilot plant is constituted by the next set of DAQ (Data Acquisition) boards: NI PCI-6220, NI-PCI-6722, NI-USB-6009, NI-USB-6210 for analog voltage signal acquisition and NI-PCI-6704 for analog current signal acquisition, all supplied by National Instruments (NI). Measurements obtained from the sensors have been conditioned to operate into the standard operational range, and signal averaging for noise cancelation has been applied using specific LabVIEW

The monitoring and control interface system developed for the pilot plant is configured throughout the interconnection of the NI Data acquisition system with both the LabVIEW monitoring subsystem and the neurogenetic controller implemented in MATLAB (Fig. 6), both environments linked together through the Mathscripts and running under a Intel core

instrumentation system, connectable without the needing to open the system.

Table 1. Sensors characteristics for the pilot distillation column

model to identify the plant dynamics starting from selected I/O plant data operation.

**3.2 Monitoring and control interface system** 

(Table 1 and 2).

toolkits (Bishop, 2004).

duo with 2.49 GHZ and 3 GB of RAM.


Table 2. Actuators characteristics for the pilot distillation column

Fig. 6. Monitoring and control interface for pilot distillation plant

The process control scheme developed in each operation cycle implies the execution of five different actions: system initializing, buttons control reading from VI (Virtual Instruments), reading plant data from instruments, control action calculation and writing control data to instruments.

#### **3.3 Neural composition estimator and neurogenetic controller**

The complete controlled system is composed of a neural network model of the process and a control scheme based on a genetic algorithm which utilizes both the composition and the level variables to get the quasi-optimal control law, by using the neural composition estimator (Fig. 3) for both determining and monitoring the composition of light and heavy components from secondary variable measurements.

After applying the selection method, the inputs to the neural estimation network turned out to be four secondary variables, namely, three temperatures ܶ, ܶହ, ܶଶ, each corresponding to reflux, top and bottom temperatures, and differential pressure drop *DPD,* while *CD* and *CB*

compositions were the net outputs. This structure is in line with what the literature suggests (Quintero-Marmol et al, 1991) (Zamprogna et al, 2005) in terms of both the number of the selected measurements and its distribution. This fact contrasts with the standard approach consisting in selecting two temperatures for a two composition estimation (Medjell and Skogestad,1991) (Strandberg and Skogestad,2006). However, this assumption is not possible when the vapor-liquid equilibrium has a strong nonlinear behavior (Baratti et al.,1998) (Oisiovici and Cruz, 2001), so that holding the temperature constant does not imply that composition will also be constant (Rueda et al, 2006).

The final network structure selected for the neural composition estimator was a 4-25-2 net, trained using the Levenberg-Marquardt algorithm (Hagan et al, 2002), with a hidden layer configuration selected after a trial and error process and input layer determined by the PCA based algorithm for selection of the secondary variables previously exposed.

The training data set used herein consisted of 700 points collected randomly from a whole data set of more than 27000 acquired points, all obtained from several experiments carried out with the pilot distillation column by covering the whole range of operation. A different subset of 700 points has been also used for validation. For this purpose we have analyzed several samples of an ethanol-water mixture during the separation process by using a flash chromatograph VARIANT, and the composition error mean obtained was lower than 1.5%.

The final network structure selected for the neural plant model was a 22-25-6 neural feedforward architecture trained by using the Levenberg-Marquardt algorithm and validated throughout the set of I/O experimental data. The hidden layer configuration was selected after the algorithm as it was stated in the previous section, using this time *VR*, *VD*, *VB*, *QB*, *T2*, *T5*, *T6*, *TF*, *LD*, *LB*, and *DPD* delayed values as inputs, while *T2*, *T5*, *T6*, *DPD, LD, LB* were the estimated outputs. The neural net was trained with a different subset of 750 points selected randomly from the whole data set of 27000 acquired points with sampling T = 2 s, both by using a PID analog control module, by changing set-points for each of the controlled variables into its operating range and by working on open loop conditions. The neural net was also validated with another subset of 750 points comparing its outputs to the real system's outputs in independent experiments.

The neurogenetic controller is characterized by a population of 75 inhabitants, 50 generations and a codification of 8 bits. The maximum is accepted if it is invariant in 5 iterations. All these parameters were estimated for achieving a time response lower than 1.3 seconds for the computational system used for controlling the experimental distillation plant.

#### **3.4 Results**

In order to test the validity of the proposed control scheme, the performance of the neurogenetic control strategy is compared against a PID control strategy by using four decoupled PID controllers relating *VR*, *QB*, *VD* and *VB* manipulated variables with the corresponding controlled variables *CD*, *CB*, *LD* and *LB*. Obviously in order to compare properly both strategies, the PID approach should control the same variables, in a way the composition is indirectly controlled, by following the standard LV configuration (Skogestad, 1997). The PID parameters set selected for each controlled variable has been heuristically tuned according to the analog PID values set by the DELTALAB field expert when the pilot column is supplied.

Several changes in composition set points on top and bottom purity have been made to test the neurogenetic controller performance (Fig. 7). As it is shown, the system is able to reach 70 Frontiers in Advanced Control Systems

compositions were the net outputs. This structure is in line with what the literature suggests (Quintero-Marmol et al, 1991) (Zamprogna et al, 2005) in terms of both the number of the selected measurements and its distribution. This fact contrasts with the standard approach consisting in selecting two temperatures for a two composition estimation (Medjell and Skogestad,1991) (Strandberg and Skogestad,2006). However, this assumption is not possible when the vapor-liquid equilibrium has a strong nonlinear behavior (Baratti et al.,1998) (Oisiovici and Cruz, 2001), so that holding the temperature constant does not imply that

The final network structure selected for the neural composition estimator was a 4-25-2 net, trained using the Levenberg-Marquardt algorithm (Hagan et al, 2002), with a hidden layer configuration selected after a trial and error process and input layer determined by the PCA

The training data set used herein consisted of 700 points collected randomly from a whole data set of more than 27000 acquired points, all obtained from several experiments carried out with the pilot distillation column by covering the whole range of operation. A different subset of 700 points has been also used for validation. For this purpose we have analyzed several samples of an ethanol-water mixture during the separation process by using a flash chromatograph VARIANT, and the composition error mean obtained was lower than 1.5%. The final network structure selected for the neural plant model was a 22-25-6 neural feedforward architecture trained by using the Levenberg-Marquardt algorithm and validated throughout the set of I/O experimental data. The hidden layer configuration was selected after the algorithm as it was stated in the previous section, using this time *VR*, *VD*, *VB*, *QB*, *T2*, *T5*, *T6*, *TF*, *LD*, *LB*, and *DPD* delayed values as inputs, while *T2*, *T5*, *T6*, *DPD, LD, LB* were the estimated outputs. The neural net was trained with a different subset of 750 points selected randomly from the whole data set of 27000 acquired points with sampling T = 2 s, both by using a PID analog control module, by changing set-points for each of the controlled variables into its operating range and by working on open loop conditions. The neural net was also validated with another subset of 750 points comparing its outputs to the real

The neurogenetic controller is characterized by a population of 75 inhabitants, 50 generations and a codification of 8 bits. The maximum is accepted if it is invariant in 5 iterations. All these parameters were estimated for achieving a time response lower than 1.3 seconds for the

In order to test the validity of the proposed control scheme, the performance of the neurogenetic control strategy is compared against a PID control strategy by using four decoupled PID controllers relating *VR*, *QB*, *VD* and *VB* manipulated variables with the corresponding controlled variables *CD*, *CB*, *LD* and *LB*. Obviously in order to compare properly both strategies, the PID approach should control the same variables, in a way the composition is indirectly controlled, by following the standard LV configuration (Skogestad, 1997). The PID parameters set selected for each controlled variable has been heuristically tuned according to the analog PID values set by the DELTALAB field expert when the pilot column is supplied. Several changes in composition set points on top and bottom purity have been made to test the neurogenetic controller performance (Fig. 7). As it is shown, the system is able to reach

computational system used for controlling the experimental distillation plant.

based algorithm for selection of the secondary variables previously exposed.

composition will also be constant (Rueda et al, 2006).

system's outputs in independent experiments.

**3.4 Results** 

the required references in composition but is a bit slow in its response. The response obtained with the PID approach presents a bigger settling time and overshoot and a poorer response to changes in the targets in the coupled variables. In fact, the ISE (integral square error) which characterizes the accuracy of both control schemes during tracking of reference commands, is significantly lower for the neurogenetic control as compared to the PID control both controlled variables, with a �������� = 4719.9 (%)� � �, ������������� = 3687.2 (%)� � � for top composition and �������� = 2427.6 (%)� � �, ������������� = 2071.8 (%)� � � for bottom composition respectively. These facts imply a better performance even when changing conditions are present (variable feed changes), due to the adaptive nature of the neurogenetic controller.

In Fig. 8 are displayed the changes in control actions *VR*, *VP*, *VH* (in % of opening) and *QB* (in % of maximum power) corresponding to the set point changes on top and bottom composition as described formerly for the neurogenetic control scheme. It must be emphasized that all control signal are within the operating range with minimum saturation effects, mainly due to mild conditions imposed to the time response profile during the neurogenetic design.

Fig. 7. Response of top and bottom composition for set point changes in ethanol purity in (a) 60-70 % range on top (b) 5-12 % range on bottom for pilot distillation column under decoupled PID and neurogenetic control

Fig. 8. Control actions *VR*, *VP*, *VH* and *QB* (a)-(d) for set point changes in ethanol purity in 60- 70 % range on top and 5-12 % range on bottom for pilot distillation column under neurogenetic control.

#### **4. Conclusions**

Adaptive neural networks have been applied to the estimation of product composition starting from on-line secondary variables measurements, by selecting the optimal net input vector for estimator by using PCA based algorithm. Genetic algorithms have been used to derive the optimum control law under MATLAB, based both on the neural network model of the pilot column and the estimation of composition. This neurogenetic approach has been applied to the dual control of distillate and bottom composition for a continuous ethanol water nonlinear pilot distillation column monitored under LabVIEW.

The proposed method gives better or equal performances over other methods such as fuzzy or adaptive control by using a simpler design based exclusively on the knowledge about the pilot distillation column in form of I/O operational data. It is also necessary to highlight the potential benefits of artificial neural networks combined with GA when are applied to the multivariable control of nonlinear plants, with unknown first-principles model and under an experimental set-up as was demonstrated with the distillation pilot plant.

Future work is directed toward the application of this methodology to industrial plants and also towards the stability and robustness analysis due to uncertainty generated by the neural network identification errors when the plant is approximated.

#### **5. References**

72 Frontiers in Advanced Control Systems

(b)

(c)

(d) Fig. 8. Control actions *VR*, *VP*, *VH* and *QB* (a)-(d) for set point changes in ethanol purity in 60-

Adaptive neural networks have been applied to the estimation of product composition starting from on-line secondary variables measurements, by selecting the optimal net input vector for estimator by using PCA based algorithm. Genetic algorithms have been used to derive the optimum control law under MATLAB, based both on the neural network model of the pilot column and the estimation of composition. This neurogenetic approach has been applied to the dual control of distillate and bottom composition for a continuous ethanol

The proposed method gives better or equal performances over other methods such as fuzzy or adaptive control by using a simpler design based exclusively on the knowledge about the pilot distillation column in form of I/O operational data. It is also necessary to highlight the potential benefits of artificial neural networks combined with GA when are applied to the

70 % range on top and 5-12 % range on bottom for pilot distillation column under

water nonlinear pilot distillation column monitored under LabVIEW.

neurogenetic control.

**4. Conclusions** 


## **New Techniques for Optimizing the Norm of Robust Controllers of Polytopic Uncertain Linear Systems**

L. F. S. Buzachero, E. Assunção, M. C. M. Teixeira and E. R. P. da Silva *FEIS - School of Electrical Engineering, UNESP, 15385-000, Ilha Solteira, São Paulo Brazil* 

#### **1. Introduction**

74 Frontiers in Advanced Control Systems

Luyben, W.L. (2006). Evaluation of criteria for selecting temperature control trays in distillation columns. *Journal of Process Control*, Vol. 16 (2006), pp. 115-134. Marchetti, J.L.; Benallou, A.; Seborg D.E. & Mellichamp, D.A. (1985). A pilot-scale

Mejdell, T. & Skogestad, S. (1991). Composition estimator in a pilot plant distillation column

Michalewitz, Z. (1992). *Genetic Algorithms + Data Structures = Evolution Programs*. Springer,

Nooraii, A.; Romagnoli, J.A & Figueroa, J. (1999). Process identification, uncertainty

Norgaard, O.; Ravn, N.; Poulsen, K. & Hansen, L.K. (2000). *Neural networks for modeling and* 

Osiovici, R. & Cruz, S.L. (2001). Inferential control of high-purity multicomponent batch

Quintero-Marmol, E.; Luyben, W.L & Georgakis, C. (1991). Application of an extended

Rueda, L.M.; Edgar, T.F. & Eldridge, R.B. (2006). A novel control methodology for a pilot

Singh, V.; Gupta, I. and Gupta, H.O. (2007). ANN-based estimator for distillation using

Skogestad, S. (1997). Dynamics and control of distillation columns. A tutorial introduction. *Transactions on Industrial Chemical Engineering*, Vol. 75(A) (1997), pp. 539-562. Skogestad, S. (2004). Control structure design for complete chemical plants. *Computers and* 

Strandberg, J. & Skogestad, S. (2006). Stabilizing operation of a 4-product integrated Kaibel

Tellez-Anguiano, A.; Rivas-Cruz, F.; Astorga-Zaragoza, C.M.; Alcorta-Garcia, E. & Juarez-

Varshney, K. & Panigrahi, P.K. (2005). Artificial neural network control of a heat exchanger in a closed flow air circuit. *Applied Soft Computing*, Vol. 5 (2005), pp. 441-465. Venkateswarlu, C. & Avantika, S. (2001). Optimal state estimation of multicomponent batch distillation. *Chemical Engineering Science*, Vol. 56 (2001), pp. 5771-5786. Zamprogna, E; Barolo, M. & Seborg, D.E. (2005). Optimal selection of soft sensor inputs for

*Engineering*, Vol. 9, No. 3 (1985), pp. 301-309.

*of Process Control*, Vol. 9 (1999), pp. 247-264.

*control of dynamic systems*, Springer Verlag.

Intelligence, Vol. 20 (2007), pp. 249–259.

*Chemical Engineering*, Vol. 28 (2004), pp. 219-234.

*Processes*,pp.623-628, Gramado, Brazil, 2-5 April 2006.

*Standards and Interfaces*, Vol. 31 (200*9*), pp. 471-479.

*Control,* Vol. 15 (2005), pp. 39-52.

*Chemical Research*, Vol. 40,(2001), pp. 2628–2639.

*Engineering Chemical Research*, Vol. 30 (1991), pp 1870-1880.

pp. 2555-2564.

Berlin, Germany.

(2006), pp. 8361-8372.

distillation facility for digital computer control research. *Computers and Chemical* 

using multiple temperatures. *Industrial Engineering Chemical Research*, Vol. 30 (1991),

characterization and robustness analysis of a pilot scale distillation column. *Journal* 

distillation columns using an extended Kalman filter. *Industrial Engineering* 

Luenberger observer to the control of multicomponent batch distillation. *Industrial* 

plant azeotropic distillation column, *Industrial Engineering Chemical Research*, Vol. 45

Levenberg–Marquardt approach. Engineering Applications of Artificial

column. *Proceedings of IFAC Symposium on Advanced Control of Chemical* 

Romero, D. (2009). Process control interface system for a distillation plant. *Computer* 

batch distillation columns using principal component analysis. *Journal of Process* 

The history of linear matrix inequalities (LMIs) in the analysis of dynamical systems dates from over 100 years. The story begins around 1890 when Lyapunov published his work introducing what is now called the Lyapunov's theory (Boyd et al., 1994). The researches and publications involving the Lyapunov's theory have grown up a lot in recent decades (Chen, 1999), opening a very wide range for various approaches such as robust stability analysis of linear systems (Montagner et al., 2009), LMI optimization approach (Wang et al., 2008), H<sup>2</sup> (Apkarian et al., 2001; Assunção et al., 2007a; Ma & Chen, 2006) or H<sup>∞</sup> (Assunção et al., 2007b; Chilali & Gahinet, 1996; Lee et al., 2004) robust control, design of controllers for systems with state feedback (Montagner et al., 2005), and design of controllers for systems with state-derivative feedback (Cardim et al., 2009). The design of robust controllers can also be applied to nonlinear systems.

In addition to the various current controllers design techniques, the design of robust controllers (or controller design by quadratic stability) using LMI stands out for solving problems that previously had no known solution. These designs use specialized computer packages (Gahinet et al., 1995), which made the LMIs important tools in control theory.

Recent publications have found a certain conservatism inserted in the analysis of quadratic stability, which led to a search for solutions to eliminate this conservatism (de Oliveira et al., 1999). Finsler's lemma (Skelton et al., 1997) has been widely used in control theory for the stability analysis by LMIs (Montagner et al., 2009; Peaucelle et al., 2000), with better results than the quadratic stability of LMIs, but with extra matrices, which allows a certain relaxation in the stability analysis (here called extended stability), by obtaining a larger feasibility region. The advantage found in its application to design of state feedback is the fact that the synthesis of gain *K* becomes decoupled from Lyapunov's matrix *P* (Oliveira et al., 1999), leaving Lyapunov's matrix free as it is necessarily symmetric and positive defined to meet the initial restrictions.

The reciprocal projection lemma used in robust control literature H<sup>2</sup> (Apkarian et al., 2001), can also be used for the synthesis of robust controllers, eliminating in a way the existing conservatism, as it makes feasible dealing with multiple Lyapunov's matrices, as in the case of extended stability point, allowing extra matrices through a relaxation in the case of extended stability, making feasible a relaxation in the stability analysis (here called projective stability) through extra matrices. The synthesis of the controller *K* is depending now on an auxiliary matrix *V*, not necessarily symmetrical, and in this situation it becomes completely decoupled from Lyapunov's matrix *P*, leaving it free.

Two critical points in the design of robust controllers are explored here. One of them is the magnitude of the designed controllers that are often high, affect their practical implementation and therefore require a minimization of the gains of the controller to facilitate its implementation (optimization of the norm of *K*).The other one is the fact that the system settling time can be larger than the required specifications of the project, thus demanding restrictions on LMIs to limit the decay rate, formulated with the inclusion of the parameter *γ* in LMIs.

The main focus of this work is to propose new methods for optimizing the controller's norm, through a different approach from that found in (Chilali & Gahinet, 1996), and compare it with the optimization method presented in (Assunção et al., 2007c) considering the different criteria of stability, aiming at the advantages and disadvantages of each method, as well as the inclusion of a decay rate (Boyd et al., 1994) in LMIs formulation.

In (Siljak & Stipanovic, 2000) an optimization of the controllers's norm was proposed for ˜ decentralized control, but without the decay rate, so no comparisons were made with this work due to the necessity to insert this parameter to improve the performance of the system response.

The LMIs of optimization that will be used for new design techniques, had to be reformulated because the matrix controller synthesis does not depend more on a symmetric matrix, a necessary condition for the formulation of the existing LMI optimization. Comparisons will be made through a practical implementation in the Quanser's 3-DOF helicopter (Quanser, 2002) and a general analysis involving 1000 randomly generated polytopic uncertain systems.

#### **2. Quadratic stability of continuous time linear systems**

Consider (1) an autonomous linear dynamic system without state feedback. Lyapunov proved that the system

$$
\dot{\mathbf{x}}(t) = A\mathbf{x}(t) \tag{1}
$$

with *<sup>x</sup>*(*t*) <sup>∈</sup> **<sup>R</sup>***<sup>n</sup>* <sup>e</sup> *<sup>A</sup>* <sup>∈</sup> **<sup>R</sup>***n*×*<sup>n</sup>* a known matrix, is asymptotically stable (i.e., all trajectories converge to zero) if and only if there exists a matrix *<sup>P</sup>* <sup>=</sup> *<sup>P</sup>*� <sup>∈</sup> **<sup>R</sup>***nxn* such that the LMIs (2) and (3) are met (Boyd et al., 1994).

$$A'P + PA < 0\tag{2}$$

$$P > 0\tag{3}$$

Consider in equation (2) that *A* is not precisely known, but belongs to a politopic bounded uncertainty domain A. In this case, the matrix *A* within the domain of uncertainty can be written as convex combination of vertexes *Aj* , *j* = 1, ..., *N*, of the convex bounded uncertainty domain (Boyd et al., 1994), i.e. *A*(*α*) ∈ A and A shown in (4).

$$\mathcal{A} = \{ A(\mathfrak{a}) \in \mathbb{R}^{n \times n} \; : \; A(\mathfrak{a}) = \sum\_{j=1}^{N} a\_j A\_j \; \; \; \; \sum\_{j=1}^{N} a\_j = 1 \; \; \; \; a\_j \ge 0 \; \; \; \; j = 1 \ldots N \} \tag{4}$$

2 Will-be-set-by-IN-TECH

extended stability point, allowing extra matrices through a relaxation in the case of extended stability, making feasible a relaxation in the stability analysis (here called projective stability) through extra matrices. The synthesis of the controller *K* is depending now on an auxiliary matrix *V*, not necessarily symmetrical, and in this situation it becomes completely decoupled

Two critical points in the design of robust controllers are explored here. One of them is the magnitude of the designed controllers that are often high, affect their practical implementation and therefore require a minimization of the gains of the controller to facilitate its implementation (optimization of the norm of *K*).The other one is the fact that the system settling time can be larger than the required specifications of the project, thus demanding restrictions on LMIs to limit the decay rate, formulated with the inclusion of the parameter *γ*

The main focus of this work is to propose new methods for optimizing the controller's norm, through a different approach from that found in (Chilali & Gahinet, 1996), and compare it with the optimization method presented in (Assunção et al., 2007c) considering the different criteria of stability, aiming at the advantages and disadvantages of each method, as well as the

In (Siljak & Stipanovic, 2000) an optimization of the controllers's norm was proposed for

decentralized control, but without the decay rate, so no comparisons were made with this work due to the necessity to insert this parameter to improve the performance of the system

The LMIs of optimization that will be used for new design techniques, had to be reformulated because the matrix controller synthesis does not depend more on a symmetric matrix, a necessary condition for the formulation of the existing LMI optimization. Comparisons will be made through a practical implementation in the Quanser's 3-DOF helicopter (Quanser, 2002) and a general analysis involving 1000 randomly generated polytopic uncertain systems.

Consider (1) an autonomous linear dynamic system without state feedback. Lyapunov proved

with *<sup>x</sup>*(*t*) <sup>∈</sup> **<sup>R</sup>***<sup>n</sup>* <sup>e</sup> *<sup>A</sup>* <sup>∈</sup> **<sup>R</sup>***n*×*<sup>n</sup>* a known matrix, is asymptotically stable (i.e., all trajectories converge to zero) if and only if there exists a matrix *<sup>P</sup>* <sup>=</sup> *<sup>P</sup>*� <sup>∈</sup> **<sup>R</sup>***nxn* such that the LMIs (2) and

Consider in equation (2) that *A* is not precisely known, but belongs to a politopic bounded uncertainty domain A. In this case, the matrix *A* within the domain of uncertainty can be written as convex combination of vertexes *Aj* , *j* = 1, ..., *N*, of the convex bounded uncertainty

*αjAj* ,

*N* ∑ *j*=1

*A*�

*N* ∑ *j*=1

*x*˙(*t*) = *Ax*(*t*) (1)

*P* + *PA* < 0 (2) *P* > 0 (3)

*α<sup>j</sup>* = 1 , *α<sup>j</sup>* ≥ 0 , *j* = 1...*N*} (4)

inclusion of a decay rate (Boyd et al., 1994) in LMIs formulation.

**2. Quadratic stability of continuous time linear systems**

domain (Boyd et al., 1994), i.e. *A*(*α*) ∈ A and A shown in (4).

<sup>A</sup> <sup>=</sup> {*A*(*α*) <sup>∈</sup> **<sup>R</sup>***n*×*<sup>n</sup>* : *<sup>A</sup>*(*α*) =

from Lyapunov's matrix *P*, leaving it free.

in LMIs.

˜

response.

that the system

(3) are met (Boyd et al., 1994).

A sufficient condition for stability of the convex bounded uncertainty domain A (now on called polytope) is given by the existence of a Lyapunov's matrix *<sup>P</sup>* <sup>=</sup> *<sup>P</sup>*� <sup>∈</sup> **<sup>R</sup>***n*×*<sup>n</sup>* such that the LMIs (5) and (6)

$$A(\mathfrak{a})^{\prime}P + PA(\mathfrak{a}) < 0\tag{5}$$

$$P > 0\tag{6}$$

are checked for every *A*(*α*) ∈ A (Boyd et al., 1994). This stability condition is known as quadratic stability and can be easily verified in practice thanks to the convexity of Lyapunov's inequality that turns the conditions (5) and (6) equivalent to checking the existence of *P* = *<sup>P</sup>*� ∈ �*n*×*<sup>n</sup>* such that conditions (7) and (8) are met with *<sup>j</sup>* <sup>=</sup> 1, ..., *<sup>N</sup>*.

$$A\_j^\prime \mathcal{P} + \mathcal{P} A\_j < 0 \tag{7}$$

$$P > 0\tag{8}$$

It can be observed that (5) can be obtained multiplying by *α<sup>j</sup>* ≥ 0 and adding in *j* of *j* = 1 to *j* = *N*.

Due to being a sufficient condition for stability of the polytope A, conservative results are generated, nevertheless this quadratic stability has been widely used for robust controllers's synthesis.

#### **3. Decay rate restriction for closed-loop systems**

Consider a linear time invariant controllable system described in (9)

$$
\dot{\mathbf{x}}(t) = A\mathbf{x}(t) + Bu(t), \quad \mathbf{x}(0) = \mathbf{x}\_0 \tag{9}
$$

with *<sup>A</sup>* <sup>∈</sup> **<sup>R</sup>***n*×*n*, *<sup>B</sup>* <sup>∈</sup> **<sup>R</sup>***n*×*<sup>m</sup>* the matrix of system input, *<sup>x</sup>*(*t*) <sup>∈</sup> **<sup>R</sup>***<sup>n</sup>* the state vector and *<sup>u</sup>*(*t*) <sup>∈</sup> **R***<sup>m</sup>* the input vector. Assuming that all state are available for feedback, the control law for the same feedback is given by (10)

$$\mu(t) = -\mathbf{K}\mathbf{x}(t)\tag{10}$$

being *<sup>K</sup>* <sup>∈</sup> **<sup>R</sup>***m*×*<sup>n</sup>* a constant elements matrix. Often the norm of the controller *<sup>K</sup>* can be high, leading to saturation of amplifiers and making the implementation in analogic systems difficult. Thus it is necessary to reduce the norm of the controllers elements to facilitate its implementation.

Considering the controlled system (9) - (10), the decay rate (or largest Lyapunov's exponent) is defined as the largest positive constant *γ*, such that (11)

$$\lim\_{t \to \infty} e^{\gamma t} ||\varkappa(t)|| = 0 \tag{11}$$

remains for all trajectories *x*(*t*), *t* > 0.

From the quadratic Lyapunov's function (12),

$$V(\mathbf{x}(t)) = \mathbf{x}(t)^{\prime} P \mathbf{x}(t) \tag{12}$$

to establish a lower limit on the decay rate of (9), with (13)

$$
\dot{V}\left(\mathbf{x}(t)\right) \le -2\gamma V(\mathbf{x}(t))\tag{13}
$$

for all trajectories (Boyd et al., 1994).

From (12) and (9), (14) can be found.

$$\dot{V}(\mathbf{x}(t)) = \dot{\mathbf{x}}\,(t)^{\prime}P\mathbf{x}(t) + \mathbf{x}(t)^{\prime}P\dot{\mathbf{x}}\,(t)$$

$$= \mathbf{x}(t)^{\prime}(A - BK)^{\prime}P\mathbf{x}(t) + \mathbf{x}(t)^{\prime}P(A - BK)\mathbf{x}(t) \tag{14}$$

Adding the restriction on the decay rate (13) in the equation (14) and making the appropriate simplifications, (15) and (16) are met.

$$P(A - BK)'P + P(A - BK) < -2\gamma P\tag{15}$$

$$P > 0\tag{16}$$

As the inequality (15) became a bilinear matrix inequality (BMI) it is necessary to perform manipulations to fit them back into the condition of LMIs. Multiplying the inequalities (17) and (18) on the left and on the right by *P*−1, making *X* = *P*−<sup>1</sup> and *G* = *KX* results:

$$AX - BG + XA' - G'B' + 2\gamma X < 0\tag{17}$$

$$X > 0\tag{18}$$

If the LMIs (17) and (18) are feasible, a controller that stabilizes the closed-loop system can be given by *K* = *GX*−1.

Consider the linear uncertain time-invariant system (19).

$$
\dot{\mathbf{x}}(t) = A(\boldsymbol{\alpha})\mathbf{x}(t) + B(\boldsymbol{\alpha})u(t) \tag{19}
$$

This system can be described as convex combination of the polytope's vertexes shown in (20).

$$\dot{\mathbf{x}}(t) = \sum\_{j=1}^{r} \alpha\_{j} A\_{j} \mathbf{x}(t) + \sum\_{j=1}^{r} \alpha\_{j} B\_{j} u(t) \tag{20}$$

with A and B belonging to the uncertainty polytope (21)

$$(\mathcal{A}, \mathcal{B}) = \{(A, \mathcal{B})(\boldsymbol{a}) \in \mathbb{R}^{n \times n} : (A, \mathcal{B})(\boldsymbol{a}) = \sum\_{j=1}^{N} a\_j (A, \mathcal{B})\_{j}, \sum\_{j=1}^{N} a\_j = 1, a\_j \ge 0, j = 1 \dots N\} \tag{21}$$

being *r* the number of vertexes (Boyd et al., 1994).

Knowing the existing theory for uncertain systems, Theorem 3.1 theorem can be enunciated (Boyd et al., 1994):

**Theorem 3.1.** *A sufficient condition which guarantees the stability of the uncertain system (20) subject to decay rate <sup>γ</sup> is the existence of matrices X* <sup>=</sup> *<sup>X</sup>*� <sup>∈</sup> **<sup>R</sup>***n*×*<sup>n</sup> and G* <sup>∈</sup> **<sup>R</sup>***m*×*n, such that* (22) *and* (23) *are met.*

$$A\_j X - B\_j G + X A\_j' - G' B\_j' + 2\gamma X < 0\tag{22}$$

$$X > 0\tag{23}$$

*with j* = 1, ...,*r.*

*When the LMIs (22) and (23) are feasible, a state feedback matrix which stabilizes the system can be given by* (24)*.*

$$K = GX^{-1} \tag{24}$$

*Proof.* The proof can be found at (Boyd et al., 1994).

4 Will-be-set-by-IN-TECH

*Px*(*t*) + *x*(*t*)

�

*Px*(*t*) + *x*(*t*)

Adding the restriction on the decay rate (13) in the equation (14) and making the appropriate

As the inequality (15) became a bilinear matrix inequality (BMI) it is necessary to perform manipulations to fit them back into the condition of LMIs. Multiplying the inequalities (17)

If the LMIs (17) and (18) are feasible, a controller that stabilizes the closed-loop system can be

This system can be described as convex combination of the polytope's vertexes shown in (20).

*r* ∑ *j*=1

*αj*(*A*, *B*)*j*,

*αjAjx*(*t*) +

*N* ∑ *j*=1

Knowing the existing theory for uncertain systems, Theorem 3.1 theorem can be enunciated

**Theorem 3.1.** *A sufficient condition which guarantees the stability of the uncertain system (20) subject to decay rate <sup>γ</sup> is the existence of matrices X* <sup>=</sup> *<sup>X</sup>*� <sup>∈</sup> **<sup>R</sup>***n*×*<sup>n</sup> and G* <sup>∈</sup> **<sup>R</sup>***m*×*n, such that*

> *<sup>j</sup>* − *G*� *B*�

*When the LMIs (22) and (23) are feasible, a state feedback matrix which stabilizes the system can be*

� *P* . *x* (*t*)

*P*(*A* − *BK*)*x*(*t*) (14)

*B*� + 2*γX* < 0 (17)

*αjBju*(*t*) (20)

*<sup>j</sup>* + 2*γX* < 0 (22)

*X* > 0 (23)

*K* = *GX*−<sup>1</sup> (24)

*α<sup>j</sup>* = 1, *α<sup>j</sup>* ≥ 0, *j* = 1...*N*} (21)

*X* > 0 (18)

*x*(*t*) = *A*(*α*)*x*(*t*) + *B*(*α*)*u*(*t*) (19)

*N* ∑ *j*=1

*P* + *P*(*A* − *BK*) < −2*γP* (15) *P* > 0 (16)

*x* (*t*) �

�

and (18) on the left and on the right by *P*−1, making *X* = *P*−<sup>1</sup> and *G* = *KX* results:

*AX* − *BG* + *XA*� − *G*�

*r* ∑ *j*=1

*AjX* − *BjG* + *XA*�

for all trajectories (Boyd et al., 1994). From (12) and (9), (14) can be found.

simplifications, (15) and (16) are met.

given by *K* = *GX*−1.

(Boyd et al., 1994):

(22) *and* (23) *are met.*

*with j* = 1, ...,*r.*

*given by* (24)*.*

.

= *x*(*t*) �

Consider the linear uncertain time-invariant system (19).

.

. *x*(*t*) =

with A and B belonging to the uncertainty polytope (21)

(A, <sup>B</sup>) = {(*A*, *<sup>B</sup>*)(*α*) <sup>∈</sup> **<sup>R</sup>***n*×*<sup>n</sup>* : (*A*, *<sup>B</sup>*)(*α*) =

being *r* the number of vertexes (Boyd et al., 1994).

*<sup>V</sup>*(*x*(*t*)) = .

(*A* − *BK*)�

(*A* − *BK*)

Thus, it can be feedback into the uncertain system shown in (19) being (22) and (23) sufficient conditions for the polytope asymptotic stability, now for a closed-loop system subject to decay rate.

#### **4. Optimization of the** *K* **matrix norm of the closed-loop system**

In many situations the norm of the state feedback matrix is high, precluding its practical implementation. Thus Theorem 4.1 was proposed in order to limit the norm of *K* (Assunção et al., 2007c; Faria et al., 2010).

**Theorem 4.1.** *Given an fixed constant μ*<sup>0</sup> > 0*, that enables to find feasible results, it can be obtained a constraint for the K* <sup>∈</sup> **<sup>R</sup>***m*×*<sup>n</sup> matrix norm from the state feedback, with K* <sup>=</sup> *GX*−1*, X* <sup>=</sup> *<sup>X</sup>*� <sup>&</sup>gt; <sup>0</sup> <sup>∈</sup> **<sup>R</sup>***n*×*<sup>n</sup> and G* <sup>∈</sup> **<sup>R</sup>***m*×*<sup>n</sup> finding the minimum value <sup>β</sup>, <sup>β</sup>* <sup>&</sup>gt; <sup>0</sup> *such that KK*� <sup>&</sup>lt; *<sup>β</sup> μ*0 <sup>2</sup> *Im. The optimum value for β can be found solving the optimization problem with the LMIs* (25)*,* (26) *and* (27)*.*

$$\begin{aligned} \min\_{\begin{bmatrix} \mathcal{B}I\_{\mathcal{W}} & \mathcal{G} \\ \mathcal{G}' & I\_{\mathcal{W}} \end{bmatrix}} \quad & \quad \text{(25)}\\ \text{s.t.} \quad & \begin{bmatrix} \mathcal{B}I\_{\mathcal{W}} \end{bmatrix} > 0 \end{aligned} \tag{25}$$

$$X > \mu\_0 I\_n \tag{26}$$

$$A\_j X - B\_j G + X A\_j' - G' B\_j' + 2\gamma X < 0\tag{27}$$

*where Im and In are the identity matrices of m and n order respectively.*

*Proof.* The proof can be found at (Assunção et al., 2007c).

#### **5. New optimization of the** *K* **matrix norm of the closed-loop system**

It can be verified that the LMIs given in Theorem 4.1 can produce conservative results, so in order to find better results, new methodologies are proposed.

Using the theory presented in (Assunção et al., 2007c) for the optimization of the norm of robust controllers subject to failures, it is proposed an alternative approach for the same problem grounded in Lemma (5.1).

The approach of the optimum norm used was modified to fit to the new structures of LMIs that will be given in sequence. At first, this new approach has produced better results comparing to the existing ones for the optimization stated in Theorem 4.1 using the set of LMIs (22) and (23).

**Lemma 5.1.** *Consider L* <sup>∈</sup> **<sup>R</sup>***n*×*<sup>m</sup> a a given matrix and <sup>β</sup>* <sup>∈</sup> **<sup>R</sup>***, <sup>β</sup>* <sup>&</sup>gt; <sup>0</sup>*. The conditions*

*1. L*� *L* ≤ *βIm 2. LL*� ≤ *βIn*

*are equivalent.*

*Proof.* Note that if *L* = 0 the lemma conditions are verified. Then consider the case where *L* � 0.

Note that in the first statement of the lemma, (28) is met

$$L'L \le \beta I\_{\mathfrak{m}} \Leftrightarrow \mathfrak{x}'(L'L)\mathfrak{x} \le \beta \mathfrak{x}'\mathfrak{x} \tag{28}$$

for all *<sup>x</sup>* <sup>∈</sup> **<sup>R</sup>***m*.

Knowing that (29) is true

$$\mathbf{x}'(L'L)\mathbf{x} \le \lambda\_{\max}(L'L)\mathbf{x}'\mathbf{x} \tag{29}$$

and *λmax*(*L*� *L*) the maximum eigenvalue of *L*� *L*, which is real (every symmetric matrix has only real eigenvalues). Besides, when *x* is equal to the eigenvector of *L*� *L* associated to the eigenvalue *λmax*(*L*� *L*), and *x*� (*L*� *L*)*x* = *λmax*(*L*� *L*)*x*� *x*. Thus, from (28) and (29), *β* ≥ *λmax*(*L*� *L*).

Similarly, for every *<sup>z</sup>* <sup>∈</sup> **<sup>R</sup>***n*, the second assertion of the lemma results in (30).

$$LL' \le \mathfrak{gl}\_n \Leftrightarrow z'(LL')z \le \lambda\_{\max}(LL')z'z \le \mathfrak{gl}z'z \tag{30}$$

and then, *β* ≥ *λmax*(*LL*� ).

Now, note that the condition (31) is true (Chen, 1999).

$$
\lambda^m \det(\lambda I\_\hbar - L^\prime L) = \lambda^n \det(\lambda I\_m - L L^\prime) \tag{31}
$$

Consequently, every non-zero eigenvalue of *L*� *L* is also an eigenvalue of *LL*� . Therefore, *λmax*(*L*� *L*) = *λmax*(*LL*� ), and from (29) and (30) the lemma is proved .

Knowing that *P* = *X*−<sup>1</sup> is the matrix used to define Lyapunov's quadratic function, Theorem 5.1 is proposed.

**Theorem 5.1.** *Given a constant <sup>μ</sup>*<sup>0</sup> <sup>&</sup>gt; <sup>0</sup>*, a constraint for the state feedback K* <sup>∈</sup> **<sup>R</sup>***m*×*<sup>n</sup> matrix norm is obtained, with K* <sup>=</sup> *GX*−1*, X* <sup>=</sup> *<sup>X</sup>*� <sup>&</sup>gt; <sup>0</sup>*, X* <sup>∈</sup> **<sup>R</sup>***n*×*<sup>n</sup> and G* <sup>∈</sup> **<sup>R</sup>***m*×*<sup>n</sup> by finding the minimum of β, β* > 0 *such that K*� *K* < *<sup>β</sup> <sup>μ</sup>*<sup>0</sup> *In. You can get the minimum β solving the optimization problem with the LMIs* (32)*,* (33) *and* (34)*.*

$$\min \beta$$

$$\text{s.t.} \begin{bmatrix} X \ G'\\ G \ \not\ I\_m \end{bmatrix} > 0 \tag{32}$$

$$X > \mu\_0 I\_{\text{fl}} \tag{33}$$

$$A\_j X - B\_j G + X A\_j' - G' B\_j' + 2\gamma X < 0\tag{34}$$

*where Im and In are the identity matrices of m and n order respectively.*

*Proof.* Applying the Schur complement for the first inequality of (32) results in (35).

$$\left\|\beta I\_{\mathfrak{M}} > 0 \right\| \left\|\left[\beta I\_{\mathfrak{M}}\right]^{-1} G > 0 \right\|\tag{35}$$

Thus, from (35), (36) is found.

6 Will-be-set-by-IN-TECH

*Proof.* Note that if *L* = 0 the lemma conditions are verified. Then consider the case where

(*L*�

*L*)*x* ≤ *λmax*(*L*�

), and from (29) and (30) the lemma is proved .

Knowing that *P* = *X*−<sup>1</sup> is the matrix used to define Lyapunov's quadratic function, Theorem

**Theorem 5.1.** *Given a constant <sup>μ</sup>*<sup>0</sup> <sup>&</sup>gt; <sup>0</sup>*, a constraint for the state feedback K* <sup>∈</sup> **<sup>R</sup>***m*×*<sup>n</sup> matrix norm is obtained, with K* <sup>=</sup> *GX*−1*, X* <sup>=</sup> *<sup>X</sup>*� <sup>&</sup>gt; <sup>0</sup>*, X* <sup>∈</sup> **<sup>R</sup>***n*×*<sup>n</sup> and G* <sup>∈</sup> **<sup>R</sup>***m*×*<sup>n</sup> by finding the minimum of*

> *X G*� *G βIm*

⎤

*<sup>j</sup>* − *G*� *B*�

min *β*

*s*.*t*. ⎡ ⎣

*AjX* − *BjG* + *XA*�

*Proof.* Applying the Schur complement for the first inequality of (32) results in (35).

*βIm* > 0 *e X* − *G*�

*where Im and In are the identity matrices of m and n order respectively.*

*L*)*x*�

)*z* ≤ *λmax*(*LL*�

*<sup>L</sup>*) = *<sup>λ</sup>ndet*(*λIm* <sup>−</sup> *LL*�

*L*)*x* ≤ *βx*�

*L*)*x*�

)*z*�

*<sup>μ</sup>*<sup>0</sup> *In. You can get the minimum β solving the optimization problem with*

*z* ≤ *βz*�

*L* is also an eigenvalue of *LL*�

<sup>⎦</sup> <sup>&</sup>gt; <sup>0</sup> (32)

*<sup>j</sup>* + 2*γX* < 0 (34)

(*βIm*)−1*G* > 0 (35)

*X* > *μ*<sup>0</sup> *In* (33)

*x* (28)

*x* (29)

*L* associated to the

*z* (30)

) (31)

. Therefore,

*L*).

*L*, which is real (every symmetric matrix has

*x*. Thus, from (28) and (29), *β* ≥ *λmax*(*L*�

Note that in the first statement of the lemma, (28) is met

*L*�

*L*) the maximum eigenvalue of *L*�

(*L*�

*LL*� ≤ *βIn* ⇔ *z*�

*<sup>λ</sup>mdet*(*λIn* <sup>−</sup> *<sup>L</sup>*�

*L*), and *x*�

).

Consequently, every non-zero eigenvalue of *L*�

*K* < *<sup>β</sup>*

Now, note that the condition (31) is true (Chen, 1999).

*x*� (*L*�

only real eigenvalues). Besides, when *x* is equal to the eigenvector of *L*�

*L*)*x* = *λmax*(*L*�

(*LL*�

Similarly, for every *<sup>z</sup>* <sup>∈</sup> **<sup>R</sup>***n*, the second assertion of the lemma results in (30).

*L* ≤ *βIm* ⇔ *x*�

*L* � 0.

for all *<sup>x</sup>* <sup>∈</sup> **<sup>R</sup>***m*.

and *λmax*(*L*�

*λmax*(*L*�

5.1 is proposed.

*β, β* > 0 *such that K*�

*the LMIs* (32)*,* (33) *and* (34)*.*

eigenvalue *λmax*(*L*�

and then, *β* ≥ *λmax*(*LL*�

*L*) = *λmax*(*LL*�

Knowing that (29) is true

$$X > \frac{1}{\beta} \mathbf{G}' \mathbf{G} \Rightarrow \mathbf{G}' \mathbf{G} < \beta \mathbf{X} \tag{36}$$

Replacing *G* = *KX* in (36) results in (37)

$$\mathbf{X} \mathbf{X}^{\prime} \mathbf{K} \mathbf{X} < \beta \mathbf{X} \Rightarrow \mathbf{K}^{\prime} \mathbf{K} < \beta \mathbf{X}^{-1} \tag{37}$$

So from (33), (37) and (33), (38) is met.

$$K^\prime K < \frac{\beta}{\mu\_0} I\_{\text{lt}} \tag{38}$$

on which *K* is the optimal controller associated with (22).

It follows that minimizing the norm of a matrix is equivalent to the minimization of a *β* > 0 variable such that *K*� *K* < *<sup>β</sup> <sup>μ</sup>*<sup>0</sup> *In*, with *μ*<sup>0</sup> > 0. Note that the position of the transposed matrix was replaced in this condition, comparing to that used in Theorem 4.1.

A comparison will be shown between the optimization methods, using the robust LMIs with decay rate (22) and (23) in the results section. Since the new method may suit the relaxed LMIs listed below, it was used in the comparative analysis for the control design for extended stability and projective stability.

Finsler's lemma shown in Lemma (5.2) can be used to express stability conditions referring to matrix inequalities, with advantages over existing Lyapunov's theory (Boyd et al., 1994), because it introduces new variables and generate new degrees of freedom in the analysis of uncertain systems with the possibility of nonlinearities elimination.

**Lemma 5.2** (Finsler)**.** *Consider w* <sup>∈</sup> **<sup>R</sup>***nx ,* L ∈ **<sup>R</sup>***nx*×*nx and* B ∈ **<sup>R</sup>***mx*×*nx with rank*(B) <sup>&</sup>lt; *nx <sup>e</sup>* <sup>B</sup><sup>⊥</sup> *a basis for the null space of* B *(i.e.,* BB<sup>⊥</sup> = 0*). Then the following conditions are equivalent:*

*1. w*� L*w* < 0, ∀ *w* �= 0 : B*w* = 0

$$\text{2. } \mathcal{B}^{\perp'} \mathcal{L} \mathcal{B}^{\perp} < 0.$$


*Proof.* Finsler's lemma proof can be found at (Oliveira & Skelton, 2001; Skelton et al., 1997).

#### **5.1 Stability of systems using Finsler's lemma restricted by the decay rate**

Consider the closed-loop system (9). Defining *w* = *x* . *x* , <sup>B</sup> <sup>=</sup> (*A* − *BK*) −*I* , <sup>B</sup><sup>⊥</sup> <sup>=</sup> *I* (*A* − *BK*) and L = 2*γP P P* 0 . Note that B*w* = 0 corresponds to (9) and *w*� L*w* < 0 corresponds to stability constraint with decay rate given by (12) and (13). In this case the dimensions of the lemma's variables (5.2) are: *nx* = 2*n* and *mx* = *n*. Considering that *P* is the matrix used to define the quadratic Lyapunov's function (12), the properties 1 and 2 of Finsler's lemma can be written as:

1. ∃*P* = *P*� > 0 such that

$$
\begin{bmatrix}
\dot{\mathbf{x}} \\
\dot{\mathbf{x}}
\end{bmatrix}'
\begin{bmatrix}
2\gamma P \ P \\
P & 0
\end{bmatrix}
\begin{bmatrix}
\mathbf{x} \\
\dot{\mathbf{x}}
\end{bmatrix} < 0 \ \forall \mathbf{x}, \dot{\mathbf{x}} \neq 0 \ : \quad \begin{bmatrix}
(A - BK) - I
\end{bmatrix}
\begin{bmatrix}
\mathbf{x} \\
\dot{\mathbf{x}}
\end{bmatrix} = \mathbf{0}
$$

 $\text{2. } \exists P = P' > 0 \text{ such that}
$ 

$$
\begin{bmatrix}
I & \\
(A - BK) \end{bmatrix}'
\begin{bmatrix}
2\gamma P \ P \\
P & 0
\end{bmatrix}
\begin{bmatrix}
I \\
(A - BK)
\end{bmatrix} < 0
$$

which results in the equations of stability, according to Lyapunov, including decay rate:

$$\begin{aligned} \text{1. } &\dot{\mathbf{x}}(t)^{\prime}P\dot{\mathbf{x}}(t) + \dot{\mathbf{x}}(t)^{\prime}P\mathbf{x}(t) + 2\gamma \mathbf{x}(t)^{\prime}P\mathbf{x}(t) < 0 \,\,\forall \mathbf{x}, \dot{\mathbf{x}} \neq 0 \,\,:\, \dot{\mathbf{x}}(t) = (A - BK)\mathbf{x}(t) \\ \text{2. } &P(A - BK) + (A - BK)^{\prime}P + 2\gamma P < 0 \end{aligned}$$

Thus, it is possible to characterize stability through Lyapunov's quadratic function (*V*(*x*(*t*)) = *x*(*t*)� *Px*(*t*)), generating new degrees of freedom for the synthesis of controllers.

From Finsler's lemma proof follows that if the properties 1 and 2 are true, then properties 3 and 4 will also be true. Thus, the fourth propriety can be written as (39).

4. ∃X ∈ **<sup>R</sup>**2*n*×*n*, *<sup>P</sup>* <sup>=</sup> *<sup>P</sup>*� <sup>&</sup>gt; 0 such that

$$
\begin{bmatrix} 2\gamma P \ P \\ P & 0 \end{bmatrix} + \mathcal{X} \begin{bmatrix} (A - BK) \ -I \end{bmatrix} + \begin{bmatrix} (A - BK)' \\ -I \end{bmatrix} \mathcal{X}' < 0. \tag{39}
$$

Choosing conveniently the matrix of variables X = *Z aZ* , with *<sup>Z</sup>* <sup>∈</sup> **<sup>R</sup>***n*×*<sup>n</sup>* invertible and not necessarily symmetric and *a* > 0 a fixed relaxation constant of the LMI (Pipeleers et al., 2009). Developing the equation (39) and applying the congruence transformation *Z*−<sup>1</sup> 0 0 *Z*−<sup>1</sup> on the

left and *Z*−<sup>1</sup> 0 0 *Z*−<sup>1</sup> � on the right, is found (40). *AZ*� <sup>−</sup><sup>1</sup>+*Z*−1*A*� −*BKZ*� <sup>−</sup><sup>1</sup>−*Z*−1*K*� *B*� +2*γZ*−1*PZ*�−<sup>1</sup> *Z*−1*PZ*�−<sup>1</sup>+*aZ*−1*A*� <sup>−</sup>*aZ*−1*K*� *B*� <sup>−</sup>*Z*�−<sup>1</sup> *Z*−1*PZ*�−<sup>1</sup>+*aAZ*� <sup>−</sup><sup>1</sup>−*aBKZ*� <sup>−</sup><sup>1</sup>−*Z*−<sup>1</sup> <sup>−</sup>*aZ*�−<sup>1</sup>−*aZ*−<sup>1</sup> < 0

Making *Y* = *Z*� <sup>−</sup>1; *G* = *KY* and *Q* = *Y*� *PY*, there were found LMIs (40) and (41) subject to decay rate *γ*.

$$
\begin{bmatrix}
AY + Y'A' - BG - G'B' + 2\gamma Q \ Q + aY'A' - aG'B' - Y \\
Q + aAY - aBG - Y' & -aY - aY' \end{bmatrix} < 0,\tag{40}
$$

$$Q > 0\tag{41}$$

$$\text{with } Y \in \mathbb{R}^{n \times n}, Y \neq Y', G \in \mathbb{R}^{m \times n} \text{ and } Q \in \mathbb{R}^{n \times n}, Q = Q' > 0 \text{, for some } a > 0.$$

These LMIs meet the restrictions for the asymptotic stability (Feron et al., 1996) of the system described in (9) with state feedback given by (10). It can be checked that the first principal minor of the LMI (40) has the structure of the result found in the theorem of quadratic stability with decay rate (Faria et al., 2009). Nevertheless, there is also, as stated in the Finsler's lemma, a greater degree of freedom because the matrix of variables *Y*, responsible for the synthesis of the controller, doesn't need to be symmetric and the Lyapunov's matrix now 8 Will-be-set-by-IN-TECH

(*A* − *BK*) −*I*

.

 + 

+2*γZ*−1*PZ*�−<sup>1</sup> *Z*−1*PZ*�−<sup>1</sup>+*aZ*−1*A*�

*B*� + 2*γQ Q* + *aY*�

These LMIs meet the restrictions for the asymptotic stability (Feron et al., 1996) of the system described in (9) with state feedback given by (10). It can be checked that the first principal minor of the LMI (40) has the structure of the result found in the theorem of quadratic stability with decay rate (Faria et al., 2009). Nevertheless, there is also, as stated in the Finsler's lemma, a greater degree of freedom because the matrix of variables *Y*, responsible for the synthesis of the controller, doesn't need to be symmetric and the Lyapunov's matrix now

, *<sup>G</sup>* <sup>∈</sup> **<sup>R</sup>***m*×*<sup>n</sup>* and *<sup>Q</sup>* <sup>∈</sup> **<sup>R</sup>***n*×*n*, *<sup>Q</sup>* <sup>=</sup> *<sup>Q</sup>*� <sup>&</sup>gt; 0, for some *<sup>a</sup>* <sup>&</sup>gt; 0.

*Q* + *aAY* − *aBG* − *Y*� −*aY* − *aY*�

<sup>−</sup><sup>1</sup>−*Z*−<sup>1</sup> <sup>−</sup>*aZ*�−<sup>1</sup>−*aZ*−<sup>1</sup>

 *Z aZ* 

(*A* − *BK*)

−*I*

�

<sup>−</sup>*aZ*−1*K*� *B*� <sup>−</sup>*Z*�−<sup>1</sup>

*PY*, there were found LMIs (40) and (41) subject to

*B*� − *Y*

*Q* > 0 (41)

*A*� − *aG*�

X � < 0. (39)

 on the

, with *<sup>Z</sup>* <sup>∈</sup> **<sup>R</sup>***n*×*<sup>n</sup>* invertible and not

 *Z*−<sup>1</sup> 0 0 *Z*−<sup>1</sup>

> < 0

< 0, (40)

*<sup>x</sup>* �<sup>=</sup> 0 : .

*x*(*t*)=(*A* − *BK*)*x*(*t*)

 *x* . *x* = 0

1. ∃*P* = *P*� > 0 such that

2. ∃*P* = *P*� > 0 such that

2. *P*(*A* − *BK*)+(*A* − *BK*)�

�

2*γP P P* 0  *x* . *x* 

*x* (*t*)�

4. ∃X ∈ **<sup>R</sup>**2*n*×*n*, *<sup>P</sup>* <sup>=</sup> *<sup>P</sup>*� <sup>&</sup>gt; 0 such that

�

*AZ*�−<sup>1</sup>+*Z*−1*A*�

with *<sup>Y</sup>* <sup>∈</sup> **<sup>R</sup>***n*×*n*, *<sup>Y</sup>* �<sup>=</sup> *<sup>Y</sup>*�

 2*γP P P* 0

2*γP P P* 0

< 0 ∀*x*,

*I*

*Px*(*t*) + 2*γx*(*t*)�

 <sup>+</sup> <sup>X</sup>

Choosing conveniently the matrix of variables X =

−*BKZ*�

*Z*−1*PZ*�−<sup>1</sup>+*aAZ*�

*AY* + *Y*�

.

(*A* − *BK*)

*P* + 2*γP* < 0

and 4 will also be true. Thus, the fourth propriety can be written as (39).

Developing the equation (39) and applying the congruence transformation

on the right, is found (40).

*B*�

<sup>−</sup><sup>1</sup>−*Z*−1*K*�

<sup>−</sup>1; *G* = *KY* and *Q* = *Y*�

<sup>−</sup><sup>1</sup>−*aBKZ*�

*A*� − *BG* − *G*�

*<sup>x</sup>* �<sup>=</sup> 0 :

 < 0

which results in the equations of stability, according to Lyapunov, including decay rate:

*Px*(*t*) < 0 ∀*x*,

*Px*(*t*)), generating new degrees of freedom for the synthesis of controllers.

(*A* − *BK*) −*I*

Thus, it is possible to characterize stability through Lyapunov's quadratic function (*V*(*x*(*t*)) =

From Finsler's lemma proof follows that if the properties 1 and 2 are true, then properties 3

necessarily symmetric and *a* > 0 a fixed relaxation constant of the LMI (Pipeleers et al., 2009).

 *x* . *x* �

1. *x*(*t*)�

*x*(*t*)�

left and

 *Z*−<sup>1</sup> 0 0 *Z*−<sup>1</sup>

Making *Y* = *Z*�

decay rate *γ*.

 *I* (*A* − *BK*)

> *P* . *<sup>x</sup>*(*t*) + .

turned into *Q*, which remains restricted to positive definite, is partially detached from the controller synthesis, since that *Q* = *Y*� *PY*.

The stability of the LMIs derived from Finsler's lemma stability is commonly called extended stability and it will be designated this way now on.

#### **5.2 Robust stability of systems using Finsler's lemma restricted by the decay rate**

As discussed for the condition of quadratic stability, the stability analysis can be performed for a robust stability condition considering the continuous time linear system as a convex combination of *r* vertexes of the polytope described in (20). The advantage of using the Finsler's lemma for robust stability analysis is the freedom of Lyapunov's function, now

defined as *<sup>Q</sup>*(*α*) = *<sup>r</sup>* ∑ *j*=1 *<sup>α</sup>jQj*, *<sup>r</sup>* ∑ *j*=1 *α<sup>j</sup>* = 1, *α<sup>j</sup>* ≥ 0 e *j* = 1...*r*, i.e., it can be defined a Lyapunov's

function *Qj* for each vertex *j*. As *Q*(*α*) depends on *α*, the Lyapunov matrix use fits to time-invariant polytopic uncertainties, being permitted rate of variation sufficiently small. To verify this, Theorem 5.2 is proposed.

**Theorem 5.2.** *A sufficient condition which guarantees the stability of the uncertain system (20) is the existence of matricesY* <sup>∈</sup> **<sup>R</sup>***n*×*n, Q* <sup>∈</sup> **<sup>R</sup>***n*×*n, Qj* <sup>=</sup> *Qj* � <sup>&</sup>gt; <sup>0</sup> *e G* <sup>∈</sup> **<sup>R</sup>***m*×*n, decay rate greater than <sup>γ</sup> and a fixed constant a* > 0 *such that the LMIs* (42) *and* (43) *are met.*

$$
\begin{bmatrix}
A\_j Y + Y' A\_j{'} - B\_j G - G' B\_j{'} + 2\gamma Q\_j \ Q\_j + aY' A\_j{'} - aG' B\_j{'} - Y \\
Q\_j + aA\_j{'}Y - aB\_j G - Y' & -aY - aY'
\end{bmatrix} < 0 \tag{42}
$$

$$Q\_{\parallel} > 0\tag{43}$$

*with j* = 1, ...,*r. When the LMIs (42) and (43) are feasible, a state feedback matrix which stabilizes the system can be given by* (44)*.*

$$K = GY^{-1} \tag{44}$$

*Proof.* Multiplying (42) and (43) by *α<sup>j</sup>* ≥ 0, and adding in *j*, for *j* = 1 to *j* = *N*, LMIs (45) and (46) are found.

$$\begin{cases} \left(\sum\_{j=1}^{r} a\_j A\_j \right) Y + Y' (\sum\_{j=1}^{r} a\_j A\_j)' - \left(\sum\_{j=1}^{r} a\_j B\_j \right) G - G' (\sum\_{j=1}^{r} a\_j B\_j)' + 2\gamma (\sum\_{j=1}^{r} a\_j Q\_j) \\ \qquad \left(\sum\_{j=1}^{r} a\_j Q\_j \right) + a \left(\sum\_{j=1}^{r} a\_j A\_j \right) Y - a \left(\sum\_{j=1}^{r} a\_j B\_j \right) G - Y' \\\\ \left(\sum\_{j=1}^{r} a\_j Q\_j \right) + a Y' (\sum\_{j=1}^{r} a\_j A\_j)' - a G' (\sum\_{j=1}^{r} a\_j B\_j)' - Y \\ \qquad - aY - aY' \end{cases}$$
 
$$\sum\_{j=1}^{r} a\_j Q\_j > 0$$

$$\begin{bmatrix} A(a)Y + Y^{\prime}A(a)^{\prime} - B(a)\mathbf{G} - G^{\prime}B(a)^{\prime} + 2\gamma Q(a) \ Q(a) + aY^{\prime}A(a)^{\prime} - aG^{\prime}B(a)^{\prime} - Y\\ Q(a) + aA(a)Y - aB(a)\mathbf{G} - Y^{\prime} & -aY - aY^{\prime} \end{bmatrix} < 0 \tag{45}$$

$$Q(\mathfrak{a}) > 0 \tag{46}$$

$$\text{with } Q(\mathfrak{a}) = \sum\_{j=1}^{r} \mathfrak{a}\_{j} Q\_{j\prime} \sum\_{j=1}^{r} \mathfrak{a}\_{j} = 1, \ \mathfrak{a}\_{j} \ge 0 \quad \text{and} \quad j = 1...r.s.$$

Thus, the uncertain system shown can be fed back in (19) with (45) and (46) sufficient conditions for asymptotic stability of the polytope.

**Observation 1.** *In the LMIs (42) and (43), the constant "a" has to be fixed for all vertexes and to satisfy the LMIs and it can be found through a one-dimensional search.*

#### **5.3 Optimization of the** *K* **matrix norm using Finsler's lemma**

The motivation for the study of an alternative optimization of the *K* matrix norm of state feedback control was due to less conservative results obtained with Finsler's lemma. This way expecting to find, for some situations, controllers with lower gains, thus being easier to implement than those designed using the existing quadratic stability theory (Faria et al., 2010), avoiding the signal control saturations.

Some difficulty in applying the existing theorem (Faria et al., 2010) was found to the new structure of LMIs, as the controller synthesis matrix *Y* is not symmetric, a condition that was necessary for the development of Theorem (4.1) when the controller synthesis matrix was *X* = *P*−1. Thus, Theorem 5.3 is proposed.

**Theorem 5.3.** *A constraint for the K* <sup>∈</sup> **<sup>R</sup>***m*×*<sup>n</sup> matrix norm of state feedback can be obtained, with K* = *GY*−<sup>1</sup> *and Qj* = *Y*� *PjY, being Y* <sup>∈</sup> **<sup>R</sup>***n*×*n, G* <sup>∈</sup> **<sup>R</sup>***m*×*<sup>n</sup> and P* <sup>∈</sup> **<sup>R</sup>***n*×*n, Pj* <sup>=</sup> *<sup>P</sup>*� *<sup>j</sup>* > 0 *finding the minimum β, β* > 0*, such that K*� *K* < *βPj, j* = 1...*N. You can get the optimal value of β solving the optimization problem with the LMIs* (47) *and* (48)*.*

$$\begin{aligned} \min\_{\mathbf{s}, \mathbf{t}, \mathbf{t}} & \begin{bmatrix} \mathbf{Q}\_j & \mathbf{G}' \\ \mathbf{G} & \mathbf{\mathcal{S}} I\_m \end{bmatrix} > 0 \end{aligned} \tag{47}$$

$$
\begin{bmatrix}
A\_j Y + Y' A\_j{\ j}^{\prime} - B\_j \mathbf{G} - \mathbf{G}^{\prime} \mathbf{B}\_{\dot{j}}^{\prime} + 2\gamma Q\_{\dot{j}} \mathbf{Q}\_{\dot{j}} + aY^{\prime} A\_{\dot{j}}^{\prime} - a\mathbf{G}^{\prime} \mathbf{B}\_{\dot{j}}^{\prime} - Y \\
Q\_{\dot{j}} + aA\_{\dot{j}}Y - aB\_{\dot{j}}\mathbf{G} - Y' & -aY - aY^{\prime}
\end{bmatrix} < 0 \tag{48}
$$

*where Im denotes the identity matrix of m order.*

*Proof.* Applying the Schur complement for (47) results in (49).

$$
\beta I\_m > 0 \text{ and } \ Q\_j - G'(\beta I\_m)^{-1} G > 0 \tag{49}
$$

Thus, from (49), (50) is found.

$$Q\_j > \frac{1}{\beta} G' G \Rightarrow G' G < \beta Q\_j \tag{50}$$

Replacing *G* = *KY* and *Qj* = *Y*� *PjY* in (50), (51) is met.

$$\mathbf{Y}^{\prime}\mathbf{K}^{\prime}\mathbf{K}\mathbf{Y} < \mathfrak{P}\mathbf{Y}^{\prime}\mathbf{P}\_{\mathbf{j}}\mathbf{Y} \Rightarrow \mathbf{K}^{\prime}\mathbf{K} < \mathfrak{P}\mathbf{P}\_{\mathbf{j}}\tag{51}$$

on which *K* is the optimal controller associated with (42) and (43).

Thus it was possible the adequacy of the proposed optimization method with the minimization of a scalar *β*, using the inequality of minimization *K*� *K* < *βPj* with *Pj* the Lyapunov's matrix, to the new relaxed parameters.

$$\square$$

#### **5.4 Stability of systems using reciprocal projection lemma restricted by the decay rate**

Another tool that can be used for stability analysis using LMIs is the reciprocal projection lemma (Apkarian et al., 2001) set out in Lemma 5.3.

**Lemma 5.3** (reciprocal projection lemma)**.** *Consider Y* = *Y*� > 0 *a given matrix. The following statements are equivalent*

$$1. \ \psi + \mathcal{S} + \mathcal{S}' < 0$$

10 Will-be-set-by-IN-TECH

Thus, the uncertain system shown can be fed back in (19) with (45) and (46) sufficient

**Observation 1.** *In the LMIs (42) and (43), the constant "a" has to be fixed for all vertexes and to*

The motivation for the study of an alternative optimization of the *K* matrix norm of state feedback control was due to less conservative results obtained with Finsler's lemma. This way expecting to find, for some situations, controllers with lower gains, thus being easier to implement than those designed using the existing quadratic stability theory (Faria et al., 2010),

Some difficulty in applying the existing theorem (Faria et al., 2010) was found to the new structure of LMIs, as the controller synthesis matrix *Y* is not symmetric, a condition that was necessary for the development of Theorem (4.1) when the controller synthesis matrix was

**Theorem 5.3.** *A constraint for the K* <sup>∈</sup> **<sup>R</sup>***m*×*<sup>n</sup> matrix norm of state feedback can be obtained, with*

min *β s*.*t*. *Qj G*� *G βIm*

*Bj*

*βIm* > 0 *and Qj* − *G*�

*Qj* > 1 *β G*�

*Qj* + *aAjY* − *aBjG* − *Y*� −*aY* − *aY*�

*PjY, being Y* <sup>∈</sup> **<sup>R</sup>***n*×*n, G* <sup>∈</sup> **<sup>R</sup>***m*×*<sup>n</sup> and P* <sup>∈</sup> **<sup>R</sup>***n*×*n, Pj* <sup>=</sup> *<sup>P</sup>*�

� + 2*γQj Qj* + *aY*�

*G* ⇒ *G*�

Thus it was possible the adequacy of the proposed optimization method with the

*PjY* ⇒ *K*�

*PjY* in (50), (51) is met.

*KY* < *βY*�

*K* < *βPj, j* = 1...*N. You can get the optimal value of β solving the*

*Aj* � − *aG*� *Bj* � − *Y*

<sup>&</sup>gt; <sup>0</sup> (47)

(*βIm*)−1*G* > 0 (49)

*G* < *βQj* (50)

*K* < *βPj* (51)

*K* < *βPj* with *Pj* the

*<sup>j</sup>* > 0 *finding the*

< 0 (48)

*α<sup>j</sup>* = 1, *α<sup>j</sup>* ≥ 0 and *j* = 1...*r*.

with *<sup>Q</sup>*(*α*) = *<sup>r</sup>*

∑ *j*=1 *<sup>α</sup>jQj*, *<sup>r</sup>* ∑ *j*=1

avoiding the signal control saturations.

*X* = *P*−1. Thus, Theorem 5.3 is proposed.

*optimization problem with the LMIs* (47) *and* (48)*.*

*Aj*

*where Im denotes the identity matrix of m order.*

� − *BjG* − *G*�

*Proof.* Applying the Schur complement for (47) results in (49).

*Y*� *K*�

Lyapunov's matrix, to the new relaxed parameters.

on which *K* is the optimal controller associated with (42) and (43).

minimization of a scalar *β*, using the inequality of minimization *K*�

*K* = *GY*−<sup>1</sup> *and Qj* = *Y*�

*minimum β, β* > 0*, such that K*�

*AjY* + *Y*�

Thus, from (49), (50) is found.

Replacing *G* = *KY* and *Qj* = *Y*�

conditions for asymptotic stability of the polytope.

*satisfy the LMIs and it can be found through a one-dimensional search.*

**5.3 Optimization of the** *K* **matrix norm using Finsler's lemma**

*2. The following LMI is feasible for W*

$$
\begin{bmatrix}
\psi + Y - (W + W') \ S' + W' \\
\mathcal{S} + W & -Y
\end{bmatrix} < 0
$$

*Proof.* Reciprocal projection lemma proof can be found at (Apkarian et al., 2001).

Consider the Lyapunov's inequality subject to a decay rate given by (15) and (16), which can be rewritten as (52) and (53).

$$(A - BK)X + X(A - BK)' + 2\gamma X < 0\tag{52}$$

$$X > 0\tag{53}$$

where *X P*−<sup>1</sup> e *P* is the Lyapunov's matrix. The original Lyapunov's inequality (15) can be recovered by multiplying the inequality (52) on the left and on the right by *P*.

Assuming *ψ* 0 e *S*� = (*A* − *BK*)*X* + *γX*, it will be verified that the first claim of the reciprocal projection lemma will be exactly Lyapunov's inequality subject to the decay rate described in (52):

$$
\psi + \mathcal{S} + \mathcal{S}' = (A - BK)X + X(A - BK)' + 2\gamma X < 0
$$

From the reciprocal projection lemma, if the first statement is true, then the second one will also be true as (54) shows.

$$
\begin{bmatrix}
Y - (W + W') & (A - BK)X + \gamma X + W' \\
X(A - BK)' + \gamma X + W & -Y
\end{bmatrix} < 0\tag{54}
$$

Multiplying (54) on the left and on the right by *I* 0 0 *X*−<sup>1</sup> with *P X*−<sup>1</sup> results in (55).

$$
\begin{bmatrix}
Y - (W + W') & (A - BK) + \gamma I + W'P \\
(A - BK)' + \gamma I + PW & -PYP
\end{bmatrix} < 0\tag{55}
$$

Multiplying (55) on the left and on the right by *W*�−<sup>1</sup> 0 0 *I* and *W*−<sup>1</sup> 0 0 *I* respectively with *V W*−1, (56) is found.

$$
\begin{bmatrix}
V'YV - (V+V') & V'(A-BK) + \gamma V' + P \\
(A-BK)'V + \gamma V + P & -PYP
\end{bmatrix} < 0\tag{56}
$$

Applying the Schur complement in *V*� *YV*, (57) is found.

$$
\begin{bmatrix}
(A-BK)'V + \gamma V + P & -PYP & 0 \\
V & 0 & -Y^{-1}
\end{bmatrix} < 0 \tag{57}
$$

performing the linearizing variable change *Y P*−<sup>1</sup> results in (58).

$$
\begin{bmatrix}
(A-BK)'V + \gamma V + P & -P & 0 \\
V & 0 & -P
\end{bmatrix} < 0 \tag{58}
$$

In literature it can be found a formulation close to the insertion of the decay rate but with different positioning of the parameter of decay rate (Shen et al., 2006). It is easy to verify that some conservatism was introduced with the choice of *Y P*−1, but the state feedback matrix is unrelated to the Lyapunov's matrix *P*, which results in relaxation of Lyapunov's LMI. Using the dual form (*A* − *BK*) → (*A* − *BK*)� (Apkarian et al., 2001) results in inequality (59).

$$
\begin{bmatrix}
(A-BK)V + \gamma V + P & -P & 0 \\
V & 0 & -P
\end{bmatrix} < 0 \tag{59}
$$

Performing the change of variable *Z KV* and inserting the constraint *P* > 0, the LMIs (60) and (60) that guarantee system stability can be found.

$$
\begin{bmatrix}
AV - BZ + \gamma V + P & -P & 0 \\
V & 0 & -P
\end{bmatrix} < 0\tag{60}
$$

$$P > 0\tag{61}$$

The inequalities (60) and (61) are LMIs, and being feasible, it is deduced a state feedback matrix that can stabilize the system (9) - (10) given by (62).

$$K = ZV^{-1} \tag{62}$$

The result of relaxation of LMIs is interesting in the design of robust controllers, proposed below.

#### **5.5 Robust stability of systems using reciprocal projection lemma restricted by the decay rate**

A stability analysis for a robust stability condition can be performed considering the continuous time linear system an convex combination of *r* vertexes of the polytope described in (20). As in the extended stability case, the advantage of using the reciprocal projection lemma for robust stability analysis is the Lyapunov's function degree of freedom, now defined

$$\text{As } P(a) = \sum\_{j=1}^{r} a\_j P\_{j\prime}, \sum\_{j=1}^{r} a\_j = 1, \ a\_j \ge 0 \text{ e } j = 1..r, \text{ i.e., it is defined a Lyapunov's function } P\_{\hat{j}} \text{ for } a.$$

each vertex *j*. As described before Theorem 5.2, the use of *P*(*α*) fits to time-invariant polytopic uncertainties, being permitted rate of variation sufficiently small. To verify this, Theorem 5.4 is proposed.

**Theorem 5.4.** *A sufficient condition which guarantees the stability of the uncertain system (20) is the existence of matrices V* <sup>∈</sup> **<sup>R</sup>***n*×*n, Pj* <sup>=</sup> *Pj* � <sup>∈</sup> **<sup>R</sup>***n*×*<sup>n</sup> and Z* <sup>∈</sup> **<sup>R</sup>***m*×*n, such that LMIs* (63) *and* (64) *are met.*

$$
\begin{bmatrix}
A\_jV - B\_jZ + \gamma V + P\_j & -P\_j & 0 \\
V & 0 & -P\_j
\end{bmatrix} < 0 \tag{63}
$$

$$P\_j > 0 \tag{64}
$$

*with j* = 1, ...,*r.*

12 Will-be-set-by-IN-TECH

*YV*, (57) is found.

*V* + *γV* + *P* −*PYP* 0 *<sup>V</sup>* <sup>0</sup> <sup>−</sup>*Y*−<sup>1</sup>

*V* + *γV* + *P* −*P* 0 *V* 0 −*P*

In literature it can be found a formulation close to the insertion of the decay rate but with different positioning of the parameter of decay rate (Shen et al., 2006). It is easy to verify that some conservatism was introduced with the choice of *Y P*−1, but the state feedback matrix is unrelated to the Lyapunov's matrix *P*, which results in relaxation of Lyapunov's LMI. Using

the dual form (*A* − *BK*) → (*A* − *BK*)� (Apkarian et al., 2001) results in inequality (59).

(*A* − *BK*)*V* + *γV* + *P* −*P* 0

) *V*�*A*� − *Z*�

*AV* − *BZ* + *γV* + *P* −*P* 0

*V* 0 −*P*

*V* 0 −*P*

The inequalities (60) and (61) are LMIs, and being feasible, it is deduced a state feedback

The result of relaxation of LMIs is interesting in the design of robust controllers, proposed

**5.5 Robust stability of systems using reciprocal projection lemma restricted by the decay**

A stability analysis for a robust stability condition can be performed considering the continuous time linear system an convex combination of *r* vertexes of the polytope described in (20). As in the extended stability case, the advantage of using the reciprocal projection lemma for robust stability analysis is the Lyapunov's function degree of freedom, now defined

Performing the change of variable *Z KV* and inserting the constraint *P* > 0, the LMIs (60)

) *V*�

(*A* − *BK*) + *γV*� + *P V*�

(*A* − *BK*) + *γV*� + *P V*�

(*A* − *BK*)� + *γV*� + *P V*�

*B*� + *γV*� + *P V*�

*α<sup>j</sup>* = 1, *α<sup>j</sup>* ≥ 0 e *j* = 1...*r*, i.e., it is defined a Lyapunov's function *Pj* for

⎤ ⎥ ⎥ ⎦

⎤ ⎥ ⎥ ⎦

⎤ ⎥ ⎦

⎤

*P* > 0 (61)

*K* = *ZV*−<sup>1</sup> (62)

< 0 (57)

< 0 (58)

< 0 (59)

⎦ < 0 (60)

) *V*�

) *V*�

Applying the Schur complement in *V*�

⎡ ⎢ ⎢ ⎣

⎡ ⎢ ⎣

> ⎡ ⎣

below.

**rate**

as *<sup>P</sup>*(*α*) = *<sup>r</sup>*

∑ *j*=1 *<sup>α</sup>jPj*, *<sup>r</sup>* ∑ *j*=1

(*A* − *BK*)

(*A* − *BK*)

−(*V* + *V*�

performing the linearizing variable change *Y P*−<sup>1</sup> results in (58).

−(*V* + *V*�

�

−(*V* + *V*�

and (60) that guarantee system stability can be found.

−(*V* + *V*�

matrix that can stabilize the system (9) - (10) given by (62).

�

⎡ ⎢ ⎢ ⎣

> *When the LMIs (63) and (64) are feasible, a state feedback matrix which stabilizes the system can be given by* (65)*.*

$$K = ZV^{-1} \tag{65}$$

*Proof.* Multiplying (63) and (64) by *α<sup>j</sup>* ≥ 0, and adding in *j*, for *j* = 1 to *j* = *N*, (65) and (66) are found.

$$
\begin{bmatrix}
\begin{matrix}
\ & -(V+V') \\
\left(\sum\ \alpha\_j A\_j\right)V - \left(\sum\ \alpha\_j B\_j\right)Z + \gamma V + \left(\sum\ \alpha\_j P\_j\right) \\
\end{matrix} \\
V' \\
\begin{pmatrix}
\ \sum\ \alpha\_j A'\_j
\end{pmatrix} - Z'
\begin{pmatrix}
\ r \\
\ r \\
\ j=1
\end{pmatrix}
\begin{matrix}
\ & \alpha\_j P\_j
\end{matrix}
\end{pmatrix} + \gamma V' + \begin{pmatrix}
\ \sum\ \alpha\_j P\_j
\end{pmatrix}
\begin{matrix}
V' \\
\ r
\end{matrix}
\end{bmatrix} \\
\begin{aligned}
\begin{matrix}
V' \\
\end{matrix} &=
\begin{matrix}
\ & \alpha\_j P\_j
\end{matrix}
\end{aligned}
$$

$$
\begin{aligned}
\begin{matrix}
\ & \alpha\_j P\_j
\end{matrix} \\
&=
\begin{matrix}
\ & \begin{matrix} \ & \alpha\_j P\_j
\end{matrix}
\end{bmatrix}
\end{aligned}
$$

$$
\begin{aligned}
\begin{matrix}
\ & \alpha\_j P\_j
\end{matrix}
\end{aligned}
$$

$$(\sum\_{j=1}^r \alpha\_j P\_j) > 0$$

$$
\begin{bmatrix}
A(a)V - B(a)Z + \gamma V + P(a) & -P(a) & 0 \\
V & 0 & -P(a)
\end{bmatrix} < 0 \tag{65}
$$

$$P(a) > 0 \tag{66}
$$

with *<sup>P</sup>*(*α*) = *<sup>r</sup>* ∑ *j*=1 *<sup>α</sup>jPj*, *<sup>r</sup>* ∑ *j*=1 *α<sup>j</sup>* = 1, *α<sup>j</sup>* ≥ 0 and *j* = 1...*r*. It appears that *K* is unique and there are *r* Lyapunov's matrices *Pj*, generating a relaxation in the LMIs. The same trend was observed in the formulation via Finsler's lemma in which variables were the Lyapunov's matrices *Qj*, but in (65) and (66) there is a greater degree of freedom with the inclusion of *V* in the design of the control matrix *K*, *V* being totally disconnected from *Pj*, *j* = 1, ..., *n*.

#### **5.6 Optimization of the** *K* **matrix norm using reciprocal projection lemma**

A study was carried out to fit the LMIs to the new relaxed parameters once the state feedback matrix *K* is completely detached from the Lyapunov's matrix *P*(*α*). Therefore, relevant changes took place in the optimization proposed in this study to suit the reciprocal projection lemma. This optimization has provided interesting results in practice.

Due to the lack of relations to assemble LMI able to optimize the module of *K* it was proposed a minimization procedure similar to the optimization procedure for redesign presented in (Chang et al., 2002) inserting an extra restriction to the LMIs (63) and (64).

Thus Theorem 5.5 was proposed.

**Theorem 5.5.** *A constraint for the K* <sup>∈</sup> **<sup>R</sup>***m*×*<sup>n</sup> matrix norm of state feedback is obtained, with <sup>K</sup>* <sup>=</sup> *ZV*−1*, V* <sup>∈</sup> **<sup>R</sup>***n*×*<sup>n</sup> and Z* <sup>∈</sup> **<sup>R</sup>***m*×*<sup>n</sup> finding the minimum <sup>β</sup>, <sup>β</sup>* <sup>&</sup>gt; <sup>0</sup>*, such that K*� *K* < *βM, being M* = *V*�−1*V*−<sup>1</sup> *and therefore M* = *M*� > 0*. You can get the optimal value of β solving the optimization problem with the LMIs* (67) *and* (68)*.*

$$\text{s.t.} \begin{bmatrix} I\_{\text{fl}} & Z' \\ Z & \beta I\_{\text{m}} \end{bmatrix} > 0 \tag{67}$$

$$(\text{Set } of \text{ } LMIs \text{ } (63) \text{ } and \text{ } (64)) \tag{68}$$

*which Im and In denote the identity matrices of m and n order respectively.*

*Proof.* Applying the Schur complement in (67) results in (69).

$$
\beta I\_{\rm ll} > 0 \text{ e } I\_{\rm ll} - Z'(\beta I\_{\rm ll})^{-1}Z > 0 \tag{69}
$$

Thus, from (69), (70) is found.

$$I\_{\rm nl} > \frac{1}{\beta} \mathbf{Z}' \mathbf{Z} \Rightarrow \mathbf{Z}' \mathbf{Z} < \beta I\_{\rm nl} \tag{70}$$

Replacing *Z* = *KV* in (70) results in (71).

$$V^{l}K^{l}KV < \beta I\_{l} \tag{71}$$

Multiplying on the left and on the right (71) for *V*�−<sup>1</sup> e *V*−<sup>1</sup> respectively and naming *V*�−1*V*−<sup>1</sup> = *M* (72) is met.

$$\mathbf{V}^{\prime}\mathbf{K}^{\prime}\mathbf{K}\mathbf{V} < \beta\mathbf{I}\_{\mathrm{ll}} \Rightarrow \mathbf{K}^{\prime}\mathbf{K} < \beta\mathbf{M} \tag{72}$$

where *K* is the optimal controller associated with (63) and (64).

Due to *M* being defined as *M* = *V*�−1*V*−<sup>1</sup> and so *M* = *M*� > 0, it is possible to find a relationship that optimizes the matrix *K* minimizing a scalar *β*, with the relation of minimizing *K*� *K* < *βM*.

$$\min \beta$$

#### **6. Practical application in the 3-DOF helicopter**

14 Will-be-set-by-IN-TECH

It appears that *K* is unique and there are *r* Lyapunov's matrices *Pj*, generating a relaxation in the LMIs. The same trend was observed in the formulation via Finsler's lemma in which variables were the Lyapunov's matrices *Qj*, but in (65) and (66) there is a greater degree of freedom with the inclusion of *V* in the design of the control matrix *K*, *V* being totally

A study was carried out to fit the LMIs to the new relaxed parameters once the state feedback matrix *K* is completely detached from the Lyapunov's matrix *P*(*α*). Therefore, relevant changes took place in the optimization proposed in this study to suit the reciprocal projection

Due to the lack of relations to assemble LMI able to optimize the module of *K* it was proposed a minimization procedure similar to the optimization procedure for redesign presented in

**Theorem 5.5.** *A constraint for the K* <sup>∈</sup> **<sup>R</sup>***m*×*<sup>n</sup> matrix norm of state feedback is obtained, with*

*being M* = *V*�−1*V*−<sup>1</sup> *and therefore M* = *M*� > 0*. You can get the optimal value of β solving the*

 *In Z*� *Z βIm*

*Z* ⇒ *Z*�

Multiplying on the left and on the right (71) for *V*�−<sup>1</sup> e *V*−<sup>1</sup> respectively and naming

Due to *M* being defined as *M* = *V*�−1*V*−<sup>1</sup> and so *M* = *M*� > 0, it is possible to find a relationship that optimizes the matrix *K* minimizing a scalar *β*, with the relation of minimizing

*KV* < *βIn* ⇒ *K*�

 > 0

(*Set o f LMIs* (63) *and* (64)) (68)

(*βIm*)−1*Z* > 0 (69)

*Z* < *βIn* (70)

*K* < *βM* (72)

*KV* < *βIn* (71)

*K* < *βM,*

(67)

*<sup>K</sup>* <sup>=</sup> *ZV*−1*, V* <sup>∈</sup> **<sup>R</sup>***n*×*<sup>n</sup> and Z* <sup>∈</sup> **<sup>R</sup>***m*×*<sup>n</sup> finding the minimum <sup>β</sup>, <sup>β</sup>* <sup>&</sup>gt; <sup>0</sup>*, such that K*�

min *β*

*s*.*t*.

*βIm* > 0 *e In* − *Z*�

*V*� *K*�

*In* > 1 *β Z*�

*V*� *K*�

where *K* is the optimal controller associated with (63) and (64).

*which Im and In denote the identity matrices of m and n order respectively.*

*Proof.* Applying the Schur complement in (67) results in (69).

**5.6 Optimization of the** *K* **matrix norm using reciprocal projection lemma**

lemma. This optimization has provided interesting results in practice.

(Chang et al., 2002) inserting an extra restriction to the LMIs (63) and (64).

disconnected from *Pj*, *j* = 1, ..., *n*.

Thus Theorem 5.5 was proposed.

Thus, from (69), (70) is found.

*V*�−1*V*−<sup>1</sup> = *M* (72) is met.

*K*�

*K* < *βM*.

Replacing *Z* = *KV* in (70) results in (71).

*optimization problem with the LMIs* (67) *and* (68)*.*

Consider the schematic model in Figure (2) of the 3-DOF helicopter (Quanser, 2002) shown in Figure (1). Two DC motors are mounted at the two ends of a rectangular frame and drive two propellers. The motors axis are parallel and the thrust vector is normal to the frame. The helicopter frame is suspended from the instrumented joint mounted at the end of a long arm and is free to pitch about its center (Quanser, 2002).

The arm is gimbaled on a 2-DOF instrumented joint and is free to pitch and yaw. The other end of the arm carries a counterweight such that the effective mass of the helicopter is light enough for it to be lifted using the thrust from the motors. A positive voltage applied to the front motor causes a positive pitch while a positive voltage applied to the back motor causes a negative pitch (angle *pitch* (*ρ*)). A positive voltage to either motor also causes an elevation of the body (i.e., pitch of the arm). If the body pitches, the thrust vectors result in a travel of the body (i.e., yaw (*ε*) of the arm) as well. If the body pitches, the impulsion vector results in the displacement of the system (i.e., travel (*λ*) of the system).

Fig. 1. Quanser's 3-DOF helicopter of UNESP - Campus Ilha Solteira.

The objective of this experiment is to design a control system to track and regulate the elevation and travel of the 3-DOF Helicopter.

The 3-DOF Helicopter can also be fitted with an active mass disturbance system that will not be used in this work.

Fig. 2. Schematic drawing of 3-DOF Helicopter

The state space model that describes the helicopter is (Quanser, 2002) shown in (73).

$$\begin{bmatrix} \dot{\varepsilon} \\ \dot{\rho} \\ \dot{\lambda} \\ \ddot{\varepsilon} \\ \ddot{\rho} \\ \ddot{\lambda} \\ \ddot{\xi} \\ \dot{\gamma} \end{bmatrix} = A \begin{bmatrix} \varepsilon \\ \rho \\ \lambda \\ \dot{\varepsilon} \\ \dot{\rho} \\ \dot{\lambda} \\ \xi \\ \gamma \end{bmatrix} + B \begin{bmatrix} V\_f \\ V\_b \end{bmatrix} \tag{73}$$

16 Will-be-set-by-IN-TECH

The 3-DOF Helicopter can also be fitted with an active mass disturbance system that will not

Pitch axis

*ρ* ≥ 0

Counterweight *lw*

Suport

*λ* ≥ 0

Fig. 2. Schematic drawing of 3-DOF Helicopter

*lh*

*Fb*

Elevation axis *<sup>ε</sup>* <sup>≥</sup> <sup>0</sup>

The state space model that describes the helicopter is (Quanser, 2002) shown in (73).

⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣

*ε ρ λ* . *ε* . *ρ* . *λ ξ γ* ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦

+ *B* � *Vf Vb* �

= *A*

⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣

. *ε* . *ρ* . *λ* .. *ε* .. *ρ* .. *λ* . *ξ* . *γ*

⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ *lh*

*m mhxg bxg*

*la*

(73)

*mf xg*

*Ff* Front motor

be used in this work.

Back motor

*mw*.*g*

Travel axis

The variables *ξ* and *γ* represent the integrals of the angles *ε* of yaw and *λ* of travel, respectively. The matrices A and B are presented in sequence.


The values used in the project were those that appear in the MATLAB programs for implementing the original design manufacturer, to maintain fidelity to the parameters. The constants used are described in Table (1).


Table 1. Helicopter parameters

Practical implementations of the controllers were carried out in order to view the controller acting in real physical systems subject to failures.

The trajectory of the helicopter was divided into three stages. The first stage is to elevate the helicopter 27.5o reaching the yaw angle *ε* = 0o. In the second stage the helicopter travels 120o, keeping the same elevation i.e., the helicopter reaches *λ* = 120o with reference to the launch point. In the third stage the helicopter performs the landing recovering the initial angle *<sup>ε</sup>* <sup>=</sup> <sup>−</sup>27.5o.

During the landing stage, more precisely in the instant 22 s, the helicopter loses 30% of the power back motor. The robust controller should maintain the stability of the helicopter and have small oscilation in the occurrence of this failure.

To add robustness to the system without any physical change, a 30% drop in power of the back motor is forced by inserting a timer switch connected to an amplifier with a gain of 0.7 in tension acting directly on engine, and thus being constituted a polytope of two vertexes with an uncertainty in the input matrix of the system acting on the helicopter voltage between 0, 7*Vb* and *Vb*. The polytope described as follows.

Vertex 1 (100% of *Vb*):

$$A\_{1} = \begin{bmatrix} 0 & 0 & 0 & 1 \ 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 1 \ 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & -1.2304 & 0 & 0 & 0 & 0 & 0 \\ 1 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 & 0 & 0 \end{bmatrix} \quad \text{and} \quad B\_{1} = \begin{bmatrix} 0 & 0 \\ 0 & 0 \\ 0 & 0 \\ 0.0858 & 0.0858 \\ 0.5810 & -0.5810 \\ 0 & 0 \\ 0 & 0 \\ 0 & 0 \end{bmatrix}$$

Vertex 2 (70% of *Vb*):

$$A\_{2} = \begin{bmatrix} 0 & 0 & 0 & 1 \ 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 1 \ 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & -1.2304 & 0 & 0 & 0 & 0 & 0 \\ 1 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 & 0 & 0 \end{bmatrix} \quad \text{and} \quad B\_{2} = \begin{bmatrix} 0 & 0 \\ 0 & 0 \\ 0 & 0 \\ 0.0858 & 0.0601 \\ 0.5810 & -0.4067 \\ 0 & 0 \\ 0 & 0 \\ 0 & 0 \end{bmatrix}$$

Fixing the decay rate equal to 0.8, there were designed: a controller with quadratic stability using the existing optimization (Assunção et al., 2007c), a controller with quadratic stability with the proposed optimization and controllers with extended stability and projective stability also with the proposed optimization to perform the practical implementation.

The controller designed by quadratic stability with existing optimization (Theorem 4.1) is shown in (74) (Assunção et al., 2007c).

$$K = \begin{bmatrix} -46.4092 - 15.6262 & 21.3173 & -24.7541 \ -3.9269 & 23.5800 & -27.4973 & 7.4713 \\ -70.3091 & 13.3795 & -10.1982 \ -37.5960 & 4.3357 & -15.1521 \ -41.5328 & -2.7935 \end{bmatrix} \tag{74}$$
 
$$|| \quad \text{ 107.62}$$

where ||*K*|| = 107.83.

This controller was implemented in helicopter and the results are shown in Figure 3.

In (75) follows the quadratic stability controller design with the proposed optimization follows (Theorem 5.1).

$$K = \begin{bmatrix} -18.8245 - 12.2370 & 10.9243 & -13.9612 - 4.4480 & 14.6213 & -9.1334 & 3.2483 \\ -27.9219 & 10.6586 & -7.6096 & -20.1096 & 4.5602 & -11.0774 & -13.7202 & -2.2629 \end{bmatrix} \tag{75}$$

where ||*K*|| = 44.88.

This controller was implemented in the helicopter and the results are shown in Figure 4.

In (76) follows the extended stability controller design with the proposed optimization follows (Theorem 5.3). For this LMIs an *a* = 10−<sup>6</sup> solves the problem. Though the Theorem 5.3 and Theorem 5.5 hypothesis establishes a sufficiently low time variation of *α*. However, for comparison purposes of Theorem 5.3 and Theorem 5.5 with Theorem 5.1, the same abrupt loss of power test was done with controllers (76) and (77).

$$K = \begin{bmatrix} -23.7152 - 12.9483 & 9.8587 \ -18.7322 \ -4.9737 & 14.3283 \ -10.7730 & 2.6780\\ -33.8862 & 15.2923 \ -11.6132 \ -25.4922 & 6.0776 \ -16.5503 & -15.8350 \ -3.4475 \end{bmatrix} \tag{76}$$

where ||*K*|| = 56.47.

18 Will-be-set-by-IN-TECH

with an uncertainty in the input matrix of the system acting on the helicopter voltage between

⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦

⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦

Fixing the decay rate equal to 0.8, there were designed: a controller with quadratic stability using the existing optimization (Assunção et al., 2007c), a controller with quadratic stability with the proposed optimization and controllers with extended stability and projective stability

The controller designed by quadratic stability with existing optimization (Theorem 4.1) is

In (75) follows the quadratic stability controller design with the proposed optimization follows

This controller was implemented in the helicopter and the results are shown in Figure 4.

In (76) follows the extended stability controller design with the proposed optimization follows (Theorem 5.3). For this LMIs an *a* = 10−<sup>6</sup> solves the problem. Though the Theorem 5.3

<sup>−</sup>18.8245 <sup>−</sup>12.2370 10.9243 <sup>−</sup>13.9612 <sup>−</sup>4.4480 14.6213 <sup>−</sup>9.1334 3.2483 <sup>−</sup>27.9219 10.6586 <sup>−</sup>7.6096 <sup>−</sup>20.1096 4.5602 <sup>−</sup>11.0774 <sup>−</sup>13.7202 <sup>−</sup>2.2629

<sup>−</sup>46.4092 <sup>−</sup>15.6262 21.3173 <sup>−</sup>24.7541 <sup>−</sup>3.9269 23.5800 <sup>−</sup>27.4973 7.4713 <sup>−</sup>70.3091 13.3795 <sup>−</sup>10.1982 <sup>−</sup>37.5960 4.3357 <sup>−</sup>15.1521 <sup>−</sup>41.5328 <sup>−</sup>2.7935

also with the proposed optimization to perform the practical implementation.

This controller was implemented in helicopter and the results are shown in Figure 3.

and *B*<sup>1</sup> =

and *B*<sup>2</sup> =

⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣

⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣

⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦

⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦

�

�

(74)

(75)

0, 7*Vb* and *Vb*. The polytope described as follows.

⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣

*A*<sup>1</sup> =

*A*<sup>2</sup> =

shown in (74) (Assunção et al., 2007c).

*K* = �

*K* = �

where ||*K*|| = 107.83.

(Theorem 5.1).

where ||*K*|| = 44.88.

⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣

Vertex 1 (100% of *Vb*):

Vertex 2 (70% of *Vb*):

This controller was implemented in the helicopter and the results are shown in Figure (5).

In (77) follows the projective stability controller design with the proposed optimization follows (Theorem 5.5).

$$K = \begin{bmatrix} -50.7121 \ -28.7596 & 35.1829 \ -29.8247 \ -7.9563 & 41.0906 \ -28.8974 & 11.7405 \\ -66.5405 & 31.9853 \ -34.7642 \ -38.3173 & 9.9376 \ -42.0298 \ -38.3418 \ -11.8207 \end{bmatrix} \tag{77}$$

where ||*K*|| = 110.46.

This controller was implemented in the helicopter and the results are shown in Figure 6.

Fig. 3. Practical implementation of the designed K by quadratic stability with the optimization method presented in (Assunção et al., 2007c).

The graphics of Figures 3, 4, 5 and 6, refer to the actual data of the angles and voltages on the front motor (*Vf* ) and back motor (*Vb*) measured with the designed controllers acting on the plant during the trajectory described as a failure in the instant 22 s. Tensions (*Vf*) and (*Vb*) on the motors were multiplied by 10 to match the scales of the two graphics.

Note that the variations of the amplitudes of (*Vf*) and (*Vb*) using optimized controllers proposed (75) and (76) in Figures 4 and 5 are smaller than those obtained with the existing

Fig. 4. Practical implementation of the K designed by quadratic stability with the proposed optimization method.

Fig. 5. Practical implementation of the K designed by extended stability with the proposed optimization method.

20 Will-be-set-by-IN-TECH

10x*Vf* [*t*] 10x*Vb* [*t*]

10x*Vf* [*t*] 10x*Vb* [*t*]

<sup>0</sup> <sup>5</sup> <sup>10</sup> <sup>15</sup> <sup>20</sup> <sup>22</sup> <sup>25</sup> <sup>30</sup> <sup>35</sup> <sup>40</sup> −40

<sup>0</sup> <sup>5</sup> <sup>10</sup> <sup>15</sup> <sup>20</sup> <sup>22</sup> <sup>25</sup> <sup>30</sup> <sup>35</sup> <sup>40</sup> −40

Fig. 5. Practical implementation of the K designed by extended stability with the proposed

t[s]

Failure

Fig. 4. Practical implementation of the K designed by quadratic stability with the proposed

t[s]

Failure

−20

−20

0

*ε*[*t*]

optimization method.

*ρ*[*t*]

*λ*[*t*]

20

States[degrees]

40

60

 and 80

100

10xVoltage

120

 [V] 140

160

0

*ε*[*t*]

optimization method.

*ρ*[*t*]

*λ*[*t*]

20

States[degrees]

40

60

 and 80

100

10xVoltage

120

 [V] 140

160

Fig. 6. Practical implementation of the K designed by projective stability with the proposed optimization method.

controller in the literature (74) shown in Figure 3. This is due to the fact that our proposed controllers (75) and (76) have lower gains then (74). For this implementation the projective stability designed controllers with proposed optimization (77) obtained the worst results as Figure 6.

It was checked that the *γ* used in the implementation of robust controllers, if higher, forces the system to have a quick and efficient recovery, with small fluctuations.

#### **7. General comparison of the two optimization methods**

In order to obtain more satisfactory results on which would be the best way to optimize the norm of *K*, a more general comparison has been made between the two methods as Theorems 4.1 and 5.1.

There were randomly generated 1000 uncertain polytopes of second order systems, with only one uncertain parameter (two vertexes) and after that, 1000 uncertain polytopes of fourth order uncertain systems, with two uncertain parameter (four vertexes). The 1000 uncertain polytopes were generated feasible in at least one case of optimization for *γ* = 0.5, and the consequences of *γ* increase were analyzed and plotted in a bar charts showing the number of controllers with lower norm due to the increase of *γ*, shown in Figure 7 for second-order systems and in Figure 8 to fourth-order systems.

The controllers designed with elevated values of *γ* do not have much practical application due to the fact that the increase of *γ* affect the increasing of the norm and make higher peaks of the transient oscillation, used here only for the purpose of analyzing feasibility and better

Fig. 7. Number of controllers with lower norm for 1000 uncertain politopic systems of second-order randomly generated.

results for the norm of *K*, so comparisons were closed in *γ* = 100.5, because this *γ* is already considered high.

In Figure 8 can be seen that the proposed optimization method produces better results for all cases analyzed. Due to the complexity of the polytopes used in this case (fourth-order uncertain systems with two uncertainties (four vertexes)), is natural a loss of feasibility with the increase of *γ*, and yet the proposed method shows very good results.

#### **8. General comparison of the new design and optimization methods**

A generic comparison between the three methods of design and optimization of *K* was also carried out: design by quadratic stability with proposed optimization shown in Theorem 5.1, design and proposed optimization with extended stability shown in Theorem 5.3 (using the parameter *a* = 10−<sup>6</sup> in the LMIs) and projective stability design with proposed optimization shown in Theorem 5.5.

Initially 1000 polytopes of second order uncertain systems were randomly generated, with only one uncertain parameter (two vertexes) and after that, fourth order uncertain systems, with two uncertain parameter (four vertexes). The 1000 polytopes were generated feasible in at least one case of optimization for *γ* = 0.5 and the consequences of *γ* increase were analyzed. In fourth-order uncertain systems, the 1000 polytopes were generated feasible in at least one case of optimization for *γ* = 0.2 and then, the consequences of *γ* of 0.2 in 0.2 increase were analyzed. This comparison was carried out with the intention of examining feasibility and better results for the norm of *K*. So, a bar graphics showing the number of controllers with lower norm with the increase of *γ* was plotted, and is shown in Figures 9 and 10.

22 Will-be-set-by-IN-TECH

Existing Optimization Proposed Optimization

−20 <sup>0</sup> <sup>20</sup> <sup>40</sup> <sup>60</sup> <sup>80</sup> <sup>100</sup> <sup>120</sup> <sup>0</sup>

*γ*

results for the norm of *K*, so comparisons were closed in *γ* = 100.5, because this *γ* is already

In Figure 8 can be seen that the proposed optimization method produces better results for all cases analyzed. Due to the complexity of the polytopes used in this case (fourth-order uncertain systems with two uncertainties (four vertexes)), is natural a loss of feasibility with

A generic comparison between the three methods of design and optimization of *K* was also carried out: design by quadratic stability with proposed optimization shown in Theorem 5.1, design and proposed optimization with extended stability shown in Theorem 5.3 (using the parameter *a* = 10−<sup>6</sup> in the LMIs) and projective stability design with proposed optimization

Initially 1000 polytopes of second order uncertain systems were randomly generated, with only one uncertain parameter (two vertexes) and after that, fourth order uncertain systems, with two uncertain parameter (four vertexes). The 1000 polytopes were generated feasible in at least one case of optimization for *γ* = 0.5 and the consequences of *γ* increase were analyzed. In fourth-order uncertain systems, the 1000 polytopes were generated feasible in at least one case of optimization for *γ* = 0.2 and then, the consequences of *γ* of 0.2 in 0.2 increase were analyzed. This comparison was carried out with the intention of examining feasibility and better results for the norm of *K*. So, a bar graphics showing the number of controllers with

Fig. 7. Number of controllers with lower norm for 1000 uncertain politopic systems of

the increase of *γ*, and yet the proposed method shows very good results.

**8. General comparison of the new design and optimization methods**

lower norm with the increase of *γ* was plotted, and is shown in Figures 9 and 10.

100

considered high.

shown in Theorem 5.5.

second-order randomly generated.

200

300

Controllers

400

 with lower norm

500

600

700

800

Fig. 8. Number of controllers with lower norm for 1000 uncertain politopic systems of fourth-order randomly generated.

Fig. 9. Number of controllers with lower norm for 1000 uncertain politopic systems of second-order randomly generated. All these methods are proposed in this work.

Fig. 10. Number of controllers with lower norm for 1000 uncertain polytopic systems of fourth-order randomly generated. All these methods are proposed in this work.

Both figures 9 and 10 show that the proposed optimization method using quadratic stability showed better results for the controller norm with the increase of *γ*, due to optimization this method no longer depend on the matrices that guarantee system stability as it can be seen in equation (22). In contrast, using the proposed optimizations with extended stability and projective stability, they still depend on the matrices that guarantee system stability as seen in equations (51) and (72) and this is the obstacle to finding better results for these methods.

#### **9. Conclusions**

At the 3-DOF helicopter practical application, the controllers designed with the proposed optimization showed lower values of the controller's norm designed by the existing optimization with quadratic stability, except the design for projective stability which had the worst value of the norm for this case, thus showing the advantage of the proposed method regarding implementation cost and required effort on the motors. These characteristics of optimality and robustness make our design methodology attractive from the standpoint of practical applications for systems subject to structural failure, guaranteeing robust stability and small oscillations in the occurrence of faults.

It is clear that the design of *K* via the optimization proposed here achieved better results than the existing optimizing *K* (Assunção et al., 2007c), using the LMI quadratic stability for second order polytopes with one uncertainty. The proposed optimization project continued to show better results even when the existing optimization has become totally infeasible for fourth order polytopes with two uncertainties.

By comparing the three optimal design methods proposed here (quadratic stability, extended stability, and projective stability) it can be concluded that the design using quadratic stability had a better performance for both analysis: 1000 second order polytopes with one uncertainty and for the 1000 fourth order polytopes with two uncertainties, showing so that the proposed optimization ensures best results when used with the quadratic stability.

#### **10. References**

24 Will-be-set-by-IN-TECH

Proposed Optimization - Quadratic Stability Proposed Optimization - Extended Stability Proposed Optimization - Projective Stability

<sup>0</sup> 0.5 <sup>1</sup> 1.5 <sup>2</sup> 2.5 <sup>0</sup>

*γ*

Fig. 10. Number of controllers with lower norm for 1000 uncertain polytopic systems of fourth-order randomly generated. All these methods are proposed in this work.

Both figures 9 and 10 show that the proposed optimization method using quadratic stability showed better results for the controller norm with the increase of *γ*, due to optimization this method no longer depend on the matrices that guarantee system stability as it can be seen in equation (22). In contrast, using the proposed optimizations with extended stability and projective stability, they still depend on the matrices that guarantee system stability as seen in equations (51) and (72) and this is the obstacle to finding better results for these methods.

At the 3-DOF helicopter practical application, the controllers designed with the proposed optimization showed lower values of the controller's norm designed by the existing optimization with quadratic stability, except the design for projective stability which had the worst value of the norm for this case, thus showing the advantage of the proposed method regarding implementation cost and required effort on the motors. These characteristics of optimality and robustness make our design methodology attractive from the standpoint of practical applications for systems subject to structural failure, guaranteeing robust stability

It is clear that the design of *K* via the optimization proposed here achieved better results than the existing optimizing *K* (Assunção et al., 2007c), using the LMI quadratic stability for second order polytopes with one uncertainty. The proposed optimization project continued to show better results even when the existing optimization has become totally infeasible for fourth

100

**9. Conclusions**

and small oscillations in the occurrence of faults.

order polytopes with two uncertainties.

200

300

Controllers

 with lower norm

400

500

600

700

800

900

1000


### **On Control Design of Switched Affine Systems with Application to DC-DC Converters**

E. I. Mainardi Júnior1, M. C. M. Teixeira1, R. Cardim1, M. R. Moreira1, E. Assunção1 and Victor L. Yoshimura2 <sup>1</sup>*UNESP - Univ Estadual Paulista* <sup>2</sup>*IFMT - Federal Institute of Education Brazil*

#### **1. Introduction**

26 Will-be-set-by-IN-TECH

100 Frontiers in Advanced Control Systems

Lee, K. H., Lee, J. H. & Kwon, W. H. (2004). Sufficient lmi conditions for the h infin; output

Ma, M. & Chen, H. (2006). *Constrained H2 control of active suspensions using LMI optimization*,

Montagner, V. F., Oliveira, R. C. L. F., Peres, P. L. D. & Bliman, P.-A. (2009). Stability

Oliveira, M. C., Bernussou, J. & Geromel, J. C. (1999). A new discrete-time robust stability

Oliveira, M. C. & Skelton, R. E. (2001). *Perspectives in robust control*, 1st edn, Springer Berlin /

Peaucelle, D., Arzelier, D., Bachelier, O. & Bernussou, J. (2000). A new robust

Pipeleers, G., Demeulenaere, B., Swevers, J. & Vandenberghe, L. (2009). Extended LMI

Shen, Y., Shen, W. & Gu, J. (2006). A new extended lmis approach for continuous-time

Siljak, D. D. & Stipanovic, D. M. (2000). Robust stabilization of nonlinear systems: The LMI

Skelton, R. E., Iwasaki, T. E. & Grigoriadis, K. (1997). *A unified algebraic approach to control*

Wang, J., Li, X., Ge, Y. & Jia, G. (2008). An LMI optimization approach to lyapunov

*CDC. 43rd IEEE Conference on*, Vol. 2, pp. 1742 – 1747 Vol.2.

IEEE.

40(1): 21–30.

˜

*Letters* 58(7): 510 – 518.

*design*, Taylor & Francis, Bristol.

*2008. CCDC 2008. Chinese*, pp. 3044 –3048.

*Proceedings -* 152(2): 195 – 201.

condition, *Systems Control Letters* 37(4): 261–265.

Quanser (2002). *3-DOF helicopter reference manual*, Quanser Inc., Markham.

approach, *Mathematical Problems in Engineering* 6(5): 461–493.

*2006 IEEE International Conference on*, pp. 1054–1059.

feedback stabilization of linear discrete-time systems, *Decision and Control, 2004.*

analysis and gain-scheduled state feedback control for continuous-time systems with bounded parameter variations, *International Journal of Control* 82(6): 1045–1059. Montagner, V., Oliveira, R., Leite, V. & Peres, P. (2005). Lmi approach for h infin;

linear parameter-varying state feedback control, *Control Theory and Applications, IEE*

Heidelberg, Berlin, chapter Stability tests for constrained linear systems, pp. 241–257.

d-stability condition for real convex polytopic uncertainty, *Systems and Control Letters*

characterizations for stability and performance of linear systems, *Systems & Control*

multiobjective controllers synthesis, *Mechatronics and Automation, Proceedings of the*

stability analysis for linear time-invariant systems, *Control and Decision Conference,*

In last years has been a growing interest of researchers on theory and applications of switched control systems, widely used in the area of power electronics (Cardim et al., 2009), (Deaecto et al., 2010), (Yoshimura et al., 2011), (Batlle et al., 1996), (Mazumder et al., 2002), (He et al., 2010) and (Cardim et al., 2011). The switched systems are characterized by having a switching rule which selects, at each instant of time, a dynamic subsystem among a determined number of available subsystems (Liberzon, 2003). In general, the main goal is to design a switching strategy of control for the asymptotic stability of a known equilibrium point, with adequate assurance of performance (Decarlo et al., 2000), (Sun & Ge, 2005) and (Liberzon & Morse, 1999). The techniques commonly used to study this class of systems consist of choosing an appropriate Lyapunov function, for instance, the quadratic (Feron, 1996), (Ji et al., 2005) and (Skafidas et al., 1999). However, in switched affine systems, it is possible that the modes do not share a common point of equilibrium. Therefore, sometimes the concept of stability should be extended using the ideas contained in (Bolzern & Spinelli, 2004) and (Xu et al., 2008). Problems involving stability analysis can many times be reduced to problems described by Linear Matrix Inequalities, also known as LMIs (Boyd et al., 1994) that, when feasible, are easily solved by some tools available in the literature of convex programming (Gahinet et al., 1995) and (Peaucelle et al., 2002). The LMIs have been increasingly used to solve various types of control problems (Faria et al., 2009), (Teixeira et al., 2003) and (Teixeira et al., 2006). This paper is structured as follows: first, a review of previous results in the literature for stability of switched affine systems with applications in power electronics is described (Deaecto et al., 2010). Next, the main goal of this paper is presented: a new theorem, which conditions hold when the conditions of the two theorems proposed in (Deaecto et al., 2010) hold. Later, in order to obtain a design procedure more general than those available in the literature (Deaecto et al., 2010), it was considered a new performance indice for this control system: bounds on output peak in the project based on LMIs. The theory developed in this paper is applied to DC-DC converters: Buck, Boost, Buck-Boost and Sepic. It is also the first time that this class of controller is used for controlling a Sepic DC-DC converter. The notation used is described below. For real matrices or vectors (� ) indicates transpose. The set composed by the first *N* positive integers, 1, ..., *N* is denoted by IK. The set of all vectors *λ* = (*λ*1,..., *λN*)� such that *λ<sup>i</sup>* ≥ 0, *i* = 1, 2, . . . , *N* and *λ*<sup>1</sup> + *λ*<sup>2</sup> + ... + *λ<sup>N</sup>* = 1 is denoted by Λ. The convex combination of a set of matrices (*A*1,..., *AN*) is denoted by *A<sup>λ</sup>* = *N* ∑ *i*=1 *λiAi*, where *λ* ∈ Λ. The trace of a matrix *P* is denoted by *Tr*(*P*).

#### **2. Switched affine systems**

Consider the switched affine system defined by the following state space realization:

$$\dot{\mathbf{x}} = A\_{\sigma(t)} \mathbf{x} + B\_{\sigma(t)} w, \quad \mathbf{x}(0) = \mathbf{x}\_0 \tag{1}$$

$$\mathbf{y} = \mathsf{C}\_{\sigma(t)} \mathbf{x}\_{\prime} \tag{2}$$

as presented in (Deaecto et al., 2010), were *<sup>x</sup>*(*t*) <sup>∈</sup> IR*<sup>n</sup>* is the state vector, *<sup>y</sup>*(*t*) <sup>∈</sup> IR*<sup>p</sup>* is the controlled output, *<sup>w</sup>* <sup>∈</sup> IR*<sup>m</sup>* is the input supposed to be constant for all *<sup>t</sup>* <sup>≥</sup> 0 and *<sup>σ</sup>*(*t*): *<sup>t</sup>* <sup>≥</sup> <sup>0</sup> <sup>→</sup> IK is the switching rule. For a known set of matrices *Ai* <sup>∈</sup> IR*n*×*n*, *Bi* <sup>∈</sup> IR*n*×*<sup>m</sup>* and *Ci* <sup>∈</sup> IR*p*, *i* = 1, . . . , *N*, such that:

$$A\_{\sigma(t)} \in \{A\_1, A\_2, \dots, A\_N\},\tag{3}$$

$$B\_{\sigma(t)} \in \{B\_1, B\_2, \dots, B\_N\} \, \, \, \tag{4}$$

$$\mathbb{C}\_{\sigma(t)} \in \{\mathbb{C}\_1, \mathbb{C}\_2, \dots, \mathbb{C}\_N\}\_{\prime} \tag{5}$$

the switching rule *σ*(*t*) selects at each instant of time *t* ≥ 0, a known subsystem among the *N* subsystems available. The control design problem is to determine a function *σ*(*x*(*t*)), for all *t* ≥ 0, such that the switching rule *σ*(*t*), makes a known equilibrium point *x* = *xr* of (1), (2) globally asymptotically stable and the controlled system satisfies a performance index, for instance, a guaranteed cost. The paper (Deaecto et al., 2010) proposed two solutions for these problems, considering a quadratic Lyapunov function and the guaranteed cost:

$$\min\_{\sigma \in \mathbb{K}} \int\_0^\infty (y - \mathbb{C}\_\sigma \mathbf{x}\_r)'(y - \mathbb{C}\_\sigma \mathbf{x}\_r) dt = \min\_{\sigma \in \mathbb{K}} \int\_0^\infty (\mathbf{x} - \mathbf{x}\_r)' \mathbb{Q}\_\sigma(\mathbf{x} - \mathbf{x}\_r) dt,\tag{6}$$

where *Q<sup>σ</sup>* = *C*� *<sup>σ</sup>C<sup>σ</sup>* ≥ 0 for all *σ* ∈ IK.

#### **2.1 Previous results**

**Theorem 1.** *(Deaecto et al., 2010) Consider the switched affine system* (1)*,* (2) *with constant input <sup>w</sup>*(*t*) = *w for all t* <sup>≥</sup> <sup>0</sup> *and let the equilibrium point xr* <sup>∈</sup> IR*<sup>n</sup> be given. If there exist <sup>λ</sup>* <sup>∈</sup> <sup>Λ</sup> *and a symmetric positive definite matrix P* <sup>∈</sup> IR*n*×*<sup>n</sup> such that*

$$A'\_{\lambda}P + PA\_{\lambda} + Q\_{\lambda} < 0,\tag{7}$$

$$A\_{\lambda}\mathfrak{x}\_{\mathbb{T}} + B\_{\lambda}\mathfrak{w} = \mathbf{0},\tag{8}$$

*then the switching strategy*

$$\sigma(\mathbf{x}) = \arg\min\_{i \in \mathbb{K}} \xi'(Q\_i \xi + 2P(A\_i \mathbf{x} + B\_i \mathbf{w})),\tag{9}$$

*where Qi* = *C*� *i Ci and <sup>ξ</sup>* <sup>=</sup> *<sup>x</sup>* <sup>−</sup> *xr, makes the equilibrium point xr* <sup>∈</sup> IR*<sup>n</sup> globally asymptotically stable and from* (6) *the guaranteed cost*

$$J = \int\_0^\infty (y - \mathbb{C}\_\sigma \mathbf{x}\_r)'(y - \mathbb{C}\_\sigma \mathbf{x}\_r) dt < (\mathbf{x}\_0 - \mathbf{x}\_r)' P (\mathbf{x}\_0 - \mathbf{x}\_r),\tag{10}$$

*holds.*

*Proof.* See (Deaecto et al., 2010).

2 Will-be-set-by-IN-TECH

Consider the switched affine system defined by the following state space realization:

as presented in (Deaecto et al., 2010), were *<sup>x</sup>*(*t*) <sup>∈</sup> IR*<sup>n</sup>* is the state vector, *<sup>y</sup>*(*t*) <sup>∈</sup> IR*<sup>p</sup>* is the controlled output, *<sup>w</sup>* <sup>∈</sup> IR*<sup>m</sup>* is the input supposed to be constant for all *<sup>t</sup>* <sup>≥</sup> 0 and *<sup>σ</sup>*(*t*): *<sup>t</sup>* <sup>≥</sup> <sup>0</sup> <sup>→</sup> IK is the switching rule. For a known set of matrices *Ai* <sup>∈</sup> IR*n*×*n*, *Bi* <sup>∈</sup> IR*n*×*<sup>m</sup>* and *Ci* <sup>∈</sup> IR*p*,

the switching rule *σ*(*t*) selects at each instant of time *t* ≥ 0, a known subsystem among the *N* subsystems available. The control design problem is to determine a function *σ*(*x*(*t*)), for all *t* ≥ 0, such that the switching rule *σ*(*t*), makes a known equilibrium point *x* = *xr* of (1), (2) globally asymptotically stable and the controlled system satisfies a performance index, for instance, a guaranteed cost. The paper (Deaecto et al., 2010) proposed two solutions for these

problems, considering a quadratic Lyapunov function and the guaranteed cost:

*A*�

*<sup>i</sup>*∈IK *<sup>ξ</sup>*�

*σ*(*x*) = *arg* min

(*y* − *Cσxr*)�

(*y* − *Cσxr*)*dt* = min

**Theorem 1.** *(Deaecto et al., 2010) Consider the switched affine system* (1)*,* (2) *with constant input <sup>w</sup>*(*t*) = *w for all t* <sup>≥</sup> <sup>0</sup> *and let the equilibrium point xr* <sup>∈</sup> IR*<sup>n</sup> be given. If there exist <sup>λ</sup>* <sup>∈</sup> <sup>Λ</sup> *and a*

*σ*∈IK

 ∞ 0

*Ci and <sup>ξ</sup>* <sup>=</sup> *<sup>x</sup>* <sup>−</sup> *xr, makes the equilibrium point xr* <sup>∈</sup> IR*<sup>n</sup> globally asymptotically stable*

(*y* − *Cσxr*)*dt* < (*x*<sup>0</sup> − *xr*)�

(*x* − *xr*)�

*<sup>λ</sup>P* + *PA<sup>λ</sup>* + *Q<sup>λ</sup>* < 0, (7) *Aλxr* + *Bλw* = 0, (8)

(*Qiξ* + 2*P*(*Aix* + *Biw*)), (9)

*Qσ*(*x* − *xr*)*dt*, (6)

*P*(*x*<sup>0</sup> − *xr*), (10)

*N* ∑ *i*=1

*x*˙ = *Aσ*(*t*)*x* + *Bσ*(*t*)*w*, *x*(0) = *x*<sup>0</sup> (1) *y* = *Cσ*(*t*)*x*, (2)

*Aσ*(*t*) ∈ {*A*1, *A*2,..., *AN*} , (3) *Bσ*(*t*) ∈ {*B*1, *B*2,..., *BN*} , (4) *Cσ*(*t*) ∈ {*C*1, *C*2,..., *CN*} , (5)

*λiAi*, where *λ* ∈ Λ. The trace of a

of a set of matrices (*A*1,..., *AN*) is denoted by *A<sup>λ</sup>* =

matrix *P* is denoted by *Tr*(*P*).

**2. Switched affine systems**

*i* = 1, . . . , *N*, such that:

min *σ*∈IK

where *Q<sup>σ</sup>* = *C*�

**2.1 Previous results**

*then the switching strategy*

*i*

*and from* (6) *the guaranteed cost*

*J* = ∞ 0

*where Qi* = *C*�

*holds.*

 ∞ 0

(*y* − *Cσxr*)�

*<sup>σ</sup>C<sup>σ</sup>* ≥ 0 for all *σ* ∈ IK.

*symmetric positive definite matrix P* <sup>∈</sup> IR*n*×*<sup>n</sup> such that*

Remembering that similar matrices have the same trace, it follows the minimization problem (Deaecto et al., 2010):

$$\inf\_{P>0} \left\{ Tr(P) : A\_{\lambda}^{\prime}P + PA\_{\lambda} + Q\_{\lambda} < 0, \lambda \in \Lambda \right\}. \tag{11}$$

The next theorem provides another strategy of switching, more conservative, but easier and simpler to implement.

**Theorem 2.** *(Deaecto et al., 2010) Consider the switched affine system* (1)*,* (2) *with constant input <sup>w</sup>*(*t*) = *w for all t* <sup>≥</sup> <sup>0</sup> *and let the equilibrium point xr* <sup>∈</sup> IR*<sup>n</sup> be given. If there exist <sup>λ</sup>* <sup>∈</sup> <sup>Λ</sup>*, and a symmetric positive definite matrix P* <sup>∈</sup> IR*n*×*<sup>n</sup> such that*

$$A\_i'P + PA\_i + Q\_i < 0,\tag{12}$$

$$A\_{\lambda}\mathfrak{x}\_{\mathfrak{r}} + B\_{\lambda}\mathfrak{w} = \mathbf{0},\tag{13}$$

*for all i* ∈ IK*, then the switching strategy*

$$\sigma(\mathbf{x}) = \arg\min\_{i \in \mathbb{K}} \tilde{\xi}^{\prime} P(A\_i \mathbf{x}\_{\mathcal{I}} + B\_{\mathcal{I}} \mathbf{w}),\tag{14}$$

*where <sup>ξ</sup>* <sup>=</sup> *<sup>x</sup>* <sup>−</sup> *xr, makes the equilibrium point xr* <sup>∈</sup> IR*<sup>n</sup> globally asymptotically stable and the guaranteed cost* (10) *holds.*

*Proof.* See (Deaecto et al., 2010).

Theorem 2 gives us the following minimization problem (Deaecto et al., 2010):

$$\inf\_{P>0} \left\{ Tr(P) : A\_i^I P + PA\_i + Q\_i < 0, i \in \mathbb{K} \right\}. \tag{15}$$

Note that (12) is more restrictive than (7), because it must be satisfied for all *i* ∈ IK. However, the switching strategy (14) proposed in Theorem 2 is simpler to implement than the strategy (9) proposed in Theorem 1, because it uses only the product of *ξ* by constant vectors.

#### **2.2 Main results**

The new theorem, proposed in this paper, is presented below.

**Theorem 3.** *Consider the switched affine system* (1)*,* (2) *with constant input w*(*t*) = *w for all t* ≥ 0 *and let xr* <sup>∈</sup> IR*<sup>n</sup> be given. If there exist <sup>λ</sup>* <sup>∈</sup> <sup>Λ</sup>*, symmetric matrices Ni, i* <sup>∈</sup> IK *and a symmetric positive definite matrix P* <sup>∈</sup> IR*n*×*<sup>n</sup> such that*

$$A\_i'P + PA\_i + Q\_i - N\_i < 0,\tag{16}$$

$$A\_{\lambda} \mathbf{x}\_{\mathbf{r}} + B\_{\lambda} \mathbf{w} = \mathbf{0},\tag{17}$$

$$N\_{\Lambda} = 0,\tag{18}$$

*for all i* ∈ IK*, where Qi* = *Q*� *i , then the switching strategy*

$$\sigma(\mathbf{x}) = \arg\min\_{i \in \mathbb{K}} \xi'(N\_i \xi + 2P(A\_i \mathbf{x}\_r + B\_i \mathbf{w})),\tag{19}$$

*where <sup>ξ</sup>* <sup>=</sup> *<sup>x</sup>* <sup>−</sup> *xr, makes the equilibrium point xr* <sup>∈</sup> IR*<sup>n</sup> globally asymptotically stable and from* (10)*, the guaranteed cost J* < (*x*<sup>0</sup> − *xr*)� *P*(*x*<sup>0</sup> − *xr*) *holds.*

*Proof.* Adopting the quadratic Lyapunov candidate function *V*(*ξ*) = *ξ*� *Pξ* and from (1), (16), (17) and (18) note that for *ξ* �= 0:

$$\begin{split} \dot{V}(\tilde{\xi}) &= \dot{\mathbf{x}}^{\prime} P\tilde{\xi} + \tilde{\xi}^{\prime} P\dot{\mathbf{x}} = 2\tilde{\xi}^{\prime} P(A\_{\sigma}\mathbf{x} + B\_{\sigma}\mathbf{w}) = \tilde{\xi}^{\prime} (A\_{\sigma}^{\prime}P + PA\_{\sigma})\xi + 2\tilde{\xi}^{\prime} P(A\_{\sigma}\mathbf{x}\_{\prime} + B\_{\sigma}\mathbf{w}) \\ &< \xi^{\prime}(-Q\_{\sigma} + N\_{\sigma})\xi + 2\tilde{\xi}^{\prime} P(A\_{\sigma}\mathbf{x}\_{\prime} + B\_{\sigma}\mathbf{w}) = \tilde{\xi}^{\prime} (N\_{\sigma}\xi + 2P(A\_{\sigma}\mathbf{x}\_{\prime} + B\_{\sigma}\mathbf{w})) - \tilde{\xi}^{\prime} Q\_{\sigma}\xi \\ &= \min\_{i \in \mathcal{K}} \left\{ \xi^{\prime} (N\_{i}\xi + 2P(A\_{i}\mathbf{x}\_{\prime} + B\_{i}\mathbf{w})) \right\} - \tilde{\xi}^{\prime} Q\_{\sigma}\xi \\ &= \min\_{\lambda \in \Lambda} \left\{ \xi^{\prime} (N\_{\lambda}\xi + 2P(A\_{\lambda}\mathbf{x}\_{\prime} + B\_{\lambda}\mathbf{w})) \right\} - \tilde{\xi}^{\prime} Q\_{\sigma}\xi \\ &\leq -\tilde{\xi}^{\prime} Q\_{\sigma}\xi \leq 0. \end{split} \tag{20}$$

Since *<sup>V</sup>*˙ (*ξ*) <sup>&</sup>lt; 0 for all *<sup>ξ</sup>* �<sup>=</sup> <sup>0</sup> <sup>∈</sup> IR*n*, and *<sup>V</sup>*˙ (0) = 0, then *xr* <sup>∈</sup> IR*<sup>n</sup>* is an equilibrium point globally asymptotically stable. Now, integrating (20) from zero to infinity and taking into account that *V*˙ *ξ*(∞) = 0, we obtain (10). The proof is concluded.

Theorem 3 gives us the following minimization problem:

$$\inf\_{P>0} \left\{ Tr(P) : A\_i^I P + PA\_i + Q\_i - N\_i < 0, \quad N\_\lambda = 0, \quad i \in \mathbb{K} \right\}.\tag{21}$$

The next theorem compares the conditions of Theorems 1, 2 and 3.

**Theorem 4.** *The following statements hold:*

*(i) if the conditions of Theorem 1 are feasible, then the conditions of Theorem 3 are also feasible; (ii) if the conditions of Theorem 2 are feasible, then the conditions of Theorem 3 are also feasible.*

*Proof.* (*i*) Consider the symmetric matrices *Ni*, *i* ∈ IK, as described below:

$$N\_{\dot{l}} = (A\_{\dot{l}}^{\prime}P + PA\_{\dot{l}} + Q\_{\dot{l}}) - (A\_{\lambda}^{\prime}P + PA\_{\lambda} + Q\_{\lambda}).\tag{22}$$

Then, multiplying (22) by *λ<sup>i</sup>* and taking the sum from 1 to *N* it follows that

$$N\_{\lambda} = \sum\_{i=1}^{N} \lambda\_i N\_i = \sum\_{i=1}^{N} \lambda\_i (A\_i^{\prime} P + P A\_i + Q\_i) - \sum\_{i=1}^{N} \lambda\_i (A\_{\lambda}^{\prime} P + P A\_{\lambda} + Q\_{\lambda})$$

$$= (A\_{\lambda}^{\prime} P + P A\_{\lambda} + Q\_{\lambda}) - (A\_{\lambda}^{\prime} P + P A\_{\lambda} + Q\_{\lambda}) = 0. \tag{23}$$

Now, from (16), (18) and (22) observe that

$$A\_i^\prime P + PA\_i + Q\_i - N\_i = A\_i^\prime P + PA\_i + Q\_i - \left( (A\_i^\prime P + PA\_i + Q\_i) - (A\_\lambda^\prime P + PA\_\lambda + Q\_\lambda) \right)$$

$$= A\_\lambda^\prime P + PA\_\lambda + Q\_\lambda < 0, \ \forall i \in \mathbb{K}. \tag{24}$$

(*ii*) It follows considering that *Ni* = 0 in (16):

$$A\_i^\prime P + PA\_i + Q\_i - N\_i = A\_i^\prime P + PA\_i + Q\_i < 0, \ \forall i \in \mathbb{K}.\tag{25}$$

Thus, the proof of Theorem 4 is completed.

$$\square$$

#### **2.3 Bounds on output peak**

4 Will-be-set-by-IN-TECH

(*A*�

*<sup>σ</sup>P* + *PAσ*)*ξ* + 2*ξ*�

(*Nσξ* + 2*P*(*Aσxr* + *Bσw*)) − *ξ*�

*<sup>λ</sup>P* + *PA<sup>λ</sup>* + *Qλ*). (22)

*<sup>λ</sup>P* + *PA<sup>λ</sup>* + *Qλ*)

*<sup>i</sup>P* + *PAi* + *Qi* < 0, ∀*i* ∈ IK. (25)

*<sup>λ</sup>P* + *PA<sup>λ</sup>* + *Qλ*) = 0. (23)

*<sup>λ</sup>P* + *PA<sup>λ</sup>* + *Qλ*)

*Pξ* and from (1), (16),

*Qσξ*

. (21)

*P*(*Aσxr* + *Bσw*)

*Proof.* Adopting the quadratic Lyapunov candidate function *V*(*ξ*) = *ξ*�

(*Ni<sup>ξ</sup>* + <sup>2</sup>*P*(*Aixr* + *Biw*))

(*Nλξ* + <sup>2</sup>*P*(*Aλxr* + *<sup>B</sup>λw*))

= 0, we obtain (10). The proof is concluded.

The next theorem compares the conditions of Theorems 1, 2 and 3.

*Proof.* (*i*) Consider the symmetric matrices *Ni*, *i* ∈ IK, as described below:

Then, multiplying (22) by *λ<sup>i</sup>* and taking the sum from 1 to *N* it follows that

*<sup>i</sup>P* + *PAi* + *Qi* −

*<sup>i</sup>P* + *PAi* + *Qi* − *Ni* = *A*�

*λi*(*A*�

Theorem 3 gives us the following minimization problem:

*Tr*(*P*) : *A*�

*Ni* = (*A*�

*N* ∑ *i*=1

= (*A*�

= *A*�

*λiNi* =

*P*(*Aσx* + *Bσw*) = *ξ*�

*P*(*Aσxr* + *Bσw*) = *ξ*�

− *ξ*� *Qσξ*

Since *<sup>V</sup>*˙ (*ξ*) <sup>&</sup>lt; 0 for all *<sup>ξ</sup>* �<sup>=</sup> <sup>0</sup> <sup>∈</sup> IR*n*, and *<sup>V</sup>*˙ (0) = 0, then *xr* <sup>∈</sup> IR*<sup>n</sup>* is an equilibrium point globally asymptotically stable. Now, integrating (20) from zero to infinity and taking into account that

*(i) if the conditions of Theorem 1 are feasible, then the conditions of Theorem 3 are also feasible; (ii) if the conditions of Theorem 2 are feasible, then the conditions of Theorem 3 are also feasible.*

*<sup>i</sup>P* + *PAi* + *Qi*) − (*A*�

*<sup>i</sup>P* + *PAi* + *Qi*) −

 (*A*�

*<sup>λ</sup>P* + *PA<sup>λ</sup>* + *Qλ*) − (*A*�

*N* ∑ *i*=1

*λi*(*A*�

*<sup>i</sup>P* + *PAi* + *Qi*) − (*A*�

*<sup>λ</sup>P* + *PA<sup>λ</sup>* + *Q<sup>λ</sup>* < 0, ∀*i* ∈ IK. (24)

− *ξ*� *Qσξ*

*Qσξ* ≤ 0. (20)

*<sup>i</sup><sup>P</sup>* <sup>+</sup> *PAi* <sup>+</sup> *Qi* <sup>−</sup> *Ni* <sup>&</sup>lt; 0, *<sup>N</sup><sup>λ</sup>* <sup>=</sup> 0, *<sup>i</sup>* <sup>∈</sup> IK

(17) and (18) note that for *ξ* �= 0:

*Px*˙ = 2*ξ*�

(−*Q<sup>σ</sup>* + *Nσ*)*ξ* + 2*ξ*�

� *Pξ* + *ξ*�

< *ξ*�

= min *i*∈IK *ξ*�

= min *λ*∈Λ *ξ*�

≤ −*ξ*�

inf *P*>0 

**Theorem 4.** *The following statements hold:*

*N<sup>λ</sup>* =

*A*�

*N* ∑ *i*=1

Now, from (16), (18) and (22) observe that

(*ii*) It follows considering that *Ni* = 0 in (16):

*A*�

Thus, the proof of Theorem 4 is completed.

*<sup>i</sup>P* + *PAi* + *Qi* − *Ni* = *A*�

*V*˙ *ξ*(∞) 

*V*˙ (*ξ*) = *x*˙

Considering the limitations imposed by practical applications of control systems, often must be considered constraints in the design. Consider the signal:

$$s = H\mathfrak{F}\_{\prime} \tag{26}$$

where *<sup>H</sup>* <sup>∈</sup> IR*q*×*<sup>n</sup>* is a known constant matrix, and the following constraint:

$$\max\_{t\geq 0} ||s(t)|| \leq \psi\_{0\prime} \tag{27}$$

where �*s*(*t*)� <sup>=</sup> *<sup>s</sup>*(*t*)�*s*(*t*) and *<sup>ψ</sup><sup>o</sup>* is a known positive constant, for a given initial condition *ξ*(0). In (Boyd et al., 1994), for an arbitrary control law were presented two LMIs for the specification of these restrictions, supposing that there exists a quadractic Lyapunov function *V*(*ξ*) = *ξ*� *Pξ*, with negative derivative defined for all *ξ* �= 0. For the particular case, where *<sup>s</sup>*(*t*) = *<sup>y</sup>*(*t*), with *<sup>y</sup>*(*t*) <sup>∈</sup> IR*<sup>p</sup>* defined in (2), is proposed the following lemma:

**Lemma 1.** *For a given constant ψ<sup>o</sup>* > 0*, if there exist λ* ∈ Λ*, and a symmetric positive definite matrix <sup>P</sup>* <sup>∈</sup> IR*n*×*n, solution of the following optimization problem, for all i* <sup>∈</sup> IK*:*

$$
\begin{bmatrix} P & \mathbf{C}'\_i \\ \mathbf{C}\_i \ \psi\_o^2 I\_{\hbar} \end{bmatrix} > 0,\tag{28}
$$

$$
\begin{bmatrix} I\_n & \mathfrak{f}(\mathbf{0})^\prime P \\ P\mathfrak{f}(\mathbf{0}) & P \end{bmatrix} > \mathbf{0}, \tag{29}
$$

$$(\text{Set of LMs}),\tag{30}$$

*where* (*Set o f LMIs*) *can be equal to* (7)*-*(8)*,* (12)*-*(13) *or* (16)*-*(18) *then the equilibrium point ξ* = *x* − *xr* = 0 *is globally asymptotically stable, the guaranteed cost* (10) *and the constraint* (27) *hold.*

*Proof.* It follows from Theorems 1, 2 and the condition for bounds on output peak given in (Boyd et al., 1994).

The next section presents applications of Theorem 3 in the control design of three DC-DC converters: Buck, Boost and Buck-Boost.

#### **3. DC-DC converters**

Consider that *iL*(*t*) denotes the inductor current and *Vc*(*t*) the capacitor voltage, that were adopted as state variables of the system:

$$\mathbf{x}(t) = [\mathbf{x}\_1(t), \mathbf{x}\_2(t)]' = [i\_L(t), V\_c(t)]'. \tag{31}$$

Define the following operating point *xr* = [*x*1*<sup>r</sup> x*2*r*] � = [*iLr Vcr*] � . Consider the DC-DC power converters: Buck, Boost and Buck-Boost, illustrated in Figures 1, 3 and 5, respectively. The DC-DC converters operate in continuous conduction mode. For theoretical analysis of DC-DC converters, no limit is imposed on the switching frequency because the trajectory of the system evolves on a sliding surface with infinite frequency. Simulation results are presented below. The used solver was the LMILab from the software MATLAB interfaced by YALMIP (Lofberg, 2004) (Yet Another LMI Parser). Consider the following design parameters (Deaecto et al., 2010): *Vg* = 100[*V*], *R* = 50[Ω], *rL* = 2[Ω], *L* = 500[*μH*], *C* = 470[*μF*] and

$$Q\_i = Q = \begin{bmatrix} \rho\_1 r\_L & 0\\ 0 & \rho\_2 / R \end{bmatrix} / 2$$

is the performance index matrix associated with the guaranteed cost:

$$\int\_0^\infty (\rho\_2 R^{-1} (V\_\mathfrak{c} - V\_{cr})^2 + \rho\_1 r\_L (\dot{\mathfrak{i}}\_L - \dot{\mathfrak{i}}\_{Lr})^2 dt\_\omega)$$

where *ρ*<sup>1</sup> and *ρ*<sup>2</sup> ∈ IR<sup>+</sup> are design parameters. Note that *ρ<sup>i</sup>* ∈ IR<sup>+</sup> plays an important role with regard to the value of peak current and duration of the transient voltage. Adopt *ρ*<sup>1</sup> = 0 and *ρ*<sup>2</sup> = 1.

#### **3.1 Buck converter**

#### Fig. 1. Buck DC-DC converter.

Figure 1 shows the structure of the Buck converter, which allows only output voltage of magnitude smaller than the input voltage. The converter is modeled with a parasitic resistor in series with the inductor. The switched system state-space (1) is defined by the following matrices (Deaecto et al., 2010):

$$A\_1 = \begin{bmatrix} -r\_L/L & -1/L \\ 1/\mathcal{C} & -1/RC \end{bmatrix}, \quad A\_2 = \begin{bmatrix} -r\_L/L & -1/L \\ 1/\mathcal{C} & -1/RC \end{bmatrix},$$

$$B\_1 = \begin{bmatrix} 1/L \\ 0 \end{bmatrix}, \quad B\_2 = \begin{bmatrix} 0 \\ 0 \end{bmatrix}.\tag{32}$$

In this example, adopt *λ*<sup>1</sup> = 0.52 and *λ*<sup>2</sup> = 0.48. Using the minimization problems (11) and (15), corresponding to Theorems 1 and 2, respectively, we obtain the following matrix quadratic Lyapunov function

$$P = 1 \times 10^{-4} \begin{bmatrix} 0.0253 \ 0.0476\\ 0.0476 \ 0.1142 \end{bmatrix} / 2$$

needed for the implementation of the switching strategies (9) and (14). Maintaining the same parameters, from minimization problem of Theorem 3, we found the matrices below as a solution, and from (10) the guaranteed cost *J* < (*x*<sup>0</sup> − *xr*)� *P*(*x*<sup>0</sup> − *xr*) = 0.029:

$$P = 1 \times 10^{-4} \begin{bmatrix} 0.0253 \ 0.0476\\ 0.0476 \ 0.1142 \end{bmatrix} / $$

$$N\_1 = -1 \times 10^{-6} \begin{bmatrix} 0.2134 \ 0.0693 \\ 0.0693 \ 0.0685 \end{bmatrix}, \quad N\_2 = 1 \times 10^{-6} \begin{bmatrix} 0.2312 \ 0.0751 \\ 0.0751 \ 0.0742 \end{bmatrix}.$$

The results are illustrated in Figure 2. The initial condition was the origin *x* = [*iL Vc*] � = [0 0] � and the equilibrium point is equal to *xr* = [1 50] � .

Fig. 2. Buck dynamic.

6 Will-be-set-by-IN-TECH

The used solver was the LMILab from the software MATLAB interfaced by YALMIP (Lofberg, 2004) (Yet Another LMI Parser). Consider the following design parameters (Deaecto et al.,

*ρ*1*rL* 0 0 *ρ*2/*R*

(*ρ*2*R*−1(*Vc* <sup>−</sup> *Vcr*)<sup>2</sup> <sup>+</sup> *<sup>ρ</sup>*1*rL*(*iL* <sup>−</sup> *iLr*)2*dt*,

where *ρ*<sup>1</sup> and *ρ*<sup>2</sup> ∈ IR<sup>+</sup> are design parameters. Note that *ρ<sup>i</sup>* ∈ IR<sup>+</sup> plays an important role with regard to the value of peak current and duration of the transient voltage. Adopt *ρ*<sup>1</sup> = 0 and

Figure 1 shows the structure of the Buck converter, which allows only output voltage of magnitude smaller than the input voltage. The converter is modeled with a parasitic resistor in series with the inductor. The switched system state-space (1) is defined by the following

, *A*<sup>2</sup> =

, *B*<sup>2</sup> =

0.0253 0.0476 0.0476 0.1142

In this example, adopt *λ*<sup>1</sup> = 0.52 and *λ*<sup>2</sup> = 0.48. Using the minimization problems (11) and (15), corresponding to Theorems 1 and 2, respectively, we obtain the following matrix

 0 0 

−*rL*/*L* −1/*L* 1/*C* −1/*RC*

> ,

 ,

. (32)

 ,

2010): *Vg* = 100[*V*], *R* = 50[Ω], *rL* = 2[Ω], *L* = 500[*μH*], *C* = 470[*μF*] and

is the performance index matrix associated with the guaranteed cost:

 ∞ 0

*ρ*<sup>2</sup> = 1.

**3.1 Buck converter**

Fig. 1. Buck DC-DC converter.

matrices (Deaecto et al., 2010):

quadratic Lyapunov function

*A*<sup>1</sup> = 

−*rL*/*L* −1/*L* 1/*C* −1/*RC*

> *B*<sup>1</sup> = 1/*L* 0

*<sup>P</sup>* <sup>=</sup> <sup>1</sup> <sup>×</sup> <sup>10</sup>−<sup>4</sup>

*Qi* = *Q* =

Observe that Theorem 3 presented the same convergence rate and cost by applying Theorems 1 and 2. This effect is due to the fact that for this particular converter, the gradient of the switching surface does not depend on the equilibrium point (Deaecto et al., 2010). Table 1 presents the obtained results.

#### **3.2 Boost converter**

In order to compare the results from the previous theorems, designs and simulations will be also done for a DC-DC converter, Boost. The converter is modeled with a parasitic resistor


Table 1. Buck results.

#### Fig. 3. Boost DC-DC converter.

in series with the inductor. The switched system state-space (1) is defined by the following matrices (Deaecto et al., 2010):

$$A\_1 = \begin{bmatrix} -r\_L/L & 0\\ 0 & -1/RC \end{bmatrix}, \quad A\_2 = \begin{bmatrix} -r\_L/L & -1/L\\ 1/C & -1/RC \end{bmatrix},$$

$$B\_1 = \begin{bmatrix} 1/L\\ 0 \end{bmatrix}, \quad B\_2 = \begin{bmatrix} 1/L\\ 0 \end{bmatrix}.\tag{33}$$

.

In this example, *λ*<sup>1</sup> = 0.4 and *λ*<sup>2</sup> = 0.6. Using the minimization problems (11) of Theorem 1 and (15) of Theorem 2, the matrices of the quadratic Lyapunov functions are

$$P = 1 \times 10^{-4} \begin{bmatrix} 0.0237 \ 0.0742 \\ 0.0742 \ 0.2573 \end{bmatrix}, \quad P = 1 \times 10^{-3} \begin{bmatrix} 0.1450 \ 0.0088 \\ 0.0088 \ 0.2478 \end{bmatrix},$$

respectively. Now, from minimization problem of Theorem 3, we found the matrices below as a solution, and from (10) the guaranteed cost *J* < (*x*<sup>0</sup> − *xr*)� *P*(*x*<sup>0</sup> − *xr*) = 0.59:

$$P = 1 \times 10^{-4} \begin{bmatrix} 0.0237 \ 0.0742 \\ 0.0742 \ 0.2573 \end{bmatrix}'$$

$$N\_1 = \begin{bmatrix} -0.018 & -0.030 \\ -0.030 & 0.0178 \end{bmatrix}, \quad N\_2 = \begin{bmatrix} 0.012 & 0.020 \\ 0.020 & -0.012 \end{bmatrix}$$

The initial condition is the origin and the equilibrium point is *xr* = [5 150] � . The results are illustrated in Figure 4 and Table 2 presents the obtained results.

Fig. 4. Boost dynamic.

8 Will-be-set-by-IN-TECH

Theo. 1 36.5 2 0.029 Theo. 2 36.5 2 0.029 Theo. 3 36.5 2 0.029

in series with the inductor. The switched system state-space (1) is defined by the following

, *A*<sup>2</sup> =

, *B*<sup>2</sup> =

In this example, *λ*<sup>1</sup> = 0.4 and *λ*<sup>2</sup> = 0.6. Using the minimization problems (11) of Theorem 1

respectively. Now, from minimization problem of Theorem 3, we found the matrices below as

 1/*L* 0

, *<sup>P</sup>* <sup>=</sup> <sup>1</sup> <sup>×</sup> <sup>10</sup>−<sup>3</sup>

0.0237 0.0742 0.0742 0.2573

, *N*<sup>2</sup> =

−*rL*/*L* −1/*L* 1/*C* −1/*RC*

 ,

0.012 0.020 0.020 −0.012  ,

0.1450 0.0088 0.0088 0.2478

*P*(*x*<sup>0</sup> − *xr*) = 0.59:

 .

�

. The results are

. (33)

 ,

Table 1. Buck results.

Fig. 3. Boost DC-DC converter.

matrices (Deaecto et al., 2010):

*A*<sup>1</sup> =

*<sup>P</sup>* <sup>=</sup> <sup>1</sup> <sup>×</sup> <sup>10</sup>−<sup>4</sup>

*N*<sup>1</sup> =

illustrated in Figure 4 and Table 2 presents the obtained results.

a solution, and from (10) the guaranteed cost *J* < (*x*<sup>0</sup> − *xr*)�

−*rL*/*L* 0

*B*<sup>1</sup> = 1/*L* 0

0 −1/*RC*

and (15) of Theorem 2, the matrices of the quadratic Lyapunov functions are

*<sup>P</sup>* <sup>=</sup> <sup>1</sup> <sup>×</sup> <sup>10</sup>−<sup>4</sup>

−0.018 −0.030 −0.030 0.0178

The initial condition is the origin and the equilibrium point is *xr* = [5 150]

0.0237 0.0742 0.0742 0.2573

*Overshoot* [*A*] *Time* [*ms*] *Cost* (6)


Table 2. Boost results.

#### **3.3 Buck-Boost converter**

Figure 5 shows the structure of the Buck-Boost converter. The switched system state-space (1) is defined by the following matrices (Deaecto et al., 2010):

$$A\_1 = \begin{bmatrix} -r\_L/L & 0\\ 0 & -1/RC \end{bmatrix}, \quad A\_2 = \begin{bmatrix} -r\_L/L & -1/L\\ 1/C & -1/RC \end{bmatrix},$$

$$B\_1 = \begin{bmatrix} 1/L\\ 0 \end{bmatrix}, \quad B\_2 = \begin{bmatrix} 0\\ 0 \end{bmatrix}.\tag{34}$$

The initial condition was the origin *x* = [*iL Vc*] � = [0 0] � , *λ*<sup>1</sup> = 0.6, *λ*<sup>2</sup> = 0.4 and the equilibrium point is equal to *xr* = [6 120] � . Moreover, the optimal solutions of minimization

Fig. 5. Buck-Boost DC-DC converter.

problems (11) of Theorem 1 and (15) of Theorem 2, are

$$P = 1 \times 10^{-4} \begin{bmatrix} 0.0211 \ 0.0989 \\ 0.0989 \ 0.4898 \end{bmatrix} \text{ \textdegree P = } 1 \times 10^{-3} \begin{bmatrix} 0.1450 \ 0.0088 \\ 0.0088 \ 0.2478 \end{bmatrix} \text{\textdegree P$$

respectively. Maintaining the same parameters, the optimal solution of minimization problem (21) are the matrices below and from (10) the guaranteed cost *J* < (*x*<sup>0</sup> − *xr*)� *P*(*x*<sup>0</sup> − *xr*) = 0.72:

$$P = 1 \times 10^{-4} \begin{bmatrix} 0.0211 \ 0.0990 \\ 0.0990 \ 0.4898 \end{bmatrix} / 2$$

$$N\_1 = \begin{bmatrix} -0.0168 \ -0.0400 \\ -0.0400 \ 0.0158 \end{bmatrix}, \quad N\_2 = \begin{bmatrix} 0.0253 & 0.0600 \\ 0.0600 & -0.0237 \end{bmatrix}.$$

The results are illustrated in Figure 6. Table 3 presents the obtained results. The next section


Table 3. Buck-Boost results.

is devoted to extend the theoretical results obtained in Theorems 1 (Deaecto et al., 2010) and 2 (Deaecto et al., 2010) for the model Sepic DC-DC converter.

#### **4. Sepic DC-DC converter**

A Sepic converter (Single-Ended Primary Inductor Converter) is characterized by being able to operate as a step-up or step-down, without suffering from the problem of polarity reversal. The Sepic converter consists of an active power switch, a diode, two inductors and two capacitors and thus it is a nonlinear fourth order. The converter is modeled with parasitic resistances in series with the inductors. The switched system (1) is described by the following matrices:

$$A\_{1} = \begin{bmatrix} -r\_{\rm L1}/L\_{1} & 0 & 0 & 0\\ 0 & -r\_{\rm L2}/L\_{2} - 1/L\_{2} & 0\\ 0 & 1/\mathcal{C}\_{1} & 0 & 0\\ 0 & 0 & 0 & -1/(\mathcal{R}\mathcal{C}\_{2}) \end{bmatrix}, \quad B\_{1} = \begin{bmatrix} 1/L\_{1} \\ 0 \\ 0 \\ 0 \end{bmatrix}.$$

Fig. 6. Buck-Boost dynamic.

10 Will-be-set-by-IN-TECH

Fig. 5. Buck-Boost DC-DC converter.

*<sup>P</sup>* <sup>=</sup> <sup>1</sup> <sup>×</sup> <sup>10</sup>−<sup>4</sup>

*N*<sup>1</sup> = �

Table 3. Buck-Boost results.

**4. Sepic DC-DC converter**

*A*<sup>1</sup> =

⎡ ⎢ ⎢ ⎣

matrices:

problems (11) of Theorem 1 and (15) of Theorem 2, are

�

0.0211 0.0989 0.0989 0.4898

(21) are the matrices below and from (10) the guaranteed cost *J* < (*x*<sup>0</sup> − *xr*)�

*<sup>P</sup>* <sup>=</sup> <sup>1</sup> <sup>×</sup> <sup>10</sup>−<sup>4</sup>

−0.0168 −0.0400 −0.0400 0.0158

(Deaecto et al., 2010) for the model Sepic DC-DC converter.

�

respectively. Maintaining the same parameters, the optimal solution of minimization problem

�

�

The results are illustrated in Figure 6. Table 3 presents the obtained results. The next section

Theo. 1 37.5 10 0.72 Theo. 2 7.5 70 3.59 Theo. 3 37.5 10 0.72

is devoted to extend the theoretical results obtained in Theorems 1 (Deaecto et al., 2010) and 2

A Sepic converter (Single-Ended Primary Inductor Converter) is characterized by being able to operate as a step-up or step-down, without suffering from the problem of polarity reversal. The Sepic converter consists of an active power switch, a diode, two inductors and two capacitors and thus it is a nonlinear fourth order. The converter is modeled with parasitic resistances in series with the inductors. The switched system (1) is described by the following

> −*rL*1/*L*<sup>1</sup> 00 0 0 −*rL*2/*L*<sup>2</sup> −1/*L*<sup>2</sup> 0 0 1/*C*<sup>1</sup> 0 0 0 00 −1/(*RC*2)

, *<sup>P</sup>* <sup>=</sup> <sup>1</sup> <sup>×</sup> <sup>10</sup>−<sup>3</sup>

0.0211 0.0990 0.0990 0.4898

�

, *N*<sup>2</sup> =

*Overshoot* [*A*] *Time* [*ms*] *Cost* (6)

�

� ,

⎤ ⎥ ⎥ <sup>⎦</sup> , *<sup>B</sup>*<sup>1</sup> <sup>=</sup> ⎡ ⎢ ⎢ ⎣

⎤ ⎥ ⎥ ⎦ ,

0.0253 0.0600 0.0600 −0.0237

0.1450 0.0088 0.0088 0.2478

> � .

� ,

*P*(*x*<sup>0</sup> − *xr*) = 0.72:

Fig. 7. Sepic DC-DC converter.

$$A\_{2} = \begin{bmatrix} -r\_{L1}/L\_{1} & 0 & -1/L\_{1} & -1/L\_{1} \\ 0 & -r\_{L2}/L\_{2} & 0 & 1/L\_{2} \\ 1/C\_{1} & 0 & 0 & 0 \\ 1/C\_{2} & -1/C\_{2} & 0 & -1/(RC\_{2}) \end{bmatrix}, \quad B\_{2} = \begin{bmatrix} 1/L\_{1} \\ 0 \\ 0 \\ 0 \end{bmatrix}. \tag{35}$$

For this converter, consider that *iL*1(*t*), *iL*2(*t*) denote the inductors currents and *Vc*1(*t*), *Vc*2(*t*) the capacitors voltages, that again were adopted as state variables of the system:

$$\mathbf{x}(t) = [\mathbf{x}\_1(t) \ \mathbf{x}\_2(t) \ \mathbf{x}\_3(t) \ \mathbf{x}\_4(t)]' = [i\_{\rm L1}(t) \ i\_{\rm L2}(t) \ V\_{\rm c1}(t) \ V\_{\rm c2}(t)]'.\tag{36}$$

Adopt the following operating point,

$$\mathbf{x}\_{\mathbf{r}} = \begin{bmatrix} \mathbf{x}\_{1r}(t) \ \mathbf{x}\_{2r}(t) \ \mathbf{x}\_{3r}(t) \ \mathbf{x}\_{4r}(t) \end{bmatrix}' = \begin{bmatrix} i\_{\mathrm{LIr}}(t) \ i\_{\mathrm{L2r}}(t) \ V\_{\mathrm{clr}}(t) \ V\_{\mathrm{clr}}(t) \end{bmatrix}'.\tag{37}$$

The DC-DC converter operates in continuous conduction mode. The used solver was the LMILab from the software MATLAB interfaced by YALMIP (Lofberg, 2004) . The parameters are the following: *Vg* = 100[*V*], *R* = 50[Ω], *rL*<sup>1</sup> = 2[Ω], *rL*<sup>2</sup> = 3[Ω], *L*<sup>1</sup> = 500[*μH*], *L*<sup>2</sup> = 600[*μH*], *C*<sup>1</sup> = 800[*μF*], *C*<sup>2</sup> = 470[*μF*] and

$$Q\_i = Q = \begin{bmatrix} \rho\_1 r\_{L1} & 0 & 0 & 0 \\ 0 & \rho\_2 r\_{L2} & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 \ \rho\_3 / R \end{bmatrix} \tag{38}$$

is the performance index matrix associated with the guaranteed cost

$$\int\_{0}^{\infty} (\rho\_1 r\_{L1} (\dot{\mathbf{i}}\_{L1} - \dot{\mathbf{i}}\_{Lr1})^2 + \rho\_2 r\_{L2} (\dot{\mathbf{i}}\_{L2} - \dot{\mathbf{i}}\_{Lr2})^2 + \rho\_3 \mathbf{R}^{-1} (V\_{c2} - V\_{c2r})^2) \, dt,\tag{39}$$

where *ρ<sup>i</sup>* ∈ IR<sup>+</sup> are design parameters. Before of all, the set of all attainable equilibrium point is calculated considering that

$$\mathbf{x}\_r = \{ [\mathbf{i}\_{L1r} \, \mathbf{i}\_{L2r} \, \mathbf{V}\_{c1r} \, \mathbf{V}\_{c2r}]' : \mathbf{V}\_{c1r} = \mathbf{V}\_{\mathcal{S}'} \quad \mathbf{0} \le \mathbf{V}\_{\mathcal{C}2r} \le \mathbf{R} \mathbf{i}\_{L2r} \}. \tag{40}$$

The initial condition was the origin *x* = [*iL*<sup>1</sup> *iL*<sup>2</sup> *Vc*<sup>1</sup> *Vc*2] � = [0000] � . Figure 8 shows the phase plane of the Sepic converter corresponding to the following values of load voltage *Vc*<sup>2</sup>*<sup>r</sup>* = {50, 60, . . . , 150}.

In this case, Theorem 1 presented a voltage setting time smaller than 30[*ms*] and the maximum current peak *iL*<sup>1</sup> = 34[*A*] and *iL*<sup>2</sup> = 9[*A*]. However, Theorem 2 showed a voltage setting time smaller than 80[*ms*], with currents peaks *iL*<sup>1</sup> = 34[*A*] and *iL*<sup>2</sup> = 13.5[*A*]. Now, in order to compare the results from the proposed Theorem 3, adopt origin as initial condition, *λ*<sup>1</sup> = 0.636, *λ*<sup>2</sup> = 0.364 and the equilibrium point equal to *xr* = [5.24 − 3 100 150] � . From the optimal solutions of minimization problems (11) and (15), we obtain respectively

$$P = 1 \times 10^{-4} \begin{bmatrix} 0.0141 & -0.0105 & 0.0037 & 0.0707 \\ -0.0105 & 0.0078 & -0.0026 & -0.0533 \\ 0.0037 & -0.0026 & 0.0016 & 0.0172 \\ 0.0707 & -0.0533 & 0.0172 & 0.3805 \end{bmatrix}^T$$
 
$$P = 1 \times 10^{-3} \begin{bmatrix} 0.0960 & -0.0882 \ 0.0016 & 0.0062 \\ -0.0882 & 0.0887 & 0.0184 & -0.0034 \\ 0.0016 & -0.0184 & 0.0940 & 0.0067 \\ 0.0062 & -0.0034 & 0.0067 & 0.2449 \end{bmatrix}^T$$

Fig. 8. Sepic DC-DC converter phase plane.

12 Will-be-set-by-IN-TECH

�� = �

The DC-DC converter operates in continuous conduction mode. The used solver was the LMILab from the software MATLAB interfaced by YALMIP (Lofberg, 2004) . The parameters are the following: *Vg* = 100[*V*], *R* = 50[Ω], *rL*<sup>1</sup> = 2[Ω], *rL*<sup>2</sup> = 3[Ω], *L*<sup>1</sup> = 500[*μH*], *L*<sup>2</sup> =

where *ρ<sup>i</sup>* ∈ IR<sup>+</sup> are design parameters. Before of all, the set of all attainable equilibrium point

the phase plane of the Sepic converter corresponding to the following values of load voltage

In this case, Theorem 1 presented a voltage setting time smaller than 30[*ms*] and the maximum current peak *iL*<sup>1</sup> = 34[*A*] and *iL*<sup>2</sup> = 9[*A*]. However, Theorem 2 showed a voltage setting time smaller than 80[*ms*], with currents peaks *iL*<sup>1</sup> = 34[*A*] and *iL*<sup>2</sup> = 13.5[*A*]. Now, in order to compare the results from the proposed Theorem 3, adopt origin as initial condition, *λ*<sup>1</sup> =

> 0.0141 −0.0105 0.0037 0.0707 −0.0105 0.0078 −0.0026 −0.0533 0.0037 −0.0026 0.0016 0.0172 0.0707 −0.0533 0.0172 0.3805

0.0960 −0.0882 0.0016 0.0062 −0.0882 0.0887 0.0184 −0.0034 0.0016 −0.0184 0.0940 0.0067 0.0062 −0.0034 0.0067 0.2449

0.636, *λ*<sup>2</sup> = 0.364 and the equilibrium point equal to *xr* = [5.24 − 3 100 150]

optimal solutions of minimization problems (11) and (15), we obtain respectively

⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣

> ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣

*ρ*1*rL*<sup>1</sup> 000 0 *ρ*2*rL*<sup>2</sup> 0 0 0 000 0 00 *ρ*3/*R*

(*ρ*1*rL*1(*iL*<sup>1</sup> <sup>−</sup> *iLr*1)<sup>2</sup> <sup>+</sup> *<sup>ρ</sup>*2*rL*2(*iL*<sup>2</sup> <sup>−</sup> *iLr*2)<sup>2</sup> <sup>+</sup> *<sup>ρ</sup>*3*R*−1(*Vc*<sup>2</sup> <sup>−</sup> *Vc*<sup>2</sup>*r*)2) *dt*, (39)

*iL*<sup>1</sup>*r*(*t*) *iL*<sup>2</sup>*r*(*t*) *Vc*<sup>1</sup>*r*(*t*) *Vc*<sup>2</sup>*r*(*t*)

⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦

� : *Vc*<sup>1</sup>*<sup>r</sup>* = *Vg*, 0 ≤ *Vc*<sup>2</sup>*<sup>r</sup>* ≤ *RiL*<sup>2</sup>*r*}. (40)

� = [0000]

⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ ,

⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ . �

. Figure 8 shows

�

. From the

��

, (38)

. (37)

Adopt the following operating point,

600[*μH*], *C*<sup>1</sup> = 800[*μF*], *C*<sup>2</sup> = 470[*μF*] and

*x*1*r*(*t*) *x*2*r*(*t*) *x*3*r*(*t*) *x*4*r*(*t*)

*Qi* = *Q* =

is the performance index matrix associated with the guaranteed cost

*xr* = {[*iL*<sup>1</sup>*<sup>r</sup> iL*<sup>2</sup>*<sup>r</sup> Vc*<sup>1</sup>*<sup>r</sup> Vc*<sup>2</sup>*r*]

The initial condition was the origin *x* = [*iL*<sup>1</sup> *iL*<sup>2</sup> *Vc*<sup>1</sup> *Vc*2]

*<sup>P</sup>* <sup>=</sup> <sup>1</sup> <sup>×</sup> <sup>10</sup>−<sup>4</sup>

*<sup>P</sup>* <sup>=</sup> <sup>1</sup> <sup>×</sup> <sup>10</sup>−<sup>3</sup>

⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣

*xr* = �

� ∞ 0

is calculated considering that

*Vc*<sup>2</sup>*<sup>r</sup>* = {50, 60, . . . , 150}.

Maintaining the same parameters, the optimal solution of minimization problem (21) are the matrices below and from (10) the guaranteed cost *J* < (*x*<sup>0</sup> − *xr*)� *P*(*x*<sup>0</sup> − *xr*) = 0.93:

$$\begin{aligned} P &= 1 \times 10^{-4} \begin{bmatrix} 0.0141 & -0.0105 & 0.0037 & 0.0707 \\ -0.0105 & 0.0078 & -0.0026 & -0.0533 \\ 0.0037 & -0.0026 & 0.0016 & 0.0172 \\ 0.0707 & -0.0533 & 0.0172 & 0.3805 \end{bmatrix}, \\\\ N\_1 &= \begin{bmatrix} -0.0113 & 0.0099 & 0.0003 & -0.0286 \\ 0.0099 & -0.0085 & 0.0002 & 0.0290 \\ 0.0003 & 0.0002 & 0.0009 & 0.0088 \\ -0.0286 & 0.0290 & 0.0088 & 0.0168 \end{bmatrix}, \\\\ N\_2 &= \begin{bmatrix} 0.0197 & -0.0173 & -0.0005 & 0.0500 \\ -0.0173 & 0.0148 & -0.0003 & -0.0507 \\ -0.0005 & -0.0003 & -0.0015 & -0.0154 \\ 0.0500 & -0.0507 & -0.0154 & -0.0293 \end{bmatrix}. \end{aligned}$$

The results are illustrated in Figure 9 and Table 4 presents the obtained results from the simulations.

Fig. 9. Sepic dynamic.


Table 4. Sepic results.

**Remark 1.** *From the simulations results, note that the proposed Theorem 3 presented the same results obtained by applying Theorem 1. Theorem 3 is an interesting theoretical result, as described in Theorem 4, and the authors think that it can be useful in the design of more general switched controllers.*

#### **5. Conclusions**

14 Will-be-set-by-IN-TECH

The results are illustrated in Figure 9 and Table 4 presents the obtained results from the

(a) Phase plane. (b) Normalized Lyapunov functions <sup>−</sup> *<sup>V</sup>*(*x*(*t*))

(c) Voltage *Vc*2(*t*). (d) Current *iL*1(*t*).

(e) Voltage *Vc*1(*t*). (f) Current *iL*2(*t*).

Theo. 1 34 30 0.93 Theo. 2 34 80 6.66 Theo. 3 34 30 0.93

*Overshoot* [*A*] *Time* [*ms*] *Cost* (6)

*<sup>V</sup>*(*x*(0)) .

simulations.

Fig. 9. Sepic dynamic.

Table 4. Sepic results.

This paper presented a study about the stability and control design for switched affine systems. Theorems proposed in (Deaecto et al., 2010) and later modified to include bounds on output peak on the control project were presented. A new theorem for designing switching affine control systems, with a flexibility that generalises Theorems 1 and 2 from (Deaecto et al., 2010) was proposed. Finally, simulations involving four types of converters namely Buck, Boost, Buck-Boost and Sepic illustrate the simplicity, quality and usefulness of this design methodology. It was also the first time that this class of controller was used for controlling a Sepic converter, that is a fourth order system and so is more complicated than the switched control design of second order Buck, Boost and Buck-Boost converters (Deaecto et al., 2010).

#### **6. Acknowledgement**

The authors gratefully acknowledge the financial support by CAPES, FAPESP and CNPq from Brazil.

#### **7. References**


### **PID Controller Tuning Based on the Classification of Stable, Integrating and Unstable Processes in a Parameter Plane**

Tomislav B. Šekara and Miroslav R. Mataušek *University of Belgrade/Faculty of Electrical Engineering, Serbia* 

#### **1. Introduction**

16 Will-be-set-by-IN-TECH

116 Frontiers in Advanced Control Systems

Feron, E. (1996). Quadratic stabilizability of switched systems via state and output

Gahinet, P., Nemirovski, A., Laub, A. J. & Chilali, M. (1995). *LMI control toolbox - for use with*

He, Y., Xu, W. & Cheng, Y. (2010). A novel scheme for sliding-mode control of DC-DC

Ji, Z., Wang, L. & Xie, G. (2005). *Quadratic stabilization of switched systems*, v. 36, 7 edn, Taylor

Liberzon, D. & Morse, A. S. (1999). *Basic problems in stability and design of switched systems*, v. 19,

Lofberg, J. (2004). Yalmip : a toolbox for modeling and optimization in MATLAB, *Computer Aided Control Systems Design, 2004 IEEE International Symposium on*, p. 284 –289. Mazumder, S. K., Nayfeh, A. H. & Borojevic, D. (2002). Robust control of parallel DC-DC buck

control schemes, *IEEE Trans. on Power Electron.*, v. 17, n. 3, p. 428–437. Peaucelle, D., Henrion, D., Labit, Y. & Taitz, K. (2002). User's guide for sedumi interface 1.04. Skafidas, E., Evans, R. J., Savkin, A. V. & Petersen, I. R. (1999). Stability results for switched

Sun, Z. & Ge, S. S. (2005). *Switched Linear Systems: Control and Design*, Springer, London. Teixeira, M. C. M., Assunção, E. & Avellar, R. G. (2003). On relaxed LMI-based designs for

Teixeira, M. C. M., Covacic, M. R. & Assuncao, E. (2006). Design of SPR systems with dynamic

Xu, X., Zhai, G. & He, S. (2008). On practical asymptotic stabilizability of switched affine

Yoshimura, V. L., Assunção, E., Teixeira, M. C. M. & Mainardi Júnior, E. I. (2011). A

systems, *Nonlinear Analysis: Hybrid Systems* v. 2, p. 196–208.

controller systems, *Automatica*, v. 35, p. 553–564.

Liberzon, D. (2003). *Switching in Systems and Control*, Systems & Control, Birkhuser.

converters with a constant frequency based on the averaging model, *Journal of Power*

converters by combining integral-variable-structure and multiple-sliding-surface

fuzzy regulators and fuzzy observers, *IEEE Trans. Fuzzy Syst.* v. 11, n. 5, p. 613 – 623.

compensators and output variable structure control, *Int. Workshop Var. Structure Syst.*,

comparison of performance indexes in DC-DC converters under different stabilizing state-dependent switching laws, *Power Electronics Conference (COBEP), 2011 Brazilian*,

feedback,*Technical report CICSP- 468* (MIT)..

*MATLAB*.

& Francis.

p. 328 –333.

p. 1069 –1075.

*Electronics*, v. 10, n.1, p. 1–8.

5 edn, IEEE Constr. Syst. Mag.

Classification of processes and tuning of the PID controllers is initiated by Ziegler and Nichols (1942). This methodology, proposed seventy years ago, is still actual and inspirational. Process dynamics characterization is defined in both the time and frequency domains by the two parameters. In the time domain, these parameters are the velocity gain *K*v and dead-time *L* of an Integrator Plus Dead-Time (IPDT) model GZN(s)=*K*vexp(-*Ls*)/*s*, defined by the reaction curve obtained from the open-loop step response of a process. In the frequency domain these parameters are the ultimate gain *k*u and ultimate frequency *ω*u, obtained from oscillations of the process in the loop with the proportional controller *k*=*k*u. The relationship between parameters in the time and frequency domains is determined by Ziegler and Nichols as

$$\mathcal{L} = \frac{\pi}{2\alpha\mathbf{o}\_{\mathbf{u}}},\\\mathcal{K}\_{\mathbf{v}} = \mathcal{E}\frac{\alpha\mathbf{o}\_{\mathbf{u}}}{k\_{\mathbf{u}}},\\\mathcal{E} = \mathcal{E}\_{\mathbf{Z}\mathbf{N}} = \frac{4}{\pi} \cdot \text{s} \tag{1}$$

However, for the process *G*p(*s*)=*G*ZN(*s*) in the loop with the proportional controller *k*, one obtains from the Nyquist stability criterion the same relationship (1) with *ε*=1. As a consequence, from (1) and the Ziegler-Nichols frequency response PID controller tuning, where the proportional gain is *k*=0.6*k*u, one obtains the step response tuning *k*=0.3*επ*/(*K*v*L*). Thus, for *ε*=εZN one obtains *k*=1.2/(*K*v*L*), as in (Ziegler & Nichols, 1942), while for *ε*=1 one obtains *k*=0.9425/(*K*v*L*), as stated in (Aström & Hägglund, 1995a). According to (1), the same values of the integral time *T*i=*π*/*ω*u and derivative time *T*d=0.25*π*/*ω*u are obtained in both frequency and time domains, in (Ziegler & Nichols, 1942) and from the Nyquist analysis. This will be discussed in more detail in Section 2.

Tuning formulae proposed by Ziegler and Nichols, were improved in (Hang et al., 1991; Aström & Hägglund, 1995a; 1995b; 2004). Besides the ultimate gain *k*u and ultimate frequency *ω*u of process *G*p(*s*), the static gain *K*p=*G*(0), for stable processes, and velocity gain **v p** <sup>0</sup> lim ( ) *K sG s <sup>s</sup>* , for integrating processes, are used to obtain better process dynamics characterization and broader classification (Aström et al.,1992). Stable processes are approximated by the First-Order Plus Dead-Time (FOPDT) model *G*FO(*s*)=*K*pexp(-*Ls*)/(*Ts*+1) and classified into four categories, by the normalized gain *κ*1=*K*p*k*u and normalized deadtime *θ*1=*L*/*T*. Integrating processes are approximated by the Integrating First-Order Plus Dead-Time (IFOPDT) model GIF(*s*)=*K*vexp(-*Ls*)/(*s*(*T*v*s*+1)) and classified into two categories, by the normalized gain *κ*2=*K*v*k*u/*ω*u and normalized dead-time *θ*2=*L*/*T*v. The idea of classification proposed in (Aström et al., 1992) was to predict the achievable closed-loop performance and to make possible performance evaluation of feedback loops under closedloop operating conditions.

In the present chapter a more ambitious idea is presented: define in advance the PID controller parameters in a classification plane for the purpose of obtaining a PID controller guaranteeing the desired performance/robustness tradeoff for the process classified into the desired region of the classification plane. It is based on the recent investigations related to: I) the process modeling of a large class of stable processes, processes having oscillatory dynamics, integrating and unstable processes, with the ultimate gain *k*u (Šekara & Mataušek, 2010a; Mataušek & Šekara, 2011), and optimizations of the PID controller under constraints on the sensitivity to measurement noise, robustness, and closed-loop system damping ratio (Šekara & Mataušek, 2009,2010a; Mataušek & Šekara, 2011), II) the closed-loop estimation of model parameters (Mataušek & Šekara, 2011; Šekara & Mataušek, 2011b, 2011c), and III) the process classification and design of a new Gain Scheduling Control (GSC) in the parameter plane (Šekara & Mataušek, 2011a).

The motive for this research was the fact that the thermodynamic, hydrodynamic, chemical, nuclear, mechanical and electrical processes, in a large number of plants with a large number of operating regimes, constitutes practically an infinite batch of transfer functions *G*p(*s*), applicable for the process dynamics characterization and PID controller tuning. Since all these processes are nonlinear, some GSC must be applied in order to obtain a high closed-loop performance/robustness tradeoff in a large domain of operating regimes. A direct solution, mostly applied in industry, is to perform experiments on the plant in order to define GSC as the look-up tables relating the controller parameters to the chosen operating regimes. The other solution, more elegant and extremely time-consuming, is to define nonlinear models used for predicting accurately dynamic characteristics of the process in a large domain of operating regimes and to design a continuous GSC (Mataušek et al., 1996). However, both solutions are dedicated to some plant and to some region of operating regimes in the plant. The same applies for the solution defined by a nonlinear controller, for example the one based on the neural networks (Mataušek et al., 1999).

A real PID controller is defined by Fig. 1, with *C*(*s*) and *C*ff(*s*) given by

$$C(s) = \frac{k\_{\rm df}s^2 + ks + k\_{\rm i}}{s(T\_{\rm f}s + 1)} F\_{\rm C}(s) \; \; \; C\_{\rm ff}(s) = \frac{k\_{\rm ff}s + k\_{\rm i}}{s} F\_{\rm C}(s) \; \; \; k\_{\rm ff} = bk\_{\rm r} \; \; F\_{\rm C}(s) = 1 \; \; \; \; 0 \; \; \; \; \; \; \quad \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \;$$

Fig. 1. Process *G*p(*s*) with a two-degree-of-freedom controller.

118 Frontiers in Advanced Control Systems

time *θ*1=*L*/*T*. Integrating processes are approximated by the Integrating First-Order Plus Dead-Time (IFOPDT) model GIF(*s*)=*K*vexp(-*Ls*)/(*s*(*T*v*s*+1)) and classified into two categories, by the normalized gain *κ*2=*K*v*k*u/*ω*u and normalized dead-time *θ*2=*L*/*T*v. The idea of classification proposed in (Aström et al., 1992) was to predict the achievable closed-loop performance and to make possible performance evaluation of feedback loops under closed-

In the present chapter a more ambitious idea is presented: define in advance the PID controller parameters in a classification plane for the purpose of obtaining a PID controller guaranteeing the desired performance/robustness tradeoff for the process classified into the desired region of the classification plane. It is based on the recent investigations related to: I) the process modeling of a large class of stable processes, processes having oscillatory dynamics, integrating and unstable processes, with the ultimate gain *k*u (Šekara & Mataušek, 2010a; Mataušek & Šekara, 2011), and optimizations of the PID controller under constraints on the sensitivity to measurement noise, robustness, and closed-loop system damping ratio (Šekara & Mataušek, 2009,2010a; Mataušek & Šekara, 2011), II) the closed-loop estimation of model parameters (Mataušek & Šekara, 2011; Šekara & Mataušek, 2011b, 2011c), and III) the process classification and design of a new Gain Scheduling Control (GSC) in the parameter

The motive for this research was the fact that the thermodynamic, hydrodynamic, chemical, nuclear, mechanical and electrical processes, in a large number of plants with a large number of operating regimes, constitutes practically an infinite batch of transfer functions *G*p(*s*), applicable for the process dynamics characterization and PID controller tuning. Since all these processes are nonlinear, some GSC must be applied in order to obtain a high closed-loop performance/robustness tradeoff in a large domain of operating regimes. A direct solution, mostly applied in industry, is to perform experiments on the plant in order to define GSC as the look-up tables relating the controller parameters to the chosen operating regimes. The other solution, more elegant and extremely time-consuming, is to define nonlinear models used for predicting accurately dynamic characteristics of the process in a large domain of operating regimes and to design a continuous GSC (Mataušek et al., 1996). However, both solutions are dedicated to some plant and to some region of operating regimes in the plant. The same applies for the solution defined by a nonlinear

controller, for example the one based on the neural networks (Mataušek et al., 1999).

**ff**( ) ( ) *<sup>C</sup> ks k C s Fs s*

> *d u*

<sup>p</sup> *G s*( )

*n y*

*C s*( )

, **ff** , ( ) 1, 0*<sup>C</sup> k bk F s b* . (2)

A real PID controller is defined by Fig. 1, with *C*(*s*) and *C*ff(*s*) given by

ff *C s*( ) *<sup>r</sup>*

, **ff i**

Fig. 1. Process *G*p(*s*) with a two-degree-of-freedom controller.

**d i f**

2 ( ) ( ) ( 1) *<sup>C</sup> k s ks k C s F s sTs*

loop operating conditions.

plane (Šekara & Mataušek, 2011a).

An effective implementation of the control system (2) is defined by relations

$$\mathbf{L}I(\mathbf{s}) = \left(k\left(bR(\mathbf{s}) - Y\_{\mathbf{f}}(\mathbf{s})\right) + \frac{k\_{\mathbf{i}}}{s}\left(R(\mathbf{s}) - Y\_{\mathbf{f}}(\mathbf{s})\right) - k\_{\mathbf{d}}sY\_{\mathbf{f}}(\mathbf{s})\right) \mathbf{F}\_{\mathbf{f}}(\mathbf{s})\,,\ \mathbf{Y}\_{\mathbf{f}}(\mathbf{s}) = \frac{Y(\mathbf{s})}{T\_{\mathbf{f}}s + 1}\,\tag{3}$$

for *F*C(*s*)≡1 as in (Panagopoulos et al., 2002; Mataušek & Šekara, 2011). When the proportional, integral, and derivative gains (*k*, *k*i, *k*d) and derivative (noise) filter time constant *T*f are determined, parameter *b* can be tuned as proposed in (Panagopoulos et al., 2002). The PID controller (2), *F*C(*s*)≡1, can be implemented in the traditional form, when noise filtering affects the derivative term only if some conditions are fulfilled (Šekara & Mataušek, 2009). The derivative filter time constant *T*f must be an integral part of the PID optimization and tuning procedures (Isaksson & Graebe, 2002; Šekara & Mataušek, 2009).

For *F*C(*s*) given by a second-order filter, one obtains a new implementation of the Modified Smith Predictor (Mataušek & Micić, 1996, 1999). The MSP-PID controller (3) guarantees better performance/robustness tradeoff than the one obtained by the recently proposed Dead-Time Compensators (DTC's), optimized under the same constraints on the sensitivity to measurement noise and robustness (Mataušek & Ribić, 2012).

Robustness is defined here by the maximum sensitivity *M*s and maximum complementary sensitivity *M*p. The sensitivity to measurement noise *M*n, *M*s, and *M*p are given by

$$M\_{\mathbf{s}} = \max\_{o} \left| \frac{1}{1 + L(\mathbf{i} \, o)} \right|, \; M\_{\mathbf{p}} = \max\_{o} \left| \frac{L(\mathbf{i} \, o)}{1 + L(\mathbf{i} \, o)} \right|, \; M\_{\mathbf{n}} = \max\_{o} \left| \mathbb{C}\_{\mathbf{n} \mathbf{u}}(\mathbf{i} \, o) \right|, \tag{4}$$

where *L*(*s*) is the loop transfer function and *C*nu(*s*) is the transfer function from the measurement noise to the control signal. In the present chapter, the sensitivity to the high frequency measurement noise is used *M*n=*M*<sup>n</sup>, where *M*<sup>n</sup>=| *C*nu(*s*)|s→∞.

#### **2. Modeling and classification of stable, integrating, and unstable plants**

A generalization of the Ziegler-Nichols process dynamics characterization, proposed by Šekara and Mataušek (2010a), is defined by the model

$$\mathbf{G\_m(s)} = \frac{Ao\_\mathbf{u} \exp(-\tau s)}{s^2 + o\_\mathbf{u}^2 - Ao\_\mathbf{u} \exp(-\tau s)} \frac{1}{k\_\mathbf{u}} \; , \; \tau = \frac{\rho}{o\_\mathbf{u}} \; , \; A = \frac{o\_\mathbf{u} k\_\mathbf{u} G\_\mathbf{p}(0)}{1 + k\_\mathbf{u} G\_\mathbf{p}(0)} \; , \tag{5}$$

where *φ* is the angle of the tangent to the Nyquist curve *G*p(i*ω*) at *ω*u and *G*p(0) is the gain at the frequency equal to zero. Thus, for integrating processes *G*p(0)=∞ and *A*=*ω*u. Adequate approximation of *G*p(*s*) by the model *G*m(*s*) is obtained for *ω*u*ω*, where arg{*G*p(i)}=. It is demonstrated in (Šekara & Mataušek, 2010a; Mataušek & Šekara, 2011, Šekara & Mataušek, 2011a) that this extension of the Ziegler-Nichols process dynamics characterization, for a large class of stable processes, processes with oscillatory dynamics, integrating and unstable processes, guarantees the desired performance/robustness tradeoff if optimization of the PID controller, for the given maximum sensitivity *M*s and given sensitivity to measurement noise *M*n, is performed by applying the frequency response of the model (5) instead of the exact frequency response *G*p(i*ω*).

Ziegler and Nichols used oscillations, defined by the impulse response of the system

$$G\_{\mathfrak{p}}^{\bullet}(\mathbf{s}) = \frac{k\_{\mathfrak{u}} G\_{\mathfrak{p}}(\mathbf{s})}{1 + k\_{\mathfrak{u}} G\_{\mathfrak{p}}(\mathbf{s})} \; \; \tag{6}$$

to determine *k*u and *ω*u, and to define tuning formulae for adjusting parameters of the P, PI and PID controllers, based on the relationship between the quarter amplitude damping ratio and the proportional gain *k*. Oscillations defined by the impulse response of the system (6) are used in (Šekara & Mataušek, 2010a) to define model (5), obtained from *G*m(*s*)≈*G*p(*s*) and the relation

$$\frac{k\_{\mathbf{u}} G\_{\mathbf{m}}(s)}{1 + k\_{\mathbf{u}} G\_{\mathbf{m}}(s)} = \frac{A \alpha\_{\mathbf{u}} \exp(-\tau s)}{s^2 + \alpha\_{\mathbf{u}}^2} \,. \tag{7}$$

Then, by analyzing these oscillations, it is obtained in (Šekara & Mataušek, 2010a) that amplitude *A*=*ω*u*κ*/(1+*κ*), *κ*=*k*u*G*p(0), and dead-time *τ* is defined by *ω*u and a parameter *φ*, given by

$$\varphi = \arg\left(\frac{\partial \mathbf{G\_p(i\rho)}}{\partial \boldsymbol{\alpha}}\right)\bigg|\_{\boldsymbol{\alpha}=\boldsymbol{\alpha}\_{\mathbf{u}}}.\tag{8}$$

Other interpretation of amplitude *A*= *A*0, obtained in (Mataušek & Šekara, 2011), is defined by

$$A\_0 = \frac{2}{k\_\mathbf{u}} \left| \frac{\partial \mathbf{G\_p(i\rho o)}}{\partial \rho o} \right|\_{\rho = \alpha\_\mathbf{u}}^{-1} \,. \tag{9}$$

Amplitudes *A* and *A*0 are not equal, but they are closely related for stable and unstable processes, as demonstrated in (Mataušek & Šekara, 2011) and Appendix. Parameter *A*0 is not used for integrating processes, since for these processes *A*=*ω*u.

The quadruplet {*k*u, *ω*u, *φ*, *A*} is used for classification of stable processes, processes with oscillatory dynamics, integrating and unstable processes in the *ρ*-*φ* parameter plane, defined by the normalized model (5), given by

$$G\_{\mathbf{n}}(s\_{\mathbf{n}}, \rho, \rho) = \frac{\rho \exp(-\rho \mathbf{s}\_{\mathbf{n}})}{s\_{\mathbf{n}}^2 + 1 - \rho \exp(-\rho \mathbf{s}\_{\mathbf{n}})}, \ s\_{\mathbf{n}} = \frac{s}{o\_{\mathbf{u}}} \,\,\,\tag{10}$$

where *ρ*=*A*/*ω*u. From the Nyquist criterion it is obtained that the region of stable processes is defined by 0 / 1,0 1 (Šekara & Mataušek, 2011a). Integrating processes, since *A*=*ω*u, are classified as 1, 0 / 2 processes, while unstable processes are outside these regions. It is demonstrated that a large test batch of stable and integrating processes used in (Aström & Hägglund, 2004) covers a small region in the *ρ*-*φ* plane.

To demonstrate that besides *k*u and *ω*u, parameters *φ* and *G*p(0) must by used for the classification of processes, Nyquist curves are presented in Fig. 2 for stable, integrating and 120 Frontiers in Advanced Control Systems

**u p**

to determine *k*u and *ω*u, and to define tuning formulae for adjusting parameters of the P, PI and PID controllers, based on the relationship between the quarter amplitude damping ratio and the proportional gain *k*. Oscillations defined by the impulse response of the system (6) are used in (Šekara & Mataušek, 2010a) to define model (5), obtained from *G*m(*s*)≈*G*p(*s*) and

*kG s A s*

Then, by analyzing these oscillations, it is obtained in (Šekara & Mataušek, 2010a) that amplitude *A*=*ω*u*κ*/(1+*κ*), *κ*=*k*u*G*p(0), and dead-time *τ* is defined by *ω*u and a parameter *φ*,

**<sup>p</sup>**( ) **i**

Other interpretation of amplitude *A*= *A*0, obtained in (Mataušek & Šekara, 2011), is defined

**p**

Amplitudes *A* and *A*0 are not equal, but they are closely related for stable and unstable processes, as demonstrated in (Mataušek & Šekara, 2011) and Appendix. Parameter *A*0 is not

The quadruplet {*k*u, *ω*u, *φ*, *A*} is used for classification of stable processes, processes with oscillatory dynamics, integrating and unstable processes in the *ρ*-*φ* parameter plane, defined

**<sup>n</sup> n n <sup>2</sup> <sup>n</sup>**

where *ρ*=*A*/*ω*u. From the Nyquist criterion it is obtained that the region of stable processes is

outside these regions. It is demonstrated that a large test batch of stable and integrating

To demonstrate that besides *k*u and *ω*u, parameters *φ* and *G*p(0) must by used for the classification of processes, Nyquist curves are presented in Fig. 2 for stable, integrating and

exp( ) ( ,,) , 1 exp( ) *s s G s <sup>s</sup> s s*

> 

processes used in (Aström & Hägglund, 2004) covers a small region in the *ρ*-*φ* plane.

**i** <sup>1</sup>

 

**n n u**

 

 

**u**

<sup>2</sup> *<sup>G</sup>* ( ) *<sup>A</sup> k*

0

used for integrating processes, since for these processes *A*=*ω*u.

 

> 

by the normalized model (5), given by

defined by 0 / 1,0 1 

since *A*=*ω*u, are classified as

  *G*

**u**

**u**

 

. (7)

. (9)

(Šekara & Mataušek, 2011a). Integrating processes,

1, 0 / 2 processes, while unstable processes are

. (8)

, (10)

**u p** ( ) ( ) 1 () *kG s*

*kG s*

, (6)

Ziegler and Nichols used oscillations, defined by the impulse response of the system

**u m u <sup>2</sup> u m <sup>u</sup>** 2 ( ) exp( )

*kG s s*

1 ()

arg

**p**

the relation

given by

by

*G s*

unstable processes having the same values *k*u=1 and *ω*u=1. For processes having also the same values of *φ*, the distinction of the Nyquist curves in the broader region around the critical point requires the information about gain *G*p(0), as demonstrated in Fig. 2-a. On the other hand, the results presented in Fig. 2-b to Fig. 2-d demonstrate that for the same values of *k*u, *ω*u, and *G*p(0) the distinction of the Nyquist curves in the region around the critical point is obtained by applying parameter *φ*. This fact confirms importance of parameter *φ* in process modeling for controller tuning, taking into account that optimization of the PID controller under constraints on the robustness is performed in the region around *ω*u.

Fig. 2. Nyquist curves of processes with the same values *k*u=1, *ω*u=1: a) *φ*=*π*/4, stable *G*p(0)=1 (dashed), integrating *G*p(0)=∞ (solid), unstable *G*p(0)=–2 (dashed-dotted); b) stable processes with *G*p(0)=1, for *φ*=*π*/4 (dashed), *φ*=*π*/6 (solid), *φ*= *π*/3 (dashed-dotted); c) integrating processes with *φ*=1 (dashed), *φ*= *π*/4 (solid), *φ*=1.2 (dashed-dotted); d) unstable processes with *G*p(0)= –2, for *φ*= *π*/4 (dashed), *φ*= *π*/6 ( solid), *φ*= *π*/3 (dashed-dotted).

For the lag dominated process

$$G\_{\mathfrak{p}^1}(\mathbf{s}) = 1 \;/\, \cosh \sqrt{2\mathbf{s}}\;/\tag{11}$$

and the corresponding models, the step and impulse responses, with the Nyquist curves around *ω*u, are presented in Fig. 3. The models are Ziegler-Nichols IPDT model *G*ZN(*s*)=*K*vexp(-*Ls*)/*s* and model (5), with *A*=*ω*u*k*u*G*p(0)/(1+*k*u*G*p(0)) and *A*=*A*0. The set-point and load disturbance step responses of this process, in the loop with the optimal PID controller (Mataušek & Šekara, 2011) and PID controller tuned as proposed by Ziegler and Nichols (1942), are compared in Fig. 4-a. In this case *k*u=11.5919, *ω*u=9.8696 and *K*v=0.9251, *L*=0.1534. The PID controller tuned as proposed by Ziegler and Nichols is implemented in the form

$$\text{LLI}(\mathbf{s}) = k \left( bR(\mathbf{s}) - Y(\mathbf{s}) \right) + \frac{k\_{\mathbf{i}}}{s} \left( R(\mathbf{s}) - Y(\mathbf{s}) \right) - \frac{k\_{\mathbf{d}}s}{T\_{\mathbf{f}}s + 1} \left( Y(\mathbf{s}) \right) \\ \text{ } \boldsymbol{b} = \mathbf{0} \text{ } \boldsymbol{k\_i} = \frac{k}{T\_{\mathbf{i}}} \text{ } \boldsymbol{k\_d} = k T\_{\mathbf{d}} \text{ } \boldsymbol{T\_{\mathbf{f}}} = \frac{T\_{\mathbf{d}}}{N\_{\mathbf{d}}} \text{ } \boldsymbol{\epsilon} \end{aligned} \tag{12}$$

where *k*=0.6*k*u, *T*i=*π*/*ω*u, *T*d= *π*/(4*ω*u), for the frequency domain ZN tuning (ZN PID1). For the time domain ZN tuning (ZN PID2) the parameters are *k*=1.2/(*K*v*L*), *T*i=2*L*, *T*d=*L*/2, or, as suggested by the earlier mentioned Nyquist analysis, proportional gain *k* is adjusted to *k*=0.943/(*K*v*L*), denoted as the modified time domain ZN tuning (ZN ModifPID2). In **n d** *M* ( 1) *N k* parameter *N*d is adjusted to obtain the same value of *M*n=76.37 used in the constrained optimization of the PID in (3), *F*C(*s*)≡1, where *M*n=|*k*d|/*T*f.

Parameters of the PID controllers and performance/robustness tradeoff are compared in Table 1. It is impressive that Ziegler and Nichols succeeded in defining seventy years ago an excellent experimental tuning for the process *G*p1(*s*), which is an infinite-order system that can be represented in simulation by the following high-order system **p1** 20 <sup>1</sup> ( ) exp( ) / ( 1) *G s Ls T s k k* , *L*=0.01013 (Mataušek & Ribić, 2009). Also, it should be noted here, that Ziegler and Nichols succeeded seventy years ago in obtaining an excellent tuning with the IPDT model defined by *K*v=0.9251, *L*=0.1534, which is an extremely crude approximation of the real impulse response of the process *G*p1(*s*), as in Fig. 3-b.


Table 1. Process *G*p1(*s*): comparison of the optimization (optPID) and the Ziegler-Nichols tuning in the frequency domain (ZN PID1) and time domain (ZN PID2, ZN ModifPID2).

The Nyquist curves of *G*p1(*s*), *G*m1(*s*), and *G*m2(*s*) are almost the same around *ω*u. This is important since the PID controller optimization, based on the experimentally determined frequency response of the process, under constraints on *M*s or on *M*s and *M*p, is performed around the ultimate frequency *ω*u. Amplitudes *A* and A0 are closely related for the stable and unstable processes, as demonstrated in (Mataušek & Šekara, 2011) and Appendix. For integrating processes *A*=*ω*u. This means, that the Ziegler–Nichols parameters *k*u and *ω*u, and the Šekara-Mataušek parameters φ and *A*=*A*0, for the stable and unstable processes, and *A*=*ω*u, for integrating processes, constitute the minimal set of parameters, measurable in the frequency domain, necessary for obtaining PID controller tuning for the desired performance/robustness tradeoff. This will be demonstrated in the subsequent sections.

#### **3. Optimization of PI/PID controllers under constraints on the sensitivity to measurement noise, robustness, and closed-loop system damping ratio**

PID controllers are still mostly used control systems in the majority of industrial applications (Desborough & Miller, 2002) and "it is reasonable to predict that PID control 122 Frontiers in Advanced Control Systems

*L*=0.1534. The PID controller tuned as proposed by Ziegler and Nichols is implemented in

where *k*=0.6*k*u, *T*i=*π*/*ω*u, *T*d= *π*/(4*ω*u), for the frequency domain ZN tuning (ZN PID1). For the time domain ZN tuning (ZN PID2) the parameters are *k*=1.2/(*K*v*L*), *T*i=2*L*, *T*d=*L*/2, or, as suggested by the earlier mentioned Nyquist analysis, proportional gain *k* is adjusted to *k*=0.943/(*K*v*L*), denoted as the modified time domain ZN tuning (ZN ModifPID2). In **n d** *M* ( 1) *N k* parameter *N*d is adjusted to obtain the same value of *M*n=76.37 used in the

Parameters of the PID controllers and performance/robustness tradeoff are compared in Table 1. It is impressive that Ziegler and Nichols succeeded in defining seventy years ago an excellent experimental tuning for the process *G*p1(*s*), which is an infinite-order system that can be represented in simulation by the following high-order system

<sup>1</sup> ( ) exp( ) / ( 1) *G s Ls T s k k* , *L*=0.01013 (Mataušek & Ribić, 2009). Also, it should be noted here, that Ziegler and Nichols succeeded seventy years ago in obtaining an excellent tuning with the IPDT model defined by *K*v=0.9251, *L*=0.1534, which is an extremely crude

method *<sup>k</sup> <sup>k</sup>*i *k*d *T*f *N*<sup>d</sup> *IAE M*n *M*<sup>s</sup> *<sup>M</sup>*<sup>p</sup> optPID 6.5483 18.4321 0.6345 0.0094 - 0.0609 76.37 2.00 1.45 ZN PID1 6.9551 21.8502 0.5535 0.0080 9.980 0.0538 76.37 2.20 1.72 ZN PID2 8.4560 27.5621 0.6486 0.0096 8.031 0.0429 76.37 2.82 2.23 ZN ModifPID2 6.6450 21.6592 0.5097 0.0073 10.49 0.0587 76.37 2.16 1.78 Table 1. Process *G*p1(*s*): comparison of the optimization (optPID) and the Ziegler-Nichols tuning in the frequency domain (ZN PID1) and time domain (ZN PID2, ZN ModifPID2).

The Nyquist curves of *G*p1(*s*), *G*m1(*s*), and *G*m2(*s*) are almost the same around *ω*u. This is important since the PID controller optimization, based on the experimentally determined frequency response of the process, under constraints on *M*s or on *M*s and *M*p, is performed around the ultimate frequency *ω*u. Amplitudes *A* and A0 are closely related for the stable and unstable processes, as demonstrated in (Mataušek & Šekara, 2011) and Appendix. For integrating processes *A*=*ω*u. This means, that the Ziegler–Nichols parameters *k*u and *ω*u, and the Šekara-Mataušek parameters φ and *A*=*A*0, for the stable and unstable processes, and *A*=*ω*u, for integrating processes, constitute the minimal set of parameters, measurable in the frequency domain, necessary for obtaining PID controller tuning for the desired performance/robustness tradeoff. This will be demonstrated in the subsequent sections.

**3. Optimization of PI/PID controllers under constraints on the sensitivity to measurement noise, robustness, and closed-loop system damping ratio**

PID controllers are still mostly used control systems in the majority of industrial applications (Desborough & Miller, 2002) and "it is reasonable to predict that PID control

, **<sup>d</sup>**

**i d df**

0, , , *<sup>k</sup> <sup>T</sup> b k k kT T*

**i d**

*T N* , (12)

**f**

**<sup>i</sup> <sup>d</sup>**

20

() () () () () () <sup>1</sup> *<sup>k</sup> k s U s k bR s Y s R s Y s Y s*

*s T s*

constrained optimization of the PID in (3), *F*C(*s*)≡1, where *M*n=|*k*d|/*T*f.

approximation of the real impulse response of the process *G*p1(*s*), as in Fig. 3-b.

the form

**p1**

Tuning

Fig. 3. Process Gp1(s), denoted as (*G*p), and models Gm*j*(s), *j*=1,2, *k*u=11.5919, *ω*u=9.8696, *τ*=0.0796 for *A*=9.0858 (*G*m1) and *A*=*A*0=8.9190 (*G*m2), and GZN(s)= *K*vexp(-*Ls*)/*s*, *K*v=0.9251, *L*=0.1534 (ZN): a) step responses, b) impulse responses, c) Nyquist curves of *G*p1(*s*) and *G*ZN(*s*), d) Nyquist curves of *G*p1(*s*), *G*m1(*s*) and *G*m2(*s*) are almost the same around *ω*u.

Fig. 4. Comparison of the optimization and the Ziegler-Nichols (ZN) tuning. Process *G*p1(*s*) in the loop with the optPID or ZN PID, tuned by using the rules: frequency domain (ZN PID1), time domain (ZN PID2), and time domain with the modified proportional gain *k*=0.943/(*K*v*L*) (ZN ModifPID2). In all controllers *b*=0 and *D*(*s*)=-5exp(-2.5*s*)/*s*.

will continue to be used in the future" (Aström & Hägglund, 2001). They operate mostly as regulators (Aström & Hägglund, 2001) and rejection of the load step disturbance is of primary importance to evaluate PID controller performance under constraints on the robustness (Shinskey, 1990), measured by the Integrated Absolute Error (IAE). Inadequate tuning and sensitivity to measurement noise are the reasons why derivative action is often excluded in the industrial process control. This is the main reason why PI controllers predominate (Yamamoto & Hashimoto, 1991). However, for lag-dominated processes, processes with oscillatory dynamics and integrating/unstable processes PID controller guarantees considerably better performance than PI controller, if adequate tuning of the PID controller is performed (Mataušek & Šekara, 2011). Moreover, PID controller is a "prerequisite for successful advanced controller implementation" (Seki & Shigemasa, 2010).

Besides PI/PID controllers, in single or multiple loops (Jevtović & Mataušek, 2010), only Dead-Time Compensators (DTC) are used in the process industry with an acceptable percentage (Yamamoto & Hashimoto, 1991). They are based on the Smith predictor (Smith, 1957; Mataušek & Kvaščev, 2003) or its modifications. However, the area of application of PID controllers overlaps deeply with the application of DTC's, as confirmed by the Modified Smith Predictor, which is a PID controller in series with a second-order filter, applicable to a large class of stable, integrating and unstable processes (Mataušek & Ribić, 2012).

Optimization of the performance may by carried out under constraints on the maximum sensitivity to measurement noise *M*n, the maximum sensitivity *M*s and maximum complementary sensitivity *M*p, as done in (Mataušek & Ribić, 2012). In this case it is recommended to use some algorithm for global optimization, such as Particle Swarm Optimization algorithm (Rapaić, 2008), requiring good estimates of the range of unknown parameters. Other alternatives, presented here, are recently developed in (Šekara & Mataušek, 2009, 2010a; Mataušek & Šekara, 2011). For the PID controller (3), for *F*C(*s*)≡1 defined by four parameters *k*, *k*i, *k*d and *T*f, optimization under constraints on *M*n and *M*s is reduced in (Šekara & Mataušek, 2009) to the solution of a system of three algebraic equations with adequate initial values of the unknown parameters. The adopted values of *M*n and *M*s are satisfied exactly for different values of *ζ*z. Thus, by repeating calculations for a few values of the damping ratio of the controller zeros *ζ*z in the range 0.5 *ζ*z, the value of *ζ*z corresponding to the minimum of IAE is obtained. Optimization methods from (Šekara & Mataušek, 2009) are denoted as max(k) and max(ki) methods.

The improvement of the max(k) method is proposed in (Šekara & Mataušek, 2010a). It consists of avoiding repetition of calculations for different values of *ζ*z in order to obtain the minimal value of the IAE for a desired value of *M*s. In this method, denoted here as method optPID, the constrained optimization is based on the frequency response of model (5).

For the PI optimization, an improvement of the performance/robustness tradeoff is obtained by applying the combined performance criterion *J*c=*k*i+(1-) (Šekara & Mataušek, 2008). Thus, one obtains

$$\max\_{k\_i, \alpha} f\_{c'} \tag{13}$$

$$F(\alpha, k, k\_i) = 0 \,, \hat{\alpha} F(\alpha, k, k\_i) / \,\hat{\alpha}\alpha = 0 \,\, \, \, \, \, \tag{14}$$

where 0ω<∞ and *β* is a free parameter in the range 0<*β*1. The calculations are repeated for a few values of , in order to find corresponding to the minimum of IAE. The optimization 124 Frontiers in Advanced Control Systems

primary importance to evaluate PID controller performance under constraints on the robustness (Shinskey, 1990), measured by the Integrated Absolute Error (IAE). Inadequate tuning and sensitivity to measurement noise are the reasons why derivative action is often excluded in the industrial process control. This is the main reason why PI controllers predominate (Yamamoto & Hashimoto, 1991). However, for lag-dominated processes, processes with oscillatory dynamics and integrating/unstable processes PID controller guarantees considerably better performance than PI controller, if adequate tuning of the PID controller is performed (Mataušek & Šekara, 2011). Moreover, PID controller is a "prerequisite for successful advanced controller implementation" (Seki & Shigemasa, 2010). Besides PI/PID controllers, in single or multiple loops (Jevtović & Mataušek, 2010), only Dead-Time Compensators (DTC) are used in the process industry with an acceptable percentage (Yamamoto & Hashimoto, 1991). They are based on the Smith predictor (Smith, 1957; Mataušek & Kvaščev, 2003) or its modifications. However, the area of application of PID controllers overlaps deeply with the application of DTC's, as confirmed by the Modified Smith Predictor, which is a PID controller in series with a second-order filter, applicable to a

large class of stable, integrating and unstable processes (Mataušek & Ribić, 2012).

Mataušek, 2009) are denoted as max(k) and max(ki) methods.

obtained by applying the combined performance criterion *J*c=

(,, ) 0 *F kk* 

Mataušek, 2008). Thus, one obtains

, in order to find

a few values of

Optimization of the performance may by carried out under constraints on the maximum sensitivity to measurement noise *M*n, the maximum sensitivity *M*s and maximum complementary sensitivity *M*p, as done in (Mataušek & Ribić, 2012). In this case it is recommended to use some algorithm for global optimization, such as Particle Swarm Optimization algorithm (Rapaić, 2008), requiring good estimates of the range of unknown parameters. Other alternatives, presented here, are recently developed in (Šekara & Mataušek, 2009, 2010a; Mataušek & Šekara, 2011). For the PID controller (3), for *F*C(*s*)≡1 defined by four parameters *k*, *k*i, *k*d and *T*f, optimization under constraints on *M*n and *M*s is reduced in (Šekara & Mataušek, 2009) to the solution of a system of three algebraic equations with adequate initial values of the unknown parameters. The adopted values of *M*n and *M*s are satisfied exactly for different values of *ζ*z. Thus, by repeating calculations for a few values of the damping ratio of the controller zeros *ζ*z in the range 0.5 *ζ*z, the value of *ζ*z corresponding to the minimum of IAE is obtained. Optimization methods from (Šekara &

The improvement of the max(k) method is proposed in (Šekara & Mataušek, 2010a). It consists of avoiding repetition of calculations for different values of *ζ*z in order to obtain the minimal value of the IAE for a desired value of *M*s. In this method, denoted here as method optPID, the constrained optimization is based on the frequency response of model (5).

For the PI optimization, an improvement of the performance/robustness tradeoff is

**i** , max *<sup>c</sup> <sup>k</sup> J* 

 *<sup>i</sup> ,* **<sup>i</sup>** *F kk* ( , , )/ 0 

where 0ω<∞ and *β* is a free parameter in the range 0<*β*1. The calculations are repeated for

 

corresponding to the minimum of IAE. The optimization

*k*i+(1-)

, (13)

, (14)

(Šekara &

in this method, denoted here as opt2, is performed for the desired value of *M*s. For *β*=1 one obtains the same values of parameters *k* and *k*i as obtained by the method proposed in (Aström et al., 1998), denoted here as opt1.

The most general is the new tuning and optimization procedure proposed in (Mataušek & Šekara, 2011). Besides the tuning formulae, the optimization procedure is derived. For the PID and PI controllers it requires only obtaining the solution of two nonlinear algebraic equations with adequate initial values of the unknown parameters. PID optimization is performed for the desired closed-loop system of damping ratio *ζ* and under constraints on *M*n and *M*s. Thus, for *ζ*=1 the critically damped closed-loop system response is obtained. PI optimization is performed under constraint on *M*s for the desired value of *ζ*. The procedure proposed in (Mataušek & Šekara, 2011) will be discussed here in more details, since it is entirely based on the concept of using oscillators (6)-(7) for dynamics characterization of the stable processes, processes having oscillatory dynamics, integrating and unstable processes. The method is derived by defining a complex controller *C*(*s*)=*k*u(1+*C*\*(*s*)), where the controller *C*\*(*s*), given by

$$C^\*(\mathbf{s}) = \frac{\mathbf{s}^2 + a\mathbf{s}^2}{Aa\rho\_\mathbf{u}\Lambda(\mathbf{s})} \frac{E(\mathbf{s}) \,/\,\Lambda(\mathbf{s})}{1 - E(\mathbf{s})\exp(-\tau\mathbf{s}) \,/\,\Lambda(\mathbf{s})^2}, \; E(\mathbf{s}) = \eta\_2\mathbf{s}^2 + \eta\_1\mathbf{s} + 1,\; \Lambda(\mathbf{s}) = \lambda^2\mathbf{s}^2 + 2\zeta\lambda\mathbf{s} + 1,\tag{15}$$

is obtained by supposing that in Fig. 1 process *G*p(*s*) is defined by oscillator **<sup>p</sup>** *G s*( ) in (6), approximated by (7). Complex controller *C*(*s*)=*k*u(1+*C*\*(*s*)) is defined by the parameters *k*u, u, , *A* and by the two tuning parameters *λ* and *ζ*, with the clear physical interpretation. Parameter *λ* is proportional to the desired closed-loop system time constant. Parameter *ζ* is the desired closed-loop system damping ratio. Then, by applying Maclaurin series expansion, the possible internal instability of the complex controller *C*(*s*) is avoided and parameters of PID controller *C*(*s*) in Fig. 1 are obtained, defined by:

$$T\_{\mathbf{f}} = \frac{\eta\_2 - \beta\_2(\eta\_1 - \beta\_2) - \beta\_3 + 1/\left|\alpha\_{\mathbf{u}}^2\right|}{\beta\_2 - \eta\_1 - (1 - M\_{\mathbf{n}} \ / \left|k\_{\mathbf{u}}\right|) / \beta\_1},\tag{16}$$

$$k = k\_{\mathbf{u}} \left( \beta\_1 (T\_{\mathbf{f}} + \eta\_1 - \beta\_2) + 1 \right), \\ k\_{\mathbf{i}} = k\_{\mathbf{u}} \beta\_1. \tag{17}$$

$$k\_{\mathbf{d}} = k\_{\mathbf{u}} \beta\_1 \left(\eta\_2 + (T\_{\mathbf{f}} - \beta\_2)(\eta\_1 - \beta\_2) - \beta\_3 + 1 \;/\; \alpha\_{\mathbf{u}}^2\right) + k\_{\mathbf{u}} T\_{\mathbf{f}}.\tag{18}$$

Parameters *η*1, *η*2, *β*1, *β*2 and *β*3, from (Mataušek & Šekara, 2011), depends on *λ*, *ζ* and *k*u, *ω*u, *τ*, *A*. They are given in Appendix. Generalization of this approach is presented in (Šekara & Trifunović, 2010; Šekara et al., 2011).

For the desired closed-loop damping ratio *ζ*=1, *λ*=1/*ω*u, and for

$$T\_{\mathbf{f}} = \mathbf{1} \left< \left( \mathrm{N}o\_{\mathbf{u}} \right) \right>\tag{19}$$

one obtains (Mataušek & Šekara, 2011) the PID tuning that guarantees set-point and load disturbance step responses with negligible overshoot for a large class of stable processes, processes with oscillatory dynamics, integrating and unstable processes. Tuning formulae defined by (17)-(19) are denoted here as method tunλu. Absolute value of the Integrated Error (IE), approximating almost exactly the obtained IAE, is given by |*IE*|=1/(|*k*u|*β*1). Here the value *T*f=1/(10*ω*u) is used, as in (Mataušek & Šekara, 2011).

To demonstrate the relationship between PID controller, tuned by using the method tunλu, and complex controller *C*(*s*)=*k*u(1+*C*\*(*s*)), obtained for *λ*=1/*ω*u and *ζ*=1, the frequency responses of these controllers, tuned for the process

$$G\_{\mathfrak{p}2}(\mathbf{s}) = 1 \;/\; (\mathbf{s} + \mathbf{1})^4 \;/\tag{20}$$

are presented in Fig. 5-a. For this process, parameters *k*u, u, , *A*, and are given in Appendix. The load disturbance unite step responses, obtained for *G*p2(*s*) in the loop with the PID controller and complex controller *C*(*s*), are presented in Fig. 5-b. Further details about the relationship between these controllers are presented in (Mataušek & Šekara, 2011; Trifunović & Šekara, 2011; Šekara et al., 2011).

Fig. 5. Comparison of the complex controller *C*(*s*)=*k*u(1+*C*\*(*s*)) with PID controller, both tuned for *G*p2(*s*): a) Bode plots of the controllers and b) the load unite step disturbance responses of *G*p2(*s*) in the loop with these controllers.

By applying tuning formulae (17)-(19), the desired closed-loop damping ratio *ζ*=1 is obtained with the acceptable values of maximum sensitivity *M*s and maximum sensitivity to measurement noise *M*n. However, when a smaller value of *M*n is required for a desired value of *M*s and the desired closed-loop damping ratio *ζ*, the other possibility is to determine the closed-loop time constant *λ* and the corresponding *ω*0, by using (16)-(18) and by solving two algebraic equations:

$$\left|1+\mathcal{C}(\mathbf{i}\,o)\mathcal{G}\_{\mathbf{m}}(\mathbf{i}\,o)\right|^{2}-1 \\ /\ M\_{\mathbf{s}}^{2} = 0 \\ \text{ }\tag{21}$$

$$\left(\hat{c}(1+C(\mathbf{i}\,o)G\_{\mathbf{m}}(\mathbf{i}\,o))^{2}\right)/\,\hat{c}\,o=0\,\,.\tag{22}$$

In this case, the PID controller in (3), *F*C(*s*)≡1, is obtained for the desired critical damping ratio *ζ*=1 of the closed-loop system and the desired values of *M*n and *M*s. This is the unique possibility of the procedure (16)-(18) and (21)-(22) proposed in (Mataušek & Šekara, 2011). Moreover, by repeating the calculations for a few values of *ζ*, the value of *ζ* is obtained 126 Frontiers in Advanced Control Systems

defined by (17)-(19) are denoted here as method tunλu. Absolute value of the Integrated Error (IE), approximating almost exactly the obtained IAE, is given by |*IE*|=1/(|*k*u|*β*1).

To demonstrate the relationship between PID controller, tuned by using the method tunλu, and complex controller *C*(*s*)=*k*u(1+*C*\*(*s*)), obtained for *λ*=1/*ω*u and *ζ*=1, the frequency

Appendix. The load disturbance unite step responses, obtained for *G*p2(*s*) in the loop with the PID controller and complex controller *C*(*s*), are presented in Fig. 5-b. Further details about the relationship between these controllers are presented in (Mataušek & Šekara, 2011;

Fig. 5. Comparison of the complex controller *C*(*s*)=*k*u(1+*C*\*(*s*)) with PID controller, both tuned for *G*p2(*s*): a) Bode plots of the controllers and b) the load unite step disturbance

By applying tuning formulae (17)-(19), the desired closed-loop damping ratio *ζ*=1 is obtained with the acceptable values of maximum sensitivity *M*s and maximum sensitivity to measurement noise *M*n. However, when a smaller value of *M*n is required for a desired value of *M*s and the desired closed-loop damping ratio *ζ*, the other possibility is to determine the closed-loop time constant *λ* and the corresponding *ω*0, by using (16)-(18) and by solving

**m s i i** <sup>2</sup> 1 ( ) ( ) 1/ 0 *CG M*

In this case, the PID controller in (3), *F*C(*s*)≡1, is obtained for the desired critical damping ratio *ζ*=1 of the closed-loop system and the desired values of *M*n and *M*s. This is the unique possibility of the procedure (16)-(18) and (21)-(22) proposed in (Mataušek & Šekara, 2011). Moreover, by repeating the calculations for a few values of *ζ*, the value of *ζ* is obtained

 

**i i <sup>m</sup>** <sup>2</sup> (1 ( ) ( ) )/ 0 *C G* 

**2**

 

, (21)

. (22)

4

<sup>2</sup> *Gs s* ( ) 1 /( 1) , (20)

and are given in

u, , *A*, 

Here the value *T*f=1/(10*ω*u) is used, as in (Mataušek & Šekara, 2011).

**p**

responses of these controllers, tuned for the process

Trifunović & Šekara, 2011; Šekara et al., 2011).

a) b)

two algebraic equations:

responses of *G*p2(*s*) in the loop with these controllers.

are presented in Fig. 5-a. For this process, parameters *k*u,

guaranteeing, for desired *M*n and *M*s, almost the same value of the IAE as obtained by the constrained PID optimization based on the exact frequency response *G*p(i). This PID optimization method is denoted here as the method opt2A, when the quadruplet {*k*u, *ω*u, *φ*, *A*} is used, or opt2A0, when the quadruplet {*k*u, *ω*u, *φ*, *A*0} is used. It should be noted here, that for *k*d=0 and *T*f=0, by relations (17) and (21)-(22) a new effective constrained PI controller optimization is obtained, denoted here as opt3. It is successfully compared (Mataušek & Šekara, 2011) with the procedure proposed in (Aström et al., 1998), opt1.

Now, tuning defined by (17)-(19) with *N*=10, *λ*=1/*ω*u and *ζ*=1, method tunλu, will be compared with the optimization defined by (16)-(18), (21)-(22), method opt2A. Both procedures guarantee desired critical damping *ζ*=1, however only the second one guarantees the desired values of *M*n and *M*s. Thus, for *ζ*=1 and for the maximum sensitivity *M*s obtained by applying method tunλu, the smaller value of sensitivity to measurement noise *M*n will be used by applying PID optimization method opt2A. The results of this analysis are presented in Table 2 and Fig. 6. As in Table 1, controller is tuned by using the model *G*m(*s*) in (5) and then applied to processes *G*p3(*s*) to obtain IAE, *M*s and *M*p, where

$$G\_{\rm p3}(s) = \frac{1.507(3.42s+1)(1 - 0.816s)}{(577s+1)(18.1s+1)(0.273s+1)(104.6s^2+15s+1)} \,\text{.}\tag{23}$$

Lower value of IAE is obtained, for almost the same robustness, by using higher value of the sensitivity to measurement noise. However, for the lower value of *M*n the controller and, as a result, the actuator activity is considerably reduced. Thus, the comparison of the IAE, obtained by the PID controllers with the same robustness, is meaningless if the sensitivity to measurement noise *M*n is not specified, as demonstrated in Fig. 6. This fact is frequently ignored.


Table 2. Process Gp3(s) in the loop with the PID controllers. Tuning method (17)-(19), tunλ<sup>u</sup> and optimization (16)-(18), (21)-(22), opt2A for *ζ*=1.

Concluding this section, the constrained PI/PID controller optimization methods proposed in (Mataušek & Šekara, 2011) is compared with the constrained PID controller optimization method proposed in (Šekara & Mataušek, 2010a), optPID1, and the constrained PI controller optimization method proposed in (Šekara & Mataušek, 2008), opt2. The test batch of stable processes, processes having oscillatory dynamics, integrating and unstable processes used in this analysis is defined by transfer functions *G*p1(s), *G*p2(s), *G*p3(s) and

$$G\_{\mathbf{p}4}(\mathbf{s}) = \frac{e^{-5s}}{\left(\mathbf{s} + 1\right)^3},\; G\_{\mathbf{p}5}(\mathbf{s}) = \frac{e^{-s}}{9s^2 + 0.24s + 1},\tag{24}$$

$$G\_{\mathfrak{p}^6}(s) = \frac{e^{-5s}}{s(s+1)(0.5s+1)(0.25s+1)(0.125s+1)},\ \ G\_{\mathfrak{p}^7}(s) = \frac{2e^{-5s}}{(10s-1)(2s+1)},\tag{25}$$

with parameters *k*u, u, , *A*, *A*0, , presented in Appendix. Comparison of the methods for PID controller tuning is presented in Table 3. Comparison of the methods for PI controller tuning is presented in Table 4 and Fig. 7.

Fig. 6. Set-point, *R*(*s*)=1/*s*, and load disturbance, *D*(*s*)=-10exp(-400*s*)/*s*, step responses. *G*p3(s) and PID controllers tuned by: a) tunλu, *b*=0.5; b) opt2A, *b*=0.6. Measurement noise is obtained by passing uniform random noise 1 through a low-pass filter *F*(*s*)=0.5/(10*s*+1).


Table 3. PID controllers, obtained by applying model *G*m(*s*) and tuning methods: max(k), max(ki); (31)-(35) optPID; (16)-(18), (21)-(22) opt2A and opt2A0.

In Table 3 optimization (16)-(18), (21)-(22) is performed for stable *G*p3(s), *G*p5(s) and unstable *G*p7(s) processes by using *G*m(*s*) with two quadruplets: {*k*u, u, , *A*}, denoted as opt2A, and 128 Frontiers in Advanced Control Systems

PID controller tuning is presented in Table 3. Comparison of the methods for PI controller

a) b)

Fig. 6. Set-point, *R*(*s*)=1/*s*, and load disturbance, *D*(*s*)=-10exp(-400*s*)/*s*, step responses. *G*p3(s)

*G*p3/max(k) 17.0778 0.2372 320.06 4.7131 4.83 67.91 2.00 1.56 0.98 - *G*p3/optPID 17.1037 0.2303 315.14 4.6407 4.84 67.91 2.00 1.54 - - *G*p3/opt2A 17.1994 0.1788 316.59 4.6621 5.62 67.91 2.00 1.41 - 1 *G*p3/opt2A0 16.9411 0.2670 312.65 4.6040 4.87 67.91 2.00 1.69 - 0.75 *G*p3/opt2A 16.8802 0.2083 268.32 3.9513 4.92 67.91 2.00 1.59 - 0.80 *G*p5/max(ki) -0.3090 0.0654 0.8640 1.7597 21.17 0.49 2.00 1.03 0.65 - *G*p5/optPID -0.3032 0.0651 0.8280 1.6864 21.87 0.49 2.00 1.07 - - *G*p5/opt2A -0.4139 0.0336 0.9398 1.9140 30.04 0.49 2.00 1.04 - 1 *G*p5/opt2A0 -0.3369 0.0583 0.8948 1.8223 20.29 0.49 2.00 1.02 - 0.65 *G*p5/opt2A -0.3542 0.0528 0.8860 1.8044 20.30 0.49 2.00 1.02 - 0.70 *G*p6/max(k) 0.1177 0.0063 0.3961 0.8353 207.22 0.47 2.00 1.76 1.18 - *G*p6/optPID 0.1181 0.0054 0.3736 0.7878 208.65 0.47 2.00 1.63 - - *G*p6/opt2A 0.1133 0.0043 0.2373 0.5003 234.73 0.47 2.01 1.60 - 1 *G*p6/opt2A 0.1160 0.0043 0.2709 0.5712 233.50 0.47 2.01 1.55 - 1.05 *G*p7/max(k) 0.8608 0.0158 3.3101 0.1418 73.50 23.35 3.61 3.39 1.88 - *G*p7/optPID 0.8609 0.0150 3.2946 0.1411 75.00 23.35 3.61 3.33 - - *G*p7/opt2A 0.8543 0.0106 2.9385 0.1258 93.96 23.35 3.54 3.18 - 1.3 *G*p7/opt2A0 0.8060 0.0093 2.3759 0.1017 107.41 23.35 3.61 3.77 - 1.1 Table 3. PID controllers, obtained by applying model *G*m(*s*) and tuning methods: max(k),

In Table 3 optimization (16)-(18), (21)-(22) is performed for stable *G*p3(s), *G*p5(s) and unstable

u, 

, *A*}, denoted as opt2A, and

 *k k*i *k*d *T*f *IAE M*n *M*<sup>s</sup> *M*p *ζ*<sup>z</sup> *ζ*

and PID controllers tuned by: a) tunλu, *b*=0.5; b) opt2A, *b*=0.6. Measurement noise is obtained by passing uniform random noise 1 through a low-pass filter *F*(*s*)=0.5/(10*s*+1).

max(ki); (31)-(35) optPID; (16)-(18), (21)-(22) opt2A and opt2A0.

*G*p7(s) processes by using *G*m(*s*) with two quadruplets: {*k*u,

presented in Appendix. Comparison of the methods for

with parameters *k*u,

Process/ method

u, , *A*, *A*0, , 

tuning is presented in Table 4 and Fig. 7.

{*k*u, u, , *A*0}, denoted opt2A0. As mentioned previously, for integrating processes *A*=u. Almost the same performance/robustness tradeoff is obtained for *A* and *A*0, as supposed in Section 2. This result is important since it confirms that an adequate approximation of the frequency response of the stable and unstable processes around u can be used in the optimization (16)-(18) and (21)-(22), instead of the model *G*m(i) in (5). Obviously, the same applies for integrating processes. The advantage of the constrained PID controller optimization (16)-(18) and (21)-(22) is that only two nonlinear algebraic equations have to be solved, with very good initial conditions for the unknown parameters *λ* and 0. Moreover, the optimization is performed for the desired values of *M*s, *M*n and for the desired closedloop system damping ratio *ζ*.

Finally, the results of the PI controller optimization are demonstrated in Table 4 and in Fig. 7. By repeating calculations for a few values of *ζ*, for the same values of *M*s and *M*p, the same (minimal) value of the IAE is obtained by applying method opt3, defined by (17) and (21)- (22), and the method opt2, defined by (13)-(14). As mentioned previously, method opt2 is an improvement of the method proposed in (Aström et al., 1998), denoted here as method opt1.

Fig. 7. Set-point and load disturbance step responses: *y*(*t*) (left) and *u*(*t*) (right). PI controllers from Table 4: opt1 *b*=0, opt2 *b*=0.6, opt3 *b*=0.6. In a) and b) *G*p1(*s*), *D*(*s*)=-exp(-4*s*)/*s*; in c) and d) *G*p4(*s*), *D*(*s*)=-0.5exp(-80*s*)/*s*.


Table 4. PI controllers, obtained for *M*s=2 by applying model (5) and methods: (Aström et al., 1998) opt1, (13)-(14) opt2, and (17), (21)-(22) opt3.

#### **4. Closed-loop estimation of model parameters**

Approximation of process dynamics, around the operating regime, can be defined by some transfer function *G*p(*s*) obtained from the open-loop or closed-loop process identification. One two step approach (Hjalmarsson, 2005) is based on the application of the high-order ARX model identification in the first step. In the second step, to reduce the variance of the obtained estimate of frequency response of the process, caused by the measurement noise, this ARX model is reduced to a low-order model *G*p(*s*). By applying this procedure an adequate approximation *G*p(i*ω*) of the unknown Nyquist curve can be obtained in the region around the ultimate frequency *ω*u. As demonstrated for the Ziegler-Nichols tuning, in Fig. 3 c and Fig. 4-b, such approximation of the unknown Nyquist curve is of essential importance for designing an adequate PID controller. The same applies for the successful PID optimization under constraints on the desired values of *M*n and *M*s, as demonstrated in Table 5 for the value of *A* defined as in (5) and for *A*=*A*0.

The Closed-Loop (CL) system identification can be performed by using indirect or direct identification methods. In indirect CL system identification methods it is assumed that the controller in operation is linear and a priory known. Direct CL system identification methods are based only on the plant input and output data (Agüero et al., 2011). Finally, the identification can be based on the simple tests, as initiated by Ziegler and Nichols (1942), to obtain an IPDT model (1). Later on, this approach is extended to obtain FOPDT model and the Second-Order Plus Dead-Time (SOPDT) model, for integrating processes characterized by the IFOPDT model. The SOPDT model can be obtained from *k*u, *ω*u, *φ*, *A*. In this case it is defined by

$$\text{G}\_{\text{SO}}(\text{s}) = \frac{e^{-Ls}}{as^2 + bs + c} \text{ \text{\textdegree}}\tag{26}$$

where parameters *a*, *b*, *c* and *L* are functions of *k*u, *ω*u, *φ* and *A*, obtained from the tangent rule (Šekara & Mataušek, 2010a). This model (26) is an adequate SOPDT approximation of the Nyquist curve *G*p(i) in the region around the ultimate frequency *ω*u, for a large class of stable processes, processes with oscillatory dynamics, integrating and unstable processes.

130 Frontiers in Advanced Control Systems

*G*p1/opt1 2.6707 6.4739 0.19 1.98 1.58 1 - *G*p1/opt2 3.1874 6.1391 0.16 1.99 1.48 0.52 - *G*p1/opt3 3.2119 6.1083 0.16 1.99 1.48 - 0.95 *G*p3/opt1 7.4060 0.0692 15.61 1.92 1.65 1 - *G*p3/opt2 8.1456 0.0680 14.72 1.94 1.60 0.48 - *G*p3/opt3 8.1355 0.0679 14.73 1.94 1.60 - 0.90 *G*p4/opt1 0.3248 0.1259 12.04 2.16 1.35 1 - *G*p4/opt2 0.4608 0.1137 10.23 2.11 1.18 0.69 - *G*p4/opt3 0.4651 0.1128 10.19 2.10 1.18 - 0.90

Table 4. PI controllers, obtained for *M*s=2 by applying model (5) and methods: (Aström et

Approximation of process dynamics, around the operating regime, can be defined by some transfer function *G*p(*s*) obtained from the open-loop or closed-loop process identification. One two step approach (Hjalmarsson, 2005) is based on the application of the high-order ARX model identification in the first step. In the second step, to reduce the variance of the obtained estimate of frequency response of the process, caused by the measurement noise, this ARX model is reduced to a low-order model *G*p(*s*). By applying this procedure an adequate approximation *G*p(i*ω*) of the unknown Nyquist curve can be obtained in the region around the ultimate frequency *ω*u. As demonstrated for the Ziegler-Nichols tuning, in Fig. 3 c and Fig. 4-b, such approximation of the unknown Nyquist curve is of essential importance for designing an adequate PID controller. The same applies for the successful PID optimization under constraints on the desired values of *M*n and *M*s, as demonstrated in

The Closed-Loop (CL) system identification can be performed by using indirect or direct identification methods. In indirect CL system identification methods it is assumed that the controller in operation is linear and a priory known. Direct CL system identification methods are based only on the plant input and output data (Agüero et al., 2011). Finally, the identification can be based on the simple tests, as initiated by Ziegler and Nichols (1942), to obtain an IPDT model (1). Later on, this approach is extended to obtain FOPDT model and the Second-Order Plus Dead-Time (SOPDT) model, for integrating processes characterized by the IFOPDT model. The SOPDT model can be obtained from *k*u, *ω*u, *φ*, *A*. In this case it is

> **SO** <sup>2</sup> ( ) *Ls <sup>e</sup> G s*

*as bs c* 

) in the region around the ultimate frequency *ω*u, for a large class of

where parameters *a*, *b*, *c* and *L* are functions of *k*u, *ω*u, *φ* and *A*, obtained from the tangent rule (Šekara & Mataušek, 2010a). This model (26) is an adequate SOPDT approximation of

stable processes, processes with oscillatory dynamics, integrating and unstable processes.

, (26)

 Process/method *k k*<sup>i</sup> *IAE M*<sup>s</sup> *M*<sup>p</sup>

al., 1998) opt1, (13)-(14) opt2, and (17), (21)-(22) opt3.

**4. Closed-loop estimation of model parameters**

Table 5 for the value of *A* defined as in (5) and for *A*=*A*0.

defined by

the Nyquist curve *G*p(i

The recently proposed new Phase-Locked Loop (PLL) estimator (Mataušek & Šekara, 2011), its improvement (Šekara & Mataušek, 2011c), and new relay SheMa estimator (Šekara & Mataušek 2011b) make possible determination of parameters *k*u, *ω*u, *φ* and *A*0 of the model *G*m(*s*) in the closed-loop experiments, without breaking the control loop in operation. This property of the proposed PLL and SheMa estimators is important for practice, since breaking of control loops in operation is mainly ignored by plant operators, especially in the case of controlling processes with oscillatory dynamics, integrating or unstable processes. The PLL estimator can be applied in the case when the controller in operation is an unknown linear controller, while the SheMa estimator can be applied when the controller in operation is unknown and nonlinear. In that sense, the SheMa estimator belongs to the direct CL system identification methods, based only on the plant input and output data, as in (Agüero et al., 2011). Both procedures, SheMa and PLL, are based on the parameterization presented in (Šekara & Mataušek, 2010a; Mataušek & Šekara, 2011). Estimates of parameters **u***k* , **u***<sup>k</sup>* , **u***k* and **u** , **<sup>u</sup>** , **u** , obtained for **<sup>p</sup>** arg ( *<sup>G</sup>* **i) -** , **0**  and **/** 36 , are used for determining *φ* and *A*0, as defined in (Mataušek & Šekara, 2011).

In this section, an improvement of the new PLL estimator from (Mataušek & Šekara, 2011) is presented in Fig. 8. The improvement, proposed by Šekara and Mataušek (2011c), consists of adding two integrators at the input to the PLL estimator from (Mataušek & Šekara, 2011). Inputs to these integrators are defined by outputs of the band-pass filters AF1, used to eliminate the load disturbance. Outputs of these integrators are passed through a cascade of the band-pass filters AF*m*, *m*=2,3,4. All filters AF*m*, *m*=1,2,3,4, are tuned to the ultimate frequency. Such implementation of the PLL estimator eliminates the effects of the high measurement noise and load disturbance. Blocks AF*m*, *j*=1,2,3,4, are implemented as presented in (Mataušek & Šekara, 2011), while implementation of blocks for determining arg{*G*p(i)} and |*G*p(i)|are presented in (Šekara & Mataušek, 2011c).

PLL estimator from Fig. 8 is applied to processes *G*p8(*s*)=exp(-*s*)/(2*s*+1) and *G*p9(*s*)=4exp(-2*s*)/(4*s*-1) in the loop with the known PID controller. Estimation of parameters **<sup>u</sup>***k* , **u***<sup>k</sup>* , **u***k* and **u** , **<sup>u</sup>** , **u** is presented in Fig. 9. Highly accurate estimates of **u***k* , **u***<sup>k</sup>* , **u***k* and **u** , **<sup>u</sup>** , **u** are obtained in the presence of the high measurement noise and load disturbance. Since these parameters are used to determine *φ* and *A*0, this experiment demonstrates that highly accurate estimate of the quadruplet {*k*u, *ω*u, *φ*, *A*0} can be obtained, in the presence of the high measurement noise and load disturbance, by the PLL estimator from (Šekara & Mataušek 2011c). In Fig. 10, estimation of the unknown Nyquist curve of the unstable process in the loop with the PID controller is demonstrated.

The PLL estimator from (Mataušek & Šekara, 2011; Šekara & Mataušek 2011c) is a further development of the idea firstly proposed in (Crowe & Johnson, 2000) and used in (Clarke & Park, 2003). The SheMa estimator is a further development of the estimator proposed by Aström and Hägglund (1984) as an improvement of the Ziegler-Nichols experiment.

The Ziegler and Nichols (1942) experiment, used to determine *k*u and u of a process is performed by setting the integral and derivative gains to zero in the PID controller *C*(*s*) in operation. However, in this approach the amplitude of oscillations is not under control. This drawback is eliminated by Aström and Hägglund (1984). The factors influencing the critical point estimation accuracy in this conventional relay setup are: the use of describing function method is faced with the fact that higher harmonics are not efficiently filtered out by the process, presence of the load disturbance *d*, and presence of the measurement noise *n*. The first drawback of the conventional relay experiment is eliminated by the modified relay setup (Lee et al., 1995).

Fig. 8. Improved PLL estimator. AF2,3,4 is the cascade of band-pass filters AF*m*, *m*=1,2,3,4.

Fig. 9. PLL estimates of **u***k* , **u***<sup>k</sup>* , **u***k* and **u** , **<sup>u</sup>** , **u** , in the presence of the high measurement noise and step load disturbance at *t*=700 s. Process *G*p8(*s*)=exp(-*s*)/(2*s*+1), for: / 36 for 0≤*t*≤300 s, 0 for 300<*t*≤500 s and / 36 for 500<*t*≤1000 s .

132 Frontiers in Advanced Control Systems

operation. However, in this approach the amplitude of oscillations is not under control. This drawback is eliminated by Aström and Hägglund (1984). The factors influencing the critical point estimation accuracy in this conventional relay setup are: the use of describing function method is faced with the fact that higher harmonics are not efficiently filtered out by the process, presence of the load disturbance *d*, and presence of the measurement noise *n*. The first drawback of the conventional relay experiment is eliminated by the modified relay

1

1 *s*

<sup>u</sup> *<sup>k</sup>* <sup>u</sup> *<sup>k</sup>*

u

ff *C s*( ) *<sup>r</sup>* 

> sin cos

1 *s*

u 

, in the presence of the high

for 500<*t*≤1000 s .

*u*

Oscillator

*u*OS

<sup>p</sup> *G s*( )

*d n*

*y*

*C s*( )

setup (Lee et al., 1995).

AF2,3,4 AF2,3,4

Fig. 9. PLL estimates of **u***k* , **u***<sup>k</sup>* , **u***k* and

/ 36 for 0≤*t*≤300 s,

  <sup>F</sup> *u*<sup>F</sup> *y*

*U U* F OS / arg F OS {/ } *U U Y U* F OS / arg F OS {/ } *Y U*

AF1 AF1

*y*

1 *s*

*u*

1 *s*

u *k*

u 

 

PLL *k s*

Fig. 8. Improved PLL estimator. AF2,3,4 is the cascade of band-pass filters AF*m*, *m*=1,2,3,4.

u

**u** , **<sup>u</sup>** , **u**

measurement noise and step load disturbance at *t*=700 s. Process *G*p8(*s*)=exp(-*s*)/(2*s*+1), for:

0 for 300<*t*≤500 s and / 36

 

arg p u { (i )} *G* 

p u *G* (i ) 

 

> ref

Fig. 10. Estimates (circles) of the Nyquist curve (solid) obtained by the PLL estimator for the desired values ref arg{*G*p9(i)}. Process *G*p9(*s*)=4exp(-2*s*)/(4*s*-1), the noise-free case.

Due to its simplicity, the relay-based setup proposed by Aström and Hägglund (1984) is still a basic part of different methods developed in the area of process dynamics characterization. For example, it is used to generate signals to be applied for determining FOPDT and SOPDT models, using a biased relay (Hang et al., 2002). However, from the viewpoint of the process control system in operation, the estimation based on this setup, and its modifications, is performed in an open-loop configuration: the loop with the controller *C*(*s*) in operation is opened and the process output is connected in feedback with a relay.

In the paper (Šekara & Mataušek, 2011b) a new relay-based setup is developed, with the controller *C*(*s*) in operation. It consists of a cascade of variable band-pass filters AF*m*, from (Clarke & Park, 2003), a new variable band-pass filter *F*mod proposed by Šekara and Mataušek (2011b) and a notch filter *F*NF=1-*F*mod. Center frequencies of variable band-pass filters AF*m* and *F*mod are at u.

Highly accurate estimates of u and *k*u are obtained in the presence of the measurement noise and load disturbance. Also, highly accurate estimates of the Nyquist curve *G*p(i) at the desired values of arg{*G*p(i)} are obtained by including into the SheMa the modified relay instead of the ordinary relay. The amplitude *μ* of both relays is equal to *μ*=*πk*u,0*y*ref*ε*0/4, where *k*u,0 is the ultimate gain obtained in the previous activation of the SheMa, *y*ref is the amplitude of the set-point *r* and *ε*<sup>0</sup> is a small percent of *y*ref, for example *ε*0=0.1% in the examples presented in (Šekara & Mataušek, 2011b). The proposed closed-loop procedure can be activated or deactivated with small impact on the controlled process output. Further details of the SheMa estimator, including the stability and robustness analyses, and implementation details, are presented in (Šekara & Mataušek 2011b).

#### **5. Gain scheduling control of stable, integrating, and unstable processes, based on the controller optimization in the classification parameter plane**

For a chosen region in the *ρ*-*φ* classification plane, presented in Fig. 11, the normalized parameters *k*n(*ρ*,*φ*), *k*in(*ρ*,*φ*), *k*dn(*ρ*,*φ*) and *T*fn=|*k*dn(*ρ*,*φ*)|/*m*n of a virtual PIDn controller are calculated in advance by using the process-independent model *G*n(i*ω*n, *ρ*, *φ*) in (10).

Then, parameters *k*, *k*i, *k*d and *T*f of the PID controller (3), *F*C(*s*)≡1, are obtained, for the process classified in the chosen region of the *ρ*-*φ* plane, by using the estimated *k*u, *ω*u, *φ*, *A* and the following relations

$$k = k\_{\mathbf{u}} k\_{\mathbf{n}\prime} \quad k\_{\mathbf{i}} = k\_{\mathbf{u}} \alpha\_{\mathbf{u}} k\_{\mathbf{in}\prime} \quad k\_{\mathbf{d}} = k\_{\mathbf{u}} k\_{\mathbf{dn}\prime} \;/\; \alpha\_{\mathbf{u}\prime} \quad \mathcal{T}\_{\mathbf{f}} = \mathcal{T}\_{\mathbf{f}\mathbf{n}} \;/\; \alpha\_{\mathbf{u}} \,. \tag{27}$$

Depending on the method applied to obtain parameters *k*n, *k*in, *k*dn and *T*fn=|*k*dn|/*m*n of a PIDn controller, parameters *k*, *k*i, *k*d and *T*f of the PID controller (3), *F*C(*s*)≡1, guarantee the desired *M*s and the sensitivity to measurement noise equal to *M*n=|*k*u|*m*n, or guarantee the

Fig. 11. Classification *ρ*-*φ* parameter plane, with processes *G*p*<sup>j</sup>*(*s*), *j*=1,2,...,9. Stable processes are classified in the region 0 1, 0 / 1 , integrating processes are classified as 1, 0 / 2 processes. Unstable processes are classified outside this region.

desired *M*s, *ζ* and *M*n=|*k*u|*m*n. Since parameters *k*n, *k*in, *k*dn and *T*fn=|*k*dn|/*m*n are determined in advance, they can be memorized as look-up tables in the *ρ*-*φ* plane. Besides, this can be done for different values of *M*s, *m*<sup>n</sup> and *ζ*. These look-up tables define a new Gain Scheduling Control (GSC) concept. Important feature of this GSC is that these look-up tables, obtained for some values of *M*s, *m*<sup>n</sup> and *ζ* from the model *G*n(i*ω*n, *ρ*, *φ*), are processindependent. Enormous resources are avoided, required for performing experiments on the plant in order to define the standard GSC as the look-up tables of PID controller parameters for this plant and the desired region of operating regimes. Thus, the important and exclusive feature of the new GSC is that a desired performance/robustness tradeoff can be obtained for a large region of dynamic characteristics of processes in different plants and different operating regimes, covered by the look-up tables of parameters *k*n, *k*in, *k*dn in the *ρ*-*φ* classification plane.

Now, this GSC PID controller tuning, performed by using (27), will be demonstrated by the two different procedures applied for obtaining parameters *k*n, *k*in, *k*dn and *T*fn=|*k*dn|/*m*n of the PIDn controller for integrating and stable processes. Stable processes having a weakly damped impulse response are denoted as processes having oscillatory dynamics, while processes with damped impulse response are denoted as stable processes.

For integrating processes, parameters *k*n, *k*in, *k*dn and *T*fn=|*k*dn|/*m*n of the PIDn controller depend only on angle *φ*, since *ρ*=1. In this case, for desired values of *M*s and *m*n, PID controller parameters (27) are obtained from tuning formulae for *k*n(*φ*), *k*in(*φ*) and *k*dn(*φ*) 134 Frontiers in Advanced Control Systems

Then, parameters *k*, *k*i, *k*d and *T*f of the PID controller (3), *F*C(*s*)≡1, are obtained, for the process classified in the chosen region of the *ρ*-*φ* plane, by using the estimated *k*u, *ω*u, *φ*, *A*

**u n i u u in d u dn u f fn u** *k kk k k k k kk T T* , , /, /

Depending on the method applied to obtain parameters *k*n, *k*in, *k*dn and *T*fn=|*k*dn|/*m*n of a PIDn controller, parameters *k*, *k*i, *k*d and *T*f of the PID controller (3), *F*C(*s*)≡1, guarantee the desired *M*s and the sensitivity to measurement noise equal to *M*n=|*k*u|*m*n, or guarantee the

Fig. 11. Classification *ρ*-*φ* parameter plane, with processes *G*p*<sup>j</sup>*(*s*), *j*=1,2,...,9. Stable processes

desired *M*s, *ζ* and *M*n=|*k*u|*m*n. Since parameters *k*n, *k*in, *k*dn and *T*fn=|*k*dn|/*m*n are determined in advance, they can be memorized as look-up tables in the *ρ*-*φ* plane. Besides, this can be done for different values of *M*s, *m*<sup>n</sup> and *ζ*. These look-up tables define a new Gain Scheduling Control (GSC) concept. Important feature of this GSC is that these look-up tables, obtained for some values of *M*s, *m*<sup>n</sup> and *ζ* from the model *G*n(i*ω*n, *ρ*, *φ*), are processindependent. Enormous resources are avoided, required for performing experiments on the plant in order to define the standard GSC as the look-up tables of PID controller parameters for this plant and the desired region of operating regimes. Thus, the important and exclusive feature of the new GSC is that a desired performance/robustness tradeoff can be obtained for a large region of dynamic characteristics of processes in different plants and different operating regimes, covered by the look-up tables of parameters *k*n, *k*in, *k*dn in the *ρ*-*φ*

Now, this GSC PID controller tuning, performed by using (27), will be demonstrated by the two different procedures applied for obtaining parameters *k*n, *k*in, *k*dn and *T*fn=|*k*dn|/*m*n of the PIDn controller for integrating and stable processes. Stable processes having a weakly damped impulse response are denoted as processes having oscillatory dynamics, while

For integrating processes, parameters *k*n, *k*in, *k*dn and *T*fn=|*k*dn|/*m*n of the PIDn controller depend only on angle *φ*, since *ρ*=1. In this case, for desired values of *M*s and *m*n, PID controller parameters (27) are obtained from tuning formulae for *k*n(*φ*), *k*in(*φ*) and *k*dn(*φ*)

processes with damped impulse response are denoted as stable processes.

1, 0 / 2 processes. Unstable processes are classified outside this region.

 

   

, integrating processes are classified as

 . (27)

are classified in the region 0 1, 0 / 1 

and the following relations

 

classification plane.

For processes having the oscillatory dynamics look-up tables and tuning formulae are derived in (Šekara & Mataušek, 2011a) for *M*s=2 and *m*n=40, in the region 0.1≤*ρ*≤0*.2*, 0.1745≤*φ*≤1.0472 of the *ρ*-*φ* classification plane of Fig. 11. These tuning formulae, in Appendix denoted as tun2, are applied to determine parameters *k*, *k*i, *k*d and *T*f for the process having the oscillatory dynamics *G*p5(s), classified as process *ρ*=0.1971, *φ*=0.3679 (Table 5, *G*p5-tun2). To illustrate the direct application of the look-up tables from (Šekara & Mataušek, 2011a, Table A4) and interpolation procedure defined in Appendix, Fig. 17, since this process is classified as *ρ*=0.1971, *φ*=21.0791 (0.3679), the following points are determined from (Šekara & Mataušek, 2011a, Table A4) and Appendix, Fig. 17: *ρ*1,1=0.15, *φ*1,1=20 , *ρ*1,2=0.2, *φ*1,2=20 and *ρ*2,2=0.2, *φ*2,2=30 . Parameters (*k*n, *k*in, *k*dn) are defined by: (-2.4122, 0.5988, 3.9353) for *ρ*1,1, *φ*1,1, (-1.7022, 0.4125, 2.8783) for *ρ*1,2,*φ*1,2 and (-1.6626, 0.4164, 2.3017) for *ρ*2,2,*φ*2,2. Then, by using three point interpolation from Appendix, upper triangle (*αru*=0.0578, *βru*=0.1971), one obtains parameters in Table 5, *G*p5-GSC: *k*=-0.4220, *k*i=0.0384, *k*d=1.9116, *T*f=1.947.

For stable processes, in a large region of the *ρ*-*φ* plane, look-up tables of parameters *k*n, *k*in and *k*dn are defined for *M*s=2 and *m*n=2 (Šekara & Mataušek, 2011a, Tables A1-A3). These look-up tables are applied in the present paper to determine parameters *k*, *k*i, *k*d and *T*f for the stable process *G*p3(s). This process is classified as process *ρ*=0.9808, *φ*=0.6783 (38.8637 ). Thus, for *G*p3(*s*) parameters (*k*n, *k*in, *k*dn) can be obtained from the three points in the *ρ*-*φ* classification plane (Appendix, Fig. 17): *ρ*1,1=0.95, *φ*1,1=30 ; *ρ*2,1=0.95, *φ*2,1=40 and *ρ*2,2=1, *φ*2,2=40 (0.6981). Two points are used for stable processes (0.5086, 0.1349, 0.6569) for *ρ*1,1,*φ*1,1 and (0.5013, 0.1261, 0.5332) for *ρ*2,1,*φ*2,1 from the look-up tables (Šekara & Mataušek, 2011a, Tables A1-A3), while data (0.5036, 0.1109, 0.5332) for *ρ*2,2, *φ*2,2 are obtained from tuning formulae derived for integrating processes in (Šekara & Mataušek, 2011a), given in Appendix as tun1. Then, by using three point interpolation from Appendix, Fig. 17 lower triangle (*αll*=0.6166, *βll*=0.1136), one obtains parameters presented in Table 5, *G*p3-GSC: *k*=17.0973, *k*i=0.2307, *k*d=315.2928 and *T*f=4.6430.


Table 5. PID controllers: stable process *G*p3(*s*), method GSC-Appendix; stable process having oscillatory dynamics *G*p5(s), method tun2 and method GSC-Appendix; integrating process *G*p6(*s*), method tun1.

#### **5.1 Experimental results**

Experimental results, presented in Fig. 12, are obtained by using the laboratory thermal plant. It consists of a thin plate made of aluminum, *L*a=0.1m long and *h*=0.03m wide (Mataušek & Ribić, 2012). Temperature *T*(*x*,*t*) is distributed along the plate, from *x*=0 to *x*=*L*a, and measured by precision sensors LM35 (TO92), at *x*=0 and *x*=*L*a. The plate is heated by a terminal adjustable regulator LM317 (TO 220) at position *x*=0. The manipulated variable is the dissipated power of the heater at *x*=0. The input to the heater is the control variable *u*(*t*) (%), defined by the output of the PID controller. The controlled variable is *y*(*t*)=*T*(*L*a,*t*), measured by the sensor at position *x*=*L*a. Temperature sensor at *x*=0 is used in the safety device, to prevent overheating when 70 C ≤*T*(0,*t*). The anti-windup implementation of the PID controller (3), *F*C(*s*)≡1, is given by

$$
\mu\_{\rm C} = T\_{\rm aw} \left( \frac{bks + k\_{\rm i}}{T\_{\rm aw}s + 1} r - \frac{k\_{\rm d}s^2 + ks + k\_{\rm i}}{(T\_{\rm aw}s + 1)(T\_{\rm f}s + 1)} y \right) + \frac{1}{T\_{\rm aw}s + 1} \mu \,\tag{28}
$$

The saturation element is defined by the input *u*C(*t*) and output *u*(*t*):

$$\mu = \begin{cases} l\_{\text{low}} & \mu\_{\text{C}} \le l\_{\text{low}} \\ \mu\_{\text{C}'} & l\_{\text{low}} < \mu\_{\text{C}} < l\_{\text{high}} \\ l\_{\text{high}} & \mu\_{\text{C}} \ge l\_{\text{high}} \end{cases} \tag{29}$$

Obviously, in the linear region *l*low<*u*C(*t*)< *l*high of the saturation element, for *u*C(*t*)*u*(*t*) one obtains (3), *F*C(*s*)≡1, from (28).

Fig. 12. Experimental results. Set-point and load step (-20% change of the controller output at *t*=1600 s) responses of the real plant, with the PI and PID controller: a) control variable *u*(*t*) and b) controlled variable *y*(*t*). The real plant, with the anti-windup PID controller under the disturbance induced by activating/deactivating the fan: c) *u*(*t*) and d) *y*(*t*).

136 Frontiers in Advanced Control Systems

*x*=*L*a, and measured by precision sensors LM35 (TO92), at *x*=0 and *x*=*L*a. The plate is heated by a terminal adjustable regulator LM317 (TO 220) at position *x*=0. The manipulated variable is the dissipated power of the heater at *x*=0. The input to the heater is the control variable *u*(*t*) (%), defined by the output of the PID controller. The controlled variable is *y*(*t*)=*T*(*L*a,*t*), measured by the sensor at position *x*=*L*a. Temperature sensor at *x*=0 is used in

**i d i**

 

> , ,

*bks k k s ks k uT r <sup>y</sup> <sup>u</sup> T s T s Ts T s*

**aw aw f aw**

**low C low C low C high high C igh**

*l ul u u l ul l u l*

Obviously, in the linear region *l*low<*u*C(*t*)< *l*high of the saturation element, for *u*C(*t*)*u*(*t*) one

a) b) <sup>1000</sup> <sup>1200</sup> <sup>1400</sup> <sup>1600</sup> <sup>1800</sup> <sup>2000</sup> <sup>2200</sup> <sup>44</sup>

Fig. 12. Experimental results. Set-point and load step (-20% change of the controller output at *t*=1600 s) responses of the real plant, with the PI and PID controller: a) control variable *u*(*t*) and b) controlled variable *y*(*t*). The real plant, with the anti-windup PID controller under the disturbance induced by activating/deactivating the fan: c) *u*(*t*) and d) *y*(*t*).

 

, *<sup>h</sup>*

<sup>2</sup> 1

1 ( 1)( 1) 1

C ≤*T*(0,*t*). The anti-windup

. (29)

PID PI

Time [sec]

. (28)

the safety device, to prevent overheating when 70

implementation of the PID controller (3), *F*C(*s*)≡1, is given by

The saturation element is defined by the input *u*C(*t*) and output *u*(*t*):

**C aw**

c) d)

obtains (3), *F*C(*s*)≡1, from (28).

Transfer function *G*p3(*s*), used for determining parameters of the PID controller applied in the real-time experiment, is obtained previously in (Mataušek & Ribić, 2012). By applying a Pseudo-Random-Binary-Sequence for *u*(*t*), the open-loop response *y*(*t*) of the laboratory thermal plant is obtained. From these *u*(*t*) and *y*(*t*) a 100-th order ARX model is determined and reduced then to the 5-th order transfer function *G*p3(s) in (Mataušek & Ribić, 2012). This model of the process is used here to determine the quadruplet {*k*u, u, , *A*} presented in the Appendix. Thus, the laboratory thermal plant is classified as the process *ρ*=0.9808, *φ*=0.6783. Then, PID controller applied to the real thermal plant is determined by using look-up tables of parameters *k*n(*ρ*,*φ*), *k*in(*ρ*,*φ*), *k*dn(*ρ*,*φ*), for stable processes, and parameters *k*n(*φ*), *k*in(*φ*), *k*dn(*φ*), for integrating processes, previously determined in (Šekara & Mataušek, 2011a). This procedure, used to obtain PID in Table 5, row *G*p3-GSC, and results obtained by this PID controller, presented in Fig. 12, demonstrate that in advance determined look-up tables of parameters *k*n, *k*in and *k*dn defines a process-independent GSC applicable for obtaining the desired performance/robustness tradeoff for a real plant classified in the *ρ*-*φ* parameter plane. For *T*i=*k*/*k*i and *T*d=*k*d/*k*, parameter *T*aw=15s is obtained from *T*aw=*pT*i+(1-*p*)*T*d, for *p*=0.2, and *l*low=0, *l*high=100%, *b*=0.25.

Closed-loop experiment in Fig. 12-a and Fig. 12-b is used to demonstrate advantages of the designed PID controller, compared with the PI controller, from Table 4, row *G*p3/opt3 defined by: *k*=8.1355, *k*i=0.0679, and *b*=0.5. This experiment starts from temperature *T*(*L*a,*t*)45 C, as presented in Fig. 12-b. Then at *t*=1000 s the set point is changed to *r*=45 C+*r*0, *r*0=5 C. At *t*=1600 s a load disturbance is inserted as a step change of the controller output equal to -20%. Improvement of the performance obtained by the PID controller is evident. As expected, this is obtained with the greater variation of the control signal *u*PID(*t*) than that obtained by *u*PI(*t*). This is the reason why PID controller from Table 2, row tunλu, having a greater value of *M*n=217.7, is not applied to the real thermal plant.

The closed-loop experiment presented in Fig. 12-c and Fig. 12-d starts from the steady state temperature *T*(*L*a,*t*)50 C by activating a fan at *t*=400 s. Then, at *t*=600 s the fan is switched-off. Action of the fan induced a strong disturbance, as seen from the control signal *u*(*t*) in Fig. 12-c. It should be observed that anti-windup action is activated two times, around 410 s and 625 s. Anti-windup action is effective and rejection of the disturbance is fast, as seen from Fig. 18-d.

### **6. Conclusion**

The extension of the Ziegler-Nichols process dynamics characterization, developed in (Šekara & Mataušek, 2010a; Mataušek & Šekara, 2011), is defined by the model (5). Based on this model, a procedure is derived for classifying a large class of stable, integrating and unstable processes into a two-parameter *ρ*-*φ* classification plane (Šekara & Mataušek, 2011a). As a result of this classification, a new CSC concept is developed. In the *ρ*-*φ* classification plane, parameters gn(*ρ*,*φ*)={*k*n(*ρ*,*φ*), *k*in(*ρ*,*φ*), *k*dn(*ρ*,*φ*)} and *T*fn(*ρ*,*φ*)=|*k*dn(*ρ*,*φ*)|/*m*n, of a virtual PIDn controller can be calculated in advance, to satisfy robustness defined by *M*s and sensitivity to measurement noise defined by *m*n. Also it is possible to satisfy *M*s, *m*n and the closed-loop system damping ratio *ζ*. Calculation of parameters *g*n(*ρ*,*φ*) and *T*fn(*ρ*,*φ*) is process-independent. The calculation is performed by using model Gn(*s*n,*ρ*,*φ*), defined by the values of *ρ* and *φ* for stable processes in the range 0 1, 0 / 1 , for integrating processes in the range 1, 0 / 2 , for unstable processes by the values of the *ρ* and *φ* outside these regions.

Parameters *g*n(*ρ*,*φ*), calculated for a given region in the *ρ*-*φ* classification plane, are memorized as process-independent look-up tables. Then, for the process *G*p(*s*) classified into this region of the *ρ*-*φ* classification plane, parameters of a real PID controller *k*, *k*i, *k*d, *T*f are obtained directly from *g*n(*ρ*,*φ*), *T*fn(*ρ*,*φ*) and the estimated quadruplet {*k*u, u, *φ*, *A*} or {*k*u, u, *φ*, *A*0} for stable/unstable processes, and the triplet {*k*u, u, *φ*} for integrating processes. It is demonstrated by simulations that for the real *M*n equal to *M*n=|*k*u|*m*n, the desired *M*s and *ζ* are obtained when a real PID controller, obtained by the proposed GSC, is applied to the process *G*p(*s*). The desired performance/robustness tradeoff can be accurately predicted. Namely, performance index IAE and robustness index *M*s, obtained on the model *G*m(*s*) in (5) are almost the same as those obtained for the process *G*p(*s*), as confirmed here and by a large test batch considered in (Šekara & Mataušek, 2010a; Mataušek & Šekara, 2011; Šekara & Mataušek, 2011a).

A set of new constrained PID optimization techniques is derived for determining the four parameters *k*, *k*i, *k*d,*T*f of the PID controller. The one of them has a unique property. The unknown parameters are obtained as the solution of only two nonlinear algebraic equations, with the good initial values of the unknown two parameters, determined to satisfy the desired values *M*s and *M*n, given desired value of the closed-loop system damping ratio *ζ*. Thus, the critically damped closed-loop system response is obtained for *ζ*=1. Two extensions of the PLL-based and relay-based procedures are derived in (Mataušek & Šekara, 2011; Šekara & Mataušek, 2010b; 2011c; 2011b) for determining the quadruplet {*k*u, u, *φ*, *A*0}. These procedures can be applied for the closed-loop PID controller tuning/retuning, in the presence of the measurement noise and load disturbance, without breaking the loop of the controller in operation.

Process-independent look-up tables of parameters *g*n(*ρ*,*φ*), defining the process-independent GSC, can be applied by using any process dynamics characterization defined by the estimated frequency response of the process around the ultimate frequency. This is demonstrated in the present chapter by applying a model obtained previously by a highorder ARX identification of a laboratory thermal plant, and reduced then to the fifth order *G*p3(s), used here to determine the quadruplet {*k*u, u, *φ*, *A*}. This quadruplet is applied to determine parameters of the real PID, by using the look-up tables of parameters *g*n(*ρ*,*φ*) calculated previously in (Šekara & Mataušek, 2011a). As confirmed by the experimental results, the method of the proposed process-independent GSC is effective. Finally, it is believed that material presented in this chapter will initiate further development of the proposed process-independent GSC and its implementation in advanced controllers.

#### **7. Appendix**

Parameters *η*1, *η*2, *β*1, *β*2 and *β*<sup>3</sup>

$$\eta\_1 = \frac{\alpha\_1 \sin(\alpha\_\mathbf{u} \tau) + \alpha\_2 \cos(\alpha\_\mathbf{u} \tau)}{\alpha\_\mathbf{u}^2}, \ \eta\_2 = \frac{\alpha\_2 \sin(\alpha\_\mathbf{u} \tau) - \alpha\_1 \cos(\alpha\_\mathbf{u} \tau) + 1}{\alpha\_\mathbf{u}^2} \ \ , \ \tau$$

$$\alpha\_1 = \lambda^4 \alpha\_\mathbf{u}^4 - 2\lambda^2 \alpha\_\mathbf{u}^2 (1 + 2\zeta^2) + 1, \ \alpha\_2 = 4\zeta\lambda \alpha\_\mathbf{u} (1 - \lambda^2 \alpha\_\mathbf{u}^2) \ ,$$

$$\eta\_1 = \frac{\alpha\_\mathbf{u}}{A(4\zeta\lambda + \tau - \eta\_1)}, \ \beta\_2 = \frac{2\lambda^2 (1 + 2\zeta^2) - \tau^2 \ / 2 + \eta\_1 \tau - \eta\_2}{4\zeta\lambda + \tau - \eta\_1}, \ \beta\_3 = \frac{4\zeta\lambda^3 + \tau^3 \ / 6 - \eta\_1 \tau^2 \ / 2 + \eta\_2 \tau}{4\zeta\lambda + \tau - \eta\_1} \ ,$$

138 Frontiers in Advanced Control Systems

Parameters *g*n(*ρ*,*φ*), calculated for a given region in the *ρ*-*φ* classification plane, are memorized as process-independent look-up tables. Then, for the process *G*p(*s*) classified into this region of the *ρ*-*φ* classification plane, parameters of a real PID controller *k*, *k*i, *k*d, *T*f are obtained directly from *g*n(*ρ*,*φ*), *T*fn(*ρ*,*φ*) and the estimated quadruplet {*k*u, u, *φ*, *A*} or {*k*u, u, *φ*, *A*0} for stable/unstable processes, and the triplet {*k*u, u, *φ*} for integrating processes. It is demonstrated by simulations that for the real *M*n equal to *M*n=|*k*u|*m*n, the desired *M*s and *ζ* are obtained when a real PID controller, obtained by the proposed GSC, is applied to the process *G*p(*s*). The desired performance/robustness tradeoff can be accurately predicted. Namely, performance index IAE and robustness index *M*s, obtained on the model *G*m(*s*) in (5) are almost the same as those obtained for the process *G*p(*s*), as confirmed here and by a large test batch considered in (Šekara & Mataušek, 2010a; Mataušek & Šekara, 2011; Šekara

A set of new constrained PID optimization techniques is derived for determining the four parameters *k*, *k*i, *k*d,*T*f of the PID controller. The one of them has a unique property. The unknown parameters are obtained as the solution of only two nonlinear algebraic equations, with the good initial values of the unknown two parameters, determined to satisfy the desired values *M*s and *M*n, given desired value of the closed-loop system damping ratio *ζ*. Thus, the critically damped closed-loop system response is obtained for *ζ*=1. Two extensions of the PLL-based and relay-based procedures are derived in (Mataušek & Šekara, 2011; Šekara & Mataušek, 2010b; 2011c; 2011b) for determining the quadruplet {*k*u, u, *φ*, *A*0}. These procedures can be applied for the closed-loop PID controller tuning/retuning, in the presence of the measurement noise and load disturbance, without breaking the loop of the

Process-independent look-up tables of parameters *g*n(*ρ*,*φ*), defining the process-independent GSC, can be applied by using any process dynamics characterization defined by the estimated frequency response of the process around the ultimate frequency. This is demonstrated in the present chapter by applying a model obtained previously by a highorder ARX identification of a laboratory thermal plant, and reduced then to the fifth order *G*p3(s), used here to determine the quadruplet {*k*u, u, *φ*, *A*}. This quadruplet is applied to determine parameters of the real PID, by using the look-up tables of parameters *g*n(*ρ*,*φ*) calculated previously in (Šekara & Mataušek, 2011a). As confirmed by the experimental results, the method of the proposed process-independent GSC is effective. Finally, it is believed that material presented in this chapter will initiate further development of the

proposed process-independent GSC and its implementation in advanced controllers.

12 21

 

 

2 (1 2 ) /2 4 /6 /2 , , *<sup>A</sup>*(4 ) <sup>4</sup> <sup>4</sup>

 

1 2

 

**u** 

 

1 2

 

1 2 3

 

**uu uu 2 2 u u**

sin( ) cos( ) sin( ) cos( ) 1 ,

,

 

**u u u u** 44 22 2 2 2

> 

,

2 (1 2 ) 1, 4 (1 ) ,

 

11 1

 

> 

2 22 33 2

   

1 2 1 2

 

   

 

 

& Mataušek, 2011a).

controller in operation.

**7. Appendix** 

Parameters *η*1, *η*2, *β*1, *β*2 and *β*<sup>3</sup>

 

> 

> >

are presented here, to make possible to repeat the results obtained by the PID optimization from (Mataušek & Šekara, 2011).

Tuning formulae tun1, for integrating processes for *M*s=2 and *m*n=2, given by

$$
\begin{bmatrix} k\_{\mathbf{n}} \\ k\_{\mathbf{in}} \\ k\_{\mathbf{dn}} \end{bmatrix} = \begin{bmatrix} 0.5904 & -0.2707 & 0.3029 & -0.1554 & 0.0311 \\ 0.1534 & -0.0826 & 0.0409 & -0.0164 & 0.0033 \\ 1.2019 & -1.5227 & 1.0714 & -0.4944 & 0.0916 \end{bmatrix} \begin{bmatrix} 1 \\ \rho \\ \rho^2 \\ \rho^3 \\ \rho^4 \end{bmatrix},
$$

and tun2, for processes with the oscillatory dynamics for *M*s=40 and *m*n=2, given by

$$
\begin{bmatrix} k\_{\mathbf{n}} \\ k\_{\mathbf{in}} \\ k\_{\mathbf{dn}} \end{bmatrix} = \begin{bmatrix} -8.9189 & 63.0913 & 0.6494 & -135.2567 & 0.2806 & -3.5564 \\ 2.2218 & -16.5791 & 0.1361 & 37.5733 & 0.0136 & -0.6388 \\ 14.8966 & -82.7969 & -9.0810 & 145.2467 & 0.9056 & 25.1221 \\ & & & & & \end{bmatrix} \begin{bmatrix} \rho \\ \rho \\ \rho \\ \rho^2 \\ \rho^2 \\ \rho\rho \end{bmatrix},
$$


are defined in (Šekara & Mataušek, 2011a). The angle *φ* is in radians.

Table 6. Parameters of models *G*m*<sup>j</sup>*(*s*) of processes *G*p*<sup>j</sup>*(*s*), *j*=1,2,...,9.

Normalized parameters of the PIDn controller can be obtained by interpolation based on the three points in the *ρ*-*φ* look-up tables of the memorized parameters *k*n(*ρi*,*φj*), *k*in(*ρi*,*φj*) and *k*dn(*ρi*,*φj*), *i*=1,2,...,*I*m, *j*=1,2,...,*J*m, determined in advance. In the present paper the look-up tables from (Šekara & Mataušek, 2011a, Tables 1-4) are used. The four points mash in the *ρ*-*φ* look-up tables is presented in Fig. 13. The normalized parameters of the PIDn controller, for the lower triangle are given by:

$$k\_{\mathbf{n}} = (1 - \alpha - \beta)k\_{\mathbf{n}2,1} + \alpha k\_{\mathbf{n}2,2} + \beta k\_{\mathbf{n}1,1}, \qquad \qquad k\_{\mathbf{in}} = (1 - \alpha - \beta)k\_{\mathbf{in}2,1} + \alpha k\_{\mathbf{in}2,2} + \beta k\_{\mathbf{in}1,1}, \quad \text{(1)}$$

$$k\_{\mathbf{dn}} = (1 - \alpha - \beta)k\_{\mathbf{dn}2,1} + \alpha k\_{\mathbf{dn}2,2} + \beta k\_{\mathbf{dn}1,1},$$

where *α*=*αll*, =*ll*. The normalized parameters of the PIDn controller, for the upper triangle are given by:

$$k\_{\mathbf{n}} = (1 - \alpha - \beta)k\_{\mathbf{n}1,2} + \alpha k\_{\mathbf{n}1,1} + \beta k\_{\mathbf{n}2,2}, \qquad \qquad k\_{\mathbf{in}} = (1 - \alpha - \beta)k\_{\mathbf{in}1,2} + \alpha k\_{\mathbf{in}1,1} + \beta k\_{\mathbf{in}2,2}, \dots$$

$$k\_{\mathbf{dn}} = (1 - \alpha - \beta)k\_{\mathbf{dn}1,2} + \alpha k\_{\mathbf{dn}1,1} + \beta k\_{\mathbf{dn}2,2}$$

where *α*=*αru*, =*ru*. In both cases *Tkm* **fn** *dn* / **<sup>n</sup>** . Then, parameters of the PID controller are obtained from (27).

Fig. 13. The four point mash in the *ρ*-*φ* plane in (Šekara & Mataušek, 2011a, Tables 1-4). For lower triangle *ll*=(est -2,1)/(2,2 -2,1) and *ll*=(*φ*2,1- *φ*est )/( *φ*2,1 – *φ*1,1). For the upper triangle *ru*=(1,2-est )/(1,2 -1,1) *ru*=(*φ*est – *φ*1,2)/( *φ*2,2 – *φ*1,2). All angles are in degrees and *φ*1,1≤ *φ*2,1, *φ*1,1≤*φ*est≤ *φ*2,1, *ρ*1,1≤ *ρ*est≤ *ρ*2,1.

#### **10. Acknowledgement**

Authors gratefully acknowledge discussion with Dr. Aleksandar Ribić and his help in implementing the anti-windup PID controller on the laboratory thermal plant. T.B. Šekara gratefully acknowledges the financial support from the Serbian Ministry of Science and Technology (Project TR33020). M.R. Mataušek gratefully acknowledges the financial support from TERI Engineering, Belgrade, Serbia.

#### **11. References**


140 Frontiers in Advanced Control Systems

**dn dn dn dn** 1,2 1,1 2,2 *k kkk* (1 )

1,1 1,2

2,1 2,2

2,1 1,1 ( ) 

Fig. 13. The four point mash in the *ρ*-*φ* plane in (Šekara & Mataušek, 2011a, Tables 1-4). For

Authors gratefully acknowledge discussion with Dr. Aleksandar Ribić and his help in implementing the anti-windup PID controller on the laboratory thermal plant. T.B. Šekara gratefully acknowledges the financial support from the Serbian Ministry of Science and Technology (Project TR33020). M.R. Mataušek gratefully acknowledges the financial support

Agüero, J.C.; Goodwin, G.C. & Van den Hof, P.M.J. (2011). A virtual closed loop method for

Aström, K.J. & Hägglund, T. (1984). Automatic tuning of simple regulators with specifications on phase and amplitude margins, *Automatica*, Vol. 20, pp. 645-651. Aström, K.J.; Hang, C.C., Persson, P. & Ho, W.K. (1992), Towards intelligent PID control,

Aström, K.J. & Hägglund, T. (1995a). *PID controllers: Theory, design and tuning*, 2nd edition,

Aström, K.J. & Hägglund, T. (1995b), New tuning methods for PID controllers, *Proceedings*

Aström, K.J.; Panagopoulos, H. & Hägglund, T. (1998). Design of PI controllers based on

closed loop identification. *Automatica*, Vol. 47, pp. 1626-1637.

ISA, ISBN 1-55617-516-7, Research Triangle Park, NC 27709.

non-convex optimization. *Automatica*, Vol. 34, pp. 585-601.

*European control conference*, Rome, Italy, pp. 2456-2462.

*ll*. The normalized parameters of the PIDn controller, for the upper triangle

*ru*. In both cases *Tkm* **fn** *dn* / **<sup>n</sup>** . Then, parameters of the PID controller are

 , **in in in in** 1,2 1,1 2,2 *k kkk* (1 ) 

*ru*=(*φ*est – *φ*1,2)/( *φ*2,2 – *φ*1,2). All angles are in degrees and *φ*1,1≤ *φ*2,1,

*ll*=(*φ*2,1- *φ*est )/( *φ*2,1 – *φ*1,1). For the upper triangle

,

,

where *α*=*αll*,

are given by:

where *α*=*αru*,

lower triangle

**11. References**

*ru*=(1,2-est )/(1,2 -1,1) 

*ll*=(est -2,1)/(2,2 -2,1) and 

*φ*1,1≤*φ*est≤ *φ*2,1, *ρ*1,1≤ *ρ*est≤ *ρ*2,1.

**10. Acknowledgement** 

from TERI Engineering, Belgrade, Serbia.

*Automatica*, Vol. 28, pp. 1-9.

obtained from (27).

=

> 

 =

**n nnn** 1,2 1,1 2,2 *k kkk* (1 )

 

22 21 ( ) 

> 1


### **A Comparative Study Using Bio-Inspired Optimization Methods Applied to Controllers Tuning**

Davi Leonardo de Souza1, Fran Sérgio Lobato2 and Rubens Gedraite2 <sup>1</sup>*Department of Chemical Engineering and Statistics Universidade Federal de São João Del-Rei,* <sup>2</sup>*School of Chemical Engineering Universidade Federal de Uberlândia Brazil*

#### **1. Introduction**

142 Frontiers in Advanced Control Systems

Mataušek, M.R. & Ribić, A.I. (2012). Control of stable, integrating and unstable processes by the Modified Smith Predictor, *Journal of Process Control*, Vol. 22, pp. 338-343. Panagopoulos, H.; Aström, K.J. & Hägglund, T. (2002). Design of PID controllers based on constrained optimization. *IEE Control Theory and Applications*, Vol. 149, pp. 32-40. Rapaić, M.R. (2008). Matlab implementation of the Particle Swarm Optimization (PSO)

http://www.mathworks.com/matlabcentral/fileexchange/22228-particleswarm-

operation data. *Journal of Process Control*, Vol. 20, pp. 217-227.

Smith, O.J. (1959). A controller to overcome dead time, *ISA Journal*, Vol. 6, pp. 28-33.

characterization. *Journal of Process Control*, Vol. 20, pp. 360-363.

*INDEL,* Bosnia and Herzegovina ,Vol. 8, pp. 258-261.

robustness?. *Measurement and Control*, Vol. 23, pp. 114-121.

Seki, H. & Shigemasa, T. (2010). Retuning oscillatory PID control loops based on plant

Shinskey, F.G. (1990). How good are our controllers in absolute performance and

Šekara, T.B. & Mataušek, M.R. (2008). Optimal and robust tuning of the PI controller based

Šekara, T.B. & Mataušek, M.R. (2009). Optimization of PID controller based on maximization

Šekara, T.B. & Mataušek, M.R. (2010b). Comparative analysis of the relay and phase-locked

Šekara, T.B. & Trifunović, M.B. (2010). Optimization in the frequency domain of the PID

Šekara, T.B. & Mataušek, M.R. (2011a). Classification of dynamic processes and PID controller tuning in a parameter plane. *Journal of Process Control*, Vol. 21, pp. 620-626. Šekara, T.B. & Mataušek, M.R. (2011b). Relay-based critical point estimation of a process with the PID controller in the loop. *Automatica*, Vol. 47, pp. 1084-1088. Šekara, T.B. & Mataušek, M.R. (2011c). Robust process identification by using Phase-Locked-

Šekara, T.B.; Trifunović, M.B. & Govedarica, V. (2011). Frequency domain design of a

Trifunović, M.B. & Šekara, T.B. (2011). Tuning formulae for PID/PIDC controllers of

Ziegler, J.G. & Nichols, N.B. (1942). Optimum settings for automatic controllers, *Transaction* 

on the maximization of the criterion *J*C defined by the linear combination of the integral gain and the closed-loop system bandwidth. *ELECTRONICS*, Vol.12, pp.

of the proportional gain under constraints on robustness and sensitivity to measurement noise. *IEEE Transactions on Automatic Control*, Vol. 54, pp. 184-189. Šekara, T.B. & Mataušek, M.R. (2010a). Revisiting the Ziegler-Nichols process dynamics

loop experiment used to determine ultimate frequency and ultimate gain.

controller in series with a lead-lag filter. (*in Serbian*), *Proceedings of Conference* 

Loop (*in Serbian*), *Proceedings of Conference INFOTEH-JAHORINA*, Bosnia and

complex controller under constraints on robustness and sensitivity to measurement

processes with the ultimate gain and ultimate frequency (*in Serbian*), *Proceedings of Conference INFOTEH-JAHORINA*, Bosnia and Herzegovina, Vol. 10, pp. 12-17. Yamamoto, S. & Hashimoto, I. (1991). Present status and future needs: the view form

Japanese industry. In: *Chemical Process Control IV (CRC-IV)*, South Padre Island, US,

algorithm, *Matlab Central*, Available from

optimization-pso-algorithm

*ELECTRONICS*, Vol. 14, pp. 77-81.

Herzegovina, Vol. 10, pp. 18-21.

*of ASME*, Vol.64, pp. 759-768.

noise. *ELECTRONICS*, Vol. 15, pp. 40-44.

41-45.

pp.1-28

The evolution of process control techniques have increased in a significant way during last years. Even so, in the industry, the Proportional-Integral-Derivative (PID) controller is frequently used in closed loops due to its simplicity, applicability, and easy implementation (Astrom & Hagglund, 1995); Shinskey, 1998; (Desbourough & Miller, 2002). An extensive research concerning regulatory control of loops used in refinery, chemical, and pulp and paper processes reveals that 97% of the applications make use of classical PID structure even though sophisticated control techniques, like advanced control strategies, are also based on PID algorithms with lower hierarchy level (Desbourough & Miller, 2002).

Traditionally, the controllers tuning is obtained using classical methods, such as Ziegler-Nichols (ZN), Cohen-Coon (CC) and hybridization. However, these methodologies are found to present quite satisfactory results for first-order processes, but they usually fail to provide acceptable performance for higher-order processes and especially for nonlinear ones due to large overshoots and poor regulation on loading (Hang et al., 1991; Mudi et al., 2008). In addition, it has been quite difficult to tune properly the PID parameters, during typical operation plant, due to difficulties related to production goals (Coelho & Pessoa, 2011).

Recently, optimization methods through use of information about real or synthetic data, has been used as alternative to controllers tuning (Lobato & Souza, 2008). Among these strategies, one can cite the based on evolutionary optimization techniques to controllers tuning, such as fuzzy logic (Hamid et al., 2010), genetic algorithms (Bandyopadhyay et al., 2001; Pan et al., 2011), augmented Lagrangian particle swarm optimization algorithm (Sedlaczek & Eberhard, 2006), particle swarm optimization (Kim et al., 2008; Solihin et al., 2011); differential evolution algorithm (Lobato & Souza, 2008); and differential evolution combined with chaotic Zaslavskii map (Coelho & Pessoa, 2011). Basically, the interest in evolutionary approach is due to following characteristics: easy code building and implementation, no usage of information about gradients and, capacity to escape from local optimal (Lobato & Souza, 2008; Souza, 2007).

According to this search area, biological systems have contributed significantly to the development of new optimization techniques. These methodologies - known as Bio-inspired Optimization Methods (BiOM ) - are based on usage of strategies that seek to mimic the behavior observed in species of nature to update a population of candidates to solve optimization problems (Lobato et al., 2010). These systems have the capacity to notice and to modify its "atmosphere" in order to seek diversity and convergence. In addition, this capacity turns possible the communication among the agents (individuals of population) that capture the changes in "atmosphere" generated by local interactions (Parrich et al., 2002).

Among the most recent bio-inspired strategies, one can cite the Bees Colony Algorithm - BCA (Pham et al., 2006), the Fish Swarm Algorithm - FSA (Li et al., 2002) and the Firefly Colony Algorithm - FCA (Yang, 2008). The classical form of BCA is based on the behavior of bees' colonies in their search of raw materials for honey production. In each hive, groups of bees (called scouts) are recruited to explore new areas in search for pollen and nectar. These bees, returning to the hive, share the acquired information so that new bees are indicated to explore the best regions visited in an amount proportional to the previously passed assessment. Thus, the most promising regions are best explored and eventually the worst ones end up being discarded. This cycle repeats itself, with new areas being visited by scouts at each iteration (Pham et al., 2006). The FSA is a random search algorithm based on the behaviour of fish swarm which contains searching, swarming and chasing behaviour. It constructs the simple behaviours of artificial fish firstly, and then makes the global optimum appear finally based on animal individuals' local searching behaviours (Li et al., 2002). Finally, the FCA is inspired in social behaviour of fireflies and their communication through the phenomenon of bioluminescence. This optimization technique admits that the solution of an optimization problem can be perceived as an agent (firefly) which "glows" proportionally to its quality in a considered problem setting. Consequently each brighter firefly attracts its partners (regardless of their sex), which makes the search space being explored more efficiently (Yang, 2008).

In the present contribution, BiOM are used for the controllers tuning in chemical engineering problems. For this finality, three problems are studied, with emphasis on a realistic application: the control design of heat exchangers on pilot scale. The results obtained with the methodology proposed are compared with those from the classical methods. This chapter is organized as follows. Classical methods to controllers tuning are reviewed in Section 2. In Section 3 the main characteristics of BiOM are briefly presented. The results and discussion are described in Section 4. Finally, the conclusions and suggestions for future work complete the chapter.

#### **2. Controllers tuning using classical methods**

As mentioned earlier, about 97% of industrial controllers are of PID type, and implement them in practice, or even during maintenance of same, there are several technical adjustments of its parameters. In literature, there are several classical methods for controllers tuning, such as strategies based on minimization of integral error, and correlation-based methods such as ZN and CC, among others.

The majority of works involving the controllers design use the ZN and CC methods (Conner & Seborg, 2005; Lobato & Souza, 2008; Solihin et al., 2011; Xi et al., 2007). In this context, the ZN and CC methods are brief described.

#### **2.1 Reaction curve method**

2 Will-be-set-by-IN-TECH

According to this search area, biological systems have contributed significantly to the development of new optimization techniques. These methodologies - known as Bio-inspired Optimization Methods (BiOM ) - are based on usage of strategies that seek to mimic the behavior observed in species of nature to update a population of candidates to solve optimization problems (Lobato et al., 2010). These systems have the capacity to notice and to modify its "atmosphere" in order to seek diversity and convergence. In addition, this capacity turns possible the communication among the agents (individuals of population) that capture

Among the most recent bio-inspired strategies, one can cite the Bees Colony Algorithm - BCA (Pham et al., 2006), the Fish Swarm Algorithm - FSA (Li et al., 2002) and the Firefly Colony Algorithm - FCA (Yang, 2008). The classical form of BCA is based on the behavior of bees' colonies in their search of raw materials for honey production. In each hive, groups of bees (called scouts) are recruited to explore new areas in search for pollen and nectar. These bees, returning to the hive, share the acquired information so that new bees are indicated to explore the best regions visited in an amount proportional to the previously passed assessment. Thus, the most promising regions are best explored and eventually the worst ones end up being discarded. This cycle repeats itself, with new areas being visited by scouts at each iteration (Pham et al., 2006). The FSA is a random search algorithm based on the behaviour of fish swarm which contains searching, swarming and chasing behaviour. It constructs the simple behaviours of artificial fish firstly, and then makes the global optimum appear finally based on animal individuals' local searching behaviours (Li et al., 2002). Finally, the FCA is inspired in social behaviour of fireflies and their communication through the phenomenon of bioluminescence. This optimization technique admits that the solution of an optimization problem can be perceived as an agent (firefly) which "glows" proportionally to its quality in a considered problem setting. Consequently each brighter firefly attracts its partners (regardless of their sex), which makes the search space being explored more efficiently (Yang, 2008).

In the present contribution, BiOM are used for the controllers tuning in chemical engineering problems. For this finality, three problems are studied, with emphasis on a realistic application: the control design of heat exchangers on pilot scale. The results obtained with the methodology proposed are compared with those from the classical methods. This chapter is organized as follows. Classical methods to controllers tuning are reviewed in Section 2. In Section 3 the main characteristics of BiOM are briefly presented. The results and discussion are described in Section 4. Finally, the conclusions and suggestions for future work complete

As mentioned earlier, about 97% of industrial controllers are of PID type, and implement them in practice, or even during maintenance of same, there are several technical adjustments of its parameters. In literature, there are several classical methods for controllers tuning, such as strategies based on minimization of integral error, and correlation-based methods such as ZN

The majority of works involving the controllers design use the ZN and CC methods (Conner & Seborg, 2005; Lobato & Souza, 2008; Solihin et al., 2011; Xi et al., 2007). In this context, the

the chapter.

and CC, among others.

**2. Controllers tuning using classical methods**

ZN and CC methods are brief described.

the changes in "atmosphere" generated by local interactions (Parrich et al., 2002).

The principle of this method is the correlation between controller parameters (*Kc*, *τ<sup>I</sup>* e *τD*) with model parameters (*K*, *τ* and *θ*) through the temporal response of open-loop system (called the process reaction curve ), compared to a step input. In open loop, leads to a unit step of input variable to obtain the reaction curve as in Fig. 1. With the parameters *θ* and *τ*, we can obtain

Fig. 1. Time response to open-loop system with step input *y*(*t*).



Table 1. Controllers tuning with ZN and CC methods through reaction curve (Seborg et al., 1989).

#### **2.2 Continuous cycling method**

This classical method is based on sustained oscillation, known as Continuous Cycling Method (Seborg et al., 1989). This procedure is valid only for open-loop stable plants, and conducted with the following steps: (*i*) establishment of parameter proportional to a very small gain, (*ii*) increase the gain to obtain an oscillatory response with constant amplitude and period (*iii*) registration of critical value (*Ku*), critical period (*Pu*) and (*iv*) adjustment of the parameters, as shown in Tab. 2. Although the vast majority of PID controllers design is tuned by ZN and CC methods, some difficulties can be observed, such as the need for knowledge of process dynamics in open-loop, and in the Continuous Cycling method, the need to work near of system instability limit (Seborg et al., 1989).


Table 2. Controllers tuning with Continuous Cycling (Seborg et al., 1989).

#### **3. Bio-inspired optimization methods**

In the last decades, nature has inspired the development of various optimization methods. These techniques try to imitate behaviors of species found in nature, such as ants, birds, bees, fireflies, bacteria, among others, to extract information that can be used to promote the development of simple and robust strategies.

This section presents briefly three bio-inspired algorithms in nature: the Bee Colony Algorithm, the Firefly Colony Algorithm and the Fish Swarm Algorithm.

#### **3.1 Bee colony algorithm - BCA**

The algorithm proposed by Pham et al. (2006) and described in this section is based on the following characteristics observed in nature (von Frisch, 1976): (*i*) a bees' colony can extend itself over long distances (more than 10 km) and in multiple directions simultaneously to exploit a large number of food sources, and (*ii*) capacity of memorization, learning and transmission of information in colony, so forming the swarm intelligence.

In a colony the foraging process begins by scout bees being sent to search randomly for promising flower patches. When they return to the hive, those scout bees that found a patch which is rated above a certain quality threshold (measured as a combination of some constituents, such as sugar content) deposit their nectar or pollen and go to the "waggle dance".

This dance is responsible by the transmission (colony communication) of information regarding a flower patch: the direction in which it will be found, its distance from the hive and its quality rating (or fitness) (von Frisch, 1976). This dance enables the colony to evaluate the relative merit of different patches according to both the quality of the food they provide and the amount of energy needed to harvest it (Camazine et al., 2003). Mathematically this dance can be represented by following expression:

$$\mathbf{x} = \mathbf{x} - \eta \mathbf{g} \mathbf{h} + 2\eta \mathbf{g} \mathbf{h} \times \text{rand} \tag{1}$$

where *x* is the new position, *ngh* is the patch radius for neighbourhood search and *rand* is the random generator.

After waggle dancing on the dance floor, the dancer (scout bee) goes back to the flower patch with follower bees that were waiting inside the hive. More follower bees are sent to more promising patches. This allows the colony to gather food quickly and efficiently. While harvesting from a patch, the bees monitor its food level. This is necessary to decide upon the next waggle dance when they return to the hive (Camazine et al., 2003). If the patch is still good enough as a food source, then it will be advertised in the waggle dance and more bees will be recruited to that source.

In this context, Pham et al. (2006) proposed an optimization algorithm inspired by the natural foraging behavior of honey bees and presented in Tab. 3.


4 Will-be-set-by-IN-TECH

Controller/Parameter *Kc τ<sup>I</sup> τ<sup>D</sup>*

In the last decades, nature has inspired the development of various optimization methods. These techniques try to imitate behaviors of species found in nature, such as ants, birds, bees, fireflies, bacteria, among others, to extract information that can be used to promote the

This section presents briefly three bio-inspired algorithms in nature: the Bee Colony

The algorithm proposed by Pham et al. (2006) and described in this section is based on the following characteristics observed in nature (von Frisch, 1976): (*i*) a bees' colony can extend itself over long distances (more than 10 km) and in multiple directions simultaneously to exploit a large number of food sources, and (*ii*) capacity of memorization, learning and

In a colony the foraging process begins by scout bees being sent to search randomly for promising flower patches. When they return to the hive, those scout bees that found a patch which is rated above a certain quality threshold (measured as a combination of some constituents, such as sugar content) deposit their nectar or pollen and go to the "waggle

This dance is responsible by the transmission (colony communication) of information regarding a flower patch: the direction in which it will be found, its distance from the hive and its quality rating (or fitness) (von Frisch, 1976). This dance enables the colony to evaluate the relative merit of different patches according to both the quality of the food they provide and the amount of energy needed to harvest it (Camazine et al., 2003). Mathematically this

where *x* is the new position, *ngh* is the patch radius for neighbourhood search and *rand* is the

After waggle dancing on the dance floor, the dancer (scout bee) goes back to the flower patch with follower bees that were waiting inside the hive. More follower bees are sent to more promising patches. This allows the colony to gather food quickly and efficiently. While harvesting from a patch, the bees monitor its food level. This is necessary to decide upon the next waggle dance when they return to the hive (Camazine et al., 2003). If the patch is still good enough as a food source, then it will be advertised in the waggle dance and more bees

In this context, Pham et al. (2006) proposed an optimization algorithm inspired by the natural

*x* = *x* − *ngh* + 2*ngh* × *rand* (1)

Table 2. Controllers tuning with Continuous Cycling (Seborg et al., 1989).

Algorithm, the Firefly Colony Algorithm and the Fish Swarm Algorithm.

transmission of information in colony, so forming the swarm intelligence.

**3. Bio-inspired optimization methods**

development of simple and robust strategies.

dance can be represented by following expression:

foraging behavior of honey bees and presented in Tab. 3.

**3.1 Bee colony algorithm - BCA**

dance".

random generator.

will be recruited to that source.

P 0.5*Ku* - - PI 0.45*Ku Pu*/1.2 - PID 0.6*Ku* 0.5*Pu Pu*/8

Table 3. Bees Colony Algorithm (Pham et al., 2006).

The BCA requires a number of parameters to be set, namely, the number of scout bees (*n*), number of sites selected for neighborhood search (out of *n* visited sites) (*m*), number of top-rated (elite) sites among *m* selected sites (*e*), number of bees recruited for the best *e* sites (*nep*), number of bees recruited for the other (*m*-*e*) selected sites (*ngh*), and the stopping criterion.

The BCA starts with the *n* scout bees being placed randomly in the search space. The fitnesses of the sites visited by the scout bees are evaluated in step 2.

In step 4, bees that have the highest fitnesses are chosen as "selected bees" and sites visited by them are chosen for neighborhood search. Then, in steps 5 and 6, the algorithm conducts searches in the neighborhood of the selected sites, assigning more bees to search near to the best *e* sites. The bees can be chosen directly according to the fitnesses associated with the sites they are visiting.

Alternatively, the fitness values are used to determine the probability of the bees being selected. Searches in the neighborhood of the best *e* sites, which represent more promising solutions, are made more detailed by recruiting more bees to follow them than the other selected bees. Together with scouting, this differential recruitment is a key operation of the BCA.

However, in step 6, for each patch only the bee with the highest fitness will be selected to form the next bee population. In nature, there is no such a restriction. This restriction is introduced here to reduce the number of points to be explored. In step 7, the remaining bees in the population are assigned randomly around the search space scouting for new potential solutions.

In the literature, various applications using this bio-inspired approach can be found, such as: modeling combinatorial optimization transportation engineering problems (Lucic & Teodorovic, 2001), engineering system design (Lobato et al., 2010; Yang, 2005), transport problems (Teodorovic & Dell'Orco, 2005), mathematical function optimization (Pham et al., 2006), dynamic optimization (Chang, 2006), optimal control problems (Afshar et al., 2001), parameter estimation in control problems (Azeem & Saad, 2004), estimation of radiative properties in a one-dimensional participating medium (Ribeiro Neto et al., 2011), among other applications (http://www.bees-algorithm.com/).

#### **3.2 Firefly colony algorithm - FCA**

The FCA is based on the characteristics of fireflies' bioluminescence, insects notorious for their light emission. Although biology does not have a complete knowledge to determine all the utilities that firefly luminescence can bring to, at least three functions have been identified (Lukasik & Zak, 2009; Yang, 2008): (*i*) as a communication tool and appeal to potential partners in the reproduction, (*ii*) as a bait to lure prey for the firefly, (*iii*) as a warning mechanism for potential predators reminding them that fireflies have a bitter taste.

It were idealized some of the flashing characteristics of the fireflies so as to develop firefly-inspired algorithms. The following three idealized rules were used (Yang, 2008):


According to Yang (2008), in the firefly algorithm there are two important issues: the variation of light intensity and the formulation of the attractiveness. For simplicity, it is always assumed that the attractiveness of a firefly is determined by its brightness, which in turn is associated with the encoded objective function.

This swarm intelligence optimization technique is based on the assumption that the solution of an optimization problem can be perceived as agent (firefly) which "glows" proportionally to its quality in a considered problem setting. Consequently, each brighter firefly attracts its partners (regardless of their sex) which make the search space being explored more efficiently.

The algorithm makes use of a synergic local search. Each member of the swarm explores the problem space taking into account results obtained by others, still applying its own randomized moves as well. The influence of other solutions is controlled by the value of attractiveness (Lukasik & Zak, 2009).

According to Lukasik & Zak (2009), the FA is presented as follows. Consider a continuous constrained optimization problem where the task is to minimize the cost function *f*(*x*). Assume that there exists a swarm of *N* agents (fireflies) solving the above mentioned problem iteratively, and *xi* represents a solution for a firefly *i* at the algorithm's iteration *k*, whereas *f*(*xi*) denotes its cost. Initially, all fireflies are dislocated in *S* (randomly or employing some deterministic strategy). Each firefly has its distinctive attractiveness *β* which implies how strong it attracts other members of the swarm. As the firefly attractiveness, one should select any monotonically decreasing function of the distance *ri*=*d*(*xi*,*xj*) to the chosen firefly *j*, e.g., the exponential function:

$$
\mathcal{B} = \mathcal{B}\_0 e^{-\gamma r\_{\parallel}} \tag{2}
$$

where *β*<sup>0</sup> and *γ* are the following predetermined algorithm parameters: maximum attractiveness value and absorption coefficient, respectively. Furthermore, every member of the swarm is characterized by its light intensity, *Ii*, which can be directly expressed as the inverse of a cost function *f*(*xi*). To effectively explore the considered search space *S*, it is assumed that each firefly *i* changes its position iteratively by taking into account two factors: attractiveness of other swarm members with higher light intensity, e.g., *Ij* > *Ii*, ∀*j*=1, ..., *m*, *j* � *i*, which is varying across the distance and a fixed random step vector *ui*. It should be noted as well that if no brighter firefly can be found only such randomized step is being used.

Thus, moving at a given time step *t* of a firefly *i* toward a better firefly *j* is defined as:

$$\mathbf{x}\_{i}^{t} = \mathbf{x}\_{i}^{t-1} + \beta \left(\mathbf{x}\_{j}^{t-1} - \mathbf{x}\_{i}^{t-1}\right) + a \left(rand - \frac{1}{2}\right) \tag{3}$$

where the second term on the right hand side of the equation inserts the attractiveness factor, *β* while the third term (governed by the parameter *α*) governs the insertion of certain randomness in the path followed by the firefly, *rand* is a random number between 0 and 1.

In the literature, few works using the FCA can be found. In this context, the application of the technique is emphasized in continuous constrained optimization task (Lukasik & Zak, 2009), multimodal optimization (Yang, 2009), solution of singular optimal control problems (Pfeifer & Lobato, 2010) and load dispatch problem (Apostolopoulos & Vlachos, 2011).

#### **3.3 Fish swarm algorithm - FSA**

6 Will-be-set-by-IN-TECH

the utilities that firefly luminescence can bring to, at least three functions have been identified (Lukasik & Zak, 2009; Yang, 2008): (*i*) as a communication tool and appeal to potential partners in the reproduction, (*ii*) as a bait to lure prey for the firefly, (*iii*) as a warning mechanism for

It were idealized some of the flashing characteristics of the fireflies so as to develop firefly-inspired algorithms. The following three idealized rules were used (Yang, 2008):

• all fireflies are unisex so that one firefly will be attracted to other fireflies regardless of their

• attractiveness is proportional to their brightness, thus for any two flashing fireflies the less bright will move towards the brightest one. The attractiveness is proportional to the brightness and they both decrease as their distance increases. If there is no brightest one,

• the brightness of a firefly is affected or determined by the landscape of the objective function. For a maximization problem, the brightness can simply be proportional to the

According to Yang (2008), in the firefly algorithm there are two important issues: the variation of light intensity and the formulation of the attractiveness. For simplicity, it is always assumed that the attractiveness of a firefly is determined by its brightness, which in turn is associated

This swarm intelligence optimization technique is based on the assumption that the solution of an optimization problem can be perceived as agent (firefly) which "glows" proportionally to its quality in a considered problem setting. Consequently, each brighter firefly attracts its partners (regardless of their sex) which make the search space being explored more efficiently. The algorithm makes use of a synergic local search. Each member of the swarm explores the problem space taking into account results obtained by others, still applying its own randomized moves as well. The influence of other solutions is controlled by the value of

According to Lukasik & Zak (2009), the FA is presented as follows. Consider a continuous constrained optimization problem where the task is to minimize the cost function *f*(*x*). Assume that there exists a swarm of *N* agents (fireflies) solving the above mentioned problem iteratively, and *xi* represents a solution for a firefly *i* at the algorithm's iteration *k*, whereas *f*(*xi*) denotes its cost. Initially, all fireflies are dislocated in *S* (randomly or employing some deterministic strategy). Each firefly has its distinctive attractiveness *β* which implies how strong it attracts other members of the swarm. As the firefly attractiveness, one should select any monotonically decreasing function of the distance *ri*=*d*(*xi*,*xj*) to the chosen firefly *j*, e.g.,

*β* = *β*0*e*

where *β*<sup>0</sup> and *γ* are the following predetermined algorithm parameters: maximum attractiveness value and absorption coefficient, respectively. Furthermore, every member of the swarm is characterized by its light intensity, *Ii*, which can be directly expressed as the inverse of a cost function *f*(*xi*). To effectively explore the considered search space *S*, it is assumed that each firefly *i* changes its position iteratively by taking into account two factors: attractiveness of other swarm members with higher light intensity, e.g., *Ij* > *Ii*, ∀*j*=1, ..., *m*,

<sup>−</sup>*γrj* (2)

potential predators reminding them that fireflies have a bitter taste.

than a particular firefly will move randomly;

value of the objective function.

with the encoded objective function.

attractiveness (Lukasik & Zak, 2009).

the exponential function:

sex;

In the development of FSA, based on fish swarm and observed in nature, the following characteristics are considered (Madeiro, 2010): (*i*) each fish represents a candidate solution of optimization problem; (*ii*) food density is related to an objective function to be optimized (in an optimization problem, the amount of food in a region is inversely proportional to value of objective function); and (*iii*) the aquarium is the design space where the fish can be found.

As noted earlier, the fish weight at the swarm represents the accumulation of food (e.g., the objective function) received during the evolutionary process. In this case, the weight is an indicator of success (Madeiro, 2010).

Basically, the FSA presents four operators classified into two class: "food search" and "movement". Details on each of these operators are shown as follows.

#### **3.3.1 Individual movement operator**

This operator contributes for the movement individual and collective of fishes in swarm. Each fish updates its new position using the Eq. (4):

$$\mathbf{x}\_{i}\left(t+1\right) = \mathbf{x}\_{i}\left(t\right) + rand \times s\_{ind} \tag{4}$$

where *xi* is the final position of fish *i* at current generation, *rand* is a random generator and *sind* is a weighted parameter.

#### **3.3.2 Food operator**

The weight of each fish is a metaphor used to measure the success of food search. The higher the weight of a fish, the more likely this fish be in a potentially interesting region in design space.

According to Madeiro (2010), the amount of food that a fish eats depends on improvement in its objective function in current generation and the value of greatest value considering the swarm. The weight is updated according to Eq. (5):

$$\mathcal{W}\_{\dot{\imath}}(t+1) = \mathcal{W}\_{\dot{\imath}}(t) + \frac{\Delta f\_{\dot{\imath}}}{\max\left(\Delta f\right)}\tag{5}$$

where *Wi*(*t*) is the fish weight *i* at generation *t* and Δ*fi* is the difference of objective function between the current position and the new position of fish *i*. It is important to emphasize that Δ*fi*=0 for the fishes in same position.

#### **3.3.3 Instinctive collective movement operator**

This operator is important for the individual movement of fishes when Δ*fi* � 0. Thus, only the fishes whose individual execution of the movement resulted in improvement of their fitness will influence the direction of motion of the school, resulting in instinctive collective movement. In this case, the resulting direction (*Id*), calculated using the contribution of the directions taken by the fish, and the new position of the *i*th fish are given by:

$$I\_d\left(t\right) = \frac{\sum\_{i=1}^{N} \Delta x\_i \Delta f\_i}{\sum\_{i=1}^{N} \Delta f\_i} \tag{6}$$

$$\mathbf{x}\_i(t+1) = \mathbf{x}\_i(t) + I\_d\left(t\right) \tag{7}$$

It is important to emphasize that in the application of this operator, the direction chosen by a fish that located the largest portion of food to exert the greatest influence on the swarm. Therefore, the instinctive collective movement operator tends to guide the swarm in the direction of motion chosen by fish who found the largest portion of food in it individual movement.

#### **3.3.4 Non-Instinctive collective movement operator**

As noted earlier, the fish weight is a good indication of search success for food. In this way, the swarm weight is increasing, this means that the search process is successful. So, the "radius" of the swarm must decrease for that other regions can be explored. Otherwise, if the swarm weight remains constant, the radius should increase to allow the exploration of new regions.

For the swarm contraction, the centroid concept is used. This is obtained by means of an average position of all fish weighted with the respective fish weights, according to Eq. (8):

$$B\left(t\right) = \frac{\sum\_{i=1}^{N} \mathbf{x}\_{i} \mathbf{W}\_{i}\left(t\right)}{\sum\_{i=1}^{N} \mathbf{W}\_{i}\left(t\right)}\tag{8}$$

If the swarm weight remains constant in the current iteration, all fish must update their positions using the Eq. (9):

$$\mathbf{x}\left(t+1\right) = \mathbf{x}\left(t\right) - s\_{\text{vol}} \times rand \times \frac{\mathbf{x}\left(t\right) - \mathbf{B}\left(t\right)}{d\left(\mathbf{x}\left(t\right), \mathbf{B}\left(t\right)\right)}\tag{9}$$

where *d* is a function that calculates the Euclidean distance between the centroid and the current position of fish, and *svol* is the step size used to control the displacement of fish.

In the literature, few works using the FSA can be found. In this context, feed forward neural networks (Wang et al., 2005), parameter estimation in engineering systems (Li et al, 2004), combinatorial optimization problem (Cai, 2010), global optimization (Yang, 2010), Augmented Lagrangian fish swarm based method for global optimization (Rocha et al., 2011), forecasting stock indices using radial basis function neural networks optimized (Shen et al., 2011), and hybridization of the FSA with the Particle Swarm Algorithm to solve engineering systems (Tsai & Lin, 2011).

#### **4. Applications**

8 Will-be-set-by-IN-TECH

where *Wi*(*t*) is the fish weight *i* at generation *t* and Δ*fi* is the difference of objective function between the current position and the new position of fish *i*. It is important to emphasize that

This operator is important for the individual movement of fishes when Δ*fi* � 0. Thus, only the fishes whose individual execution of the movement resulted in improvement of their fitness will influence the direction of motion of the school, resulting in instinctive collective movement. In this case, the resulting direction (*Id*), calculated using the contribution of the

> *N* ∑ *i*=1

It is important to emphasize that in the application of this operator, the direction chosen by a fish that located the largest portion of food to exert the greatest influence on the swarm. Therefore, the instinctive collective movement operator tends to guide the swarm in the direction of motion chosen by fish who found the largest portion of food in it individual

As noted earlier, the fish weight is a good indication of search success for food. In this way, the swarm weight is increasing, this means that the search process is successful. So, the "radius" of the swarm must decrease for that other regions can be explored. Otherwise, if the swarm weight remains constant, the radius should increase to allow the exploration of new regions. For the swarm contraction, the centroid concept is used. This is obtained by means of an average position of all fish weighted with the respective fish weights, according to Eq. (8):

> *N* ∑ *i*=1

> > *N* ∑ *i*=1

If the swarm weight remains constant in the current iteration, all fish must update their

where *d* is a function that calculates the Euclidean distance between the centroid and the current position of fish, and *svol* is the step size used to control the displacement of fish.

In the literature, few works using the FSA can be found. In this context, feed forward neural networks (Wang et al., 2005), parameter estimation in engineering systems (Li et al, 2004),

*xiWi* (*t*)

*Wi* (*t*)

*x* (*t*) − *B* (*t*)

*<sup>d</sup>* (*<sup>x</sup>* (*t*), *<sup>B</sup>* (*t*)) (9)

*B* (*t*) =

*x* (*t* + 1) = *x* (*t*) − *svol* × *rand* ×

*N* ∑ *i*=1 Δ*fi*

Δ*xi*Δ*fi*

*xi* (*t* + 1) = *xi* (*t*) + *Id* (*t*) (7)

(6)

(8)

directions taken by the fish, and the new position of the *i*th fish are given by:

*Id* (*t*) =

Δ*fi*=0 for the fishes in same position.

movement.

positions using the Eq. (9):

**3.3.3 Instinctive collective movement operator**

**3.3.4 Non-Instinctive collective movement operator**

For evaluating the methodology proposed in this work for controllers tuning, some practical points should be emphasized:

• the objective function (Sum Quadratic Error - *SQE*) considered in all case studies is given by Eq. (10):

$$\min \, SQE = \sum\_{k=1}^{np} Error = \sum\_{k=1}^{np} \left( X^{setpoint} - X^{calluated} \right)^2 \tag{10}$$

where *Xsetpoint* and *Xcalculated* are the values of variables considered at *setpoint* and calculated using the mathematical model, respectively, and *np* is the points number used to formulate this objective function (*np* equals to 1000).


#### **4.1 Distillation column**

This first study proposed by (Skogestad & Morari, 1987) considers a distillation column of high purity consisting of 25 plates, a condenser and a reboiler. The reflow ration and the composition of distillate are the input and output system, respectively. The dynamic model that describes this system is given by following transfer function (Skogestad & Morari, 1987):

$$G(z) = \frac{-0.75448z + 0.149199}{z - 0.6386913} \tag{11}$$

The objective is to maintain the composition of distillate in 0.99 by manipulating the reflow ratio, which has a nominal value of 1.477 Kmol/min. In this case, the following ranges to controllers tuning are considered: 0 ≤ *Kc* ≤ 150, 0 ≤ *τ<sup>I</sup>* ≤ 50 and 0 ≤ *τ<sup>D</sup>* ≤ 50.

Table 5 presents the best value and standard deviation for the distillation column case study.

In this table can be observed that both the algorithms presented good estimates for the unknown parameters. When the results are analyzed in terms of the objective function (*OF*),


Table 4. Parameters used by the BiOM.


Table 5. Results obtained by the BiOM - distillation column case study.

is clear that the combination of control parameters lead us to very close values, also seen in the value of standard deviation presented.

Figure 2 present the distillation top and the control action (reflow profile), respectively, using the classical methods and the BiOM. The behaviour observed in this simple case study is practically the same for all strategies used.

#### **4.2 Heat exchanger**

Consider a heat exchanger type shell-tube counter-current as illustrated in Fig. 3 (Garcia, 2005). In this figure, *Qc*,*<sup>i</sup>* and *Tc*,*<sup>i</sup>* represent, the flow rate and inlet temperature of the hot fluid, respectively, *Qt*,*<sup>e</sup>* and *Tt*,*e*, the flow rate and inlet temperature of cold fluid, respectively. *Tc* is the fluid temperature on the side of hull and *Tt* is the fluid temperature in the side of pipe. The objective of this system is to heat a water stream at 40 ◦C to 41 ◦C manipulating a hot water stream (*Qt*,*e*) with nominal flow rate 0.0004 m3/s. The thermal exchanges are

Fig. 2. Distillation top profile (a) and reflow profile (b) using classical and BiOM.

considered: heat transfer fluid circulating between the tubes and the hull, heat transfer fluid circulating between the hull and its walls, and transport of energy (enthalpy) due to fluid flow in pipes and shell. More information about the design and the considerations are in Garcia (2005).

Fig. 3. Schematic heat exchanger.

10 Will-be-set-by-IN-TECH

BCA Number of scout bees 10 Number of bees recruited for the best *e* sites 5 Number of bees recruited for the other selected sites 5 Number of sites selected for neighbourhood search 5 Number of top-rated (elite) sites among *m* selected sites 5 Neighborhood search 10−<sup>6</sup> Generation Number 50 FCA Number of fireflies 15 Maximum attractiveness value 0.9 Absorption coefficient 0.9 Generation Number 50 FSA Number of fishes 15 Weighted parameter (*sind*) 0.01 Weighted parameter (*svol*) 1 Generation Number 50

Method *Kc τ<sup>I</sup>* (min−1) *τ<sup>D</sup>* (min−1) *OF* (Eq. 10) ZN-SL 67.2000 12.500 3.1250 8.10×10−<sup>3</sup> ZN-RC 2.6578 2.0000 0.5000 1.24×10−<sup>2</sup> CC-RC 3.2890 2.1154 0.3364 1.09×10−<sup>2</sup> BCA 24.282 0.008 43.103 8.102×10−<sup>3</sup>

FCA 77.412 0.003 26.955 8.100×10−<sup>3</sup>

FSA 128.009 0.009 7.176 8.101×10−<sup>3</sup> (28.260) (0.237) (18.779) (3.145×10−8)

is clear that the combination of control parameters lead us to very close values, also seen in

Figure 2 present the distillation top and the control action (reflow profile), respectively, using the classical methods and the BiOM. The behaviour observed in this simple case study is

Consider a heat exchanger type shell-tube counter-current as illustrated in Fig. 3 (Garcia, 2005). In this figure, *Qc*,*<sup>i</sup>* and *Tc*,*<sup>i</sup>* represent, the flow rate and inlet temperature of the hot fluid, respectively, *Qt*,*<sup>e</sup>* and *Tt*,*e*, the flow rate and inlet temperature of cold fluid, respectively. *Tc* is the fluid temperature on the side of hull and *Tt* is the fluid temperature in the side of pipe. The objective of this system is to heat a water stream at 40 ◦C to 41 ◦C manipulating a hot water stream (*Qt*,*e*) with nominal flow rate 0.0004 m3/s. The thermal exchanges are

Table 5. Results obtained by the BiOM - distillation column case study.

(36.265) (12.12) (12.710) (3.026×10−6)

(37.324) (0.037) (15.926) (6.834×10−8)

Table 4. Parameters used by the BiOM.

the value of standard deviation presented.

practically the same for all strategies used.

**4.2 Heat exchanger**

The dynamic model that describes this system is given by transfer function:

$$G(s) = \frac{0.0189}{10s^3 + 5.114s^2 + 0.825s + 0.041} \tag{12}$$

In this case, the following ranges to controllers tuning are considered: 0 ≤ *Kc* ≤ 150, 0 ≤ *τ<sup>I</sup>* ≤ 50 and 0 ≤ *D* ≤ 50.

Table 6 presents the best value and standard deviation for the heat exchanger case study.

As observed in the previous case, that both the algorithms presented good estimates for the unknown parameters, but the best results were obtained by the BiOM. It is possible to observe the fluctuation of control parameters, found in the value of standard deviation.

Figure 4 present the temperature and flow profiles using the classical methods and the BiOM. It should be emphasized the oscillatory behaviour observed with the application of the BiOM, even for a short period of time (see Fig. 4(a)).


Table 6. Results obtained by the BiOM - heat exchanger case study.

Fig. 4. Temperature and flow (control action) profiles.

#### **4.3 Shell and tube heat exchanger**

Finally, consider the real system shown in Fig. 5 for analysis and application of the previously studied concepts. This system consists essentially of: (*i*) the main tank (P5) in stainless steel with a capacity of approximately 0.250 m3, (*ii*) stainless steel shell and tube heat exchanger (P1), (*iii*) positive displacement pump for movement of food products (P3), (*iv*) centrifugal pump for the heating agent movement (P2) and (*v*) vertical cylindrical storage tank water heater (P4) (Gedraite et al., 2011).

The Vettore-Manghi heat exchanger is responsible for heating the liquid food product that flows inside the tube bundle, considering four passes. The displacement of the process fluid inside the tubes of the heat exchanger is driven by a model RE50-110 Robuschi positive displacement pump (pump 2). The heating of the heat exchanger is done by hot water which flows through the shell side of the exchanger. Hot water is transported by Robuschi , model RE50-160 centrifugal pump (pump 3). Hot water is heated at the expense of saturated steam produced in the H. BREMER steam generator, installed in suitable and safe environment. The temperature of the process fluid is controlled by manipulating the flow of steam fed to the system, whose setting is done by the Fluxotrol model PK2117 control valve (P4), with reverse action. The hot water removed from the shell of the heat exchanger returns to the vertical tank, that is equipped with a safety valve. For cooling the product, the procedure is reversed, ie, the

Fig. 5. Process and instrumentation diagram for the system studied (Gedraite et al., 2011).

flow control valve used for manipulate the value of heating steam flow is gradually closed. In this process, the response time is slower when compared with heating time. Cooling only occurs as a result of the heat exchange between the body of the heat exchanger, the process fluid and the environment.

The data acquisition and control system is composed by an PC based DAS (Data Acquisition System), working also as a computer control. This system consists of the following items: (*i*) PC microcomputer for the collection and storage of process data, (*ii*) LabVIEW version 2009 application to perform monitoring, data acquisition and process control in real-time, (*iii*) data acquisition board, National Instruments (NI) PCI-6259 model, with 4 analog output channels and 32 analog input channels with both operating range of -10 V to +10 V and resolution of 16 bits, and 48 channel digital input/output programmable, (*iv*) set of cables to acquisition board NI model SHC68-68-EPM, (*v*) a connections terminal NI model CB-68LP, (*vi*) signal conditioners INCON model CS01-1360 to match the signals from the sensing elements of temperature, (*vii*) temperature sensors IOPE model 49312 type Pt 100, (*viii*) METROVAL flow meter model OI-2-SMRX/FS, (*ix*) ENGINSTREL model 621-IPB electrical current to pressure signal converter, (*x*) pneumatic control valve Fluxotrol model PK2117 and (*xi*) Micronal model B474 pH meter.

#### **4.3.1 Approximate model system**

12 Will-be-set-by-IN-TECH

Method *Kc τ<sup>I</sup>* (s−1) *τ<sup>D</sup>* (s−1) *OF* (Eq. 10) ZN-SL 5.8800 30.0000 7.5000 0.6206 ZN-RC 3.3600 27.0000 6.7500 2.1733 CC-RC 4.4300 26.0200 4.2500 0.9025 BCA 26.289 10.086 17.120 0.0219 (10.949) (5.845) (4.467) (0.0202)

FCA 48.638 6.577 24.354 0.0228 (13.425) (5.146) (4.898) (0.033)

FSA 47.274 9.012 17.508 0.0238 (9.751) (7.209) (5.409) (0.0157)

(a) (b)

Finally, consider the real system shown in Fig. 5 for analysis and application of the previously studied concepts. This system consists essentially of: (*i*) the main tank (P5) in stainless steel with a capacity of approximately 0.250 m3, (*ii*) stainless steel shell and tube heat exchanger (P1), (*iii*) positive displacement pump for movement of food products (P3), (*iv*) centrifugal pump for the heating agent movement (P2) and (*v*) vertical cylindrical storage tank water

The Vettore-Manghi heat exchanger is responsible for heating the liquid food product that flows inside the tube bundle, considering four passes. The displacement of the process fluid inside the tubes of the heat exchanger is driven by a model RE50-110 Robuschi positive displacement pump (pump 2). The heating of the heat exchanger is done by hot water which flows through the shell side of the exchanger. Hot water is transported by Robuschi , model RE50-160 centrifugal pump (pump 3). Hot water is heated at the expense of saturated steam produced in the H. BREMER steam generator, installed in suitable and safe environment. The temperature of the process fluid is controlled by manipulating the flow of steam fed to the system, whose setting is done by the Fluxotrol model PK2117 control valve (P4), with reverse action. The hot water removed from the shell of the heat exchanger returns to the vertical tank, that is equipped with a safety valve. For cooling the product, the procedure is reversed, ie, the

Table 6. Results obtained by the BiOM - heat exchanger case study.

Fig. 4. Temperature and flow (control action) profiles.

**4.3 Shell and tube heat exchanger**

heater (P4) (Gedraite et al., 2011).

The non-parametric identification process employs basically the response curves of the system when excited by input signals like step, impulse or sinusoidal. From these curves, one can extract approximate models of low order, which describe the dynamic behavior of the process (Aguirre, 2007). These models are reasonably accurate and can be assumed to be good enough to represent the system studied. In this work, they were used to perform the pre-tuning PID controllers and to mathematically model the dynamic behavior of pH versus time.

The input most commonly used as non-parametric excitation to identify a process dynamics is the step (Aguirre, 2007). These tests usually can generate by means of graphical representation, empirical dynamic models that consists of low order transfer functions (1st or 2nd order, possibly including a dead time) with a maximum of four parameters to be determined experimentally.

Astrom & Hagglund (1995) state that many of the processes can be represented in an approximate way, by combining four elements typically found in industrial processes, namely: (*i*) gain, (*ii*) transport delay, (*iii*) transfer delay and (*iv*) integrating element. The approach of overdamped systems of order 2 or higher for transfer delay plus dead time (transport delay) can be represented by the transfer function shown in Eq. (13) (Aguirre, 2007):

$$G\left(\mathbf{s}\right) = \frac{\mathbf{K}\mathcal{A}^{-\theta\mathbf{s}}}{1 + \tau\mathbf{s}}\tag{13}$$

where *K* is the gain, *τ* is the transfer delay and *θ* is the dead time (or transport delay).

#### **4.4 Plant reaction curves**

Tests were made to obtain the process parameters related to plant response to changes in flow and temperature. In this test, the equipment was put into operation with steady flow of 7 Lmin−<sup>1</sup> and applied positive step 3 Lmin−<sup>1</sup> at time 32 s, waiting for the system stabilization. In the sequence, a negative step of 3 Lmin−<sup>1</sup> at time 203 s was applied. The first step (7 to 10 Lmin−1) was adopted to obtain the process parameters, whose results are presented below. Figure 6 shows the system behavior to the situation examined.

Fig. 6. Step test at flow of process fluid.

In the assay realized for the temperature, whose response time is illustrated in Fig. 7, a constant flow of 9 Lmin−<sup>1</sup> was used. The outlet temperature of process fluid was adjusted equal to 60 ◦C and a step into the control valve installed at the steam line stem position was applied at the time 50 s, starting from the condition of fully closed until 50% opening. Following the instant 1430 s, we applied a second step of amplitude equal to 10%. The analysis to obtain the process parameters were calculated considering the first step (50%).

The process parameters *K* (process gain), *τ* (process time constant) and *θ* (process dead time) were calculated using the method proposed by Aguirre (2007). The transfer functions obtained are presented in Eqs. (14) and (15):

*i*) flow: *K*=1.3 Lmin−1V−1, *τ*=14 s and *θ*=2 s

$$G(s) = \frac{1.3 \exp(-2s)}{1 + 14s} \tag{14}$$

Fig. 7. Step test at temperature position at steam line.

*ii*) temperature: *K*=7.22 ◦C, *τ*=378 s and *θ*=78 s

14 Will-be-set-by-IN-TECH

Astrom & Hagglund (1995) state that many of the processes can be represented in an approximate way, by combining four elements typically found in industrial processes, namely: (*i*) gain, (*ii*) transport delay, (*iii*) transfer delay and (*iv*) integrating element. The approach of overdamped systems of order 2 or higher for transfer delay plus dead time (transport delay) can be represented by the transfer function shown in Eq. (13) (Aguirre,

*<sup>G</sup>* (*s*) <sup>=</sup> *<sup>K</sup>*.*e*−*θ<sup>s</sup>*

Tests were made to obtain the process parameters related to plant response to changes in flow and temperature. In this test, the equipment was put into operation with steady flow of 7 Lmin−<sup>1</sup> and applied positive step 3 Lmin−<sup>1</sup> at time 32 s, waiting for the system stabilization. In the sequence, a negative step of 3 Lmin−<sup>1</sup> at time 203 s was applied. The first step (7 to 10 Lmin−1) was adopted to obtain the process parameters, whose results are presented below.

In the assay realized for the temperature, whose response time is illustrated in Fig. 7, a constant flow of 9 Lmin−<sup>1</sup> was used. The outlet temperature of process fluid was adjusted equal to 60 ◦C and a step into the control valve installed at the steam line stem position was applied at the time 50 s, starting from the condition of fully closed until 50% opening. Following the instant 1430 s, we applied a second step of amplitude equal to 10%. The analysis

The process parameters *K* (process gain), *τ* (process time constant) and *θ* (process dead time) were calculated using the method proposed by Aguirre (2007). The transfer functions

*<sup>G</sup>*(*s*) = 1.3 exp(−2*s*)

to obtain the process parameters were calculated considering the first step (50%).

where *K* is the gain, *τ* is the transfer delay and *θ* is the dead time (or transport delay).

Figure 6 shows the system behavior to the situation examined.

<sup>1</sup> <sup>+</sup> *<sup>τ</sup><sup>s</sup>* (13)

<sup>1</sup> <sup>+</sup> <sup>14</sup>*<sup>s</sup>* (14)

2007):

**4.4 Plant reaction curves**

Fig. 6. Step test at flow of process fluid.

obtained are presented in Eqs. (14) and (15): *i*) flow: *K*=1.3 Lmin−1V−1, *τ*=14 s and *θ*=2 s

$$G\left(\text{s}\right) = \frac{7.22\exp(-78\text{s})}{1 + 378\text{s}}\tag{15}$$

In these simulations, the following ranges to design parameters are considered: 0 ≤ *Kc* ≤ 50, 0 ≤ *τ<sup>I</sup>* ≤ 248 and 0 ≤ *τ<sup>D</sup>* ≤ 50.

Tables 7 and 8 present the average and standard deviation for the flow and the temperature case studies.


Table 7. Results obtained by BiOM - Flow case study.

In these tables is possible to observe that both the algorithms presented good estimates for the controllers tuning, but the best results were obtained by the BiOM (this represent a reduction of approximately 97% in comparison to ZN-SL method). In addition, it is important to comment that if a larger range for the design variables was used, the value of the objective function would reduce. However, in spite of this reduction, the design found cannot be physically viable, e.g., can represent an infeasible condition in industrial context, as illustrated in Fig. 11(a) for the classical methods.

Figures 8 and 9 present the flow and temperature profiles using the classical methods and the BiOM. Also can be observed in these figures the control action (motor pump signal (8(a)) and valve steam signal (9(b)).


Table 8. Results obtained by BiOM - Temperature case study.

Fig. 8. Flow profile and control action (motor pump signal).

Fig. 9. Temperature profile and control action (valve steam signal).

#### **5. Conclusions**

16 Will-be-set-by-IN-TECH

Method *Kc τ<sup>I</sup>* (s−1) *τ<sup>D</sup>* (s−1) *OF* (Eq. 10) ZN-SL 0.8880 112.5 28.125 87303 ZN-RC 0.7453 124.0000 31.0000 64056 CC-RC 0.8758 142.8046 21.9605 77922 BCA 1.3713 248.0000 38.1768 1929.1287 (0.0010) (0) (0.0307) (0.2111)

FCA 1.3708 248.0000 38.1905 1929.1073 (0.0018) (0.0008) (0.0518) (0.1990) FSA 1.3725 248.0000 38.1428 1929.1481 (0.0015) (0.0023) (0.0438) (0.1501)

(a) (b)

(a) (b)

Table 8. Results obtained by BiOM - Temperature case study.

Fig. 8. Flow profile and control action (motor pump signal).

Fig. 9. Temperature profile and control action (valve steam signal).

In the present contribution, the effectiveness of using the BiOM for controllers tuning through formulation of an optimization problem was analyzed.

In this sense, three cases were studied and it was possible to conclude that both bio-inspired algorithms led to good results for an acceptable number of generations (1510) when compared to the classical methods. It should be pointed out that the quality of solution obtained is dependent of design space considered, e.g., if other ranges were used, other results can be found. Besides, also can be observed that the combination of control parameters, can take to values close, in terms of the objective function.

It is important to emphasize that the use of the BiOM not have the pretension of substituting the classical techniques for the controllers tuning, but to represent an interesting alternative for this purpose.

#### **6. Acknowledgment**

The authors acknowledge the financial support provided by CNPq, Conselho Nacional de Desenvolvimento Científico e Tecnológico and FAPEMIG, Fundação de Amparo à Pesquisa do Estado de Minas Gerais.

#### **7. References**


18 Will-be-set-by-IN-TECH

Conner, J. S. & Seborg, D. E. (2005). Assessing de Need for Process Re-identification. *Industrial*

Desbourough, L. & Miller, R. (2002). Increasing Customer Value of Industrial Control

Hang, C. C.; Åström, K. J. & Ho, W. K. (1991). Refinements of Ziegler-Nichols Tuning Formula.

Hamid, B.; Mohamed, T.; Pierre-Yves, G. & Salim, L. (2010). Tuning Fuzzy PD and PI Controllers using Reinforcement Learning. *ISA Transactions*, Vol. 49 (4), 543–51. Kim, T-H.; Maruta, I. & Sugie, T. (2008). Robust PID Controller Tuning based on the Constrained Particle Swarm Optimization, *Automatica*, Vol. 44, 1104–1110. Li, X. L.; Shao, Z. J. & Qian, J. X. (2002). An Optimizing Method based on Autonomous

Li, X. L.; Xue, Y. C.; Lu, F. & Tian, G. H. (2004). Parameter Estimation Method based

Lobato, F. S. & Souza, D. L. (2008). Adaptive Differential Evolution Method Applied To Controllers Tuning. *7th Brazilian Conference on Dynamics, Control and Applications*. Lobato, F. S.; Sousa, J. A.; Hori, C. E. & Steffen Jr, V. (2010), Improved Bees Colony Algorithm

Lucic, P. & Teodorovic, D. (2001). Bee System: Modeling Combinatorial Optimization

Lukasik, S. & Zak, S. (2009). *Firefly Algorithm for Continuous Constrained Optimization Task*,

Madeiro, S. S. (2010). *Modal Search for Swarm based on Density*, Dissertation, Universidade de

Mudi, R. K.; Dey, C. & Lee, T. T. (2008). An Improved Auto-Tuning Scheme for PI Controllers.

Pan, I.; Das, S. & Gupta, A. (2011). Tuning of an Optimal Fuzzy PID Controller with

Parrich, J.; Viscido, S. & Grunbaum, D. (2002). Self-organized Fish Schools: An Examination

Pfeifer, A. A. & Lobato, F. S. (2010). Solution of Singular Optimal Control Problems using

of Emergent Properties. *Biological Bulletin*, Vol. 202 (3), 296–305.

Garcia, C. (2005). *Modeling and Simulation*. EdUSP, São Paulo (in portuguese), 120 pages. Gedraite, R.; Lobato, F. S.; Neiro, S. M. da S.; Melero Jr., V.; Augusto, S. R. & Kunigk, L.

Performance Monitoring-Honeywell's Experience. *Proc. Int. Conference on Chemical*

(2011). CIP System kinetics mathematical modeling: dynamic behavior of residuals removal kinetics. *XXXII Iberian Latin-American Congress on Computational Methods in*

Animate: Fish Swarm Algorithm, *System Engineering Theory and Practice*, Vol. 22 (11),

on Artificial Fish School Algorithm, *Journal of Shan Dong University* (Engineering

Applied to Chemical Engineering System Design. *International Review of Chemical*

Transportation Engineering Problems by Swarm Intelligence. *TRISTAN IV Triennial*

*ICCCI 2009, Lecture Notes in Artificial Intelligence* (Eds. N. T. Ngugen, R. Kowalczyk,

Stochastic Algorithms for Networked Control Systems with Random Time Delay, *ISA*

the Firefly Algorithm. *Proceedings of VI Congreso Argentino de Ingeniería Química -*

*Engineering Chemical Research*, Vol. 44, 2767–2775.

*Engineering*.

32–38.

*IEE Proc-D*, Vol. 138 (2), 111–118.

Science), Vol. 34 (3), 84–87.

S. M. Chen), Vol. 5796, 97–100.

Pernambuco (in portuguese).

*ISA Trans*. Vol. 47, 45–52.

*Transactions*, Vol. 50, 28–36.

*CAIQ2010*.

*Engineering* (Rapid Communications), Vol. 6, 1–7.

*Symposium on Transportation Analysis*, 441–445.

*Process Control*. AIChE symposium series, N. 326, Vol. 98.


Wang, C. R.; Zhou, C. L. & Ma, J. W. (2005). An Improved Artificial Fish-Swarm Algorithm and Its Application in Feedforward Neural Networks. *Proc. of the Fourth Int. Conf. on Machine Learning and Cybernetics*, 2890–2894.

### **Adaptive Coordinated Cooperative Control of Multi-Mobile Manipulators**

Víctor H. Andaluz1, Paulo Leica2, Flavio Roberti2, Marcos Toibero2 and Ricardo Carelli2 *1Universidad Técnica de Ambato, Facultad de Ingeniería en Sistemas, Electrónica e Industrial 2Universidad Nacional de San Juan, Instituto de Automática, 1Ecuador 2Argentina* 

#### **1. Introduction**

20 Will-be-set-by-IN-TECH

162 Frontiers in Advanced Control Systems

Wang, C. R.; Zhou, C. L. & Ma, J. W. (2005). An Improved Artificial Fish-Swarm Algorithm

*Machine Learning and Cybernetics*, 2890–2894.

and Its Application in Feedforward Neural Networks. *Proc. of the Fourth Int. Conf. on*

A coordinated group of robots can execute certain tasks, *e.g*. surveillance of large areas (Hougen et al., 2000), search and rescue (Jennings et al., 1997), and large objectstransportation (Stouten and De Graaf, 2004), more efficiently than a single specialized robot (Cao et al., 1997). Other tasks are simply not accomplishable by a single mobile robot, demanding a group of coordinated robots to perform it, like the problem of sensors and actuators positioning (Bicchi et al., 2008), and the entrapment/escorting mission (Antonelli et al., 2008). In such context, the term formation control arises, which can be defined as the problem of controlling the relative postures of the robots of a platoon that moves as a single structure (Consolini et al., 2007).

Mobile manipulator is nowadays a widespread term that refers to robots built by a robotic arm mounted on a mobile platform. This kind of system, which is usually characterized by a high degree of redundancy, combines the manipulability of a fixed-base manipulator with the mobility of a wheeled platform. Such systems allow the most usual missions of robotic systems which requiere both locomotion and manipulation abilities. Coordinated control of multiple mobile manipulators have attracted the attention of many researchers (Khatib et al., 1996; Fujii et al., 2007; Tanner et al., 2003; Yasuhisa et al., 2003). The interest in such systems stems from the capability for carrying out complex and dexterous tasks which cannot be simply made using a single robot. Moreover, multiple small mobile manipulators are also more appropriate for realizing several tasks in the human environments than a large and heavy mobile manipulator from a safety point of view.

Main coordination schemes for multiple mobile manipulators that can be found in the literature are:

1. Leader–follower control for mobile manipulator, where one or a group of mobile manipulators plays the role of a leader, which track a preplanned trajectory, and the rest of the mobile manipulators form the follower group which moves in conjunction with the leader mobile manipulators (Fujii et al., 2007; Hirata et al., 2004; Thomas et al., 2002). In Xin and Yangmin, 2006, a leader-follower type formation control is designed for a group of mobile manipulators. To overcome parameter uncertainty in the model of the robot, a decentralized control law is applied to individual robots, in which an adaptive NN is used to model robot dynamics online.

2. Hybrid position–force control by decentralized/centralized scheme, where the position of the object is controlled in a certain direction of the workspace and the internal force of the object is controlled in a small range of the origin (Khatib et al., 1996; Tanner et al., 2003; Yamamoto et al., 2004). In Zhijun et al., 2008, robust adaptive controllers of multiple mobile manipulators carrying a common object in a cooperative manner have been investigated with unknown inertia parameters and disturbances. At first, a concise dynamics consisting of the dynamics of mobile manipulators and the geometrical constraints between the end-effectors and the object is developed for coordinated multiple mobile manipulators. In Zhijun et al., 2009 coupled dynamics are presented for two cooperating mobile manipulators manipulating an object with relative motion in the presence of uncertainties and external disturbances. Centralized robust adaptive controllers are introduced to guarantee the motion and force trajectories of the constrained object. A simulation study to the decentralized dynamic control for a robot collective consisting of nonholonomic wheeled mobile manipulators is performed in Hao and Venkat, 2008, by tracking the trajectories of the load, where two reference signals are used for each robot, one for the mobile platform and another for end-effector of the manipulating arm.

To reduce performance degradation, on-line parameter adaptation is relevant in applications where the mobile manipulator dynamic parameters may vary, such as load transportation. It is also useful when the knowledge of the dynamic parameters is limited. As an example, the trajectory tracking task can be severely affected by the change imposed to the robot dynamics when it is carrying an object, as shown in (Martins et al., 2008). Hence, some formation control architectures already proposed in the literature have considered the dynamics of the mobile robots (Zhijun et al., 2008; Zhijun et al., 2009).

In this Chapter, it is proposed a novel method for centralized-decentralized coordinated cooperative control of multiple wheeled mobile manipulators. Also, it is worth noting that, differently to the work in Hao and Venkat, 2008, we use a single reference for the endeffector of the robot mobile manipulator.

Although centralized control approaches present intrinsic problems, like the difficulty to sustain the communication between the robots and the limited scalability, they have technical advantages when applied to control a group of robots with defined geometric formations. Therefore, there still exists significant interest in their use. As an example, in Antonelli et al., 2008, a centralized multi-robot system is proposed for an entrapment/escorting mission, where the escorted agent is kept in the centroid of a polygon of n sides, surrounded by n robots positioned in the vertices of the polygon. Another task for which it is important to keep a formation during navigation is large-objects transportation, since the load has a fixed geometric form. Another recent work dealing with centralized formation control is Mas et al., 2008, where a control approach based on a virtual structure, called Cluster Space Control, is presented. There, the positioning control is carried out considering the centroid of a geometric structure corresponding to a three-robot formation.

164 Frontiers in Advanced Control Systems

2. Hybrid position–force control by decentralized/centralized scheme, where the position of the object is controlled in a certain direction of the workspace and the internal force of the object is controlled in a small range of the origin (Khatib et al., 1996; Tanner et al., 2003; Yamamoto et al., 2004). In Zhijun et al., 2008, robust adaptive controllers of multiple mobile manipulators carrying a common object in a cooperative manner have been investigated with unknown inertia parameters and disturbances. At first, a concise dynamics consisting of the dynamics of mobile manipulators and the geometrical constraints between the end-effectors and the object is developed for coordinated multiple mobile manipulators. In Zhijun et al., 2009 coupled dynamics are presented for two cooperating mobile manipulators manipulating an object with relative motion in the presence of uncertainties and external disturbances. Centralized robust adaptive controllers are introduced to guarantee the motion and force trajectories of the constrained object. A simulation study to the decentralized dynamic control for a robot collective consisting of nonholonomic wheeled mobile manipulators is performed in Hao and Venkat, 2008, by tracking the trajectories of the load, where two reference signals are used for each robot, one for the mobile platform and another for end-effector

To reduce performance degradation, on-line parameter adaptation is relevant in applications where the mobile manipulator dynamic parameters may vary, such as load transportation. It is also useful when the knowledge of the dynamic parameters is limited. As an example, the trajectory tracking task can be severely affected by the change imposed to the robot dynamics when it is carrying an object, as shown in (Martins et al., 2008). Hence, some formation control architectures already proposed in the literature have considered the

In this Chapter, it is proposed a novel method for centralized-decentralized coordinated cooperative control of multiple wheeled mobile manipulators. Also, it is worth noting that, differently to the work in Hao and Venkat, 2008, we use a single reference for the end-

Although centralized control approaches present intrinsic problems, like the difficulty to sustain the communication between the robots and the limited scalability, they have technical advantages when applied to control a group of robots with defined geometric formations. Therefore, there still exists significant interest in their use. As an example, in Antonelli et al., 2008, a centralized multi-robot system is proposed for an entrapment/escorting mission, where the escorted agent is kept in the centroid of a polygon of n sides, surrounded by n robots positioned in the vertices of the polygon. Another task for which it is important to keep a formation during navigation is large-objects transportation, since the load has a fixed geometric form. Another recent work dealing with centralized formation control is Mas et al., 2008, where a control approach based on a virtual structure, called Cluster Space Control, is presented. There, the positioning control is carried out considering the centroid of a geometric

dynamics of the mobile robots (Zhijun et al., 2008; Zhijun et al., 2009).

adaptive NN is used to model robot dynamics online.

of the manipulating arm.

effector of the robot mobile manipulator.

structure corresponding to a three-robot formation.

rest of the mobile manipulators form the follower group which moves in conjunction with the leader mobile manipulators (Fujii et al., 2007; Hirata et al., 2004; Thomas et al., 2002). In Xin and Yangmin, 2006, a leader-follower type formation control is designed for a group of mobile manipulators. To overcome parameter uncertainty in the model of the robot, a decentralized control law is applied to individual robots, in which an In this Chapter, the proposed strategy conceptualizes the mobile manipulators system (with *n* 3 ) as a single group, and the desired motions are specified as a function of cluster attributes, such as position, orientation, and geometry. These attributes guide the selection of a set of independent system state variables suitable for specification, control, and monitoring. The control is based on a virtual 3-dimensional structure, where the position control (or tracking control) is carried out considering the centroid of the upper side of a geometric structure (shaped as a prism) corresponding to a three-mobile manipulators formation. It is worth noting that in control problem formulating first it is considered three mobile manipulators robots, and then is generalized to mobile manipulators robots.

The proposed multi-layer control scheme is mainly divided in five modules: 1) the upper module is responsible for planning the trajectory to be followed by the team of mobile manipulators; 2) the next module controls the formation, whose shape is determined by the distance and angle between the end-effector of a mobile manipulator and the two other ones; 3) another module is responsible to generate the control signals to the end-effectors of the mobile manipulators, through the inverse kinematics of each robot. As a mobile manipulator is usually a redundant system, this redundancy can be used for the achievement of additional performances. In this layer two secondary objectives are considered: the avoidance of obstacles by the mobile platforms and the singular configuration prevention through the control of the system's manipulability; introduced by Yoshikawa (1985). 4) The adaptive dynamic compensation module compensates the dynamics of each mobile manipulator to reduce the velocity tracking error. It is worth noting that this controller has been designed based on a dynamic model having reference velocities as input signals. Also, it uses a robust updating law, which makes the dynamic compensation system robust to parameter variations and guarantees that no parameter drift will occur; 5) finally, the robots module represents the mobile manipulators.

It is worth noting that we propose a methodology to avoid obstacles in the trajectory of any mobile manipulator based on the concept of mechanical impedance of the interaction robots-environment, without deforming the virtual structure and maintaining its desired trajectory. It is considered that the obstacle is placed at a maximum height that it does not interfere with the workspace, so that the arm of the mobile manipulators can follow the desired trajectory even when the platform is avoiding the obstacle.

This Chapter is organized as follows. Section 2 shows the kinematic and dynamic models of the mobile manipulator. Section 3 presents the proposed multi-layer control scheme for the coordinated and cooperative control of mobile manipulators. While the forward and inverse kinematics transformations, necessary for the control scheme, are presented in Section 4. Section 5 describes the scalability for coordinated cooperative control of mobile manipulators. By its turn, Section 6 presents the design of the controller, and the analysis of the system's stability is developed. Next, simulation experiments results are presented and discussed in Section 7, and finally the Chapter conclusions are given in Section 8.

#### **2. Mobile manipulator models**

The mobile manipulator configuration is defined by a vector **q** of *n* independent coordinates, called *generalized coordinates of the mobile manipulator*, where 1 2 [ ... ]*<sup>T</sup> <sup>n</sup>* **q** *qq q* [] *T TT* **q q** *p a* where **q***a* represents the generalized coordinates of the arm, and **q***p* the generalized coordinates of the mobile platform. We notice that *nn n <sup>a</sup> <sup>p</sup>* , where *na* and *np* are respectively the dimensions of the generalized spaces associated to the robotic arm and to the mobile platform. The configuration **q** is an element of the mobile manipulator *configuration space*; denoted by N . The location of the end-effector of the mobile manipulator is given by the *m –*dimensional vector 1 2 [ ... ]*<sup>T</sup>* **h** *hh hm* which define the position and the orientation of the end-effector of the mobile manipulator in R . Its *m* coordinates are the *operational coordinates of the mobile manipulator*. The set of all locations constitutes the *mobile manipulator operational space*, denoted by M.

The location of the mobile manipulator end-effector can be defined in different ways according to the task, *i.e.*, it can be considered only the position of the end-effector or both its position and its orientation*.*

#### **2.1 Mobile manipulator kinematic model**

The *kinematic model of a mobile manipulator* gives the location of the end-effector **h** as a function of the robotic arm configuration and the platform location (or its operational coordinates as functions of the robotic arm generalized coordinates and the mobile platform operational coordinates).

$$\begin{aligned} f: \mathcal{N}\_a \times \mathcal{M}\_p &\to \mathcal{M} \\\\ \left(\mathbf{q}\_{\mathbf{a'}} \mathbf{q}\_{\mathbf{p}}\right) &\mapsto \quad \mathbf{h} = f\left(\mathbf{q}\_{\mathbf{a'}} \mathbf{q}\_{\mathbf{p}}\right) \end{aligned}$$

where, Na is the *configuration space* of the robotic arm, M <sup>p</sup> is the *operational space of the platform*.

The *instantaneous kinematic model of a mobile manipulator* gives the derivative of its endeffector location as a function of the derivatives of both the robotic arm configuration and the location of the mobile platform,

$$\dot{\mathbf{h}} = \frac{\partial f}{\partial \mathbf{q}}(\mathbf{q}\_{a'} \mathbf{q}\_p) \mathbf{v}$$

where, 1 2 [ ... ]*<sup>T</sup>* **<sup>h</sup>** *hh hm* is the vector of the end-effector velocity, 1 2 [ ... ] *<sup>n</sup> <sup>T</sup> vv v* **v** T <sup>p</sup> [ ]*<sup>T</sup>* **<sup>T</sup> v va** is the control vector of mobility of the mobile manipulator. Its dimension is *n np na* , where *np* and *na* are respectively the dimensions of the control vector of mobility associated to the mobile platform and to the robotic arm. Now, after replacing a p , *<sup>f</sup>* **<sup>J</sup> q q q Tq q** in the above equation, we obtain

$$\dot{\mathbf{h}}(t) = \mathbf{J}(\mathbf{q})\mathbf{v}(t) \tag{1}$$

where, **J q** is the Jacobian matrix that defines a linear mapping between the vector of the mobile manipulator velocities **v***t* and the vector of the end-effector velocity **h** *t* , and **T q** is the transformation matrix that relates joints velocities **q** *t* and mobile manipulator velocities **v** *t* such that **q Tqv** *t t* .

*Remark 1:* The transformation matrix **T q** includes the non-holonomic constraints of the mobile platform.

The Jacobian matrix is, in general, a function of the configuration **q** ; those configurations at which **J q** is rank-deficient are termed *singular kinematic configurations*. It is fundamental to notice that, in general, the dimension of the operational space *m* is less than the degree of mobility of the mobile manipulator, therefore the system is redundant.

#### **2.2 Mobile manipulator dynamic model**

166 Frontiers in Advanced Control Systems

arm, and **q***p* the generalized coordinates of the mobile platform. We notice that *nn n <sup>a</sup> <sup>p</sup>* , where *na* and *np* are respectively the dimensions of the generalized spaces associated to the robotic arm and to the mobile platform. The configuration **q** is an element of the mobile manipulator *configuration space*; denoted by N . The location of the end-effector of the mobile manipulator is given by the *m –*dimensional vector 1 2 [ ... ]*<sup>T</sup>* **h** *hh hm* which define the position and the orientation of the end-effector of the mobile manipulator in R . Its *m* coordinates are the *operational coordinates of the mobile manipulator*. The set of all

The location of the mobile manipulator end-effector can be defined in different ways according to the task, *i.e.*, it can be considered only the position of the end-effector or both

The *kinematic model of a mobile manipulator* gives the location of the end-effector **h** as a function of the robotic arm configuration and the platform location (or its operational coordinates as functions of the robotic arm generalized coordinates and the mobile platform

M

**qq h qq** a p , = , *f* a p

The *instantaneous kinematic model of a mobile manipulator* gives the derivative of its endeffector location as a function of the derivatives of both the robotic arm configuration and

> , *a p <sup>f</sup>* **h qq v q**

where, 1 2 [ ... ]*<sup>T</sup>* **<sup>h</sup>** *hh hm* is the vector of the end-effector velocity, 1 2 [ ... ] *<sup>n</sup>*

<sup>p</sup> [ ]*<sup>T</sup>* **<sup>T</sup> v va** is the control vector of mobility of the mobile manipulator. Its dimension is

mobility associated to the mobile platform and to the robotic arm. Now, after replacing

 M

M

are respectively the dimensions of the control vector of

**h Jqv** *t t* (1)

f : Na p x

in the above equation, we obtain

M.

<sup>p</sup> is the *operational space of the* 

*<sup>T</sup> vv v* **v**

locations constitutes the *mobile manipulator operational space*, denoted by

its position and its orientation*.*

operational coordinates).

the location of the mobile platform,

, where *np*

 and *na* 

*platform*.

T

 

*n np na*

 a p , *<sup>f</sup>* **<sup>J</sup> q q q Tq q**

**2.1 Mobile manipulator kinematic model** 

where, Na is the *configuration space* of the robotic arm,

The mathematic model that represents the dynamics of a mobile manipulator can be obtained from Lagrange's dynamic equations, which are based on the difference between the kinetic and the potential energy of each of the joints of the robot (energy balance) (Spong and Vidyasagar, 1989; Yoshikawa, 1990; Sciavicco and Siciliano, 2000). The dynamic equation of the mobile manipulator can be represented as follows,

$$
\overline{\mathbf{M}}(\mathbf{q})\dot{\mathbf{v}} + \overline{\mathbf{C}}(\mathbf{q}, \mathbf{v})\mathbf{v} + \overline{\mathbf{G}}(\mathbf{q}) + \overline{\mathbf{d}} = \overline{\mathbf{B}}(\mathbf{q})\mathbf{r} \tag{2}
$$

where, 1 [ ,..., ]*<sup>T</sup> <sup>n</sup>* **q** *q q <sup>n</sup>* is the general coordinate system vector of the mobile manipulator, 1 [ ,...., ] *<sup>n</sup> <sup>T</sup> v v* **v** *<sup>n</sup>* is the velocity vector of the mobile manipulator, **M q** <sup>x</sup> *n n* is a symmetrical positive definite matrix that represents the system's inertia, **C q, v v** *<sup>n</sup>* represents the components of the centripetal and Coriolis forces, **G q** *<sup>n</sup>* represents the gravitational forces, **d** denotes bounded unknown disturbances including the unmodeled dynamics, **τ** *<sup>n</sup>* is the torque input vector, **B q** <sup>x</sup> *n n* is the

transformation matrix of the control actions.

Most of the commercially available robots have low level PID controllers in order to follow the reference velocity inputs, thus not allowing controlling the voltages of the motors directly. Therefore, it becomes useful to express the dynamic model of the mobile manipulator in a more appropriate way, taking the rotational and longitudinal reference velocities as the control signals. To do so, the velocity servo controller dynamics are included in the model. The dynamic model of the mobile manipulator, having as control signals the reference velocities of the system, can be represented as follows,

$$\mathbf{M}(\mathbf{q})\mathbf{v} + \mathbf{C}(\mathbf{q}, \mathbf{v})\mathbf{v} + \mathbf{G}(\mathbf{q}) + \mathbf{d} = \mathbf{v}\_{\text{ref}} \tag{3}$$

where **Mq H M D -1** , **-1 C q, v = H C + P** , **-1 G q =H G q** and **-1 d Hd** . Thus, **M q** <sup>x</sup> *n n* is a positive definite matrix, **Cqvv** , *<sup>n</sup>* , **G q** *<sup>n</sup>* , **d** *<sup>n</sup>* and **vref** *<sup>n</sup>* is the vector of velocity control signals, **H** <sup>x</sup> *n n* , **D** <sup>x</sup> *n n* and **P** <sup>x</sup> *n n* are positive definite constant diagonal matrices containing the physical parameters of the mobile manipulator, motors, and velocity controllers of both the mobile platform and the manipulator. It is important to remark that **H** , **D** and **P** are positive definite constant diagonal matrices, hence the properties for the dynamic model with reference velocities as control signals (3) were obtained on based of the properties of the dynamic model (2):

*Property 1*. Matrix **M q** is positive definite, additionally it is known that

$$\|\mathbf{M}(\mathbf{q})\| < k\_M$$

*Property 2.* Furthermore, the following inequalities are also satisfied

$$\|\mathbf{C}(\mathbf{q}, \mathbf{v})\| < k\_c \|\mathbf{v}\|$$

*Property 3.* Vector **G q** and **d** are bounded

$$\|\mathbf{G}(\mathbf{q})\| < k\_G \quad ; \qquad \|\mathbf{d}\| < k\_d$$

where, *<sup>c</sup> k* , *kM* , *Gk* and *<sup>d</sup> k* denote some positive constants.

*Property 4.* The dynamic model of the mobile manipulator can be represented by

$$\mathbf{M(q)}\dot{\mathbf{v}} + \mathbf{C(q,v)}\mathbf{v} + \mathbf{G(q)} + \mathbf{d} = \mathbf{0} \\ \mathbf{(q,v)}\chi$$

where, **Φq, v** <sup>x</sup> *<sup>n</sup> <sup>l</sup>* and 1 2 ... *<sup>T</sup>* **χ** *<sup>l</sup>* is the vector of *l* unknown parameters of the mobile manipulator, *i.e*., mass of the mobile robot, mass of the robotic arm, physical parameters of the mobile manipulator, motors, velocity, etc.

For the sake of simplicity, from now on it will be written **M Mq** , **C Cqv** , and **G Gq** .

Hence, the full mathematical model of the mobile manipulator robot is represented by (1), the *instantaneous kinematic model* and (3), the *dynamic model*, taking the reference velocities of the system as input signals.

#### **3. Multi-layers control scheme**

Figure 1, shows the Multi-layer control Scheme of the coordinated cooperative control of mobile manipulators which is taken into account in this Chapter.

Each layer works as an independent module, dealing with a specific part of the problem of coordinated cooperative control, and such control scheme includes a basic structure defined 168 Frontiers in Advanced Control Systems

where **Mq H M D -1** , **-1 C q, v = H C + P** , **-1 G q =H G q** and **-1 d Hd** . Thus,

are positive definite constant diagonal matrices containing the physical parameters of the mobile manipulator, motors, and velocity controllers of both the mobile platform and the manipulator. It is important to remark that **H** , **D** and **P** are positive definite constant diagonal matrices, hence the properties for the dynamic model with reference velocities as control signals (3) were obtained on based of the properties of the dynamic model (2):

**M q** *kM*

*<sup>c</sup>* **C q, v v** *k*

*<sup>G</sup>* **G q** *k* ; *<sup>d</sup>* **d** *k*

**M q v + C q, v v + G q + d = Φ q, v χ**

 

the mobile manipulator, *i.e*., mass of the mobile robot, mass of the robotic arm, physical

For the sake of simplicity, from now on it will be written **M Mq** , **C Cqv** , and

Hence, the full mathematical model of the mobile manipulator robot is represented by (1), the *instantaneous kinematic model* and (3), the *dynamic model*, taking the reference velocities of

Figure 1, shows the Multi-layer control Scheme of the coordinated cooperative control of

Each layer works as an independent module, dealing with a specific part of the problem of coordinated cooperative control, and such control scheme includes a basic structure defined

*Property 4.* The dynamic model of the mobile manipulator can be represented by

 , **D** <sup>x</sup> *n n*

, **G q** *<sup>n</sup>*

 

*<sup>l</sup>* is the vector of *l* unknown parameters of

, **d** *<sup>n</sup>*

and **P** <sup>x</sup> *n n*

and

> 

is a positive definite matrix, **Cqvv** , *<sup>n</sup>*

*Property 1*. Matrix **M q** is positive definite, additionally it is known that

*Property 2.* Furthermore, the following inequalities are also satisfied

where, *<sup>c</sup> k* , *kM* , *Gk* and *<sup>d</sup> k* denote some positive constants.

 *<sup>l</sup>* and 1 2 ... *<sup>T</sup>* **χ** 

mobile manipulators which is taken into account in this Chapter.

parameters of the mobile manipulator, motors, velocity, etc.

*Property 3.* Vector **G q** and **d** are bounded

where, **Φq, v** <sup>x</sup> *<sup>n</sup>*

the system as input signals.

**3. Multi-layers control scheme** 

**G Gq** .

is the vector of velocity control signals, **H** <sup>x</sup> *n n*

**M q** <sup>x</sup> *n n* 

**vref** *<sup>n</sup>* 

by the formation control layer, the kinematic control layer, the robots layer and the environment layer.

Fig. 1. Multi-layer control scheme


One of the main advantages of the proposed scheme is the independence of each layer, *i.e.*, changes within a layer do not cause structural changes in the other layers. As an example, several kinematic controllers or dynamic compensation approaches can be tested using the same formation control strategy and vice-versa. It is worth mentioning that a simple structure can be obtained from the presented scheme, that is, some layers can be eliminated whenever the basic structure is maintained and the absence of the eliminated layers do not affect the remaining layers. For example, the On-line Planning layer could be discarded in the case of trajectory tracking or path following by a multirobot formation in a known environment free of obstacles, because the entire task accomplishment is controlled by the Formation Control layer. Also the Adaptive Dynamic Compensation layer can be suppressed, for applications demanding low velocities and light load transportation.

On the other hand, it is important to stress that some additional blocks are necessary to complete the multi-layer scheme, such as **-1 FJ r** and *f* **x** , which represents the inverse formation Jacobian matrix, and the forward kinematic transformation function for the formation, respectively.

*Remark 2:* The mobile manipulators can be different, *i.e*., each mobile manipulator can be built by different types mobile platforms or/and different types robotic arms. Thus each mobile manipulator has its own configuration.

*Remark 3:* A mobile manipulator is defined as a redundant system because it has more degrees of freedom than required to achieve the desired end-effector motion. Hence, the redundancy of such systems can be effectively used for the achievement of additional performances.

#### **4. Kinematic transformation**

The proposed coordinated cooperative control method considers three or more mobile manipulators. In the first step, only three mobile manipulators are considered. In this case the control method is based on creating a regular or irregular prism defined by the position of the end-effector of each mobile manipulator. The location of the upper side of the prism in the plane *X-Y* of the global framework is defined by **PF** *x y FF F* , where ( , ) *F F x y* represents the position of its centroid, and *<sup>F</sup>* represents its orientation with respect to the global *Y*-axis. The structure shape of the prism (regular or irregular) is defined by **SF** *p q FF F F F F zzz* <sup>123</sup> , where, *<sup>F</sup> p* represents the distance between **h**<sup>1</sup> and **h**<sup>2</sup> , *Fq* the distance between **h**1 and **h**<sup>3</sup> , *<sup>F</sup>* the angle formed by 213 <sup>ˆ</sup> **hhh** and *zzz* <sup>123</sup> *FFF* , , represents the height of the upper side of the prism. This situation is illustrated in Figure 2.

*Remark 4:* **h***i* represents the position the end-effector of the *i*-th mobile manipulator.

The relationship between the prism pose-orientation-shape and the end-effector positions of the mobile manipulators is given by the forward and inverse kinematics transformation, *i.e.*,

$$\mathbf{r} = f\left(\mathbf{x}\right) \text{ and } \mathbf{x} = f^{-1}\left(\mathbf{r}\right) \text{, where } \mathbf{r} = \begin{bmatrix} \mathbf{P}\_{\mathbf{F}} & \mathbf{S}\_{\mathbf{F}} \end{bmatrix}^{T} \text{ and } \mathbf{x} = \begin{bmatrix} \mathbf{h}\_{1}^{T} & \mathbf{h}\_{2}^{T} & \mathbf{h}\_{3}^{T} \end{bmatrix}^{T} \text{, } \mathbf{x}$$

Fig. 2. Structure variables

170 Frontiers in Advanced Control Systems

One of the main advantages of the proposed scheme is the independence of each layer, *i.e.*, changes within a layer do not cause structural changes in the other layers. As an example, several kinematic controllers or dynamic compensation approaches can be tested using the same formation control strategy and vice-versa. It is worth mentioning that a simple structure can be obtained from the presented scheme, that is, some layers can be eliminated whenever the basic structure is maintained and the absence of the eliminated layers do not affect the remaining layers. For example, the On-line Planning layer could be discarded in the case of trajectory tracking or path following by a multirobot formation in a known environment free of obstacles, because the entire task accomplishment is controlled by the Formation Control layer. Also the Adaptive Dynamic Compensation layer can be suppressed, for applications demanding low velocities and

On the other hand, it is important to stress that some additional blocks are necessary to

formation Jacobian matrix, and the forward kinematic transformation function for the

*Remark 2:* The mobile manipulators can be different, *i.e*., each mobile manipulator can be built by different types mobile platforms or/and different types robotic arms. Thus each

*Remark 3:* A mobile manipulator is defined as a redundant system because it has more degrees of freedom than required to achieve the desired end-effector motion. Hence, the redundancy of such systems can be effectively used for the achievement of additional

The proposed coordinated cooperative control method considers three or more mobile manipulators. In the first step, only three mobile manipulators are considered. In this case the control method is based on creating a regular or irregular prism defined by the position of the end-effector of each mobile manipulator. The location of the upper side of the prism

global *Y*-axis. The structure shape of the prism (regular or irregular) is defined by

represents the height of the upper side of the prism. This situation is illustrated in Figure 2.

The relationship between the prism pose-orientation-shape and the end-effector positions of the mobile manipulators is given by the forward and inverse kinematics transformation, *i.e.*,

*zzz* <sup>123</sup> , where, *<sup>F</sup> p* represents the distance between **h**<sup>1</sup> and **h**<sup>2</sup> ,

*<sup>F</sup>* the angle formed by 213

*<sup>T</sup> TTT* **xhhh** .

in the plane *X-Y* of the global framework is defined by **PF** *x y FF F*

**r x** *<sup>f</sup>* and <sup>1</sup> *<sup>f</sup>* **x r** , where *<sup>T</sup>* **F F rPS** and 123

*Remark 4:* **h***i* represents the position the end-effector of the *i*-th mobile manipulator.

**FJ r** and *f* **x** , which represents the inverse

*<sup>F</sup>* represents its orientation with respect to the

, where ( , ) *F F x y*

<sup>ˆ</sup> **hhh** and *zzz* <sup>123</sup> *FFF* , ,

light load transportation.

formation, respectively.

performances.

complete the multi-layer scheme, such as **-1**

mobile manipulator has its own configuration.

**4. Kinematic transformation** 

**SF** *p q FF F F F F* 

*Fq* the distance between **h**1 and **h**<sup>3</sup> ,

represents the position of its centroid, and

The forward kinematic transformation *f* . , as shown in figure 2, is given by

$$\mathbf{P\_{F}} = \begin{bmatrix} \frac{\mathbf{x}\_{1} + \mathbf{x}\_{2} + \mathbf{x}\_{3}}{3} \\\\ \frac{y\_{1} + y\_{2} + y\_{3}}{3} \\\\ \arctan\frac{\frac{2}{3}\mathbf{x}\_{1} - \frac{1}{3}\left(\mathbf{x}\_{2} + \mathbf{x}\_{3}\right)}{\frac{2}{3}y\_{1} - \frac{1}{3}\left(y\_{2} + y\_{3}\right)} \end{bmatrix}; \quad \mathbf{S\_{F}} = \begin{bmatrix} \sqrt{\left(\mathbf{x}\_{1} - \mathbf{x}\_{2}\right)^{2} + \left(y\_{1} - y\_{2}\right)^{2}} \\\\ \sqrt{\left(\mathbf{x}\_{1} - \mathbf{x}\_{3}\right)^{2} + \left(y\_{1} - y\_{3}\right)^{2}} \\\\ \arctan\frac{p\_{F}^{2} + q\_{F}^{2} - r\_{F}^{2}}{2p\_{F}q\_{F}} \\\\ z\_{1F} \\\\ z\_{2F} \end{bmatrix}^{T}$$

where, <sup>2</sup> <sup>2</sup> *<sup>F</sup>* 23 23 *r xx yy* . In turn, for the inverse kinematic transformation <sup>1</sup> *f* . , two representations are possible, depending on the disposition of the mobile manipulators in the prism shape (clockwise or counter-clockwise). Such disposition can be referred to as *RRR* 123 or *RRR* 132 sequence ( *Ri* represents the i-th mobile manipulator robot). Considering the first possibility, <sup>123</sup> 1 *RRR <sup>f</sup>* **x r** is given by,

$$\mathbf{x} = \begin{bmatrix} \mathbf{h}\_{F} + \frac{2}{3}h\_{F}\sin\psi\_{F} \\ \mathbf{y}\_{F} + \frac{2}{3}h\_{F}\cos\psi\_{F} \\ \mathbf{z}\_{1F} \\ \mathbf{z}\_{1F} \\ \mathbf{h}\_{3} \\ \mathbf{h}\_{3} \end{bmatrix} = \begin{bmatrix} \mathbf{h}\_{1} \\ \mathbf{z}\_{F} + \frac{2}{3}h\_{F}\sin\psi\_{F} - p\_{F}\sin\left(\alpha + \psi\_{F}\right) \\ \mathbf{y}\_{F} + \frac{2}{3}h\_{F}\cos\psi\_{F} - p\_{F}\cos\left(\alpha + \psi\_{F}\right) \\ \mathbf{z}\_{2F} \\ \mathbf{z}\_{2F} \\ \mathbf{x}\_{F} + \frac{2}{3}h\_{F}\sin\psi\_{F} + q\_{F}\sin\left(\beta\_{F} - \alpha - \psi\_{F}\right) \\ \mathbf{y}\_{F} + \frac{2}{3}h\_{F}\cos\psi\_{F} - q\_{F}\cos\left(\beta\_{F} - \alpha - \psi\_{F}\right) \\ \mathbf{z}\_{3F} \end{bmatrix}$$

where, 1 1 22 2 *F FF F* 2 2 *h p q r* represents the distance between the end-effector **h**1 and the point in the middle of the segment **h h**2 3 , passing through ( , ) *F F x y* , and 22 2 1 <sup>4</sup> arccos 2 *FF F F F p h r p h* . On the other hand, <sup>132</sup> 1 *RRR <sup>f</sup>* **x r** is given by

$$\mathbf{x} = \begin{bmatrix} \mathbf{h}\_{\mathrm{F}} \\ \mathbf{h}\_{\mathrm{F}} \\ \mathbf{h}\_{\mathrm{2}} \\ \mathbf{h}\_{\mathrm{3}} \end{bmatrix} = \begin{bmatrix} \mathbf{x}\_{\mathrm{F}} + \frac{2}{3}h\_{\mathrm{F}}\sin\psi\_{\mathrm{F}} \\ \mathbf{y}\_{\mathrm{F}} + \frac{2}{3}h\_{\mathrm{F}}\cos\psi\_{\mathrm{F}} \\ \mathbf{z}\_{\mathrm{1F}} \\ \mathbf{x}\_{\mathrm{F}} + \frac{2}{3}h\_{\mathrm{F}}\sin\psi\_{\mathrm{F}} + p\_{\mathrm{F}}\sin\left(\alpha - \psi\_{\mathrm{F}}\right) \\ \mathbf{y}\_{\mathrm{F}} + \frac{2}{3}h\_{\mathrm{F}}\cos\psi\_{\mathrm{F}} - p\_{\mathrm{F}}\cos\left(\alpha - \psi\_{\mathrm{F}}\right) \\ \mathbf{z}\_{2\mathrm{F}} \\ \mathbf{x}\_{\mathrm{F}} + \frac{2}{3}h\_{\mathrm{F}}\sin\psi\_{\mathrm{F}} - q\_{\mathrm{F}}\sin\left(\beta\_{\mathrm{F}} - \alpha + \psi\_{\mathrm{F}}\right) \\ \mathbf{y}\_{\mathrm{F}} + \frac{2}{3}h\_{\mathrm{F}}\cos\psi\_{\mathrm{F}} - q\_{\mathrm{F}}\cos\left(\beta\_{\mathrm{F}} - \alpha + \psi\_{\mathrm{F}}\right) \\ \mathbf{z}\_{3\mathrm{F}} \end{bmatrix}$$

Figure 3 shows the control structure proposed in this Chapter for the coordinated cooperative control of mobile manipulators. Taking the time derivative of the forward and the inverse kinematics transformations we can obtain the relationship between the time variations of **x** *t* and **r***t* , represented by the Jacobian matrix **FJ** , which is given by

$$
\dot{\mathbf{r}} = \mathbf{J}\_F(\mathbf{x}) \dot{\mathbf{x}} \tag{4}
$$

and in the inverse way is given by

$$
\dot{\mathbf{x}} = \mathbf{J}\_{\mathbf{r}}^{\mathbf{1}}(\mathbf{r})\dot{\mathbf{r}}\tag{5}
$$

where,

$$\mathbf{J}\_{\mathbf{F}}(\mathbf{x}) = \frac{\partial \mathbf{r}\_{f\mathbf{x}1}}{\partial \mathbf{x}\_{e\mathbf{x}1}} \qquad \text{and} \qquad \mathbf{J}\_{\mathbf{F}}^{\mathbf{1}}(\mathbf{r}) = \frac{\partial \mathbf{x}\_{e\mathbf{x}1}}{\partial \mathbf{r}\_{f\mathbf{x}1}} \qquad \text{with } e\_{\prime}f = 1, 2..., 9 \dots$$

Fig. 3. Control system block diagram

172 Frontiers in Advanced Control Systems

point in the middle of the segment **h h**2 3 , passing through ( , ) *F F x y* , and

2

*xh p yh p*

2

*xh q yh q*

variations of **x** *t* and **r***t* , represented by the Jacobian matrix **FJ** , which is given by

3 2 2 3 2 3

1 3

2 3

*F FF F* 2 2 *h p q r* represents the distance between the end-effector **h**1 and the

1

*FFF FF F F F F FF F F F FF F F F F FF F F F F FF F F F*

sin cos

*x h y h z*

3

*z*

sin sin cos cos

Figure 3 shows the control structure proposed in this Chapter for the coordinated cooperative control of mobile manipulators. Taking the time derivative of the forward and the inverse kinematics transformations we can obtain the relationship between the time

**r=J x x <sup>F</sup>** (4)

**<sup>F</sup>**

x1 *e f* 

**r**

and x1

**-1 F <sup>x</sup> J r**

**x**

**x**

1 *s*

*z*

sin sin cos cos

*RRR <sup>f</sup>* **x r** is given by

 

 

 

 

 

 

**-1 x J rr** (5)

with *e f* , 1,2..,9 .

**v <sup>c</sup> v**

**χ**ˆ 

**ref v <sup>c</sup> v**

**v v +**

where, 1 1 22 2

<sup>4</sup> arccos 2 *FF F F F p h r p h*

22 2 1

and in the inverse way is given by

x1

**r**

**x**

**<sup>F</sup>**

 **r r** *<sup>d</sup>* **x**

**<sup>x</sup> <sup>+</sup> <sup>+</sup> <sup>+</sup> <sup>+</sup>**

**J x**

*ref* **<sup>r</sup>** *<sup>d</sup>* **<sup>r</sup> -1 J r**

Fig. 3. Control system block diagram

x1 *f e*

where,

*d* **r**

**r**

. On the other hand, <sup>132</sup>

**h x h h**

### **5. Scalability for the cooperative control of multi-mobile manipulator**

This Subsection proposes a way to generalize the control system associated to the coordinated cooperation of three mobile manipulators (virtual structure prism) to a coordinated cooperation of 3 *n* mobile manipulators. Such proposition is based on the decomposition of a virtual 3-dimensional structure of *n* vertices into simpler components, *i.e*., 2 *n* prisms. The idea is to take advantage of the control scheme proposed for a virtual prism to implement a coordinated cooperative control of 3 *n* mobile manipulators using the same kinematics transformations presented in previous Section 3, thus not demanding to change the Jacobian (Figure 4).

Fig. 4. Scalability in the multi-layer control

To do that, one should first label the mobile manipulators *Ri* , *i n* 1,2,3..., and determine the leader prism of the whole formation ( *RRR* 213 or *RRR* <sup>312</sup> , paying attention to the sequence **ABC** or **ACB**). After that, new prisms are formed with the remaining mobile manipulators, based on a simple algorithm: a new prism is formed with the last two mobile manipulators robots of the last prism already formed and the next mobile manipulator in the list of labelled mobile manipulators (in other words, *R RR j jj* 1 2 or *R RR j jj* 2 1 where *j n* 1,2,..., 2 represents the current virtual structure prism). Additionally, from previous Section 6.3, a set of desired virtual structure variables 123 *F FF F F F F j jj j j j j S pq z z z* is assigned to each virtual structure prism. Actually, the number of virtual structure variables is the same, but three of the variables has its value defined by the previous formation, *i.e.*, 2 21 *n* , instead of 3 2 *n* , because it is assumed that <sup>1</sup> 1 1 <sup>123</sup> *F F FF F F F j j jj j j j Spq z z z* .

One point that deserves to be mentioned here is the control signals generated: there will always be a redundancy in the virtual structures with more than three mobile manipulators. For example, the mobile manipulators *R*2 and *R*<sup>3</sup> , in a virtual structures of four mobile manipulators, will receive control signals associated to the errors of the two virtual prisms ( *RRR* 213 and *RRR* <sup>324</sup> , for example). In this work, however, the implementation chosen is one in which the mobile manipulators *Rj*2 will receive control signals only from the controllers associated to the *j* 2 virtual prisms, while the mobile manipulators *R*<sup>1</sup> , *R*2 and *R*3 will receive the signals generated by the controller associated to the leader prism *j* 1 .

*Remark 5:* the proposed structure is also modular in the horizontal sense, *i.e*., it grows horizontally whenever a new robot is added to the formation.

*Remark 6:* the proposed structure is not centralized, since a controller is associated to each robot, except for the three first robots, which are governed by a single controller.

#### **6. Controllers design**

In this section it is presented the design of the controllers for the following control layers: Formation Control, Kinematic Control and Adaptive Dynamic Compensation. It is worth remark that both the kinematic control and adaptive dynamic compensation are performed separately for each mobile manipulator robot.

#### **6.1 Formation controller**

The Control Layer receives from the upper layer the desired formation pose and shape *d d T <sup>d</sup>* **F F r PS** and its desired variations *d d T <sup>d</sup>* **F F r PS** . It generates the pose and shape variation references *ref ref T ref* **F F r PS** , where the subscripts *d* and *ref* represent the desired and reference signals, respectively. Defining the formation error as **r r -r** *t tt <sup>d</sup>* and taking its first time derivative, the following expression is obtained,

$$
\dot{\tilde{\mathbf{r}}} = \dot{\mathbf{r}}\_d - \dot{\mathbf{r}} \,. \tag{6}
$$

174 Frontiers in Advanced Control Systems

To do that, one should first label the mobile manipulators *Ri* , *i n* 1,2,3..., and determine the leader prism of the whole formation ( *RRR* 213 or *RRR* <sup>312</sup> , paying attention to the sequence **ABC** or **ACB**). After that, new prisms are formed with the remaining mobile manipulators, based on a simple algorithm: a new prism is formed with the last two mobile manipulators robots of the last prism already formed and the next mobile manipulator in the list of labelled mobile manipulators (in other words, *R RR j jj* 1 2 or *R RR j jj* 2 1 where *j n* 1,2,..., 2 represents the current virtual structure prism). Additionally, from previous Section 6.3, a set of desired virtual structure variables 123 *F FF F F F F j jj j j j j S pq z z z*

is assigned to each virtual structure prism. Actually, the number of virtual structure

variables is the same, but three of the variables has its value defined by the previous formation, *i.e.*, 2 21 *n* , instead of 3 2 *n* , because it is assumed that

One point that deserves to be mentioned here is the control signals generated: there will always be a redundancy in the virtual structures with more than three mobile manipulators. For example, the mobile manipulators *R*2 and *R*<sup>3</sup> , in a virtual structures of four mobile manipulators, will receive control signals associated to the errors of the two virtual prisms ( *RRR* 213 and *RRR* <sup>324</sup> , for example). In this work, however, the implementation chosen is one in which the mobile manipulators *Rj*2 will receive control signals only from the controllers associated to the *j* 2 virtual prisms, while the mobile manipulators *R*<sup>1</sup> , *R*2 and *R*3 will

*Remark 5:* the proposed structure is also modular in the horizontal sense, *i.e*., it grows

*Remark 6:* the proposed structure is not centralized, since a controller is associated to each

In this section it is presented the design of the controllers for the following control layers: Formation Control, Kinematic Control and Adaptive Dynamic Compensation. It is worth remark that both the kinematic control and adaptive dynamic compensation are performed

The Control Layer receives from the upper layer the desired formation pose and shape

and reference signals, respectively. Defining the formation error as **r r -r** *t tt <sup>d</sup>* and

*<sup>d</sup>* **r r -r**

*T*

*ref* **F F r PS** , where the subscripts *d* and *ref* represent the desired

*<sup>d</sup>* **F F r PS** . It generates the pose and shape

. (6)

receive the signals generated by the controller associated to the leader prism *j* 1 .

robot, except for the three first robots, which are governed by a single controller.

horizontally whenever a new robot is added to the formation.

<sup>1</sup> 1 1 <sup>123</sup> *F F FF F F F j j jj j j j Spq z z z* .

**6. Controllers design** 

**6.1 Formation controller** 

*T*

*d d*

variation references

*<sup>d</sup>*

separately for each mobile manipulator robot.

**F F r PS** and its desired variations *d d*

*ref ref*

taking its first time derivative, the following expression is obtained,

*T*

$$V\left(\tilde{\mathbf{r}}\right) = \frac{1}{2}\tilde{\mathbf{r}}^{\mathbf{T}}\tilde{\mathbf{r}} \, > 0 \,\, \prime$$

taking its first time derivative and replacing (6) and *ref* **<sup>F</sup>** *<sup>d</sup>* **r =J x** , assuming -by now- perfect velocity tracking, *i.e*., *ref* **r r** , one gets

$$\dot{V}\left(\tilde{\mathbf{r}}\right) = \tilde{\mathbf{r}}^{\mathrm{T}}\dot{\tilde{\mathbf{r}}} = \tilde{\mathbf{r}}^{\mathrm{T}}\left(\dot{\mathbf{r}}\_{d} - \mathbf{J}\_{\mathrm{F}}\dot{\mathbf{x}}\_{d}\right) \cdot \mathbf{r}$$

Now, the proposed formation control law is defined as

$$\dot{\mathbf{x}}\_d = \mathbf{J}\_\mathbf{F}^1 \left( \dot{\mathbf{r}}\_d + \boldsymbol{\kappa}\_1 \tanh \left( \boldsymbol{\kappa}\_2 \tilde{\mathbf{r}} \right) \right) = \mathbf{J}\_\mathbf{F}^1 \dot{\mathbf{r}}\_{ref} \tag{7}$$

where 1 Κ and 2 Κ are diagonal positive gain matrix. Introducing (7) into the time derivative of *V* **r** , it is obtained

$$\dot{V}\left(\tilde{\mathbf{r}}\right) = -\tilde{\mathbf{r}}^{T}\mathbb{K}\_{1}\tanh\left(\mathbb{K}\_{2}\tilde{\mathbf{r}}\right) < 0\,\,. \tag{8}$$

Thus, the equilibrium point is asymptotically stable, *i.e.* **r***t* 0 asymptotically.

*Remark 7:* Equation (7) represents the desired reference velocity vector for each mobile manipulator's end-effector.

Now, relaxing the assumption of perfect velocity tracking, it is considered a difference *t* **<sup>r</sup>** between the desired and the real formation variations, such as *ref* **<sup>r</sup> δ r r** . Then, (8) should be written as

$$\dot{V}\left(\tilde{\mathbf{r}}\right) = \tilde{\mathbf{r}}^{\mathsf{T}}\mathbf{\hat{\mathbf{6}}}\_{\dot{\mathbf{r}}} - \tilde{\mathbf{r}}^{\mathsf{T}}\mathbb{K}\_{1}\tanh\left(\mathbb{K}\_{2}\tilde{\mathbf{r}}\right). \tag{9}$$

A sufficient condition for *V* **r** to be negative definite is,

$$\left|\widetilde{\mathbf{r}}^{\mathbf{T}}\mathbb{K}\_{1}\tanh\left(\mathbb{K}\_{2}\widetilde{\mathbf{r}}\right)\right| > \left|\widetilde{\mathbf{r}}^{\mathbf{T}}\mathbb{\tilde{G}}\_{\widetilde{\mathbf{r}}}\right|.\tag{10}$$

For large values of **r** , it can be considered that: Κ ΚΚ 1 21 tanh **r** . *<sup>V</sup>* **r** will be negative definite only if: 1 **<sup>r</sup> <sup>δ</sup>** <sup>Κ</sup> , thus making the errors **r** to decrease. Now, for small values of **r** , it can be expressed: Κ Κ ΚΚ 1 2 12 tanh **r r** , and (10) can be written as,

$$\|\tilde{\mathbf{r}}\| \ge \frac{\|\mathbf{\tilde{s}}\_{\tilde{\mathbf{r}}}\|}{\lambda\_{\min}\left(\mathsf{K}\_1\right)\lambda\_{\min}\left(\mathsf{K}\_2\right)}$$

thus implying that the error **r** is bounded by,

$$\left\|\vec{\mathbf{r}}\right\| \leq \frac{\left\|\mathsf{S}\_{\hat{\mathbf{r}}}\right\|}{\varphi \lambda\_{\text{min}}\left(\mathsf{K}\_{1}\right) \lambda\_{\text{min}}\left(\mathsf{K}\_{2}\right)}; \qquad \text{with } 0 \leq \zeta \leq 1 \tag{11}$$

Hence, with *t* **<sup>r</sup> δ 0** , the formation error **r***t* is ultimately bounded by (11).

#### **6.2 Kinematic controller**

This layer receives the desired position and velocities for each mobile manipulator <sup>12</sup> *T ddd di dn* **xhh h h** and 12 *T d dd di dn* **x hh h h** , respectively, and it generates the desired kinematic velocities 1 2 *T* **vvv v v ccc c c** *i n* for all robots. In other words, the desired operational motion of the *n* mobile manipulators is an application *<sup>d</sup>* | [, ] <sup>0</sup> *<sup>f</sup>* **x** *t t tt* . Thus, the problem of control is to find the control vector of maneuverability **vc** *t t tt* | [, ] <sup>0</sup> *<sup>f</sup>* to achieve the desired operational motion (7). The corresponding evolution of the whole system is given by the actual generalized motion | [, ] <sup>0</sup> *<sup>f</sup>* **q** *t t tt* .

The design of the kinematic controller is based on the kinematic model of each mobile manipulator robot that belongs to the work team. The kinematic model (1) of the whole mobile manipulators can be represented by,

$$
\dot{\mathbf{x}}(t) = \mathbf{J}(\mathbf{q})\mathbf{v}(t)
$$

with

$$
\begin{split}
\dot{\mathbf{x}}(t) &= \begin{bmatrix}
\dot{\mathbf{h}}\_1(t) & \dot{\mathbf{h}}\_2(t) & \cdots & \dot{\mathbf{h}}\_i(t) & \cdots & \dot{\mathbf{h}}\_n(t)
\end{bmatrix}^T \in \mathfrak{R}^{3,n}, \\\\
\mathbf{v}(t) &= \begin{bmatrix}
\mathbf{v}\_1(t) & \mathbf{v}\_2(t) & \cdots & \mathbf{v}\_i(t) & \cdots & \mathbf{v}\_n(t)
\end{bmatrix}^T \in \mathfrak{R}^{n, \mathcal{S}\_{\mathbf{x}^\*}}, \\\\
\mathbf{q}(t) &= \begin{bmatrix}
\mathbf{q}\_1(t) & \mathbf{q}\_2(t) & \cdots & \mathbf{q}\_i(t) & \cdots & \mathbf{q}\_n(t)
\end{bmatrix}^T \in \mathfrak{R}^{n, \mathcal{T}^\*}, \text{and finally} \\\\
\mathbf{J}(\mathbf{q}) &= \begin{bmatrix}
\mathbf{J}\_1(\mathbf{q}\_1) & \mathbf{J}\_2(\mathbf{q}\_3) & \cdots & \mathbf{J}\_i(\mathbf{q}\_i) & \cdots & \mathbf{J}\_n(\mathbf{q}\_n)
\end{bmatrix}^T \in \mathfrak{R}^{(3, n \times n, n)}.
\end{split}
$$

where ' *n* represents the dimensions of the generalized spaces associated to the robotic arms and to the mobile platforms of all mobile manipulators; *i.e.*, '' ' ' 1 2 ' ... ... *nnn n n i n* (see *Remark 2*).

It is worth noting, that the kinematic controller is performed separately for each robot. Hence, to obtain the vector of maneuverability **v***<sup>i</sup> t* that correspond to the *i-*th mobile manipulator, the right pseudo-inverse Jacobian matrix **J***i i* **q** is used

$$\mathbf{v}\_i = \mathbf{J}\_i^\# \dot{\mathbf{h}}\_i \tag{12}$$

176 Frontiers in Advanced Control Systems

**<sup>r</sup>** Κ Κ ; with 0< <1

This layer receives the desired position and velocities for each mobile manipulator

robots. In other words, the desired operational motion of the *n* mobile manipulators is an application *<sup>d</sup>* | [, ] <sup>0</sup> *<sup>f</sup>* **x** *t t tt* . Thus, the problem of control is to find the control vector of maneuverability **vc** *t t tt* | [, ] <sup>0</sup> *<sup>f</sup>* to achieve the desired operational motion (7). The corresponding evolution of the whole system is given by the actual generalized motion

The design of the kinematic controller is based on the kinematic model of each mobile manipulator robot that belongs to the work team. The kinematic model (1) of the whole

**x Jqv** *t t*

*i n t tt t t* **xhh h h** 3.*<sup>n</sup>* ,

**qqq q q** *n n*. ' , and finally

*ii nn* **Jq J q J q J q J q** 3. x . ' *n nn* .

where ' *n* represents the dimensions of the generalized spaces associated to the robotic arms

It is worth noting, that the kinematic controller is performed separately for each robot. Hence, to obtain the vector of maneuverability **v***<sup>i</sup> t* that correspond to the *i-*th mobile

and to the mobile platforms of all mobile manipulators; *i.e.*, '' ' '

1 2

1 2

11 23 *<sup>T</sup>*

1 2

manipulator, the right pseudo-inverse Jacobian matrix **J***i i* **q** is used

*i n t tt t t*

*i n t tt t t* **vvv v v** ' . *<sup>n</sup> <sup>n</sup>*

and it generates the desired kinematic velocities 1 2

*d dd di dn* **x hh h h** , respectively,

*T*

*T*

*T*

,
