**Meet the editor**

Dr Hassan A. Yasser received his B.Sc. and M.Sc. from the college of science, Baghdad University, Baghdad, Iraq in 1989 and 1995, respectively. He received his Ph.D. from the college of science, Basrah University, Basrah, Iraq in 2005. His current research interests include digital image processing, mathematical physics, and optical communication. He is currently a professor of

mathematical physics at the College of Science, Thi-Qar University, Iraq.

## Contents

#### **Preface XI**


#### Chapter 11 **Efficient Model Transition in Adaptive Multi-Resolution Modeling of Biopolymers 237**  Mohammad Poursina, Imad M. Khan and Kurt S. Anderson

## Preface

The core of linear algebra is essential to every mathematician, and we not only treat this core, but add material that is essential to mathematicians in specific fields. This book is for advanced researchers. We presume you are already familiar with elementary linear algebra and that you know how to multiply matrices, solve linear systems, etc. We do not treat elementary material here, though we occasionally return to elementary material from a more advanced standpoint to show you what it really means. We have written a book that we hope will be broadly useful. In a few places we have succumbed to temptation and included material that is not quite so well known, but which, in our opinion, should be. We hope that you will be enlightened not only by the specific material in the book but also by its style of argument. We also hope this book will serve as a valuable reference throughout your mathematical career.

Chapter 1 reviews the metric Hermitian 3-algebra, which has been playing important roles recently in sting theory. It is classified by using a correspondence to a class of the super Lie algebra. It also reviews the Lie and Hermitian 3-algebra models of M-theory. Chapter 2 deals with algebraic analysis of Appell polynomials. It presents the determinantal approaches of Appell polynomials and the related topics, where many classical and non-classical examples are presented. Chapter 3 reviews a universal relation between combinatorics and the matrix model, and discusses its relation to the gauge theory. Chapter 4 covers the nonnegative matrices that have been a source of interesting and challenging mathematical problems. They arise in many applications such as: communications systems, biological systems, economics, ecology, computer sciences, machine learning, and many other engineering systems. Chapter 5 presents the central theory behind realization-based system identication and connects the theory to many tools in linear algebra, including the QR-decomposition, the singular value decomposition, and linear least-squares problems. Chapter 6 presents a novel iterative-recursive algorithm for computing GI for block matrices in the context of wireless MIMO communication systems within RFC. Chapter 7 deals with the development of the theory of operator means. It setups basic notations and states some background about operator monotone functions which play important roles in the theory of operator means. Chapter 8 studies a general formulation of Jensen's operator inequality for a continuous eld of self-adjoint operators and a eld of positive linear mappings. The aim of chapter 9 is to present a system of linear equation and inequalities in max-algebra. Max-algebra is an analogue of linear algebra developed on a pair of operations extended to matrices and vectors. Chapter 10 covers an efficient algorithm for the coarse to fine scale transition in multi-flexible-body systems with application to biomolecular systems that are modeled as articulated bodies and undergo discontinuous changes in the model definition. Finally, chapter 11 studies the structure of matrices dened over arbitrary elds whose elements are rational functions with no poles at innity and prescribed nite poles. Complete systems of invariants are provided for each one of these equivalence relations and the relationship between both systems of invariants is claried. This result can be seen as an extension of the classical theorem on pole assignment by Rosenbrock.

**Dr. Hassan Abid Yasser**

College of Science University of Thi-Qar, Thi-Qar Iraq

**Chapter 0 Chapter 1**

## **3-Algebras in String Theory**

Matsuo Sato

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/46480

## **1. Introduction**

In this chapter, we review 3-algebras that appear as fundamental properties of string theory. 3-algebra is a generalization of Lie algebra; it is defined by a tri-linear bracket instead of by a bi-linear bracket, and satisfies fundamental identity, which is a generalization of Jacobi identity [1–3]. We consider 3-algebras equipped with invariant metrics in order to apply them to physics.

It has been expected that there exists M-theory, which unifies string theories. In M-theory, some structures of 3-algebras were found recently. First, it was found that by using *u*(*N*) ⊕ *u*(*N*) Hermitian 3-algebra, we can describe a low energy effective action of N coincident supermembranes [4–8], which are fundamental objects in M-theory.

With this as motivation, 3-algebras with invariant metrics were classified [9–22]. Lie 3-algebras are defined in real vector spaces and tri-linear brackets of them are totally anti-symmetric in all the three entries. Lie 3-algebras with invariant metrics are classified into A<sup>4</sup> algebra, and Lorentzian Lie 3-algebras, which have metrics with indefinite signatures. On the other hand, Hermitian 3-algebras are defined in Hermitian vector spaces and their tri-linear brackets are complex linear and anti-symmetric in the first two entries, whereas complex anti-linear in the third entry. Hermitian 3-algebras with invariant metrics are classified into *u*(*N*) ⊕ *u*(*M*) and *sp*(2*N*) ⊕ *u*(1) Hermitian 3-algebras.

Moreover, recent studies have indicated that there also exist structures of 3-algebras in the Green-Schwartz supermembrane action, which defines full perturbative dynamics of a supermembrane. It had not been clear whether the total supermembrane action including fermions has structures of 3-algebras, whereas the bosonic part of the action can be described by using a tri-linear bracket, called Nambu bracket [23, 24], which is a generalization of Poisson bracket. If we fix to a light-cone gauge, the total action can be described by using Poisson bracket, that is, only structures of Lie algebra are left in this gauge [25]. However, it was shown under an approximation that the total action can be described by Nambu bracket if we fix to a semi-light-cone gauge [26]. In this gauge, the eleven dimensional space-time of M-theory is manifest in the supermembrane action, whereas only ten dimensional part is manifest in the light-cone gauge.

©2012 Sato, licensee InTech. This is an open access chapter distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. ©2012 Sato, licensee InTech. This is a paper distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

The BFSS matrix theory is conjectured to describe an infinite momentum frame (IMF) limit of M-theory [27] and many evidences were found. The action of the BFSS matrix theory can be obtained by replacing Poisson bracket with a finite dimensional Lie algebra's bracket in the supermembrane action in the light-cone gauge. Because of this structure, only variables that represent the ten dimensional part of the eleven-dimensional space-time are manifest in the BFSS matrix theory. Recently, 3-algebra models of M-theory were proposed [26, 28, 29], by replacing Nambu bracket with finite dimensional 3-algebras' brackets in an action that is shown, by using an approximation, to be equivalent to the semi-light-cone supermembrane action. All the variables that represent the eleven dimensional space-time are manifest in these models. It was shown that if the DLCQ limit of the 3-algebra models of M-theory is taken, they reduce to the BFSS matrix theory [26, 28], as they should [30–35].

### **2. Definition and classification of metric Hermitian 3-algebra**

In this section, we will define and classify the Hermitian 3-algebras equipped with invariant metrics.

#### **2.1. General structure of metric Hermitian 3-algebra**

The metric Hermitian 3-algebra is a map *V* × *V* × *V* → *V* defined by (*x*, *y*, *z*) �→ [*x*, *y*; *z*], where the 3-bracket is complex linear in the first two entries, whereas complex anti-linear in the last entry, equipped with a metric < *x*, *y* >, satisfying the following properties: the fundamental identity

$$[[\mathbf{x}, \mathbf{y}; \mathbf{z}], \mathbf{v}; \mathbf{w}] = [[\mathbf{x}, \mathbf{v}; \mathbf{w}], \mathbf{y}; \mathbf{z}] + [\mathbf{x}, [\mathbf{y}, \mathbf{v}; \mathbf{w}]; \mathbf{z}] - [\mathbf{x}, \mathbf{y}; [\mathbf{z}, \mathbf{w}; \mathbf{v}]] \tag{1}$$

the metric invariance

$$<\langle x, v; w \rangle\_{\prime} y> - <\ge, [y, w; v]> = 0 \tag{2}$$

and the anti-symmetry

$$[\mathbf{x}, \mathbf{y}; \mathbf{z}] = -[\mathbf{y}, \mathbf{x}; \mathbf{z}] \tag{3}$$

for

$$
\langle x, y, z, v, w \in V \rangle \tag{4}
$$

The Hermitian 3-algebra generates a symmetry, whose generators *D*(*x*, *y*) are defined by

$$D(\mathbf{x}, \mathbf{y})z := [z, \mathbf{x}; \mathbf{y}] \tag{5}$$

From (1), one can show that *D*(*x*, *y*) form a Lie algebra,

$$D[D(\mathbf{x}, y), D(\mathbf{v}, w)] = D(D(\mathbf{x}, y)\mathbf{v}, \mathbf{w}) - D(\mathbf{v}, D(y, \mathbf{x})w) \tag{6}$$

There is an one-to-one correspondence between the metric Hermitian 3-algebra and a class of metric complex super Lie algebras [19]. Such a class satisfies the following conditions among complex super Lie algebras *S* = *S*<sup>0</sup> ⊕ *S*1, where *S*<sup>0</sup> and *S*<sup>1</sup> are even and odd parts, respectively. *<sup>S</sup>*<sup>1</sup> is decomposed as *<sup>S</sup>*<sup>1</sup> <sup>=</sup> *<sup>V</sup>* <sup>⊕</sup> *<sup>V</sup>*¯ , where *<sup>V</sup>* is an unitary representation of *<sup>S</sup>*0: for *<sup>a</sup>* <sup>∈</sup> *<sup>S</sup>*0, *u*, *v* ∈ *V*,

$$[a, u] \in V \tag{7}$$

and

2 Will-be-set-by-IN-TECH

The BFSS matrix theory is conjectured to describe an infinite momentum frame (IMF) limit of M-theory [27] and many evidences were found. The action of the BFSS matrix theory can be obtained by replacing Poisson bracket with a finite dimensional Lie algebra's bracket in the supermembrane action in the light-cone gauge. Because of this structure, only variables that represent the ten dimensional part of the eleven-dimensional space-time are manifest in the BFSS matrix theory. Recently, 3-algebra models of M-theory were proposed [26, 28, 29], by replacing Nambu bracket with finite dimensional 3-algebras' brackets in an action that is shown, by using an approximation, to be equivalent to the semi-light-cone supermembrane action. All the variables that represent the eleven dimensional space-time are manifest in these models. It was shown that if the DLCQ limit of the 3-algebra models of M-theory is taken, they

In this section, we will define and classify the Hermitian 3-algebras equipped with invariant

The metric Hermitian 3-algebra is a map *V* × *V* × *V* → *V* defined by (*x*, *y*, *z*) �→ [*x*, *y*; *z*], where the 3-bracket is complex linear in the first two entries, whereas complex anti-linear in the last

The Hermitian 3-algebra generates a symmetry, whose generators *D*(*x*, *y*) are defined by

There is an one-to-one correspondence between the metric Hermitian 3-algebra and a class of metric complex super Lie algebras [19]. Such a class satisfies the following conditions among complex super Lie algebras *S* = *S*<sup>0</sup> ⊕ *S*1, where *S*<sup>0</sup> and *S*<sup>1</sup> are even and odd parts, respectively. *<sup>S</sup>*<sup>1</sup> is decomposed as *<sup>S</sup>*<sup>1</sup> <sup>=</sup> *<sup>V</sup>* <sup>⊕</sup> *<sup>V</sup>*¯ , where *<sup>V</sup>* is an unitary representation of *<sup>S</sup>*0: for *<sup>a</sup>* <sup>∈</sup> *<sup>S</sup>*0,

[[*x*, *y*; *z*], *v*; *w*] = [[*x*, *v*; *w*], *y*; *z*]+[*x*, [*y*, *v*; *w*]; *z*] − [*x*, *y*; [*z*, *w*; *v*]] (1)

[*D*(*x*, *y*), *D*(*v*, *w*)] = *D*(*D*(*x*, *y*)*v*, *w*) − *D*(*v*, *D*(*y*, *x*)*w*) (6)

< [*x*, *v*; *w*], *y* > − < *x*, [*y*, *w*; *v*] >= 0 (2)

[*x*, *y*; *z*] = −[*y*, *x*; *z*] (3)

*x*, *y*, *z*, *v*, *w* ∈ *V* (4)

*D*(*x*, *y*)*z* := [*z*, *x*; *y*] (5)

[*a*, *u*] ∈ *V* (7)

reduce to the BFSS matrix theory [26, 28], as they should [30–35].

**2.1. General structure of metric Hermitian 3-algebra**

From (1), one can show that *D*(*x*, *y*) form a Lie algebra,

metrics.

for

*u*, *v* ∈ *V*,

the fundamental identity

the metric invariance

and the anti-symmetry

**2. Definition and classification of metric Hermitian 3-algebra**

entry, equipped with a metric < *x*, *y* >, satisfying the following properties:

$$<\langle a, u \rangle, v> +  \langle a^\*, v \rangle> = 0 \tag{8}$$

*<sup>v</sup>*¯ <sup>∈</sup> *<sup>V</sup>*¯ is defined by

$$<\mathfrak{v}>\tag{9}$$

The super Lie bracket satisfies

$$[V,V] = 0, \quad [\bar{V}, \bar{V}] = 0 \tag{10}$$

From the metric Hermitian 3-algebra, we obtain the class of the metric complex super Lie algebra in the following way. The elements in *S*0, *V*, and *V*¯ are defined by (5), (4), and (9), respectively. The algebra is defined by (6) and

$$\begin{aligned} [D(x, y), z] &:= D(x, y)z = [z, x; y] \\ [D(x, y), \bar{z}] &:= -D(\bar{y'}, x)z = -[z, \bar{y}; x] \\ [x, \bar{y}] &:= D(x, y) \\ [x, y] &:= 0 \\ [\bar{x}, \bar{y}] &:= 0 \end{aligned} \tag{11}$$

One can show that this algebra satisfies the super Jacobi identity and (7)-(10) as in [19].

Inversely, from the class of the metric complex super Lie algebra, we obtain the metric Hermitian 3-algebra by

$$[\mathbf{x}, \mathbf{y}; \mathbf{z}] := \mathbf{a}[[\mathbf{y}, \mathbf{\bar{z}}], \mathbf{x}] \tag{12}$$

where *α* is an arbitrary constant. One can also show that this algebra satisfies (1)-(3) for (4) as in [19].

#### **2.2. Classification of metric Hermitian 3-algebra**

The classical Lie super algebras satisfying (7)-(10) are *A*(*m* − 1, *n* − 1) and *C*(*n* + 1). The even parts of *A*(*m* −1, *n* −1) and *C*(*n* +1) are *u*(*m*) ⊕ *u*(*n*) and *sp*(2*n*) ⊕ *u*(1), respectively. Because the metric Hermitian 3-algebra one-to-one corresponds to this class of the super Lie algebra, the metric Hermitian 3-algebras are classified into *u*(*m*) ⊕ *u*(*n*) and *sp*(2*n*) ⊕ *u*(1) Hermitian 3-algebras.

First, we will construct the *u*(*m*) ⊕ *u*(*n*) Hermitian 3-algebra from *A*(*m* − 1, *n* − 1), according to the relation in the previous subsection. *A*(*m* −1, *n* −1) is simple and is obtained by dividing *sl*(*m*, *n*) by its ideal. That is, *A*(*m* − 1, *n* − 1) = *sl*(*m*, *n*) when *m* �= *n* and *A*(*n* − 1, *n* − 1) = *sl*(*n*, *n*)/*λ*12*n*.

Real *sl*(*m*, *n*) is defined by

$$
\begin{pmatrix} h\_1 & c \\ ic^\dagger \ h\_2 \end{pmatrix} \tag{13}
$$

where *h*<sup>1</sup> and *h*<sup>2</sup> are *m* × *m* and *n* × *n* anti-Hermite matrices and *c* is an *n* × *m* arbitrary complex matrix. Complex *sl*(*m*, *n*) is a complexification of real *sl*(*m*, *n*), given by

$$
\begin{pmatrix} \alpha \ \beta \\ \gamma \ \delta \end{pmatrix} \tag{14}
$$

#### 4 Will-be-set-by-IN-TECH 4 Linear Algebra – Theorems and Applications

where *α*, *β*, *γ*, and *δ* are *m* × *m*, *n* × *m*, *m* × *n*, and *n* × *n* complex matrices that satisfy

$$\text{tr}\mathfrak{a} = \text{tr}\delta\tag{15}$$

Complex *<sup>A</sup>*(*<sup>m</sup>* <sup>−</sup> 1, *<sup>n</sup>* <sup>−</sup> <sup>1</sup>) is decomposed as *<sup>A</sup>*(*<sup>m</sup>* <sup>−</sup> 1, *<sup>n</sup>* <sup>−</sup> <sup>1</sup>) = *<sup>S</sup>*<sup>0</sup> <sup>⊕</sup> *<sup>V</sup>* <sup>⊕</sup> *<sup>V</sup>*¯ , where

$$\begin{aligned} \begin{pmatrix} \alpha & 0\\ 0 & \delta \end{pmatrix} & \in \mathcal{S}\_0\\ \begin{pmatrix} 0 & \beta\\ 0 & 0 \end{pmatrix} & \in \mathcal{V} \\ \begin{pmatrix} 0 & 0\\ \gamma & 0 \end{pmatrix} & \in \bar{\mathcal{V}} \end{aligned} \tag{16}$$

(9) is rewritten as *<sup>V</sup>* <sup>→</sup> *<sup>V</sup>*¯ defined by

$$B = \begin{pmatrix} 0 \ \beta \\ 0 \ 0 \end{pmatrix} \mapsto B^\dagger = \begin{pmatrix} 0 & 0 \\ \beta^\dagger \ 0 \end{pmatrix} \tag{17}$$

where *<sup>B</sup>* <sup>∈</sup> *<sup>V</sup>* and *<sup>B</sup>*† <sup>∈</sup> *<sup>V</sup>*¯ . (12) is rewritten as

$$\mathbb{E}\left[X,Y;Z\right] = \mathfrak{a}\left[\left[Y,Z^{\dagger}\right],X\right] = \mathfrak{a}\begin{pmatrix} 0 \ yz^{\dagger}x - xz^{\dagger}y\\ 0 & 0 \end{pmatrix} \tag{18}$$

for

$$\begin{aligned} X &= \begin{pmatrix} 0 \ x \\ 0 \ 0 \end{pmatrix} \in V \\ Y &= \begin{pmatrix} 0 \ y \\ 0 \ 0 \end{pmatrix} \in V \\ Z &= \begin{pmatrix} 0 \ z \\ 0 \ 0 \end{pmatrix} \in V \end{aligned} \tag{19}$$

As a result, we obtain the *u*(*m*) ⊕ *u*(*n*) Hermitian 3-algebra,

$$
\pi[\mathbf{x}, y; z] = \mathfrak{a}(yz^\dagger \mathbf{x} - \mathbf{x}z^\dagger y) \tag{20}
$$

where *x*, *y*, and *z* are arbitrary *n* × *m* complex matrices. This algebra was originally constructed in [8].

Inversely, from (20), we can construct complex *A*(*m* − 1, *n* − 1). (5) is rewritten as

$$D(\mathbf{x}, y) = (\mathbf{x}y^\dagger, y^\dagger \mathbf{x}) \in \mathbb{S}\_0 \tag{21}$$

(6) and (11) are rewritten as

$$\begin{aligned} [(xy^\dagger, y^\dagger x), (x'y^\dagger, y^\dagger x')] &= ([xy^\dagger, x'y^\dagger], [y^\dagger x, y'^\dagger x']) \\ [(xy^\dagger, y^\dagger x), z] &= xy^\dagger z - zy^\dagger x \\ [(xy^\dagger, y^\dagger x), w^\dagger] &= y^\dagger xw^\dagger - w^\dagger xy^\dagger \\ [x, y^\dagger] &= (xy^\dagger, y^\dagger x) \\ [x, y] &= 0 \\ [x^\dagger, y^\dagger] &= 0 \end{aligned} \tag{22}$$

This algebra is summarized as

4 Will-be-set-by-IN-TECH

tr*α* = tr*δ* (15)

<sup>∈</sup> *<sup>V</sup>*¯ (16)

(17)

(18)

(19)

where *α*, *β*, *γ*, and *δ* are *m* × *m*, *n* × *m*, *m* × *n*, and *n* × *n* complex matrices that satisfy

 0 *β* 0 0 ∈ *V*

 0 0 *γ* 0 

�→ *<sup>B</sup>*† <sup>=</sup>

 0 0 *β*† 0 

 <sup>0</sup> *yz*†*<sup>x</sup>* <sup>−</sup> *xz*†*<sup>y</sup>* 0 0

[*x*, *<sup>y</sup>*; *<sup>z</sup>*] = *<sup>α</sup>*(*yz*†*<sup>x</sup>* <sup>−</sup> *xz*†*y*) (20)

*<sup>D</sup>*(*x*, *<sup>y</sup>*)=(*xy*†, *<sup>y</sup>*†*x*) <sup>∈</sup> *<sup>S</sup>*<sup>0</sup> (21)

*y*�†], [*y*†*x*, *y*�†*x*�

])

*B* =

 0 *β* 0 0 

[*X*,*Y*; *Z*] = *α*[[*Y*, *Z*†], *X*] = *α*

*X* =

*Y* =

*Z* =

Inversely, from (20), we can construct complex *A*(*m* − 1, *n* − 1). (5) is rewritten as

*y*�†, *y*�†*x*�

[(*xy*†, *<sup>y</sup>*†*x*), *<sup>z</sup>*] = *xy*†*<sup>z</sup>* <sup>−</sup> *zy*†*<sup>x</sup>* [(*xy*†, *<sup>y</sup>*†*x*), *<sup>w</sup>*†] = *<sup>y</sup>*†*xw*† <sup>−</sup> *<sup>w</sup>*†*xy*†

As a result, we obtain the *u*(*m*) ⊕ *u*(*n*) Hermitian 3-algebra,

[(*xy*†, *y*†*x*),(*x*�

[*x*, *y*†]=(*xy*†, *y*†*x*)

[*x*, *y*] = 0

 0 *x* 0 0 ∈ *V*

 0 *y* 0 0 ∈ *V*

 0 *z* 0 0 ∈ *V*

where *x*, *y*, and *z* are arbitrary *n* × *m* complex matrices. This algebra was originally

)] = ([*xy*†, *x*�

[*x*†, *y*†] = 0 (22)

(9) is rewritten as *<sup>V</sup>* <sup>→</sup> *<sup>V</sup>*¯ defined by

for

constructed in [8].

(6) and (11) are rewritten as

where *<sup>B</sup>* <sup>∈</sup> *<sup>V</sup>* and *<sup>B</sup>*† <sup>∈</sup> *<sup>V</sup>*¯ . (12) is rewritten as

Complex *<sup>A</sup>*(*<sup>m</sup>* <sup>−</sup> 1, *<sup>n</sup>* <sup>−</sup> <sup>1</sup>) is decomposed as *<sup>A</sup>*(*<sup>m</sup>* <sup>−</sup> 1, *<sup>n</sup>* <sup>−</sup> <sup>1</sup>) = *<sup>S</sup>*<sup>0</sup> <sup>⊕</sup> *<sup>V</sup>* <sup>⊕</sup> *<sup>V</sup>*¯ , where *α* 0 0 *δ* ∈ *S*<sup>0</sup>

$$
\begin{bmatrix}
\begin{pmatrix}
\mathbf{x}y^{\dagger} & z \\
\mathbf{w}^{\dagger} & y^{\dagger}\mathbf{x}
\end{pmatrix}
\end{bmatrix}
\begin{pmatrix}
\mathbf{x}'y^{\dagger} & z' \\
\mathbf{w}'^{\dagger} & y'^{\dagger}\mathbf{x}'
\end{pmatrix}
\begin{bmatrix}
\end{bmatrix}
\tag{23}
$$

which forms complex *A*(*m* − 1, *n* − 1).

Next, we will construct the *sp*(2*n*) ⊕ *u*(1) Hermitian 3-algebra from *C*(*n* + 1). Complex *C*(*n* + <sup>1</sup>) is decomposed as *<sup>C</sup>*(*<sup>n</sup>* <sup>+</sup> <sup>1</sup>) = *<sup>S</sup>*<sup>0</sup> <sup>⊕</sup> *<sup>V</sup>* <sup>⊕</sup> *<sup>V</sup>*¯ . The elements are given by

$$
\begin{pmatrix} a & 0 & 0 & 0 \\ 0 & -a & 0 & 0 \\ 0 & 0 & a & b \\ 0 & 0 & c & -a^T \end{pmatrix} \in \mathbb{S}\_0
$$

$$
\begin{pmatrix} 0 & 0 & x\_1 & x\_2 \\ 0 & 0 & 0 & 0 \\ 0 & x\_2^T & 0 & 0 \\ 0 & -x\_1^T & 0 & 0 \end{pmatrix} \in V
$$

$$
\begin{pmatrix} 0 & 0 & 0 & 0 \\ 0 & 0 & y\_1 & y\_2 \\ y\_2^T & 0 & 0 & 0 \\ -y\_1^T & 0 & 0 & 0 \end{pmatrix} \in V \tag{24}
$$

where *α* is a complex number, *a* is an arbitrary *n* × *n* complex matrix, *b* and *c* are *n* × *n* complex symmetric matrices, and *<sup>x</sup>*1, *<sup>x</sup>*2, *<sup>y</sup>*<sup>1</sup> and *<sup>y</sup>*<sup>2</sup> are *<sup>n</sup>* <sup>×</sup>1 complex matrices. (9) is rewritten as *<sup>V</sup>* <sup>→</sup> *<sup>V</sup>*¯ defined by *<sup>B</sup>* �→ *<sup>B</sup>*¯ <sup>=</sup> *UB*∗*U*−1, where *<sup>B</sup>* <sup>∈</sup> *<sup>V</sup>*, *<sup>B</sup>*¯ <sup>∈</sup> *<sup>V</sup>*¯ and

$$M = \begin{pmatrix} 0 \ 1 & 0 & 0 \\ 1 \ 0 & 0 & 0 \\ 0 \ 0 & 0 & \mathbf{1} \\ 0 \ 0 & -\mathbf{1} & 0 \end{pmatrix} \tag{25}$$

Explicitly,

$$B = \begin{pmatrix} 0 & 0 & \mathbf{x}\_1 \ \mathbf{x}\_2 \\ 0 & 0 & 0 & 0 \\ 0 & \mathbf{x}\_2^T & 0 & 0 \\ 0 & -\mathbf{x}\_1^T & 0 & 0 \end{pmatrix} \mapsto \bar{B} = \begin{pmatrix} 0 & 0 & 0 & 0 \\ 0 & 0 \ \mathbf{x}\_2^\* & -\mathbf{x}\_1^\* \\ -\mathbf{x}\_1^\dagger & 0 & 0 & 0 \\ -\mathbf{x}\_2^\dagger & 0 & 0 & 0 \end{pmatrix} \tag{26}$$

(12) is rewritten as

$$\begin{aligned} [X, Y; Z] &:= a[[Y, \bar{Z}], X] \\ &= a \left[ \left[ \begin{pmatrix} 0 & 0 & y\_1 \ y\_2 \\ 0 & 0 & 0 \\ 0 & y\_2^T & 0 & 0 \\ 0 & -y\_1^T & 0 & 0 \end{pmatrix}, \begin{pmatrix} 0 & 0 & 0 & 0 \\ 0 & 0 & z\_2^\* - z\_1^\* \\ -z\_1^\dagger & 0 & 0 & 0 \\ -z\_2^\dagger & 0 & 0 & 0 \end{pmatrix} \right], \begin{pmatrix} 0 & 0 & x\_1 \ x\_2 \\ 0 & 0 & 0 & 0 \\ 0 & x\_2^T & 0 & 0 \\ 0 & -x\_1^T & 0 & 0 \end{pmatrix} \right] \\ &= a \begin{pmatrix} 0 & 0 & w\_1 \ w\_2 \\ 0 & 0 & 0 & 0 \\ 0 & w\_2^T & 0 & 0 \\ 0 & -w\_1^T & 0 & 0 \end{pmatrix} \end{aligned} \tag{27}$$

#### 6 Will-be-set-by-IN-TECH 6 Linear Algebra – Theorems and Applications

for

$$\begin{aligned} X &= \begin{pmatrix} 0 & 0 & x\_1 \ x\_2 \\ 0 & 0 & 0 & 0 \\ 0 & x\_2^T & 0 & 0 \\ 0 & -x\_1^T & 0 & 0 \end{pmatrix} \in V \\ Y &= \begin{pmatrix} 0 & 0 & y\_1 \ y\_2 \\ 0 & 0 & 0 & 0 \\ 0 & y\_2^T & 0 & 0 \\ 0 & -y\_1^T & 0 & 0 \end{pmatrix} \in V \\ Z &= \begin{pmatrix} 0 & 0 & z\_1 \ z\_2 \\ 0 & 0 & 0 & 0 \\ 0 & z\_2^T & 0 & 0 \\ 0 & -z\_1^T & 0 & 0 \end{pmatrix} \in V \end{aligned} \tag{28}$$

where *w*<sup>1</sup> and *w*<sup>2</sup> are given by

$$(w\_1, w\_2) = -(y\_1 z\_1^\dagger + y\_2 z\_2^\dagger)(x\_1, x\_2) + (x\_1 z\_1^\dagger + x\_2 z\_2^\dagger)(y\_1, y\_2) + (x\_2 y\_1^T - x\_1 y\_2^T)(z\_{2\prime}^\* - z\_1^\*)\tag{29}$$

As a result, we obtain the *sp*(2*n*) ⊕ *u*(1) Hermitian 3-algebra,

$$\mathbf{a}\left[\mathbf{x},\mathbf{y};\mathbf{z}\right] = \mathbf{a}\left((\mathbf{y}\odot\mathbf{\bar{z}})\mathbf{x} + (\mathbf{z}\odot\mathbf{x})\mathbf{y} - (\mathbf{x}\odot\mathbf{y})\mathbf{\bar{z}}\right) \tag{30}$$

for *x* = (*x*1, *x*2), *y* = (*y*1, *y*2), *z* = (*z*1, *z*2), where *x*1, *x*2, *y*1, *y*2, *z*1, and *z*<sup>2</sup> are n-vectors and

$$\begin{aligned} \tilde{z} &= (z\_{2'}^\* - z\_1^\*) \\ a \odot b &= a\_1 \cdot b\_2 - a\_2 \cdot b\_1 \end{aligned} \tag{31}$$

#### **3. 3-algebra model of M-theory**

In this section, we review the fact that the supermembrane action in a semi-light-cone gauge can be described by Nambu bracket, where structures of 3-algebra are manifest. The 3-algebra Models of M-theory are defined based on the semi-light-cone supermembrane action. We also review that the models reduce to the BFSS matrix theory in the DLCQ limit.

#### **3.1. Supermembrane and 3-algebra model of M-theory**

The fundamental degrees of freedom in M-theory are supermembranes. The action of the covariant supermembrane action in M-theory [36] is given by

$$\begin{split} S\_{M2} = \int d^3 \sigma \left( \sqrt{-\mathcal{G}} + \frac{i}{4} \epsilon^{\mu \beta \gamma} \bar{\Psi} \Gamma\_{MN} \partial\_{\bar{n}} \Psi (\Pi\_{\beta}{}^{M} \Pi\_{\gamma}{}^{N} + \frac{i}{2} \Pi\_{\beta}{}^{M} \bar{\Psi} \Gamma^{N} \partial\_{\gamma} \Psi \\ - \frac{1}{12} \bar{\Psi} \Gamma^{M} \partial\_{\beta} \Psi \bar{\Psi} \Gamma^{N} \partial\_{\gamma} \Psi \rangle \right) \tag{32}$$

where *<sup>M</sup>*, *<sup>N</sup>* <sup>=</sup> 0, ··· , 10, *<sup>α</sup>*, *<sup>β</sup>*, *<sup>γ</sup>* <sup>=</sup> 0, 1, 2, *<sup>G</sup>αβ* <sup>=</sup> <sup>Π</sup> *<sup>M</sup> <sup>α</sup>* Π*β<sup>M</sup>* and Π *<sup>M</sup> <sup>α</sup>* <sup>=</sup> *∂αX<sup>M</sup>* <sup>−</sup> *<sup>i</sup>* <sup>2</sup>ΨΓ¯ *<sup>M</sup>∂α*Ψ. <sup>Ψ</sup> is a *SO*(1, 10) Majorana fermion.

This action is invariant under dynamical supertransformations,

$$
\delta \Psi = \epsilon
$$

$$
\delta X^M = -i \bar{\Psi} \Gamma^M \epsilon
$$

These transformations form the N = 1 supersymmetry algebra in eleven dimensions,

$$
\begin{bmatrix}
\delta\_1 \,\delta\_2\right] \mathbf{X}^M = -2i\epsilon\_1 \Gamma^M \epsilon\_2
$$

$$
[\delta\_1, \delta\_2] \mathbf{Y} = \mathbf{0} \tag{34}
$$

The action is also invariant under the *κ*-symmetry transformations,

$$
\begin{aligned}
\delta \Psi &= (1 + \Gamma) \kappa(\sigma) \\
\delta X^M &= i \bar{\Psi} \Gamma^M (1 + \Gamma) \kappa(\sigma)
\end{aligned}
\tag{35}
$$

where

6 Will-be-set-by-IN-TECH

0 0 *x*<sup>1</sup> *x*<sup>2</sup> 0 0 00 0 *x<sup>T</sup>*

0 0 *y*<sup>1</sup> *y*<sup>2</sup> 0 0 00 0 *y<sup>T</sup>*

0 0 *z*<sup>1</sup> *z*<sup>2</sup> 0 0 00 0 *z<sup>T</sup>*

<sup>0</sup> <sup>−</sup>*x<sup>T</sup>*

<sup>0</sup> <sup>−</sup>*y<sup>T</sup>*

<sup>0</sup> <sup>−</sup>*z<sup>T</sup>*

for *x* = (*x*1, *x*2), *y* = (*y*1, *y*2), *z* = (*z*1, *z*2), where *x*1, *x*2, *y*1, *y*2, *z*1, and *z*<sup>2</sup> are n-vectors and

*z*˜ = (*z*∗

review that the models reduce to the BFSS matrix theory in the DLCQ limit.

*i* 4

**3.1. Supermembrane and 3-algebra model of M-theory**

covariant supermembrane action in M-theory [36] is given by

�√−*<sup>G</sup>* <sup>+</sup>

where *<sup>M</sup>*, *<sup>N</sup>* <sup>=</sup> 0, ··· , 10, *<sup>α</sup>*, *<sup>β</sup>*, *<sup>γ</sup>* <sup>=</sup> 0, 1, 2, *<sup>G</sup>αβ* <sup>=</sup> <sup>Π</sup> *<sup>M</sup>*

<sup>2</sup> 0 0

⎞

⎟⎟⎠ ∈ *V*

⎞

⎟⎟⎠ ∈ *V*

⎞

⎟⎟⎠

<sup>2</sup>)(*y*1, *<sup>y</sup>*2)+(*x*2*y<sup>T</sup>*

[*x*, *y*; *z*] = *α*((*y* � *z*˜)*x* + (*z*˜ � *x*)*y* − (*x* � *y*)*z*˜) (30)

*<sup>β</sup>* <sup>Π</sup> *<sup>N</sup> <sup>γ</sup>* + *i* 2 Π *<sup>M</sup>*

*<sup>α</sup>* Π*β<sup>M</sup>* and Π *<sup>M</sup>*

<sup>12</sup>ΨΓ¯ *<sup>M</sup>∂β*ΨΨΓ¯ *<sup>N</sup>∂γ*Ψ)

− 1

*a* � *b* = *a*<sup>1</sup> · *b*<sup>2</sup> − *a*<sup>2</sup> · *b*<sup>1</sup> (31)

∈ *V* (28)

<sup>1</sup> <sup>−</sup> *<sup>x</sup>*1*y<sup>T</sup>*

<sup>2</sup> )(*z*<sup>∗</sup> 2, −*z*<sup>∗</sup>

*<sup>β</sup>* ΨΓ¯ *<sup>N</sup>∂γ*<sup>Ψ</sup>

*<sup>α</sup>* <sup>=</sup> *∂αX<sup>M</sup>* <sup>−</sup> *<sup>i</sup>*

�

(32)

<sup>2</sup>ΨΓ¯ *<sup>M</sup>∂α*Ψ. <sup>Ψ</sup>

<sup>1</sup> ) (29)

<sup>1</sup> 0 0

<sup>2</sup> 0 0

<sup>1</sup> 0 0

<sup>2</sup> 0 0

<sup>1</sup> 0 0

<sup>1</sup> <sup>+</sup> *<sup>x</sup>*2*z*†

<sup>2</sup>, −*z*<sup>∗</sup> 1 )

In this section, we review the fact that the supermembrane action in a semi-light-cone gauge can be described by Nambu bracket, where structures of 3-algebra are manifest. The 3-algebra Models of M-theory are defined based on the semi-light-cone supermembrane action. We also

The fundamental degrees of freedom in M-theory are supermembranes. The action of the

*�αβγ*ΨΓ¯ *MN∂α*Ψ(Π *<sup>M</sup>*

*X* =

*Y* =

*Z* =

<sup>2</sup>)(*x*1, *<sup>x</sup>*2)+(*x*1*z*†

As a result, we obtain the *sp*(2*n*) ⊕ *u*(1) Hermitian 3-algebra,

where *w*<sup>1</sup> and *w*<sup>2</sup> are given by

<sup>1</sup> <sup>+</sup> *<sup>y</sup>*2*z*†

**3. 3-algebra model of M-theory**

*SM*<sup>2</sup> =

is a *SO*(1, 10) Majorana fermion.

� *d*3*σ*

(*w*1, *<sup>w</sup>*2) = <sup>−</sup>(*y*1*z*†

⎛

⎜⎜⎝

⎛

⎜⎜⎝

⎛

⎜⎜⎝

for

$$\Gamma = \frac{1}{3!\sqrt{-G}} \epsilon^{a\beta\gamma} \Pi\_a^L \Pi\_\beta^M \Pi\_\gamma^N \Gamma\_{LMN} \tag{36}$$

If we fix the *κ*-symmetry (35) of the action by taking a semi-light-cone gauge [26]<sup>1</sup>

$$
\Gamma^{012}\Psi = -\Psi\tag{37}
$$

we obtain a semi-light-cone supermembrane action,

$$S\_{M\Sigma} = \int d^3 \sigma \left( \sqrt{-G} + \frac{i}{4} \epsilon^{\mu \beta \gamma} \left( \bar{\Psi} \Gamma\_{\mu \nu} \partial\_{\mathbf{a}} \Psi (\Pi\_{\beta}^{\ \mu} \Pi\_{\gamma}{}^{\nu} + \frac{i}{2} \Pi\_{\beta}{}^{\mu} \bar{\Psi} \Gamma^{\nu} \partial\_{\gamma} \Psi - \frac{1}{12} \bar{\Psi} \Gamma^{\mu} \partial\_{\beta} \Psi \bar{\Psi} \Gamma^{\nu} \partial\_{\gamma} \Psi \right)$$

$$+ \Psi \Gamma\_{I\bar{I}} \partial\_{\mathbf{a}} \Psi \partial\_{\beta} X^{I} \partial\_{\gamma} X^{J}) \right) \tag{38}$$

where *<sup>G</sup>αβ* <sup>=</sup> *<sup>h</sup>αβ* <sup>+</sup> <sup>Π</sup> *<sup>μ</sup> <sup>α</sup>* <sup>Π</sup>*βμ*, <sup>Π</sup> *<sup>μ</sup> <sup>α</sup>* <sup>=</sup> *∂αX<sup>μ</sup>* <sup>−</sup> *<sup>i</sup>* <sup>2</sup>ΨΓ¯ *μ∂α*Ψ, and *<sup>h</sup>αβ* <sup>=</sup> *∂αX<sup>I</sup>∂βXI*.

In [26], it is shown under an approximation up to the quadratic order in *∂αX<sup>μ</sup>* and *∂α*Ψ but exactly in *X<sup>I</sup>* , that this action is equivalent to the continuum action of the 3-algebra model of M-theory,

$$\begin{split} S\_{cl} &= \int d^3 \sigma \sqrt{-g} \Big( -\frac{1}{12} \{X^I, X^I, X^K\}^2 - \frac{1}{2} (A\_{\mu ab} \{\varphi^a, \varphi^b, X^I\})^2 \\ &- \frac{1}{3} F^{\mu \nu \lambda} A\_{\mu ab} A\_{\nu cd} A\_{\lambda cf} \{\varphi^a, \varphi^c, \varphi^d\} \{\varphi^b, \varphi^c, \varphi^f\} + \frac{1}{2} \Lambda \\ &- \frac{i}{2} \bar{\Psi} \Gamma^{\mu} A\_{\mu ab} \{\varphi^a, \varphi^b, \Psi\} + \frac{i}{4} \bar{\Psi} \Gamma\_{I\{}} \{X^I, X^I, \Psi\} \Big) \tag{39} \end{split} \tag{30}$$

where *<sup>I</sup>*, *<sup>J</sup>*, *<sup>K</sup>* <sup>=</sup> 3, ··· , 10 and {*ϕa*, *<sup>ϕ</sup>b*, *<sup>ϕ</sup>c*} <sup>=</sup> *�αβγ∂αϕ<sup>a</sup>∂βϕ<sup>b</sup>∂γ <sup>ϕ</sup><sup>c</sup>* is the Nambu-Poisson bracket. An invariant symmetric bilinear form is defined by *d*3*σ* √−*<sup>g</sup>ϕaϕ<sup>b</sup>* for complete basis *ϕ<sup>a</sup>* in three dimensions. Thus, this action is manifestly VPD covariant even when the world-volume metric is flat. *<sup>X</sup><sup>I</sup>* is a scalar and <sup>Ψ</sup> is a *SO*(1, 2) <sup>×</sup> *SO*(8) Majorana-Weyl fermion

<sup>1</sup> Advantages of a semi-light-cone gauges against a light-cone gauge are shown in [37–39]

#### 8 Will-be-set-by-IN-TECH 8 Linear Algebra – Theorems and Applications

satisfying (37). *Eμνλ* is a Levi-Civita symbol in three dimensions and Λ is a cosmological constant.

The continuum action of 3-algebra model of M-theory (39) is invariant under 16 dynamical supersymmetry transformations,

$$\begin{aligned} \delta X^I &= i\varepsilon \Gamma^I \Psi\\ \delta A\_\mu(\sigma, \sigma') &= \frac{i}{2} \varepsilon \Gamma\_\mu \Gamma\_I (X^I(\sigma) \Psi(\sigma') - X^I(\sigma') \Psi(\sigma)),\\ \delta \Psi &= -A\_{\mu ab} \{q^a, q^b, X^I\} \Gamma^\mu \Gamma\_I \varepsilon - \frac{1}{6} \{X^I, X^I, X^K\} \Gamma\_{IJK} \varepsilon \end{aligned} \tag{40}$$

where Γ012*�* = −*�*. These supersymmetries close into gauge transformations on-shell,

$$\begin{aligned} [\delta\_1, \delta\_2]X^I &= \Lambda\_{cd}\{\boldsymbol{\varphi^c}, \boldsymbol{\varphi^d}, X^I\} \\ [\delta\_1, \delta\_2]A\_{\mu ab}\{\boldsymbol{\varphi^a}, \boldsymbol{\varphi^b}, \quad \} &= \Lambda\_{ab}\{\boldsymbol{\varphi^a}, \boldsymbol{\varphi^b}, A\_{\mu cd}\{\boldsymbol{\varphi^c}, \boldsymbol{\varphi^d}, \quad \} \\ &- A\_{\mu ab}\{\boldsymbol{\varphi^a}, \boldsymbol{\varphi^b}, \Lambda\_{cd}\{\boldsymbol{\varphi^c}, \boldsymbol{\varphi^d}, \quad \} \} + 2i\bar{\varepsilon}\_2 \Gamma^{\boldsymbol{\nu}} \varepsilon\_1 O^A\_{\mu \nu} \\ [\delta\_1, \delta\_2]\Psi &= \Lambda\_{cd}\{\boldsymbol{\varphi^c}, \boldsymbol{\varphi^d}, \Psi\} + (i\bar{\varepsilon}\_2 \Gamma^{\boldsymbol{\mu}} \varepsilon\_1 \Gamma\_{\mu} - \frac{i}{4} \bar{\varepsilon}\_2 \Gamma^{KL} \varepsilon\_1 \Gamma\_{\boldsymbol{KL}}) \mathcal{O}^{\Psi} \end{aligned} \tag{41}$$

where gauge parameters are given by <sup>Λ</sup>*ab* <sup>=</sup> <sup>2</sup>*i�*¯2Γ*μ�*1*Aμab* <sup>−</sup> *<sup>i</sup>�*¯2Γ*JK�*1*X<sup>J</sup> aX<sup>K</sup> <sup>b</sup>* . *<sup>O</sup><sup>A</sup> μν* = 0 and *O*<sup>Ψ</sup> = 0 are equations of motions of *Aμν* and Ψ, respectively, where

$$\mathcal{O}^{A}\_{\mu\nu} = A\_{\mu ab} \{ \emptyset^{a}, \emptyset^{b}, A\_{\nu cd} \{ \emptyset^{c}, \emptyset^{d}, \quad \} \} - A\_{\nu ab} \{ \emptyset^{a}, \emptyset^{b}, A\_{\mu cd} \{ \emptyset^{c}, \emptyset^{d}, \quad \} \}$$

$$+ E\_{\mu\nu\lambda} (-\{ X^{I}, A\_{ab}^{\lambda} \{ \emptyset^{a}, \emptyset^{b}, X\_{I} \}, \quad \} + \frac{i}{2} \{ \Psi, \Gamma^{\lambda} \Psi, \quad \})$$

$$\mathcal{O}^{\Psi} = -\Gamma^{\mu} A\_{\mu ab} \{ \emptyset^{a}, \emptyset^{b}, \Psi \} + \frac{1}{2} \Gamma\_{II} \{ X^{I}, X^{I}, \Psi \} \tag{42}$$

(41) implies that a commutation relation between the dynamical supersymmetry transformations is

$$
\delta\_2 \delta\_1 - \delta\_1 \delta\_2 = 0 \tag{43}
$$

up to the equations of motions and the gauge transformations.

This action is invariant under a translation,

$$
\delta X^I(\sigma) = \eta^I, \qquad \delta A^\mu(\sigma, \sigma') = \eta^\mu(\sigma) - \eta^\mu(\sigma') \tag{44}
$$

where *η<sup>I</sup>* are constants.

The action is also invariant under 16 kinematical supersymmetry transformations

$$
\delta \Psi = \mathfrak{E} \tag{45}
$$

and the other fields are not transformed. *�*˜ is a constant and satisfy Γ012*�*˜ = *�*˜. *�*˜ and *�* should come from sixteen components of thirty-two N = 1 supersymmetry parameters in eleven dimensions, corresponding to eigen values ±1 of Γ012, respectively. This N = 1 supersymmetry consists of remaining 16 target-space supersymmetries and transmuted 16 *κ*-symmetries in the semi-light-cone gauge [25, 26, 40].

A commutation relation between the kinematical supersymmetry transformations is given by

$$
\tilde{\delta}\_2 \tilde{\delta}\_1 - \tilde{\delta}\_1 \tilde{\delta}\_2 = 0 \tag{46}
$$

A commutator of dynamical supersymmetry transformations and kinematical ones acts as

$$\begin{aligned} (\tilde{\delta}\_2 \delta\_1 - \delta\_1 \tilde{\delta}\_2) X^I(\sigma) &= i \tilde{\varepsilon}\_1 \Gamma^I \tilde{\varepsilon}\_2 \equiv \eta\_0^I \\ (\tilde{\delta}\_2 \delta\_1 - \delta\_1 \tilde{\delta}\_2) A^\mu(\sigma, \sigma') &= \frac{i}{2} \tilde{\varepsilon}\_1 \Gamma^\mu \Gamma\_I (X^I(\sigma) - X^I(\sigma')) \tilde{\varepsilon}\_2 \equiv \eta\_0^\mu(\sigma) - \eta\_0^\mu(\sigma') \end{aligned} \tag{47}$$

where the commutator that acts on the other fields vanishes. Thus, the commutation relation is given by

$$
\delta\_2 \delta\_1 - \delta\_1 \tilde{\delta}\_2 = \delta\_\eta \tag{48}
$$

where *δη* is a translation.

If we change a basis of the supersymmetry transformations as

$$\begin{aligned} \delta' &= \delta + \tilde{\delta} \\ \tilde{\delta}' &= \dot{\imath}(\delta - \tilde{\delta}) \end{aligned} \tag{49}$$

we obtain

8 Will-be-set-by-IN-TECH

satisfying (37). *Eμνλ* is a Levi-Civita symbol in three dimensions and Λ is a cosmological

The continuum action of 3-algebra model of M-theory (39) is invariant under 16 dynamical

(*σ*)Ψ(*σ*�

}Γ*μ*Γ*I�* <sup>−</sup> <sup>1</sup>

, *<sup>ϕ</sup>d*, <sup>Ψ</sup>} + (*i�*¯2Γ*μ�*1Γ*<sup>μ</sup>* <sup>−</sup> *<sup>i</sup>*

, *<sup>ϕ</sup>d*, }} − *<sup>A</sup>νab*{*ϕ<sup>a</sup>*

, *X<sup>J</sup>*

, *<sup>ϕ</sup>b*, *XI*}, } <sup>+</sup>

(41) implies that a commutation relation between the dynamical supersymmetry

1 2 <sup>Γ</sup>*I J*{*X<sup>I</sup>*

, *δAμ*(*σ*, *σ*�

˜

and the other fields are not transformed. *�*˜ is a constant and satisfy Γ012*�*˜ = *�*˜. *�*˜ and *�* should come from sixteen components of thirty-two N = 1 supersymmetry parameters in eleven dimensions, corresponding to eigen values ±1 of Γ012, respectively. This N = 1 supersymmetry consists of remaining 16 target-space supersymmetries and transmuted 16

The action is also invariant under 16 kinematical supersymmetry transformations

) <sup>−</sup> *<sup>X</sup><sup>I</sup>* (*σ*�

, *<sup>ϕ</sup>b*, *<sup>A</sup>μcd*{*ϕ<sup>c</sup>*

*i* 2

, *<sup>ϕ</sup>d*, }} <sup>+</sup> <sup>2</sup>*i�*¯2Γ*ν�*1*O<sup>A</sup>*

4

6 {*X<sup>I</sup>* , *X<sup>J</sup>*

)Ψ(*σ*)),

, *<sup>ϕ</sup>d*, }}

, *<sup>ϕ</sup>b*, *<sup>A</sup>μcd*{*ϕ<sup>c</sup>*

*δ*2*δ*<sup>1</sup> − *δ*1*δ*<sup>2</sup> = 0 (43)

*δ*Ψ = *�*˜ (45)

) = *<sup>η</sup>μ*(*σ*) <sup>−</sup> *<sup>η</sup>μ*(*σ*�

{Ψ¯ , <sup>Γ</sup>*λ*Ψ, })

*μν*

, *<sup>X</sup>K*}Γ*IJK�* (40)

*�*¯2Γ*KL�*1Γ*KL*)*O*<sup>Ψ</sup> (41)

, *<sup>ϕ</sup>d*, }}

) (44)

, Ψ} (42)

*μν* = 0 and

*aX<sup>K</sup> <sup>b</sup>* . *<sup>O</sup><sup>A</sup>*

constant.

supersymmetry transformations,

*O<sup>A</sup>*

transformations is

where *η<sup>I</sup>* are constants.

*μν* <sup>=</sup> *<sup>A</sup>μab*{*ϕ<sup>a</sup>*

*<sup>O</sup>*<sup>Ψ</sup> <sup>=</sup> <sup>−</sup>Γ*μAμab*{*ϕ<sup>a</sup>*

This action is invariant under a translation,

*δX<sup>I</sup>*

*κ*-symmetries in the semi-light-cone gauge [25, 26, 40].

*δX<sup>I</sup>* = *i�*¯Γ*<sup>I</sup>*

*δAμ*(*σ*, *σ*�

*<sup>δ</sup>*<sup>Ψ</sup> <sup>=</sup> <sup>−</sup>*Aμab*{*ϕ<sup>a</sup>*

[*δ*1, *<sup>δ</sup>*2]*X<sup>I</sup>* <sup>=</sup> <sup>Λ</sup>*cd*{*ϕ<sup>c</sup>*

[*δ*1, *<sup>δ</sup>*2]<sup>Ψ</sup> <sup>=</sup> <sup>Λ</sup>*cd*{*ϕ<sup>c</sup>*

<sup>+</sup>*Eμνλ*(−{*X<sup>I</sup>*

<sup>−</sup>*Aμab*{*ϕ<sup>a</sup>*

[*δ*1, *<sup>δ</sup>*2]*Aμab*{*ϕ<sup>a</sup>*

Ψ

) = *<sup>i</sup>* 2

*�*¯Γ*μ*Γ*I*(*X<sup>I</sup>*

, *ϕb*, *X<sup>I</sup>*

, *ϕd*, *X<sup>I</sup>* }

where gauge parameters are given by <sup>Λ</sup>*ab* <sup>=</sup> <sup>2</sup>*i�*¯2Γ*μ�*1*Aμab* <sup>−</sup> *<sup>i</sup>�*¯2Γ*JK�*1*X<sup>J</sup>*

*O*<sup>Ψ</sup> = 0 are equations of motions of *Aμν* and Ψ, respectively, where

, *<sup>ϕ</sup>b*, *<sup>A</sup>νcd*{*ϕ<sup>c</sup>*

, *A<sup>λ</sup> ab*{*ϕ<sup>a</sup>*

up to the equations of motions and the gauge transformations.

(*σ*) = *η<sup>I</sup>*

, *<sup>ϕ</sup>b*, <sup>Ψ</sup>} <sup>+</sup>

, *<sup>ϕ</sup>b*, } <sup>=</sup> <sup>Λ</sup>*ab*{*ϕ<sup>a</sup>*

, *<sup>ϕ</sup>b*, <sup>Λ</sup>*cd*{*ϕ<sup>c</sup>*

where Γ012*�* = −*�*. These supersymmetries close into gauge transformations on-shell,

$$\begin{aligned} \delta\_2' \delta\_1' - \delta\_1' \delta\_2' &= \delta\_\eta \\ \delta\_2' \delta\_1' - \tilde{\delta}\_1' \tilde{\delta}\_2' &= \delta\_\eta \\ \delta\_2' \delta\_1' - \delta\_1' \tilde{\delta}\_2' &= 0 \end{aligned} \tag{50}$$

These thirty-two supersymmetry transformations are summarised as Δ = (*δ*� , ˜ *δ*� ) and (50) implies the N = 1 supersymmetry algebra in eleven dimensions,

$$
\Delta\_2 \Delta\_1 - \Delta\_1 \Delta\_2 = \delta\_\eta \tag{51}
$$

#### **3.2. Lie 3-algebra models of M-theory**

In this and next subsection, we perform the second quantization on the continuum action of the 3-algebra model of M-theory: By replacing the Nambu-Poisson bracket in the action (39) with brackets of finite-dimensional 3-algebras, Lie and Hermitian 3-algebras, we obtain the Lie and Hermitian 3-algebra models of M-theory [26, 28], respectively. In this section, we review the Lie 3-algebra model.

If we replace the Nambu-Poisson bracket in the action (39) with a completely antisymmetric real 3-algebra's bracket [21, 22],

$$\begin{aligned} \int d^3 \sigma \sqrt{-g} &\rightarrow \left< \begin{array}{c} \\ \\ \end{array} \right> \\ \{\varphi^a, \varphi^b, \varphi^c\} &\rightarrow [T^a, T^b, T^c] \end{aligned} \tag{52}$$

we obtain the Lie 3-algebra model of M-theory [26, 28],

$$\begin{split} S\_0 &= \left\langle -\frac{1}{12} [X^I, X^I, X^K]^2 - \frac{1}{2} (A\_{\mu ab} [T^a, T^b, X^I])^2 \right. \\ &\left. -\frac{1}{3} E^{\mu \nu \lambda} A\_{\mu ab} A\_{\nu cd} A\_{\lambda ef} [T^a, T^c, T^d] [T^b, T^c, T^f] \right. \\ &\left. -\frac{i}{2} \bar{\Psi} \Gamma^{\mu} A\_{\mu ab} [T^a, T^b, \Psi] + \frac{i}{4} \bar{\Psi} \Gamma\_{II} [X^I, X^J, \Psi] \right\rangle \end{split} \tag{53}$$

We have deleted the cosmological constant Λ, which corresponds to an operator ordering ambiguity, as usual as in the case of other matrix models [27, 41].

This model can be obtained formally by a dimensional reduction of the N = 8 BLG model [4–6],

$$\begin{split} S\_{N=8BLG} &= \int d^3x \Big\langle -\frac{1}{12} [\mathbf{X}^I, \mathbf{X}^I, \mathbf{X}^K]^2 - \frac{1}{2} (D\_\mu \mathbf{X}^I)^2 - E^{\mu\nu\lambda} \Big( \frac{1}{2} A\_{\mu ab} \partial\_\nu A\_{\lambda cd} T^d [T^b, T^c, T^d] \\ &+ \frac{1}{3} A\_{\mu ab} A\_{\nu cd} A\_{\lambda cf} [T^d, T^c, T^d] [T^b, T^c, T^f] \Big) \\ &+ \frac{i}{2} \bar{\Psi} \Gamma^\mu D\_\mu \Psi + \frac{i}{4} \bar{\Psi} \Gamma\_{II} [X^I, X^I, \Psi] \Big) \end{split} \tag{54}$$

The formal relations between the Lie (Hermitian) 3-algebra models of M-theory and the N = 8 (N = 6) BLG models are analogous to the relation among the N = 4 super Yang-Mills in four dimensions, the BFSS matrix theory [27], and the IIB matrix model [41]. They are completely different theories although they are related to each others by dimensional reductions. In the same way, the 3-algebra models of M-theory and the BLG models are completely different theories.

The fields in the action (53) are spanned by the Lie 3-algebra *T<sup>a</sup>* as *X<sup>I</sup>* = *X<sup>I</sup> aT<sup>a</sup>*, Ψ = Ψ*aT<sup>a</sup>* and *A<sup>μ</sup>* = *A<sup>μ</sup> abT<sup>a</sup>* <sup>⊗</sup> *<sup>T</sup>b*, where *<sup>I</sup>* <sup>=</sup> 3, ··· , 10 and *<sup>μ</sup>* <sup>=</sup> 0, 1, 2. <> represents a metric for the 3-algebra. Ψ is a Majorana spinor of SO(1,10) that satisfies Γ012Ψ = Ψ. *Eμνλ* is a Levi-Civita symbol in three-dimensions.

Finite dimensional Lie 3-algebras with an invariant metric is classified into four-dimensional Euclidean A<sup>4</sup> algebra and the Lie 3-algebras with indefinite metrics in [9–11, 21, 22]. We do not choose A<sup>4</sup> algebra because its degrees of freedom are just four. We need an algebra with arbitrary dimensions N, which is taken to infinity to define M-theory. Here we choose the most simple indefinite metric Lie 3-algebra, so called the Lorentzian Lie 3-algebra associated with *u*(*N*) Lie algebra,

$$\begin{aligned} [T^{-1}, T^i, T^\emptyset] &= 0\\ [T^0, T^i, T^j] &= [T^i, T^j] = f^{ij} \, {}\_k T^k\\ [T^i, T^j, T^k] &= f^{ijk} T^{-1} \end{aligned} \tag{55}$$

where *<sup>a</sup>* <sup>=</sup> <sup>−</sup>1, 0, *<sup>i</sup>* (*<sup>i</sup>* <sup>=</sup> 1, ··· , *<sup>N</sup>*2). *<sup>T</sup><sup>i</sup>* are generators of *<sup>u</sup>*(*N*). A metric is defined by a symmetric bilinear form,

$$ = -1\tag{56}$$

$$ = \hbar^{ij} \tag{57}$$

and the other components are 0. The action is decomposed as

$$S = \text{Tr}(-\frac{1}{4}(\mathbf{x}\_0^K)^2[\mathbf{x}^I, \mathbf{x}^J]^2 + \frac{1}{2}(\mathbf{x}\_0^I[\mathbf{x}\_I, \mathbf{x}^I])^2 - \frac{1}{2}(\mathbf{x}\_0^I b\_\mu + [a\_\mu, \mathbf{x}^I])^2 - \frac{1}{2}E^{\mu\nu\lambda}b\_\mu[a\_\nu, a\_\lambda]$$

$$+ i\bar{\psi}\_0 \Gamma^\mu b\_\mu \psi - \frac{i}{2}\bar{\psi}\Gamma^\mu [a\_\mu, \psi] + \frac{i}{2}\mathbf{x}\_0^I \bar{\psi}\Gamma\_{II}[\mathbf{x}^I, \psi] - \frac{i}{2}\bar{\psi}b\_0 \Gamma\_{II}[\mathbf{x}^I, \mathbf{x}^I] \psi \tag{58}$$

where we have renamed *X<sup>I</sup>* <sup>0</sup> <sup>→</sup> *<sup>x</sup><sup>I</sup>* <sup>0</sup>, *<sup>X</sup><sup>I</sup> <sup>i</sup> <sup>T</sup><sup>i</sup>* <sup>→</sup> *<sup>x</sup><sup>I</sup>* , <sup>Ψ</sup><sup>0</sup> <sup>→</sup> *<sup>ψ</sup>*0, <sup>Ψ</sup>*iT<sup>i</sup>* <sup>→</sup> *<sup>ψ</sup>*, 2*Aμ*<sup>0</sup>*iT<sup>i</sup>* <sup>→</sup> *<sup>a</sup>μ*, and *Aμij*[*T<sup>i</sup>* , *T<sup>j</sup>* ] <sup>→</sup> *<sup>b</sup>μ*. *<sup>a</sup><sup>μ</sup>* correspond to the target coordinate matrices *<sup>X</sup>μ*, whereas *<sup>b</sup><sup>μ</sup>* are auxiliary fields.

In this action, *T*−<sup>1</sup> mode; *X<sup>I</sup>* <sup>−</sup>1, <sup>Ψ</sup>−<sup>1</sup> or *<sup>A</sup><sup>μ</sup>* <sup>−</sup>1*<sup>a</sup>* does not appear, that is they are unphysical modes. Therefore, the indefinite part of the metric (56) does not exist in the action and the Lie 3-algebra model of M-theory is ghost-free like a model in [42]. This action can be obtained by a dimensional reduction of the three-dimensional N = 8 BLG model [4–6] with the same 3-algebra. The BLG model possesses a ghost mode because of its kinetic terms with indefinite signature. On the other hand, the Lie 3-algebra model of M-theory does not possess a kinetic term because it is defined as a zero-dimensional field theory like the IIB matrix model [41].

This action is invariant under the translation

10 Will-be-set-by-IN-TECH

We have deleted the cosmological constant Λ, which corresponds to an operator ordering

This model can be obtained formally by a dimensional reduction of the N = 8 BLG model

, *T<sup>c</sup>*

ΨΓ¯ *I J*[*X<sup>I</sup>*

The formal relations between the Lie (Hermitian) 3-algebra models of M-theory and the N = 8 (N = 6) BLG models are analogous to the relation among the N = 4 super Yang-Mills in four dimensions, the BFSS matrix theory [27], and the IIB matrix model [41]. They are completely different theories although they are related to each others by dimensional reductions. In the same way, the 3-algebra models of M-theory and the BLG models are completely different

3-algebra. Ψ is a Majorana spinor of SO(1,10) that satisfies Γ012Ψ = Ψ. *Eμνλ* is a Levi-Civita

Finite dimensional Lie 3-algebras with an invariant metric is classified into four-dimensional Euclidean A<sup>4</sup> algebra and the Lie 3-algebras with indefinite metrics in [9–11, 21, 22]. We do not choose A<sup>4</sup> algebra because its degrees of freedom are just four. We need an algebra with arbitrary dimensions N, which is taken to infinity to define M-theory. Here we choose the most simple indefinite metric Lie 3-algebra, so called the Lorentzian Lie 3-algebra associated

, *Tb*] = 0

]=[*T<sup>i</sup>*

where *<sup>a</sup>* <sup>=</sup> <sup>−</sup>1, 0, *<sup>i</sup>* (*<sup>i</sup>* <sup>=</sup> 1, ··· , *<sup>N</sup>*2). *<sup>T</sup><sup>i</sup>* are generators of *<sup>u</sup>*(*N*). A metric is defined by a

])<sup>2</sup> <sup>−</sup> <sup>1</sup> 2 (*x<sup>I</sup>*

*<sup>ψ</sup>*¯Γ*μ*[*aμ*, *<sup>ψ</sup>*] + *<sup>i</sup>*

2 *xI* 0*ψ*¯Γ*I J*[*x<sup>J</sup>*

, *T<sup>j</sup>* ] = *f ij kTk*

, *Tk*] = *f ijkT*−<sup>1</sup> (55)

<sup>&</sup>lt; *<sup>T</sup>*<sup>−</sup>1, *<sup>T</sup>*<sup>0</sup> <sup>&</sup>gt; <sup>=</sup> <sup>−</sup><sup>1</sup> (56)

<sup>0</sup>*b<sup>μ</sup>* + [*aμ*, *<sup>x</sup><sup>I</sup>*

, *T<sup>j</sup>* > = *hij* (57)

])<sup>2</sup> <sup>−</sup> <sup>1</sup> 2

, *<sup>ψ</sup>*] <sup>−</sup> *<sup>i</sup>* 2 *Eμνλbμ*[*aν*, *aλ*]

, *x<sup>J</sup>*

]*ψ*) (58)

*ψ*¯0Γ*I J*[*x<sup>I</sup>*

)<sup>2</sup> <sup>−</sup> *<sup>E</sup>μνλ* <sup>1</sup>

, *T<sup>f</sup>* ] 

, *Td*][*Tb*, *T<sup>e</sup>*

, *X<sup>J</sup>* , Ψ] 

*abT<sup>a</sup>* <sup>⊗</sup> *<sup>T</sup>b*, where *<sup>I</sup>* <sup>=</sup> 3, ··· , 10 and *<sup>μ</sup>* <sup>=</sup> 0, 1, 2. <> represents a metric for the

<sup>2</sup> *<sup>A</sup>μab∂νAλcdT<sup>a</sup>*[*Tb*, *<sup>T</sup><sup>c</sup>*

, *Td*]

*aT<sup>a</sup>*, Ψ = Ψ*aT<sup>a</sup>*

(54)

<sup>2</sup> <sup>−</sup> <sup>1</sup> 2 (*DμX<sup>I</sup>*

*i* 4

The fields in the action (53) are spanned by the Lie 3-algebra *T<sup>a</sup>* as *X<sup>I</sup>* = *X<sup>I</sup>*

[*T*<sup>−</sup>1, *T<sup>a</sup>*

, *T<sup>j</sup>*

< *T<sup>i</sup>*

2

[*T*0, *T<sup>i</sup>*

[*Ti* , *T<sup>j</sup>*

and the other components are 0. The action is decomposed as

<sup>+</sup>*iψ*¯0Γ*μbμψ* <sup>−</sup> *<sup>i</sup>*

ambiguity, as usual as in the case of other matrix models [27, 41].

, *X<sup>J</sup>* , *XK*]

<sup>3</sup> *<sup>A</sup>μabAνcdAλe f* [*T<sup>a</sup>*

ΨΓ¯ *<sup>μ</sup>Dμ*Ψ +

+ 1

+ *i* 2

[4–6],

theories.

and *A<sup>μ</sup>* = *A<sup>μ</sup>*

symbol in three-dimensions.

with *u*(*N*) Lie algebra,

symmetric bilinear form,

*<sup>S</sup>* <sup>=</sup> Tr(−<sup>1</sup>

4 (*x<sup>K</sup>* <sup>0</sup> )2[*x<sup>I</sup>* , *x<sup>J</sup>* ] <sup>2</sup> + 1 2 (*x<sup>I</sup>* <sup>0</sup>[*xI*, *<sup>x</sup><sup>J</sup>*

*<sup>S</sup>*<sup>N</sup> <sup>=</sup>8*BLG* =

 *d*3*x* − 1 <sup>12</sup> [*X<sup>I</sup>*

$$
\delta \mathbf{x}^I = \eta^I, \qquad \delta a^\mu = \eta^\mu \tag{59}
$$

where *η<sup>I</sup>* and *η<sup>μ</sup>* belong to *u*(1). This implies that eigen values of *x<sup>I</sup>* and *a<sup>μ</sup>* represent an eleven-dimensional space-time.

The action is also invariant under 16 kinematical supersymmetry transformations

$$
\delta\psi = \mathfrak{E} \tag{60}
$$

and the other fields are not transformed. *�*˜ belong to *u*(1) and satisfy Γ012*�*˜ = *�*˜. *�*˜ and *�* should come from sixteen components of thirty-two N = 1 supersymmetry parameters in eleven dimensions, corresponding to eigen values ±1 of Γ012, respectively, as in the previous subsection.

A commutation relation between the kinematical supersymmetry transformations is given by

$$
\delta\_2 \vec{\delta}\_1 - \vec{\delta}\_1 \vec{\delta}\_2 = 0 \tag{61}
$$

The action is invariant under 16 dynamical supersymmetry transformations,

$$\begin{aligned} \delta X^I &= i\varepsilon \Gamma^I \Psi\\ \delta A\_{\mu ab}[T^a, T^b, \quad ] &= i\varepsilon \Gamma\_{\mu} \Gamma\_I[X^I, \Psi, \quad ]\\ \delta \Psi &= -A\_{\mu ab}[T^a, T^b, X^I] \Gamma^\mu \Gamma\_I \varepsilon - \frac{1}{6}[X^I, X^J, X^K] \Gamma\_{IJK} \varepsilon \end{aligned} \tag{62}$$

where Γ012*�* = −*�*. These supersymmetries close into gauge transformations on-shell,

$$\begin{aligned} [\delta\_1, \delta\_2]X^I &= \Lambda\_{cd}[T^c, T^d, X^I] \\\\ [\delta\_1, \delta\_2]A\_{\mu ab}[T^a, T^b, \quad ] &= \Lambda\_{ab}[T^a, T^b, A\_{\mu cd}[T^c, T^d, \quad ]] \\\\ &- A\_{\mu ab}[T^d, T^b, \Lambda\_{cd}[T^c, T^d, \quad ]] + 2i\bar{\varepsilon}\_2 \Gamma^\nu \varepsilon\_1 O^A\_{\mu \nu} \end{aligned}$$

$$[\delta\_1, \delta\_2]\Psi = \Lambda\_{cd}[T^c, T^d, \Psi] + (i\bar{\varepsilon}\_2 \Gamma^\mu \varepsilon\_1 \Gamma\_\mu - \frac{i}{4}\bar{\varepsilon}\_2 \Gamma^{\text{KL}}\varepsilon\_1 \Gamma\_{\text{KL}})O^\Psi \tag{63}$$

#### 12 Will-be-set-by-IN-TECH 12 Linear Algebra – Theorems and Applications

where gauge parameters are given by <sup>Λ</sup>*ab* <sup>=</sup> <sup>2</sup>*i�*¯2Γ*μ�*1*Aμab* <sup>−</sup> *<sup>i</sup>�*¯2Γ*JK�*1*X<sup>J</sup> aX<sup>K</sup> <sup>b</sup>* . *<sup>O</sup><sup>A</sup> μν* = 0 and *O*<sup>Ψ</sup> = 0 are equations of motions of *Aμν* and Ψ, respectively, where

$$\mathbf{O}^{A}\_{\mu\nu} = A\_{\mu ab} [T^a, T^b, A\_{\nu cd} [T^c, T^d, \quad ]] - A\_{\nu ab} [T^a, T^b, A\_{\mu cd} [T^c, T^d, \quad ]]$$

$$+ E\_{\mu\nu\lambda} (-[X^I, A^\lambda\_{ab} [T^a, T^b, X\_I], \quad ] + \frac{i}{2} [\Psi, \Gamma^\lambda \Psi, \quad ])$$

$$\mathbf{O}^{\Psi} = -\Gamma^\mu A\_{\mu ab} [T^a, T^b, \Psi] + \frac{1}{2} \Gamma\_{II} [X^I, X^I, \Psi] \tag{64}$$

(63) implies that a commutation relation between the dynamical supersymmetry transformations is

$$
\delta\_2 \delta\_1 - \delta\_1 \delta\_2 = 0 \tag{65}
$$

up to the equations of motions and the gauge transformations.

The 16 dynamical supersymmetry transformations (62) are decomposed as

$$\begin{aligned} \delta \mathbf{x}^I &= i \bar{\epsilon} \Gamma^I \psi\\ \delta \mathbf{x}^I\_0 &= i \bar{\epsilon} \Gamma^I \psi\_0\\ \delta \mathbf{x}^I\_{-1} &= i \bar{\epsilon} \Gamma^I \psi\_{-1} \end{aligned}$$

$$\begin{aligned} \delta \psi &= -(b\_{\mu} \mathbf{x}^0\_0 + [a\_{\mu \nu} \mathbf{x}^I]) \Gamma^\mu \Gamma\_I \epsilon - \frac{1}{2} \mathbf{x}^I\_0 [\mathbf{x}^I, \mathbf{x}^K] \Gamma\_{IIK} \epsilon\\ \delta \psi\_0 &= 0 \end{aligned}$$

$$\begin{aligned} \delta \psi\_{-1} &= -\text{Tr}(b\_{\mu} \mathbf{x}^I) \Gamma^\mu \Gamma\_I \epsilon - \frac{1}{6} \text{Tr}([\mathbf{x}^I, \mathbf{x}^I] \mathbf{x}^K) \Gamma\_{IIK} \epsilon \\\ \delta a\_{\mu} &= i \bar{\epsilon} \Gamma\_{\mu} \Gamma\_I (\mathbf{x}^I, \psi \mathbf{I}) \\\ \delta b\_{\mu} &= i \bar{\epsilon} \Gamma\_{\mu} \Gamma\_I [\mathbf{x}^I, \psi \mathbf{I}] \end{aligned}$$

$$\begin{aligned} \delta A\_{\mu - 1i} &= i \bar{\epsilon} \Gamma\_{\mu} \Gamma\_I \frac{1}{2} (\mathbf{x}^I\_{-1} \Psi\_{\bar{\imath}} - \Psi\_{-1} \mathbf{x}^I\_i) \\\ \delta A\_{\mu - 10} &= i \bar{\epsilon} \Gamma\_{\mu} \Gamma\_I \frac{1}{2} (\mathbf{x}^I\_{-1} \Psi\_0 - \Psi\_{-1} \mathbf{x}^I\_0) \end{aligned} \tag{66}$$

and thus a commutator of dynamical supersymmetry transformations and kinematical ones acts as

$$(\delta\_2 \delta\_1 - \delta\_1 \tilde{\delta}\_2) \mathbf{x}^I = i \vec{\varepsilon}\_1 \boldsymbol{\Gamma}^I \vec{\varepsilon}\_2 \equiv \eta^I$$

$$(\delta\_2 \delta\_1 - \delta\_1 \tilde{\delta}\_2) a^\mu = i \vec{\varepsilon}\_1 \boldsymbol{\Gamma}^\mu \boldsymbol{\Gamma}\_I \mathbf{x}\_0^I \vec{\varepsilon}\_2 \equiv \eta^\mu$$

$$(\delta\_2 \delta\_1 - \delta\_1 \tilde{\delta}\_2) A^\mu\_{-1i} T^i = \frac{1}{2} i \vec{\varepsilon}\_1 \boldsymbol{\Gamma}^\mu \boldsymbol{\Gamma}\_I \mathbf{x}\_{-1}^I \vec{\varepsilon}\_2 \tag{67}$$

where the commutator that acts on the other fields vanishes. Thus, the commutation relation for physical modes is given by

$$
\delta\_2 \delta\_1 - \delta\_1 \tilde{\delta}\_2 = \delta\_\eta \tag{68}
$$

where *δη* is a translation.

(61), (65), and (68) imply the N = 1 supersymmetry algebra in eleven dimensions as in the previous subsection.

#### **3.3. Hermitian 3-algebra model of M-theory**

12 Will-be-set-by-IN-TECH

, *<sup>T</sup>d*, ]] <sup>−</sup> *<sup>A</sup>νab*[*T<sup>a</sup>*

, *<sup>T</sup>b*, *XI*], ] + *<sup>i</sup>*

, *X<sup>J</sup>*

])Γ*μ*Γ*I�* <sup>−</sup> <sup>1</sup>

6 Tr([*x<sup>I</sup>* , *x<sup>J</sup>*

)

<sup>−</sup>1*ψ<sup>i</sup>* <sup>−</sup> *<sup>ψ</sup>*−1*x<sup>I</sup>*

<sup>−</sup>1*ψ*<sup>0</sup> <sup>−</sup> *<sup>ψ</sup>*−1*x<sup>I</sup>*

and thus a commutator of dynamical supersymmetry transformations and kinematical ones

*δ*2)*a<sup>μ</sup>* = *i�*¯1Γ*μ*Γ*<sup>I</sup> x<sup>I</sup>*

where the commutator that acts on the other fields vanishes. Thus, the commutation relation

(61), (65), and (68) imply the N = 1 supersymmetry algebra in eleven dimensions as in the

*<sup>T</sup><sup>i</sup>* <sup>=</sup> <sup>1</sup> 2

*δ*2)*x<sup>I</sup>* = *i�*¯1Γ*<sup>I</sup>*

)Γ*μ*Γ*I�* <sup>−</sup> <sup>1</sup>

<sup>0</sup>*<sup>ψ</sup>* <sup>−</sup> *<sup>ψ</sup>*0*x<sup>I</sup>*

, *ψ*]

1 2 (*x<sup>I</sup>*

1 2 (*x<sup>I</sup>*

*<sup>δ</sup>*2)*A<sup>μ</sup>* −1*i*

˜ *<sup>δ</sup>*2*δ*<sup>1</sup> <sup>−</sup> *<sup>δ</sup>*<sup>1</sup> ˜ 2 *xI* 0[*x<sup>J</sup>*

*i* )

*�*˜2 <sup>≡</sup> *<sup>η</sup><sup>I</sup>*

<sup>0</sup>*�*˜2 <sup>≡</sup> *<sup>η</sup><sup>μ</sup>*

*i�*¯1Γ*μ*Γ*<sup>I</sup> x<sup>I</sup>*

2

*aX<sup>K</sup> <sup>b</sup>* . *<sup>O</sup><sup>A</sup>*

, *Td*, ]]

, Ψ] (64)

, *<sup>T</sup>b*, *<sup>A</sup>μcd*[*T<sup>c</sup>*

*δ*2*δ*<sup>1</sup> − *δ*1*δ*<sup>2</sup> = 0 (65)

, *xK*]Γ*IJK�*

]*xK*)Γ*IJK�*

<sup>0</sup>) (66)

<sup>−</sup>1*�*˜2 (67)

*δ*<sup>2</sup> = *δη* (68)

[Ψ¯ , Γ*λ*Ψ, ])

*μν* = 0 and

where gauge parameters are given by <sup>Λ</sup>*ab* <sup>=</sup> <sup>2</sup>*i�*¯2Γ*μ�*1*Aμab* <sup>−</sup> *<sup>i</sup>�*¯2Γ*JK�*1*X<sup>J</sup>*

*O*<sup>Ψ</sup> = 0 are equations of motions of *Aμν* and Ψ, respectively, where

, *<sup>T</sup>b*, *<sup>A</sup>νcd*[*T<sup>c</sup>*

, *A<sup>λ</sup> ab*[*T<sup>a</sup>*

up to the equations of motions and the gauge transformations.

*δx<sup>I</sup>* = *i�*¯Γ*<sup>I</sup>*

<sup>0</sup> = *<sup>i</sup>�*¯Γ*<sup>I</sup>*

<sup>−</sup><sup>1</sup> <sup>=</sup> *<sup>i</sup>�*¯Γ*<sup>I</sup>*

*δψ* <sup>=</sup> <sup>−</sup>(*bμx<sup>I</sup>*

*δψ*−<sup>1</sup> <sup>=</sup> <sup>−</sup>Tr(*bμx<sup>I</sup>*

*δa<sup>μ</sup>* = *i�*¯Γ*μ*Γ*I*(*x<sup>I</sup>*

*δb<sup>μ</sup>* = *i�*¯Γ*μ*Γ*I*[*x<sup>I</sup>*

*<sup>δ</sup>Aμ*−1*<sup>i</sup>* = *<sup>i</sup>�*¯Γ*μ*Γ*<sup>I</sup>*

*<sup>δ</sup>Aμ*−<sup>10</sup> = *<sup>i</sup>�*¯Γ*μ*Γ*<sup>I</sup>*

(˜

(˜

(˜

for physical modes is given by

where *δη* is a translation.

previous subsection.

*<sup>δ</sup>*2*δ*<sup>1</sup> <sup>−</sup> *<sup>δ</sup>*<sup>1</sup> ˜

*<sup>δ</sup>*2*δ*<sup>1</sup> <sup>−</sup> *<sup>δ</sup>*<sup>1</sup> ˜

*<sup>δ</sup>*2*δ*<sup>1</sup> <sup>−</sup> *<sup>δ</sup>*<sup>1</sup> ˜

*δψ*<sup>0</sup> = 0

*δx<sup>I</sup>*

*δx<sup>I</sup>*

, *<sup>T</sup>b*, <sup>Ψ</sup>] + <sup>1</sup>

The 16 dynamical supersymmetry transformations (62) are decomposed as

*ψ*

*ψ*0

*ψ*−<sup>1</sup>

<sup>0</sup> + [*aμ*, *<sup>x</sup><sup>I</sup>*

2 Γ*I J*[*X<sup>I</sup>*

(63) implies that a commutation relation between the dynamical supersymmetry

*O<sup>A</sup>*

transformations is

acts as

*μν* = *<sup>A</sup>μab*[*T<sup>a</sup>*

*<sup>O</sup>*<sup>Ψ</sup> <sup>=</sup> <sup>−</sup>Γ*μAμab*[*T<sup>a</sup>*

<sup>+</sup>*Eμνλ*(−[*X<sup>I</sup>*

In this subsection, we study the Hermitian 3-algebra models of M-theory [26]. Especially, we study mostly the model with the *u*(*N*) ⊕ *u*(*N*) Hermitian 3-algebra (20).

The continuum action (39) can be rewritten by using the triality of *SO*(8) and the *SU*(4) ×*U*(1) decomposition [8, 43, 44] as

$$\begin{split} S\_{cl} &= \int d^3 \sigma \sqrt{-g} \Big( -V - A\_{\mu \mu \iota} \{ Z^A, T^a, T^b \} A\_{dc}^{\mu} \{ Z\_A, T^c, T^d \} \\ &+ \frac{i}{3} E^{\mu \nu \lambda} A\_{\mu \mu \iota} A\_{\nu \ell \epsilon} \{ T^d, T^c, T^d \} \{ T^b, T^f, T^e \} \\ &+ i \bar{\psi}^A \Gamma^\mu A\_{\mu \mu \iota} \{ \psi\_A, T^a, T^b \} + \frac{i}{2} E\_{ABCD} \bar{\psi}^A \{ Z^C, Z^D, \psi^B \} - \frac{i}{2} E^{ABCD} Z\_D \{ \bar{\psi}\_A, \psi\_B, Z\_C \} \\ &- i \bar{\psi}^A \{ \psi\_A, Z^B, Z\_B \} + 2i \bar{\psi}^A \{ \psi\_B, Z^B, Z\_A \} \Big) \end{split} \tag{69}$$

where fields with a raised *A* index transform in the 4 of SU(4), whereas those with lowered one transform in the 4.¯ *<sup>A</sup>μba* (*<sup>μ</sup>* = 0, 1, 2) is an anti-Hermitian gauge field, *<sup>Z</sup><sup>A</sup>* and *ZA* are a complex scalar field and its complex conjugate, respectively. *ψ<sup>A</sup>* is a fermion field that satisfies

$$
\Gamma^{012}\psi\_A = -\psi\_A \tag{70}
$$

and *ψ<sup>A</sup>* is its complex conjugate. *Eμνλ* and *EABCD* are Levi-Civita symbols in three dimensions and four dimensions, respectively. The potential terms are given by

$$\begin{aligned} V &= \frac{2}{3} \mathbf{Y}\_B^{\mathbb{C}D} \mathbf{Y}\_{\mathbb{C}D}^{B} \\ \mathbf{Y}\_B^{\mathbb{C}D} &= \{ \mathbf{Z}^{\mathbb{C}}, \mathbf{Z}^{D}, \mathbf{Z}\_B \} - \frac{1}{2} \delta\_{\mathbb{B}}^{\mathbb{C}} \{ \mathbf{Z}^{\mathbb{E}}, \mathbf{Z}^{D}, \mathbf{Z}\_{\mathbb{E}} \} + \frac{1}{2} \delta\_{\mathbb{B}}^{D} \{ \mathbf{Z}^{\mathbb{E}}, \mathbf{Z}^{\mathbb{C}}, \mathbf{Z}\_{\mathbb{E}} \} \end{aligned} \tag{71}$$

If we replace the Nambu-Poisson bracket with a Hermitian 3-algebra's bracket [19, 20],

$$\begin{aligned} \int d^3 \sigma \sqrt{-g} &\rightarrow \left< \quad \right>\\ \{\varphi^a, \varphi^b, \varphi^c\} &\rightarrow [T^a, T^b; \bar{T}^c] \end{aligned} \tag{72}$$

we obtain the Hermitian 3-algebra model of M-theory [26],

$$\begin{split} S &= \left\langle -V - A\_{\mu\bar{b}a}[\boldsymbol{Z}^{A}, \boldsymbol{T}^{a}; \boldsymbol{\bar{T}}^{b}] \overline{A\_{\bar{d}c}^{\mu}[\boldsymbol{Z}\_{A}, \boldsymbol{T}^{c}; \boldsymbol{\bar{T}}^{d}]} + \frac{1}{3} \boldsymbol{E}^{\mu\nu\lambda} A\_{\mu\bar{b}a} A\_{\nu\bar{d}c} A\_{\lambda\bar{f}c} [\boldsymbol{T}^{a}, \boldsymbol{T}^{c}; \boldsymbol{\bar{T}}^{d}] \overline{[\boldsymbol{T}^{b}, \boldsymbol{T}^{f}; \boldsymbol{\bar{T}}^{p}]} \\ &+ i \boldsymbol{\bar{\psi}}^{A} \boldsymbol{\Gamma}^{\mu} A\_{\mu\bar{b}a} [\psi\_{A}, \boldsymbol{T}^{a}; \boldsymbol{\bar{T}}^{b}] + \frac{i}{2} \boldsymbol{E}\_{ABCD} \boldsymbol{\bar{\psi}}^{A} [\boldsymbol{Z}^{C}, \boldsymbol{Z}^{D}; \boldsymbol{\bar{\psi}}^{B}] - \frac{i}{2} \boldsymbol{E}^{\text{ABCD}} \boldsymbol{\mathcal{Z}}\_{D} [\bar{\psi}\_{A}, \psi\_{B}; \boldsymbol{\bar{\mathcal{Z}}\_{C}}] \\ &- i \boldsymbol{\bar{\psi}}^{A} [\psi\_{A}, \boldsymbol{\mathcal{Z}}^{B}; \boldsymbol{\bar{\mathcal{Z}}\_{B}}] + 2i \boldsymbol{i} \boldsymbol{\bar{\psi}}^{A} [\psi\_{B}, \boldsymbol{\mathcal{Z}}^{B}; \boldsymbol{\bar{\mathcal{Z}}\_{A}}] \end{split} \tag{73}$$

where the cosmological constant has been deleted for the same reason as before. The potential terms are given by

$$\begin{aligned} V &= \frac{2}{3} \mathbf{Y}\_B^{CD} \mathbf{\bar{Y}}\_{CD}^B\\ \mathbf{Y}\_B^{CD} &= [Z^\mathbb{C}, Z^D; \bar{Z}\_B] - \frac{1}{2} \delta\_B^\mathbb{C} [Z^E, Z^D; \bar{Z}\_E] + \frac{1}{2} \delta\_B^D [Z^E, Z^\mathbb{C}; \bar{Z}\_E] \end{aligned} \tag{74}$$

#### 14 Will-be-set-by-IN-TECH 14 Linear Algebra – Theorems and Applications

This matrix model can be obtained formally by a dimensional reduction of the N = 6 BLG action [8], which is equivalent to ABJ(M) action [7, 45]2,

$$\begin{split} S\_{N \leftarrow 6BLG} &= \int d^3x \Big\langle -V - D\_{\mu}Z^{A}\overline{D^{\mu}Z\_{A}} + E^{\mu\nu\lambda} \Big( \frac{1}{2} A\_{\mu\nu b} \partial\_{\nu}A\_{\lambda d\bar{a}} \, \bar{T}^{\bar{d}}[T^{a}, T^{b}; \bar{T}^{c}] \\ &+ \frac{1}{3} A\_{\mu\bar{b}a} A\_{\nu\bar{d}c} A\_{\lambda\bar{f}\epsilon} [T^{a}, T^{c}; \bar{T}^{d}] \overline{[T^{b}, T^{f}; \bar{T}^{c}]} \Big) \\ &- i\bar{\psi}^{A} \Gamma^{\mu} D\_{\mu} \psi\_{A} + \frac{i}{2} E\_{ABCD} \bar{\psi}^{A} [Z^{C}, Z^{D}; \psi^{B}] - \frac{i}{2} E^{ABCD} \bar{Z}\_{D} [\bar{\psi}\_{A}, \psi\_{B}; \bar{Z}\_{C}] \\ &- i\bar{\psi}^{A} \left[ \psi\_{A}, Z^{B}; \bar{Z}\_{B} \right] + 2i\bar{\psi}^{A} \{ \psi\_{B}, Z^{B}; \bar{Z}\_{A} \} \Big) \end{split} \tag{75}$$

The Hermitian 3-algebra models of M-theory are classified into the models with *u*(*m*) ⊕ *u*(*n*) Hermitian 3-algebra (20) and *sp*(2*n*) ⊕ *u*(1) Hermitian 3-algebra (30). In the following, we study the *u*(*N*) ⊕ *u*(*N*) Hermitian 3-algebra model. By substituting the *u*(*N*) ⊕ *u*(*N*) Hermitian 3-algebra (20) to the action (73), we obtain

$$\begin{split} S &= \text{Tr}\left(-\frac{(2\pi)^{2}}{k^{2}}V - (Z^{A}A^{R}\_{\mu} - A^{L}\_{\mu}Z^{A})(Z^{A}A^{R\mu} - A^{L\mu}Z^{A})^{\dagger} - \frac{k}{2\pi}\frac{i}{3}E^{\mu\lambda}(A^{R}\_{\mu}A^{R}\_{\nu}A^{R}\_{\lambda} - A^{L}\_{\mu}A^{L}\_{\nu}A^{L}\_{\lambda})\right) \\ &- \bar{\psi}^{A}\Gamma^{\mu}(\psi\_{A}A^{R}\_{\mu} - A^{L}\_{\mu}\psi\_{A}) + \frac{2\pi}{k}\left(i\text{E}\_{ABCD}\bar{\psi}^{A}Z^{C}\psi^{\dagger B}Z^{D} - i\text{E}^{ABCD}Z^{\dagger}\_{\alpha}\bar{\psi}^{\dagger}{}\_{A}Z^{\dagger}\_{\zeta}\psi\_{B} \\ &- i\bar{\psi}^{A}\psi\_{A}Z^{\dagger}\_{B}Z^{B} + i\bar{\psi}^{A}Z^{B}Z^{\dagger}\_{B}\psi\_{A} + 2i\bar{\psi}^{A}\psi\_{B}Z^{\dagger}\_{A}Z^{B} - 2i\bar{\psi}^{A}Z^{B}Z^{\dagger}\_{A}\psi\_{B})\right) \end{split} \tag{76}$$

where *A<sup>R</sup> <sup>μ</sup>* ≡ − *<sup>k</sup>* <sup>2</sup>*<sup>π</sup> iAμ*¯ *baT*†¯ *bTa* and *A<sup>L</sup> <sup>μ</sup>* ≡ − *<sup>k</sup>* <sup>2</sup>*<sup>π</sup> iAμ*¯ *baTaT*†¯ *<sup>b</sup>* are *<sup>N</sup>* <sup>×</sup> *<sup>N</sup>* Hermitian matrices. In the algebra, we have set *α* = <sup>2</sup>*<sup>π</sup> <sup>k</sup>* , where *k* is an integer representing the Chern-Simons level. We choose *k* = 1 in order to obtain 16 dynamical supersymmetries. *V* is given by

$$\begin{split} V &= + \frac{1}{3} \mathbf{Z}\_A^\dagger \mathbf{Z}^A \mathbf{Z}\_B^\dagger \mathbf{Z}^B \mathbf{Z}\_C^\dagger \mathbf{Z}^C + \frac{1}{3} \mathbf{Z}^A \mathbf{Z}\_A^\dagger \mathbf{Z}^B \mathbf{Z}\_B^\dagger \mathbf{Z}^C \mathbf{Z}\_C^\dagger + \frac{4}{3} \mathbf{Z}\_A^\dagger \mathbf{Z}^B \mathbf{Z}\_C^\dagger \mathbf{Z}^A \mathbf{Z}\_B^\dagger \mathbf{Z}^C \\ &- \mathbf{Z}\_A^\dagger \mathbf{Z}^A \mathbf{Z}\_B^\dagger \mathbf{Z}^C \mathbf{Z}\_C^\dagger \mathbf{Z}^B - \mathbf{Z}^A \mathbf{Z}\_A^\dagger \mathbf{Z}^B \mathbf{Z}\_C^\dagger \mathbf{Z}^C \mathbf{Z}\_B^\dagger \end{split} \tag{77}$$

By redefining fields as

$$Z^A \rightarrow \left(\frac{k}{2\pi}\right)^{\frac{1}{3}} Z^A$$

$$A^\mu \rightarrow \left(\frac{2\pi}{k}\right)^{\frac{1}{3}} A^\mu$$

$$\psi^A \rightarrow \left(\frac{k}{2\pi}\right)^{\frac{1}{6}} \psi^A \tag{78}$$

we obtain an action that is independent of Chern-Simons level:

$$\begin{split} S &= \text{Tr}\Big(-V - (Z^A A^R\_\mu - A^L\_\mu Z^A)(Z^A A^{R\mu} - A^{L\mu} Z^A)^\dagger - \frac{i}{3} E^{\mu\nu\lambda} (A^R\_\mu A^R\_\nu A^R\_\lambda - A^L\_\mu A^L\_\nu A^L\_\lambda) \\ &- \bar{\psi}^A \Gamma^\mu (\psi\_A A^R\_\mu - A^L\_\mu \psi\_A) + i E\_{ABCD} \bar{\psi}^A Z^C \psi^{\dagger B} Z^D - i E^{ABCD} Z^\dagger\_D \bar{\psi}^\dagger{}\_A Z^\dagger\_C \psi\_B \\ &- i \bar{\psi}^A \psi\_A Z^\dagger\_B Z^B + i \bar{\psi}^A Z^B Z^\dagger\_B \psi\_A + 2i \bar{\psi}^A \psi\_B Z^\dagger\_A Z^B - 2i \bar{\psi}^A Z^B Z^\dagger\_A \psi\_B \Big) \end{split} \tag{79}$$

<sup>2</sup> The authors of [46–49] studied matrix models that can be obtained by a dimensional reduction of the ABJM and ABJ gauge theories on *S*3. They showed that the models reproduce the original gauge theories on *S*<sup>3</sup> in planar limits.

as opposed to three-dimensional Chern-Simons actions.

14 Will-be-set-by-IN-TECH

This matrix model can be obtained formally by a dimensional reduction of the N = 6 BLG

The Hermitian 3-algebra models of M-theory are classified into the models with *u*(*m*) ⊕ *u*(*n*) Hermitian 3-algebra (20) and *sp*(2*n*) ⊕ *u*(1) Hermitian 3-algebra (30). In the following, we study the *u*(*N*) ⊕ *u*(*N*) Hermitian 3-algebra model. By substituting the *u*(*N*) ⊕ *u*(*N*)

*<sup>μ</sup>ZA*)(*Z<sup>A</sup> <sup>A</sup>Rμ*−*ALμZA*)†<sup>−</sup> *<sup>k</sup>*

<sup>2</sup>*<sup>π</sup> iAμ*¯

*AZBZ*†

 *k* 2*π*

2*π k*

 *k* 2*π*

*<sup>μ</sup>ZA*)(*Z<sup>A</sup> <sup>A</sup>R<sup>μ</sup>* <sup>−</sup> *<sup>A</sup>LμZA*)† <sup>−</sup> *<sup>i</sup>*

*<sup>B</sup>ψ<sup>A</sup>* <sup>+</sup> <sup>2</sup>*iψ*¯ *<sup>A</sup>ψBZ*†

*μψA*) + *iEABCDψ*¯ *AZCψ*†*BZD* <sup>−</sup> *iEABCDZ*†

<sup>2</sup> The authors of [46–49] studied matrix models that can be obtained by a dimensional reduction of the ABJM and ABJ gauge theories on *S*3. They showed that the models reproduce the original gauge theories on *S*<sup>3</sup> in planar limits.

*AZBZ*†

1 3 *Z<sup>A</sup>*

1 3 *Aμ*

1 6

*CZCZ*†

*<sup>k</sup>* (*iEABCDψ*¯ *AZCψ*†*BZD* <sup>−</sup> *iEABCDZ*†

*baTaT*†¯

*<sup>A</sup>Z<sup>B</sup>* <sup>−</sup> <sup>2</sup>*iψ*¯ *AZBZ*†

*BZCZ*† *<sup>C</sup>* + 4 3 *Z*† *AZBZ*†

<sup>2</sup> *<sup>A</sup>μcb*¯ *∂νA<sup>λ</sup>* ¯*daT*¯ ¯*d*[*T<sup>a</sup>*

] 

2*π i* 3

*<sup>k</sup>* , where *k* is an integer representing the Chern-Simons level.

3

*<sup>A</sup>Z<sup>B</sup>* <sup>−</sup> <sup>2</sup>*iψ*¯ *AZBZ*†

*Eμνλ*(*A<sup>R</sup>*

2

*Eμνλ*(*A<sup>R</sup>*

*<sup>A</sup>ψB*)  *<sup>μ</sup> <sup>A</sup><sup>R</sup> <sup>ν</sup> <sup>A</sup><sup>R</sup> <sup>λ</sup>* <sup>−</sup> *<sup>A</sup><sup>L</sup> μA<sup>L</sup> <sup>ν</sup> <sup>A</sup><sup>L</sup> λ*)

*<sup>b</sup>* are *<sup>N</sup>* <sup>×</sup> *<sup>N</sup>* Hermitian matrices. In

*CZAZ*† *BZ<sup>C</sup>*

*<sup>B</sup>* (77)

*ψ<sup>A</sup>* (78)

*<sup>μ</sup> <sup>A</sup><sup>R</sup> <sup>ν</sup> <sup>A</sup><sup>R</sup> <sup>λ</sup>* <sup>−</sup> *<sup>A</sup><sup>L</sup> μA<sup>L</sup> <sup>ν</sup> <sup>A</sup><sup>L</sup> λ*)

*<sup>D</sup>* ¯*ψ*† *AZ*† *<sup>C</sup>ψ<sup>B</sup>*

*<sup>A</sup>ψ<sup>B</sup>* 

*<sup>D</sup>* ¯*ψ*† *AZ*† *<sup>C</sup>ψ<sup>B</sup>*

; *<sup>T</sup>*¯ ¯*d*][*Tb*, *<sup>T</sup><sup>f</sup>* ; *<sup>T</sup>*¯*e*¯

*EABCDψ*¯ *<sup>A</sup>*[*ZC*, *<sup>Z</sup>D*; *<sup>ψ</sup>B*] <sup>−</sup> *<sup>i</sup>*

, *Tb*; *T*¯ *<sup>c</sup>*¯ ]

*EABCDZ*¯ *<sup>D</sup>*[*ψ*¯*A*, *ψB*; *Z*¯*C*]

(75)

(76)

(79)

<sup>−</sup>*<sup>V</sup>* <sup>−</sup> *<sup>D</sup>μZAD<sup>μ</sup>ZA* <sup>+</sup> *<sup>E</sup>μνλ* <sup>1</sup>

*f e*[*T<sup>a</sup>* , *T<sup>c</sup>*

> *i* 2

<sup>−</sup>*iψ*¯ *<sup>A</sup>*[*ψA*, *<sup>Z</sup>B*; *<sup>Z</sup>*¯ *<sup>B</sup>*] + <sup>2</sup>*iψ*¯ *<sup>A</sup>*[*ψB*, *<sup>Z</sup>B*; *<sup>Z</sup>*¯ *<sup>A</sup>*]

*<sup>B</sup>ψ<sup>A</sup>* <sup>+</sup> <sup>2</sup>*iψ*¯ *<sup>A</sup>ψBZ*†

*<sup>μ</sup>* ≡ − *<sup>k</sup>*

We choose *k* = 1 in order to obtain 16 dynamical supersymmetries. *V* is given by

1 3 *ZAZ*†

*<sup>Z</sup><sup>A</sup>* <sup>→</sup>

*<sup>A</sup><sup>μ</sup>* <sup>→</sup>

*<sup>ψ</sup><sup>A</sup>* <sup>→</sup>

*<sup>C</sup>Z<sup>C</sup>* <sup>+</sup>

*<sup>C</sup>Z<sup>B</sup>* <sup>−</sup> *<sup>Z</sup>AZ*†

*baA<sup>ν</sup>* ¯*dcA<sup>λ</sup>* ¯

<sup>−</sup>*iψ*¯ *<sup>A</sup>*Γ*μDμψ<sup>A</sup>* <sup>+</sup>

action [8], which is equivalent to ABJ(M) action [7, 45]2,

+ 1 <sup>3</sup> *<sup>A</sup>μ*¯

Hermitian 3-algebra (20) to the action (73), we obtain

*<sup>μ</sup>* <sup>−</sup>*A<sup>L</sup>*

*μψA*) + <sup>2</sup>*<sup>π</sup>*

*bTa* and *A<sup>L</sup>*

*BZBZ*†

we obtain an action that is independent of Chern-Simons level:

*<sup>μ</sup>* <sup>−</sup> *<sup>A</sup><sup>L</sup>*

*<sup>μ</sup>* <sup>−</sup> *<sup>A</sup><sup>L</sup>*

*<sup>B</sup>Z<sup>B</sup>* <sup>+</sup> *<sup>i</sup>ψ*¯ *AZBZ*†

*BZCZ*†

*<sup>k</sup>*<sup>2</sup> *<sup>V</sup>*−(*Z<sup>A</sup> <sup>A</sup><sup>R</sup>*

<sup>2</sup>*<sup>π</sup> iAμ*¯

<sup>−</sup>*Z*†

<sup>−</sup>*<sup>V</sup>* <sup>−</sup> (*Z<sup>A</sup> <sup>A</sup><sup>R</sup>*

<sup>−</sup>*ψ*¯ *<sup>A</sup>*Γ*μ*(*ψ<sup>A</sup> <sup>A</sup><sup>R</sup>*

<sup>−</sup>*iψ*¯ *<sup>A</sup>ψAZ*†

*<sup>μ</sup>* <sup>−</sup> *<sup>A</sup><sup>L</sup>*

*<sup>B</sup>Z<sup>B</sup>* <sup>+</sup> *<sup>i</sup>ψ*¯ *AZBZ*†

*baT*†¯

*AZAZ*†

<sup>−</sup>*ψ*¯ *<sup>A</sup>*Γ*μ*(*ψ<sup>A</sup> <sup>A</sup><sup>R</sup>*

*<sup>μ</sup>* ≡ − *<sup>k</sup>*

the algebra, we have set *α* = <sup>2</sup>*<sup>π</sup>*

*V* = + 1 3 *Z*† *AZAZ*†

By redefining fields as

*S* = Tr 

<sup>−</sup>*iψ*¯ *<sup>A</sup>ψAZ*†

*<sup>S</sup>*<sup>N</sup> <sup>=</sup>6*BLG* =

*S* = Tr −(2*π*)<sup>2</sup>

where *A<sup>R</sup>*

 *d*3*x*  If we rewrite the gauge fields in the action as *A<sup>L</sup> <sup>μ</sup>* = *A<sup>μ</sup>* + *b<sup>μ</sup>* and *A<sup>R</sup> <sup>μ</sup>* = *A<sup>μ</sup>* − *bμ*, we obtain

$$\begin{split} S &= \text{Tr}\Big(-V + ([A\_{\mu}, Z^{A}] + \{b\_{\mu}, Z^{A}\})([A^{\mu}, Z\_{A}] - \{b^{\mu}, Z\_{A}\}) + iE^{\mu\nu}(\frac{2}{3}b\_{\mu}b\_{\nu}b\_{\lambda} + 2A\_{\mu}A\_{\nu}b\_{\lambda}) \\ &+ \bar{\psi}^{A}\Gamma^{\mu}([A\_{\mu}, \psi\_{A}] + \{b\_{\mu}, \psi\_{A}\}) + iE\_{ABCD}\bar{\psi}^{A}Z^{C}\psi^{\dagger B}Z^{D} - iE^{\text{ABCD}}Z\_{D}^{\dagger}\bar{\psi}^{\dagger}{}\_{A}Z\_{C}^{\dagger}\psi\_{B} \\ &- i\bar{\psi}^{A}\psi\_{A}Z\_{B}^{\dagger}Z^{B} + i\bar{\psi}^{A}Z^{B}Z\_{B}^{\dagger}\psi\_{A} + 2i\bar{\psi}^{A}\psi\_{B}Z\_{A}^{\dagger}Z^{B} - 2i\bar{\psi}^{A}Z^{B}Z\_{A}^{\dagger}\psi\_{B} \end{split} \tag{80}$$

where [ , ] and { , } are the ordinary commutator and anticommutator, respectively. The *u*(1) parts of *A<sup>μ</sup>* decouple because *A<sup>μ</sup>* appear only in commutators in the action. *b<sup>μ</sup>* can be regarded as auxiliary fields, and thus *A<sup>μ</sup>* correspond to matrices *X<sup>μ</sup>* that represents three space-time coordinates in M-theory. Among *<sup>N</sup>* <sup>×</sup> *<sup>N</sup>* arbitrary complex matrices *<sup>Z</sup>A*, we need to identify matrices *<sup>X</sup><sup>I</sup>* (*<sup>I</sup>* <sup>=</sup> 3, ··· 10) representing the other space coordinates in M-theory, because the model possesses not *SO*(8) but *SU*(4) × *U*(1) symmetry. Our identification is

$$\begin{aligned} \mathbf{Z}^A &= \mathbf{i}X^{A+2} - \mathbf{X}^{A+6}, \\ \mathbf{X}^I &= \hat{\mathbf{X}}^I - \mathbf{i}\mathbf{x}^I \mathbf{1} \end{aligned} \tag{81}$$

where *X*ˆ *<sup>I</sup>* and *x<sup>I</sup>* are *su*(*N*) Hermitian matrices and real scalars, respectively. This is analogous to the identification when we compactify ABJM action, which describes N M2 branes, and obtain the action of N D2 branes [7, 50, 51]. We will see that this identification works also in our case. We should note that while the *su*(*N*) part is Hermitian, the *u*(1) part is anti-Hermitian. That is, an eigen-value distribution of *Xμ*, *ZA*, and not *X<sup>I</sup>* determine the spacetime in the Hermitian model. In order to define light-cone coordinates, we need to perform Wick rotation: *<sup>a</sup>*<sup>0</sup> → −*ia*0. After the Wick rotation, we obtain

$$A^0 = \hat{A}^0 - ia^0 \mathbf{1} \tag{82}$$

where *A*ˆ0 is a *su*(*N*) Hermitian matrix.

#### **3.4. DLCQ Limit of 3-algebra model of M-theory**

It was shown that M-theory in a DLCQ limit reduces to the BFSS matrix theory with matrices of finite size [30–35]. This fact is a strong criterion for a model of M-theory. In [26, 28], it was shown that the Lie and Hermitian 3-algebra models of M-theory reduce to the BFSS matrix theory with matrices of finite size in the DLCQ limit. In this subsection, we show an outline of the mechanism.

DLCQ limit of M-theory consists of a light-cone compactification, *x*<sup>−</sup> ≈ *x*<sup>−</sup> + 2*πR*, where *x*± = √ 1 2 (*x*<sup>10</sup> <sup>±</sup> *<sup>x</sup>*0), and Lorentz boost in *<sup>x</sup>*<sup>10</sup> direction with an infinite momentum. After appropriate scalings of fields [26, 28], we define light-cone coordinate matrices as

$$X^0 = \frac{1}{\sqrt{2}}(X^+ - X^-)$$

$$X^{10} = \frac{1}{\sqrt{2}}(X^+ + X^-) \tag{83}$$

We integrate out *b<sup>μ</sup>* by using their equations of motion.

#### 16 Will-be-set-by-IN-TECH 16 Linear Algebra – Theorems and Applications

A matrix compactification [52] on a circle with a radius R imposes the following conditions on *X*− and the other matrices *Y*:

$$\begin{cases} X^- - (2\pi R)1 = \mathcal{U}^\dagger X^- \mathcal{U} \\ Y = \mathcal{U}^\dagger Y \mathcal{U} \end{cases} \tag{84}$$

where *U* is a unitary matrix. In order to obtain a solution to (84), we need to take *N* → ∞ and consider matrices of infinite size [52]. A solution to (84) is given by *X*<sup>−</sup> = *X*¯ <sup>−</sup> + *X*˜ <sup>−</sup>, *Y* = *Y*˜ and

$$\mathcal{U} = \begin{pmatrix} \ddots \ \ddots \ \ddots \\ & 0 \ 1 & 0 \\ & & 0 \ 1 \\ & & & 0 \ 1 \\ & & & & 0 \ \ddots \\ & & & & & \ddots \end{pmatrix} \otimes \mathbf{1}\_{n \times n} \in \mathcal{U}(N) \tag{85}$$

Backgrounds *X*¯ <sup>−</sup> are

$$\mathcal{X}^- = -T^3 \mathfrak{x}\_0^- T^0 - (2\pi \mathbb{R}) \text{diag}(\cdots, s-1, s, s+1, \cdots) \otimes \mathbf{1}\_{\mathbb{R} \times \mathbb{R}} \tag{86}$$

in the Lie 3-algebra case, whereas

$$\bar{X}^- = -i(T^3\bar{x}^-)\mathbf{1} - i(2\pi R)\text{diag}(\cdot\cdot\cdot, s-1, s, s+1, \cdot\cdot) \otimes \mathbf{1}\_{\mathbb{R}\times\mathbb{N}}\tag{87}$$

in the Hermitian 3-algebra case. A fluctuation *x*˜ that represents *u*(*N*) parts of *X*˜ <sup>−</sup> and *Y*˜ is

$$
\begin{pmatrix}
\ddots & \ddots & \ddots & & & & & \\
\ddots & \check{\mathbf{x}}(0) & \check{\mathbf{x}}(1) & \check{\mathbf{x}}(2) & & & \ddots & \\
\vdots & \ddots & \check{\mathbf{x}}(0) & \check{\mathbf{x}}(1) & \check{\mathbf{x}}(2) & & \\
& \ddots & \check{\mathbf{x}}(-1) & \check{\mathbf{x}}(0) & \check{\mathbf{x}}(1) & \check{\mathbf{x}}(2) & \\
& & \check{\mathbf{x}}(-2) & \check{\mathbf{x}}(-1) & \check{\mathbf{x}}(0) & \check{\mathbf{x}}(1) & \check{\mathbf{x}}(2) & \\
& & & \check{\mathbf{x}}(-2) & \check{\mathbf{x}}(-1) & \check{\mathbf{x}}(0) & \check{\mathbf{x}}(1) & \check{\mathbf{x}}(2) \\
& & & & & \ddots & \\
& & & & & \check{\mathbf{x}}(-2) & \check{\mathbf{x}}(-1) & \check{\mathbf{x}}(0) & \overset{\cdot}{\cdot} \\
& & & & & \ddots & \ddots & \ddots \\
& & & & & & \ddots & \ddots & \ddots
\end{pmatrix}
\tag{88}
$$

Each *x*˜(*s*) is a *n* × *n* matrix, where *s* is an integer. That is, the (s, t)-th block is given by *x*˜*s*,*<sup>t</sup>* = *x*˜(*s* − *t*).

We make a Fourier transformation,

$$\mathfrak{X}(\mathbf{s}) = \frac{1}{2\pi\tilde{\mathbf{R}}} \int\_0^{2\pi\tilde{\mathbf{R}}} d\tau \mathbf{x}(\tau) e^{i\mathbf{s}\cdot\frac{\mathbf{r}}{R}} \tag{89}$$

where *<sup>x</sup>*(*τ*) is a *<sup>n</sup>* <sup>×</sup> *<sup>n</sup>* matrix in one-dimension and *RR*˜ <sup>=</sup> <sup>2</sup>*π*. From (86)-(89), the following identities hold:

$$\sum\_{t} \ddot{\mathbf{x}}\_{s,t} \ddot{\mathbf{x}}^{\prime}\_{t,\mu} = \frac{1}{2\pi\bar{\mathcal{R}}} \int\_{0}^{2\pi\bar{\mathcal{R}}} d\tau \, \mathbf{x}(\tau) \mathbf{x}^{\prime}(\tau) e^{i(s-u)\frac{\tau}{\bar{\mathcal{R}}}}$$

$$\text{tr}(\sum\_{s,t} \ddot{\mathbf{x}}\_{s,t} \ddot{\mathbf{x}}^{\prime}\_{t,s}) = V \frac{1}{2\pi\bar{\mathcal{R}}} \int\_{0}^{2\pi\bar{\mathcal{R}}} d\tau \, \text{tr}(\mathbf{x}(\tau) \mathbf{x}^{\prime}(\tau))$$

$$[\ddot{\mathbf{x}}^{\prime}, \ddot{\mathbf{x}}]\_{s,t} = \frac{1}{2\pi\bar{\mathcal{R}}} \int\_{0}^{2\pi\bar{\mathcal{R}}} d\tau \, \partial\_{\tau} \mathbf{x}(\tau) e^{i(s-t)\frac{\tau}{\bar{\mathcal{R}}}}\tag{90}$$

where tr is a trace over *n* × *n* matrices and *V* = ∑*<sup>s</sup>* 1. Next, we boost the system in *x*<sup>10</sup> direction:

$$
\tilde{X}^{\prime +} = \frac{1}{T} \tilde{X}^{+}
$$

$$
\tilde{X}^{\prime -} = T \tilde{X}^{-} \tag{91}
$$

The DLCQ limit is achieved when *T* → ∞, where the "novel Higgs mechanism" [51] is realized. In *T* → ∞, the actions of the 3-algebra models of M-theory reduce to that of the BFSS matrix theory [27] with matrices of finite size,

$$S = \frac{1}{\mathcal{S}^2} \int\_{-\infty}^{\infty} d\tau \text{tr}(\frac{1}{2} (D\_0 \mathbf{x}^P)^2 - \frac{1}{4} [\mathbf{x}^P, \mathbf{x}^Q]^2 + \frac{1}{2} \bar{\psi} \Gamma^0 D\_0 \psi - \frac{i}{2} \bar{\psi} \Gamma^P [\mathbf{x}\_P, \psi]) \tag{92}$$

where *P*, *Q* = 1, 2, ··· , 9.

#### **3.5. Supersymmetric deformation of Lie 3-algebra model of M-theory**

A supersymmetric deformation of the Lie 3-algebra Model of M-theory was studied in [53] (see also [54–56]). If we add mass terms and a flux term,

$$S\_{\mathfrak{M}} = \left\langle -\frac{1}{2}\mu^2 (\mathbf{X}^I)^2 - \frac{i}{2}\mu \bar{\Psi} \Gamma\_{3456} \Psi + H\_{IJKL} [\mathbf{X}^I, \mathbf{X}^I, \mathbf{X}^K] \mathbf{X}^L \right\rangle \tag{93}$$

such that

16 Will-be-set-by-IN-TECH

A matrix compactification [52] on a circle with a radius R imposes the following conditions on

*<sup>X</sup>*<sup>−</sup> <sup>−</sup> (2*πR*)<sup>1</sup> <sup>=</sup> *<sup>U</sup>*†*X*−*<sup>U</sup>*

where *U* is a unitary matrix. In order to obtain a solution to (84), we need to take *N* → ∞ and consider matrices of infinite size [52]. A solution to (84) is given by *X*<sup>−</sup> = *X*¯ <sup>−</sup> + *X*˜ <sup>−</sup>, *Y* = *Y*˜

⎞

⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎠

...

in the Hermitian 3-algebra case. A fluctuation *x*˜ that represents *u*(*N*) parts of *X*˜ <sup>−</sup> and *Y*˜ is

... *<sup>x</sup>*˜(0) *<sup>x</sup>*˜(1) *<sup>x</sup>*˜(2) ...

*x*˜(−2) *x*˜(−1) *x*˜(0) *x*˜(1) *x*˜(2)

*x*˜(−2) *x*˜(−1) *x*˜(0) *x*˜(1) *x*˜(2)

... *<sup>x</sup>*˜(−2) *<sup>x</sup>*˜(−1) *<sup>x</sup>*˜(0) ...

Each *x*˜(*s*) is a *n* × *n* matrix, where *s* is an integer. That is, the (s, t)-th block is given by

� <sup>2</sup>*πR*˜ 0

*<sup>x</sup>*˜(−2) *<sup>x</sup>*˜(−1) *<sup>x</sup>*˜(0) *<sup>x</sup>*˜(1) ...

*dτx*(*τ*)*e*

*is <sup>τ</sup>*

... ... ...

... *<sup>x</sup>*˜(−1) *<sup>x</sup>*˜(0) *<sup>x</sup>*˜(1) *<sup>x</sup>*˜(2)

*<sup>x</sup>*˜(*s*) = <sup>1</sup>

2*πR*˜

*Y* = *U*†*YU* (84)

<sup>0</sup> *<sup>T</sup>*<sup>0</sup> <sup>−</sup> (2*πR*)diag(··· ,*<sup>s</sup>* <sup>−</sup> 1,*s*,*<sup>s</sup>* <sup>+</sup> 1, ···) <sup>⊗</sup> **<sup>1</sup>***n*×*<sup>n</sup>* (86)

<sup>−</sup>)**1** − *i*(2*πR*)diag(··· ,*s* − 1,*s*,*s* + 1, ···) ⊗ **1***n*×*<sup>n</sup>* (87)

⊗ 1*n*×*<sup>n</sup>* ∈ *U*(*N*) (85)

⎞

⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎠

*<sup>R</sup>*˜ (89)

(88)

*X*− and the other matrices *Y*:

Backgrounds *X*¯ <sup>−</sup> are

*x*˜*s*,*<sup>t</sup>* = *x*˜(*s* − *t*).

We make a Fourier transformation,

*U* =

−

... ... ...

*<sup>X</sup>*¯ <sup>−</sup> <sup>=</sup> <sup>−</sup>*T*3*x*¯

*<sup>X</sup>*¯ <sup>−</sup> <sup>=</sup> <sup>−</sup>*i*(*T*3*x*¯

⎛

⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎝

in the Lie 3-algebra case, whereas

⎛

... ...

⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎝

and

$$\mathbf{H}\_{\rm IJKL} = \begin{cases} -\frac{\mu}{6} \mathbf{e}\_{\rm IJKL} \begin{pmatrix} \mathbf{I}\_{\prime} \mathbf{J}\_{\prime} \mathbf{K}\_{\prime} \mathbf{L} = \mathbf{3}, 4, 5, 6 \text{ or } \mathbf{7}, 8, 9, 10 \end{pmatrix} \\ \text{(} \mathbf{0}^{\prime} \text{)} \text{ (} \mathbf{0}^{\prime} \text{)} \text{ } \end{cases} \tag{94}$$

to the action (53), the total action *S*<sup>0</sup> + *Sm* is invariant under dynamical 16 supersymmetries,

$$\begin{aligned} \delta X^{I} &= i\varepsilon \Gamma^{I} \Psi\\ \delta A\_{\mu ab}[T^{a}, T^{b}, \quad ] &= i\varepsilon \Gamma\_{\mu} \Gamma\_{I}[X^{I}, \Psi, \quad ]\\ \delta \Psi &= -\frac{1}{6}[X^{I}, X^{J}, X^{K}] \Gamma\_{IJK}\varepsilon - A\_{\mu ab}[T^{a}, T^{b}, X^{I}] \Gamma^{\mu} \Gamma\_{I} \varepsilon + \mu \Gamma\_{3456} X^{I} \Gamma\_{I} \varepsilon \end{aligned} \tag{95}$$

From this action, we obtain various interesting solutions, including fuzzy sphere solutions [53].

18 Will-be-set-by-IN-TECH 18 Linear Algebra – Theorems and Applications

## **4. Conclusion**

The metric Hermitian 3-algebra corresponds to a class of the super Lie algebra. By using this relation, the metric Hermitian 3-algebras are classified into *u*(*m*) ⊕ *u*(*n*) and *sp*(2*n*) ⊕ *u*(1) Hermitian 3-algebras.

The Lie and Hermitian 3-algebra models of M-theory are obtained by second quantizations of the supermembrane action in a semi-light-cone gauge. The Lie 3-algebra model possesses manifest N = 1 supersymmetry in eleven dimensions. In the DLCQ limit, both the models reduce to the BFSS matrix theory with matrices of finite size as they should.

## **Acknowledgements**

We would like to thank T. Asakawa, K. Hashimoto, N. Kamiya, H. Kunitomo, T. Matsuo, S. Moriyama, K. Murakami, J. Nishimura, S. Sasa, F. Sugino, T. Tada, S. Terashima, S. Watamura, K. Yoshida, and especially H. Kawai and A. Tsuchiya for valuable discussions.

## **Author details**

Matsuo Sato *Hirosaki University, Japan*

### **5. References**


18 Will-be-set-by-IN-TECH

The metric Hermitian 3-algebra corresponds to a class of the super Lie algebra. By using this relation, the metric Hermitian 3-algebras are classified into *u*(*m*) ⊕ *u*(*n*) and *sp*(2*n*) ⊕ *u*(1)

The Lie and Hermitian 3-algebra models of M-theory are obtained by second quantizations of the supermembrane action in a semi-light-cone gauge. The Lie 3-algebra model possesses manifest N = 1 supersymmetry in eleven dimensions. In the DLCQ limit, both the models

We would like to thank T. Asakawa, K. Hashimoto, N. Kamiya, H. Kunitomo, T. Matsuo, S. Moriyama, K. Murakami, J. Nishimura, S. Sasa, F. Sugino, T. Tada, S. Terashima, S. Watamura,

[2] N. Kamiya, A structure theory of Freudenthal-Kantor triple systems, J. Algebra 110 (1987)

[3] S. Okubo, N. Kamiya, Quasi-classical Lie superalgebras and Lie supertriple systems,

[5] A. Gustavsson, Algebraic structures on parallel M2-branes, Nucl. Phys. B811 (2009) 66. [6] J. Bagger, N. Lambert, Gauge Symmetry and Supersymmetry of Multiple M2-Branes,

[7] O. Aharony, O. Bergman, D. L. Jafferis, J. Maldacena, N=6 superconformal Chern-Simons-matter theories, M2-branes and their gravity duals, JHEP 0810 (2008) 091. [8] J. Bagger, N. Lambert, Three-Algebras and N=6 Chern-Simons Gauge Theories, Phys.

[9] J. Figueroa-O'Farrill, G. Papadopoulos, Pluecker-type relations for orthogonal planes, J.

[10] G. Papadopoulos, M2-branes, 3-Lie Algebras and Plucker relations, JHEP 0805 (2008)

[11] J. P. Gauntlett, J. B. Gutowski, Constraining Maximally Supersymmetric Membrane

[12] D. Gaiotto, E. Witten, Janus Configurations, Chern-Simons Couplings, And The

[13] K. Hosomichi, K-M. Lee, S. Lee, S. Lee, J. Park, N=5,6 Superconformal Chern-Simons

Theta-Angle in N=4 Super Yang-Mills Theory, arXiv:0804.2907[hep-th].

Theories and M2-branes on Orbifolds, JHEP 0809 (2008) 002.

[4] J. Bagger, N. Lambert, Modeling Multiple M2's, Phys. Rev. D75 (2007) 045020.

reduce to the BFSS matrix theory with matrices of finite size as they should.

K. Yoshida, and especially H. Kawai and A. Tsuchiya for valuable discussions.

[1] V. T. Filippov, n-Lie algebras, Sib. Mat. Zh. 26, No. 6, (1985) 126140.

Comm. Algebra 30 (2002) no. 8, 3825.

Phys. Rev. D77 (2008) 065008.

Rev. D79 (2009) 025002.

Geom. Phys. 49 (2004) 294.

Actions, JHEP 0806 (2008) 053.

**4. Conclusion**

Hermitian 3-algebras.

**Acknowledgements**

**Author details**

**5. References**

108.

054.

*Hirosaki University, Japan*

Matsuo Sato

	- [43] H. Nishino, S. Rajpoot, Triality and Bagger-Lambert Theory, Phys. Lett. B671 (2009) 415.
	- [44] A. Gustavsson, S-J. Rey, Enhanced N=8 Supersymmetry of ABJM Theory on R(8) and R(8)/Z(2), arXiv:0906.3568 [hep-th].
	- [45] O. Aharony, O. Bergman, D. L. Jafferis, Fractional M2-branes, JHEP 0811 (2008) 043.
	- [46] M. Hanada, L. Mannelli, Y. Matsuo, Large-N reduced models of supersymmetric quiver, Chern-Simons gauge theories and ABJM, arXiv:0907.4937 [hep-th].
	- [47] G. Ishiki, S. Shimasaki, A. Tsuchiya, Large N reduction for Chern-Simons theory on *S*3, Phys. Rev. D80 (2009) 086004.
	- [48] H. Kawai, S. Shimasaki, A. Tsuchiya, Large N reduction on group manifolds, arXiv:0912.1456 [hep-th].
	- [49] G. Ishiki, S. Shimasaki, A. Tsuchiya, A Novel Large-N Reduction on *S*3: Demonstration in Chern-Simons Theory, arXiv:1001.4917 [hep-th].
	- [50] Y. Pang, T. Wang, From N M2's to N D2's, Phys. Rev. D78 (2008) 125007.
	- [51] S. Mukhi, C. Papageorgakis, M2 to D2, JHEP 0805 (2008) 085.
	- [52] W. Taylor, D-brane field theory on compact spaces, Phys. Lett. B394 (1997) 283.
	- [53] J. DeBellis, C. Saemann, R. J. Szabo, Quantized Nambu-Poisson Manifolds in a 3-Lie Algebra Reduced Model, JHEP 1104 (2011) 075.
	- [54] M. M. Sheikh-Jabbari, Tiny Graviton Matrix Theory: DLCQ of IIB Plane-Wave String Theory, A Conjecture , JHEP 0409 (2004) 017.
	- [55] J. Gomis, A. J. Salim, F. Passerini, Matrix Theory of Type IIB Plane Wave from Membranes, JHEP 0808 (2008) 002.
	- [56] K. Hosomichi, K. Lee, S. Lee, Mass-Deformed Bagger-Lambert Theory and its BPS Objects, Phys.Rev. D78 (2008) 066015.

## **Algebraic Theory of Appell Polynomials with Application to General Linear Interpolation Problem**

Francesco Aldo Costabile and Elisabetta Longo

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/46482

## **1. Introduction**

20 Will-be-set-by-IN-TECH

[43] H. Nishino, S. Rajpoot, Triality and Bagger-Lambert Theory, Phys. Lett. B671 (2009) 415. [44] A. Gustavsson, S-J. Rey, Enhanced N=8 Supersymmetry of ABJM Theory on R(8) and

[47] G. Ishiki, S. Shimasaki, A. Tsuchiya, Large N reduction for Chern-Simons theory on *S*3,

[48] H. Kawai, S. Shimasaki, A. Tsuchiya, Large N reduction on group manifolds,

[49] G. Ishiki, S. Shimasaki, A. Tsuchiya, A Novel Large-N Reduction on *S*3: Demonstration

[53] J. DeBellis, C. Saemann, R. J. Szabo, Quantized Nambu-Poisson Manifolds in a 3-Lie

[54] M. M. Sheikh-Jabbari, Tiny Graviton Matrix Theory: DLCQ of IIB Plane-Wave String

[55] J. Gomis, A. J. Salim, F. Passerini, Matrix Theory of Type IIB Plane Wave from

[56] K. Hosomichi, K. Lee, S. Lee, Mass-Deformed Bagger-Lambert Theory and its BPS

[45] O. Aharony, O. Bergman, D. L. Jafferis, Fractional M2-branes, JHEP 0811 (2008) 043. [46] M. Hanada, L. Mannelli, Y. Matsuo, Large-N reduced models of supersymmetric quiver,

Chern-Simons gauge theories and ABJM, arXiv:0907.4937 [hep-th].

[50] Y. Pang, T. Wang, From N M2's to N D2's, Phys. Rev. D78 (2008) 125007.

[52] W. Taylor, D-brane field theory on compact spaces, Phys. Lett. B394 (1997) 283.

in Chern-Simons Theory, arXiv:1001.4917 [hep-th].

Algebra Reduced Model, JHEP 1104 (2011) 075.

Theory, A Conjecture , JHEP 0409 (2004) 017.

Membranes, JHEP 0808 (2008) 002.

Objects, Phys.Rev. D78 (2008) 066015.

[51] S. Mukhi, C. Papageorgakis, M2 to D2, JHEP 0805 (2008) 085.

R(8)/Z(2), arXiv:0906.3568 [hep-th].

Phys. Rev. D80 (2009) 086004.

arXiv:0912.1456 [hep-th].

20 Linear Algebra – Theorems and Applications

In 1880 P. E. Appell ([1]) introduced and widely studied sequences of *n*-degree polynomials

$$A\_n\left(\mathbf{x}\right), n = 0, 1, \ldots \tag{1}$$

satisfying the differential relation

$$DA\_{\mathfrak{n}}\left(\mathbf{x}\right) = \mathfrak{n}A\_{\mathfrak{n}-1}(\mathbf{x}), \mathfrak{n} = 1, 2, \dots \tag{2}$$

Sequences of polynomials, verifying the (2), nowadays called Appell polynomials, have been well studied because of their remarkable applications not only in different branches of mathematics ([2, 3]) but also in theoretical physics and chemistry ([4, 5]). In 1936 an initial bibliography was provided by Davis ([6, p. 25]). In 1939 Sheffer ([7]) introduced a new class of polynomials which extends the class of Appell polynomials; he called these polynomials of type zero, but nowadays they are called Sheffer polynomials. Sheffer also noticed the similarities between Appell polynomials and the umbral calculus, introduced in the second half of the 19th century with the work of such mathematicians as Sylvester, Cayley and Blissard (for examples, see [8]). The Sheffer theory is mainly based on formal power series. In 1941 Steffensen ([9]) published a theory on Sheffer polynomials based on formal power series too. However, these theories were not suitable as they did not provide sufficient computational tools. Afterwards Mullin, Roman and Rota ([10–12]), using operators method, gave a beautiful theory of umbral calculus, including Sheffer polynomials. Recently, Di Bucchianico and Loeb ([13]) summarized and documented more than five hundred old and new findings related to Appell polynomial sequences. In last years attention has centered on finding a novel representation of Appell polynomials. For instance, Lehemer ([14]) illustrated six different approaches to representing the sequence of Bernoulli polynomials, which is a

©2012 Costabile and Longo, licensee InTech. This is an open access chapter distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. ©2012 Costabile and Longo, licensee InTech. This is a paper distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### 2 Will-be-set-by-IN-TECH 22 Linear Algebra – Theorems and Applications

special case of Appell polynomial sequences. Costabile ([15, 16]) also gave a new form of Bernoulli polynomials, called determinantal form, and later these ideas have been extended to Appell polynomial sequences. In fact, in 2010, Costabile and Longo ([17]) proposed an algebraic and elementary approach to Appell polynomial sequences. At the same time, Yang and Youn ([18]) also gave an algebraic approach, but with different methods. The approach to Appell polynomial sequences via linear algebra is an easily comprehensible mathematical tool, specially for non-specialists; that is very good because many polynomials arise in physics, chemistry and engineering. The present work concerns with these topics and it is organized as follows: in Section 2 we mention the Appell method ([1]); in Section 3 we provide the determinantal approach ([17]) and prove the equivalence with other definitions; in Section 4 classical and non-classical examples are given; in Section 5, by using elementary tools of linear algebra, general properties of Appell polynomials are provided; in Section 6 we mention Appell polynomials of second kind ([19, 20]) and, in Section 7 two classical examples are given; in Section 8 we provide an application to general linear interpolation problem([21]), giving, in Section 9, some examples; in Section 10 the Yang and Youn approach ([18]) is sketched; finally, in Section 11 conclusions close the work.

### **2. The Appell approach**

Let {*An*(*x*)}*<sup>n</sup>* be a sequence of *n*-degree polynomials satisfying the differential relation (2). Then we have

**Remark 1.** *There is a one-to-one correspondence of the set of such sequences* {*An*(*x*)}*<sup>n</sup> and the set of numerical sequences* {*αn*}*<sup>n</sup>* , *α*<sup>0</sup> � 0 *given by the explicit representation*

$$A\_n(\mathbf{x}) = a\_n + \binom{n}{1} a\_{n-1} \mathbf{x} + \binom{n}{2} a\_{n-2} \mathbf{x}^2 + \dots + a\_0 \mathbf{x}^n, \ n = 0, 1, \dots \tag{3}$$

Equation (3), in particular, shows explicitly that for each *n* ≥ 1 the polynomial *An* (*x*) is completely determined by *An*−<sup>1</sup> (*x*) and by the choice of the constant of integration *αn*.

**Remark 2.** *Given the formal power series*

$$a'(h) = a\_0 + \frac{h}{1!}a\_1 + \frac{h^2}{2!}a\_2 + \dots + \frac{h^n}{n!}a\_n + \dotsb, \quad a\_0 \neq 0,\tag{4}$$

*with α<sup>i</sup> i* = 0, 1, ... *real coefficients, the sequence of polynomials, An*(*x*)*, determined by the power series expansion of the product a* (*h*)*ehx, i.e.*

$$a\left(h\right)e^{h\mathbf{x}} = A\_0\left(\mathbf{x}\right) + \frac{h}{1!}A\_1\left(\mathbf{x}\right) + \frac{h^2}{2!}A\_2\left(\mathbf{x}\right) + \dots + \frac{h^n}{n!}A\_n\left(\mathbf{x}\right) + \dots \tag{5}$$

*satisfies (2).*

The function *a* (*h*) is said, by Appell, 'generating function' of the sequence {*An*(*x*)}*n*.

Appell also noticed various examples of sequences of polynomials verifying (2).

He also considered ([1]) an application of these polynomial sequences to linear differential equations, which is out of this context.

#### **3. The determinantal approach**

Let be *β<sup>i</sup>* ∈ **R**, *i* = 0, 1, ..., with *β*<sup>0</sup> �= 0.

We give the following

2 Will-be-set-by-IN-TECH

special case of Appell polynomial sequences. Costabile ([15, 16]) also gave a new form of Bernoulli polynomials, called determinantal form, and later these ideas have been extended to Appell polynomial sequences. In fact, in 2010, Costabile and Longo ([17]) proposed an algebraic and elementary approach to Appell polynomial sequences. At the same time, Yang and Youn ([18]) also gave an algebraic approach, but with different methods. The approach to Appell polynomial sequences via linear algebra is an easily comprehensible mathematical tool, specially for non-specialists; that is very good because many polynomials arise in physics, chemistry and engineering. The present work concerns with these topics and it is organized as follows: in Section 2 we mention the Appell method ([1]); in Section 3 we provide the determinantal approach ([17]) and prove the equivalence with other definitions; in Section 4 classical and non-classical examples are given; in Section 5, by using elementary tools of linear algebra, general properties of Appell polynomials are provided; in Section 6 we mention Appell polynomials of second kind ([19, 20]) and, in Section 7 two classical examples are given; in Section 8 we provide an application to general linear interpolation problem([21]), giving, in Section 9, some examples; in Section 10 the Yang and Youn approach ([18]) is sketched; finally,

Let {*An*(*x*)}*<sup>n</sup>* be a sequence of *n*-degree polynomials satisfying the differential relation (2).

**Remark 1.** *There is a one-to-one correspondence of the set of such sequences* {*An*(*x*)}*<sup>n</sup> and the set of*

Equation (3), in particular, shows explicitly that for each *n* ≥ 1 the polynomial *An* (*x*) is completely determined by *An*−<sup>1</sup> (*x*) and by the choice of the constant of integration *αn*.

2! *<sup>α</sup>*<sup>2</sup> <sup>+</sup> ··· <sup>+</sup>

*with α<sup>i</sup> i* = 0, 1, ... *real coefficients, the sequence of polynomials, An*(*x*)*, determined by the power series*

*h*2

He also considered ([1]) an application of these polynomial sequences to linear differential

*hn n*!

2! *<sup>A</sup>*<sup>2</sup> (*x*) <sup>+</sup> ··· <sup>+</sup>

*hn n*!

*<sup>α</sup>n*−2*x*<sup>2</sup> <sup>+</sup> ··· <sup>+</sup> *<sup>α</sup>*0*xn*, *<sup>n</sup>* <sup>=</sup> 0, 1, ... (3)

*α<sup>n</sup>* + ··· , *α*<sup>0</sup> � 0, (4)

*An* (*x*) + ··· , (5)

*n* 2 

*numerical sequences* {*αn*}*<sup>n</sup>* , *α*<sup>0</sup> � 0 *given by the explicit representation*

*h* 1! *<sup>α</sup>*<sup>1</sup> <sup>+</sup>

*h*

*<sup>α</sup>n*−1*<sup>x</sup>* +

*h*2

1! *<sup>A</sup>*<sup>1</sup> (*x*) <sup>+</sup>

The function *a* (*h*) is said, by Appell, 'generating function' of the sequence {*An*(*x*)}*n*. Appell also noticed various examples of sequences of polynomials verifying (2).

*n* 1 

in Section 11 conclusions close the work.

*An* (*x*) = *α<sup>n</sup>* +

**Remark 2.** *Given the formal power series*

*expansion of the product a* (*h*)*ehx, i.e.*

equations, which is out of this context.

*a* (*h*) = *α*<sup>0</sup> +

*a* (*h*)*ehx* = *A*<sup>0</sup> (*x*) +

**2. The Appell approach**

Then we have

*satisfies (2).*

**Definition 1.** *The polynomial sequence defined by*

$$\begin{cases} A\_{0}\left(\mathbf{x}\right) = \frac{1}{\beta\_{0}},\\ \begin{bmatrix} 1 & \mathbf{x} & \mathbf{x}^{2} & \cdots & \cdots & \mathbf{x}^{n-1} & \mathbf{x}^{n} \\ \beta\_{0}\left\beta\_{1} & \beta\_{2} & \cdots & \cdots & \beta\_{n-1} & \beta\_{n} \\ 0 & \beta\_{0}\left(\frac{1}{2}\right)\beta\_{1} & \cdots & \cdots & \binom{n-1}{1}\beta\_{n-2}\left(\frac{n}{2}\right)\beta\_{n-1} \\ 0 & 0 & \beta\_{0} & \cdots & \cdots & \binom{n-1}{2}\beta\_{n-3}\left(\frac{n}{2}\right)\beta\_{n-2} \\ \vdots & \ddots & \vdots & \vdots \\ \vdots & & \ddots & \vdots & \vdots \\ 0 & \cdots & \cdots & \cdots & 0 & \beta\_{0} & \binom{n}{n-1}\beta\_{1} \end{bmatrix}, n = 1, 2, \ldots \end{cases} (6)$$

*is called Appell polynomial sequence for βi*.

Then we have

**Theorem 1.** *If An* (*x*) *is the Appell polynomial sequence for β<sup>i</sup> the differential relation (2) holds.*

*Proof.* Using the properties of linearity we can differentiate the determinant (6), expand the resulting determinant with respect to the first column and recognize the factor *An*−<sup>1</sup> (*x*) after multiplication of the *<sup>i</sup>*-th row by *<sup>i</sup>* <sup>−</sup> 1, *<sup>i</sup>* <sup>=</sup> 2, ..., *<sup>n</sup>* and *<sup>j</sup>*-th column by <sup>1</sup> *<sup>j</sup>* , *j* = 1, ..., *n*.

**Theorem 2.** *If An* (*x*) *is the Appell polynomial sequence for β<sup>i</sup> we have the equality (3) with*

$$\begin{aligned} \mathbf{a}\_{0} &= \frac{1}{\beta\_{0}}, \end{aligned} \tag{7}$$

$$\mathbf{a}\_{i} = \frac{1}{(\beta\_{0})^{i+1}} \begin{vmatrix} \beta\_{1} & \beta\_{2} & \cdots & \cdots & \beta\_{i-1} & \beta\_{i} \\ & \beta\_{0} \binom{i}{2}\beta\_{1} & \cdots & \cdots & \binom{i-1}{2}\beta\_{i-2} & \binom{i}{1}\beta\_{i-1} \\ 0 & \beta\_{0} & \cdots & \cdots & \binom{i-1}{2}\beta\_{i-3} & \binom{i}{2}\beta\_{i-2} \\ \vdots & & \ddots & & \vdots & \vdots \\ \vdots & & \ddots & & \vdots \\ 0 & \cdots & \cdots & 0 & \beta\_{0} & \binom{i}{i-1}\beta\_{1} \end{vmatrix} = $$

$$= -\frac{1}{\beta\_{0}} \sum\_{k=0}^{i-1} \binom{i}{k} \beta\_{i-k} \mathbf{a}\_{k'} \qquad i = 1, 2, \dots, n. \tag{8}$$

#### 4 Will-be-set-by-IN-TECH 24 Linear Algebra – Theorems and Applications

*Proof.* From (6), by expanding the determinant *An* (*x*) with respect to the first row, we obtain the (3) with *α<sup>i</sup>* given by (7) and the determinantal form in (8); this is a determinant of an upper Hessenberg matrix of order *<sup>i</sup>* ([16]), then setting *<sup>α</sup><sup>i</sup>* = (−1)*<sup>i</sup>* (*β*0) *<sup>i</sup>*+<sup>1</sup> *<sup>α</sup><sup>i</sup>* for *<sup>i</sup>* <sup>=</sup> 1, 2, ..., *<sup>n</sup>*, we have

$$
\overline{\mathfrak{a}}\_{i} = \sum\_{k=0}^{i-1} (-1)^{i-k-1} h\_{k+1,i} q\_k \left( i \right) \overline{\mathfrak{a}}\_{k'} \tag{9}
$$

where:

$$h\_{l,m} = \begin{cases} \beta\_m & \text{for } l = 1, \\ \binom{m}{l-1} \beta\_{m-l+1} & \text{for } 1 < l \le m+1, \\ 0 & \text{for } l > m+1, \end{cases} \quad l, m = 1, 2, \dots, \mathbf{i}, \tag{10}$$

$$q\_k\left(i\right) = \prod\_{j=k+2}^{l} h\_{j,j-1} = \left(\beta\_0\right)^{i-k-1}, \quad k = 0, 1, \ldots, i-2,\tag{11}$$

$$q\_{i-1}\left(i\right) = 1.\tag{12}$$

By virtue of the previous setting, (9) implies

$$\begin{aligned} \overline{\mathfrak{a}}\_{i} &= \sum\_{k=0}^{i-2} (-1)^{i-k-1} \binom{i}{k} \beta\_{i-k} \left(\beta\_{0}\right)^{i-k-1} \overline{\mathfrak{a}}\_{k} + \binom{i}{i-1} \beta\_{1} \overline{\mathfrak{a}}\_{i-1} = \\ &= (-1)^{i} \left(\beta\_{0}\right)^{i+1} \left(-\frac{1}{\beta\_{0}} \sum\_{k=0}^{i-1} \binom{i}{k} \beta\_{i-k} \mathfrak{a}\_{k}\right). \end{aligned}$$

and the proof is concluded.

**Remark 3.** *We note that (7) and (8) are equivalent to*

$$\sum\_{k=0}^{i} \binom{i}{k} \beta\_{i-k} \alpha\_k = \begin{cases} 1 & i = 0 \\ 0 & i > 0 \end{cases} \tag{13}$$

*and that for each sequence of Appell polynomials there exist two sequences of numbers α<sup>i</sup> and β<sup>i</sup> related by (13).*

**Corollary 1.** *If An* (*x*) *is the Appell polynomial sequence for β<sup>i</sup> we have*

$$A\_{\boldsymbol{n}}\left(\mathbf{x}\right) = \sum\_{j=0}^{n} \binom{n}{j} A\_{\boldsymbol{n}-j}\left(\mathbf{0}\right) \mathbf{x}^{j}\,,\,\boldsymbol{n} = \mathbf{0}\text{,}\,\mathbf{1}\,,\tag{14}$$

*Proof.* Follows from Theorem 2 being

$$A\_{\dot{i}}\left(0\right) = a\_{\dot{i}\prime} \quad \dot{\iota} = 0, 1, \ldots, n. \tag{15}$$

**Remark 4.** *For computation we can observe that αn is a n-order determinant of a particular upper Hessenberg form and it's known that the algorithm of Gaussian elimination without pivoting for computing the determinant of an upper Hessenberg matrix is stable ([22, p. 27]).*

**Theorem 3.** *If a*(*h*) *is the function defined in (4) and An* (*x*) *is the polynomial sequence defined by (5), setting*

$$\begin{cases} \beta\_0 = \frac{1}{a\_0},\\ \beta\_n = -\frac{1}{a\_0} \left( \sum\_{k=1}^n \binom{n}{k} a\_k \beta\_{n-k} \right), \quad n = 1, 2, \dots \end{cases} \tag{16}$$

*we have that An*(*x*) *satisfies the (6), i.e. An*(*x*) *is the Appell polynomial sequence for βi.*

*Proof.* Let be

4 Will-be-set-by-IN-TECH

*Proof.* From (6), by expanding the determinant *An* (*x*) with respect to the first row, we obtain the (3) with *α<sup>i</sup>* given by (7) and the determinantal form in (8); this is a determinant of an upper

*<sup>i</sup>*+<sup>1</sup> *<sup>α</sup><sup>i</sup>* for *<sup>i</sup>* <sup>=</sup> 1, 2, ..., *<sup>n</sup>*, we

*l*, *m* = 1, 2, ..., *i*, (10)

*<sup>i</sup>*−*k*−<sup>1</sup> *hk*<sup>+</sup>1,*iqk* (*i*) *<sup>α</sup>k*, (9)

*<sup>i</sup>*−*k*−<sup>1</sup> , *<sup>k</sup>* <sup>=</sup> 0, 1, ..., *<sup>i</sup>* <sup>−</sup> 2, (11)

*<sup>β</sup>*1*αi*−<sup>1</sup> =

<sup>0</sup> *<sup>i</sup>* <sup>&</sup>gt; <sup>0</sup> (13)

, *n* = 0, 1, ... (14)

*Ai* (0) = *αi*, *i* = 0, 1, ..., *n*. (15)

*qi*−<sup>1</sup> (*i*) = 1. (12)

� , � *i i* − 1

�

*<sup>i</sup>*−*k*−<sup>1</sup> *<sup>α</sup><sup>k</sup>* <sup>+</sup>

*βi*−*kα<sup>k</sup>*

� 1 *i* = 0

Hessenberg matrix of order *<sup>i</sup>* ([16]), then setting *<sup>α</sup><sup>i</sup>* = (−1)*<sup>i</sup>* (*β*0)

*i*−1 ∑ *k*=0

*β<sup>m</sup>* for *l* = 1,

(−1)

*<sup>l</sup>*−1)*βm*−*l*+<sup>1</sup> for 1 <sup>&</sup>lt; *<sup>l</sup>* <sup>≤</sup> *<sup>m</sup>* <sup>+</sup> 1, 0 for *l* > *m* + 1,

*hj*,*j*−<sup>1</sup> = (*β*0)

*<sup>β</sup>i*−*<sup>k</sup>* (*β*0)

*i*−1 ∑ *k*=0 �*i k* �

*<sup>β</sup>i*−*kα<sup>k</sup>* =

*and that for each sequence of Appell polynomials there exist two sequences of numbers α<sup>i</sup> and β<sup>i</sup> related*

**Remark 4.** *For computation we can observe that αn is a n-order determinant of a particular upper Hessenberg form and it's known that the algorithm of Gaussian elimination without pivoting for*

*An*−*<sup>j</sup>* (0) *<sup>x</sup><sup>j</sup>*

*α<sup>i</sup>* =

*i* ∏ *j*=*k*+2

*i*−*k*−1 �*i k* �

> *i* ∑ *k*=0

**Corollary 1.** *If An* (*x*) *is the Appell polynomial sequence for β<sup>i</sup> we have*

*An* (*x*) =

�*i k* �

*n* ∑ *j*=0

*computing the determinant of an upper Hessenberg matrix is stable ([22, p. 27]).*

�*n j* �

*hl*,*<sup>m</sup>* =

⎧ ⎨ ⎩

*qk* (*i*) =

By virtue of the previous setting, (9) implies

*i*−2 ∑ *k*=0

= (−1) *i* (*β*0) *i*+1 � − 1 *β*0

**Remark 3.** *We note that (7) and (8) are equivalent to*

(−1)

*α<sup>i</sup>* =

*Proof.* Follows from Theorem 2 being

and the proof is concluded.

*by (13).*

( *<sup>m</sup>*

have

where:

$$\mathfrak{b}(h) = \mathfrak{f}\_0 + \frac{h}{1!} \mathfrak{f}\_1 + \frac{h^2}{2!} \mathfrak{f}\_2 + \dots + \frac{h^n}{n!} \mathfrak{f}\_n + \dotsb \tag{17}$$

with *β<sup>n</sup>* as in (16). Then we have *a* (*h*) *b* (*h*) = 1, where the product is intended in the Cauchy sense, i.e.:

$$a\left(h\right)b\left(h\right) = \sum\_{n=0}^{\infty} \sum\_{k=0}^{n} \binom{n}{k} \alpha\_k \beta\_{n-k} \frac{h^n}{n!}$$

Let us multiply both hand sides of equation

$$a(h)e^{h\mathbf{x}} = \sum\_{n=0}^{\infty} A\_{\mathbb{N}}\left(\mathbf{x}\right) \frac{h^n}{n!} \tag{18}$$

for <sup>1</sup> *<sup>a</sup>* (*h*) and, in the same equation, replace functions *<sup>e</sup>hx* and <sup>1</sup> *<sup>a</sup>* (*h*) by their Taylor series expansion at the origin; then (18) becomes

$$\sum\_{n=0}^{\infty} \frac{x^n h^n}{n!} = \sum\_{n=0}^{\infty} A\_n \left( x \right) \frac{h^n}{n!} \sum\_{n=0}^{\infty} \frac{h^n}{n!} \beta\_n. \tag{19}$$

By multiplying the series on the left hand side of (19) according to the Cauchy-product rules, previous equality leads to the following system of infinite equations in the unknown *An* (*x*), *n* = 0, 1, ...

$$\begin{cases} A\_0\left(\mathbf{x}\right)\boldsymbol{\beta}\_0 = 1, \\ A\_0\left(\mathbf{x}\right)\boldsymbol{\beta}\_1 + A\_1\left(\mathbf{x}\right)\boldsymbol{\beta}\_0 = \mathbf{x}, \\ A\_0\left(\mathbf{x}\right)\boldsymbol{\beta}\_2 + \binom{2}{1} A\_1\left(\mathbf{x}\right)\boldsymbol{\beta}\_1 + A\_2\left(\mathbf{x}\right)\boldsymbol{\beta}\_0 = \mathbf{x}^2, \\ \vdots \\ A\_0\left(\mathbf{x}\right)\boldsymbol{\beta}\_{\boldsymbol{n}} + \binom{n}{1} A\_1\left(\mathbf{x}\right)\boldsymbol{\beta}\_{\boldsymbol{n}-1} + \dots + A\_n\left(\mathbf{x}\right)\boldsymbol{\beta}\_0 = \mathbf{x}^n, \\ \vdots \end{cases} \tag{20}$$

From the first one of (20) we obtain the first one of (6). Moreover, the special form of the previous system (lower triangular) allows us to work out the unknown *An* (*x*) operating with the first *n* + 1 equations, only by applying the Cramer rule:

6 Will-be-set-by-IN-TECH 26 Linear Algebra – Theorems and Applications

$$A\_{\boldsymbol{n}}\left(\mathbf{x}\right) = \frac{1}{\left(\boldsymbol{\beta}\_{0}\right)^{\boldsymbol{n}+1}} \begin{vmatrix} \boldsymbol{\beta}\_{0} & \boldsymbol{0} & \boldsymbol{0} & \cdots & \boldsymbol{0} & \boldsymbol{1} \\ \boldsymbol{\beta}\_{1} & \boldsymbol{\beta}\_{0} & \boldsymbol{0} & \cdots & \boldsymbol{0} & \boldsymbol{x} \\ \boldsymbol{\beta}\_{2} & \binom{2}{1}\boldsymbol{\beta}\_{1} & \boldsymbol{\beta}\_{0} & \cdots & \boldsymbol{0} & \boldsymbol{x}^{2} \\ \vdots & & \ddots & & \vdots \\ \boldsymbol{\beta}\_{n-1} & \binom{n-1}{1}\boldsymbol{\beta}\_{n-2} & \cdots & \cdots & \boldsymbol{\beta}\_{0} & \boldsymbol{x}^{n-1} \\ \boldsymbol{\beta}\_{n} & \binom{n}{1}\boldsymbol{\beta}\_{n-1} & \cdots & \cdots & \binom{n}{n-1}\boldsymbol{\beta}\_{1} & \boldsymbol{x}^{n} \\ \end{vmatrix}$$

By transposition of the previous, we have

$$A\_n(\mathbf{x}) = \frac{1}{\left(\beta\_0\right)^{n+1}} \begin{vmatrix} \beta\_0 \ \beta\_1 & \beta\_2 & \cdots & \beta\_{n-1} & \beta\_n \\ 0 & \beta\_0 \ \binom{n}{1} \beta\_1 & \cdots & \binom{n-1}{1} \beta\_{n-2} & \binom{n}{1} \beta\_{n-1} \\ 0 & 0 & \beta\_0 & & \vdots \\ \vdots & & \ddots & & \vdots \\ 0 & 0 & 0 & \cdots & \beta\_0 & \binom{n}{n-1} \beta\_1 \\ 1 & x & x^2 & \cdots & x^{n-1} & x^n \end{vmatrix}, \quad n = 1, 2, \ldots, \tag{21}$$

.

that is exactly the second one of (6) after *n* circular row exchanges: more precisely, the *i*-th row moves to the (*i* + 1)-th position for *i* = 1, . . . , *n* − 1, the *n*-th row goes to the first position.

**Definition 2.** *The function a* (*h*)*ehx, as in (4) and (5), is said 'generating function' of the Appell polynomial sequence An* (*x*) *for βi.*

Theorems 1, 2, 3 concur to assert the validity of following

**Theorem 4** (Circular)**.** *If An* (*x*) *is the Appell polynomial sequence for β<sup>i</sup> we have*

$$(6) \Rightarrow (2) \Rightarrow (3) \Rightarrow (5) \Rightarrow (6).$$

*Proof.*

**(6)**⇒**(2):** Follows from Theorem 1.


**Remark 5.** *In virtue of the Theorem 4, any of the relations (2), (3), (5), (6) can be assumed as definition of Appell polynomial sequences.*

#### **4. Examples of Appell polynomial sequences**

The following are classical examples of Appell polynomial sequences.

**a)** Bernoulli polynomials ([17, 23]):

6 Will-be-set-by-IN-TECH

*β*<sup>0</sup> 0 0 ··· 0 1 *β*<sup>1</sup> *β*<sup>0</sup> 0 ··· 0 *x*

. ... .

<sup>1</sup>)*βn*−<sup>1</sup> ··· ··· ( *<sup>n</sup>*

<sup>1</sup>)*β*<sup>1</sup> *<sup>β</sup>*<sup>0</sup> ··· <sup>0</sup> *<sup>x</sup>*<sup>2</sup>

<sup>1</sup> )*βn*−<sup>2</sup> ··· ··· *<sup>β</sup>*<sup>0</sup> *<sup>x</sup>n*−<sup>1</sup>

*n* <sup>1</sup>)*βn*−<sup>1</sup>

. . .

. .

*<sup>n</sup>*−1)*β*<sup>1</sup>

. .  

.

, *n* = 1, 2, ..., (21)

*<sup>n</sup>*−1)*β*<sup>1</sup> *<sup>x</sup><sup>n</sup>*

 

*An* (*x*) = <sup>1</sup>

By transposition of the previous, we have

*An* (*x*) <sup>=</sup> <sup>1</sup>

*polynomial sequence An* (*x*) *for βi.*

**(6)**⇒**(2):** Follows from Theorem 1.

*n*! . **(5)**⇒**(6):** Follows from Theorem 3.

*Proof.*

equation (2).

coefficients of *<sup>h</sup><sup>n</sup>*

*of Appell polynomial sequences.*

(*β*0) *n*+1

(*β*0) *n*+1

> 

. .

Theorems 1, 2, 3 concur to assert the validity of following

0 *β*<sup>0</sup> ( 2 <sup>1</sup>)*β*<sup>1</sup> ··· (

0 0 *β*<sup>0</sup>

**Theorem 4** (Circular)**.** *If An* (*x*) *is the Appell polynomial sequence for β<sup>i</sup> we have*

 

*β*<sup>2</sup> ( 2

*n*−1

*β*<sup>0</sup> *β*<sup>1</sup> *β*<sup>2</sup> ··· *βn*−<sup>1</sup> *β<sup>n</sup>*

. ... .

00 0 ··· *<sup>β</sup>*<sup>0</sup> ( *<sup>n</sup>*

that is exactly the second one of (6) after *n* circular row exchanges: more precisely, the *i*-th row moves to the (*i* + 1)-th position for *i* = 1, . . . , *n* − 1, the *n*-th row goes to the first position.

**Definition 2.** *The function a* (*h*)*ehx, as in (4) and (5), is said 'generating function' of the Appell*

(6) ⇒ (2) ⇒ (3) ⇒ (5) ⇒ (6).

**(2)**⇒**(3):** Follows from Theorem 2, or more simply by direct integration of the differential

**(3 )**⇒**(5):** Follows ordering the Cauchy product of the developments *<sup>a</sup>*(*h*) and *<sup>e</sup>hx* with respect to the powers of *h* and recognizing polynomials *An*(*x*), expressed in form (3), as

**Remark 5.** *In virtue of the Theorem 4, any of the relations (2), (3), (5), (6) can be assumed as definition*

<sup>1</sup> *x x*<sup>2</sup> ··· *<sup>x</sup>n*−<sup>1</sup> *<sup>x</sup><sup>n</sup>*

*n*−1 <sup>1</sup> )*βn*−<sup>2</sup> (

. .

*βn*−<sup>1</sup> (

*β<sup>n</sup>* ( *n*

$$\beta\_i = \frac{1}{i+1}, \quad i = 0, 1, \ldots \tag{22}$$

$$a(h) = \frac{h}{e^h - 1};\tag{23}$$

**b)** Euler polynomials ([17, 23]):

$$
\beta\_0 = 1, \quad \beta\_i = \frac{1}{2}, \quad i = 1, 2, \dots \tag{24}
$$

$$a(h) = \frac{2}{e^h + 1};\tag{25}$$

**c)** Normalized Hermite polynomials ([17, 24]):

$$\beta\_{\dot{l}} = \frac{1}{\sqrt{\pi}} \int\_{-\infty}^{+\infty} e^{-\mathbf{x}^2} \mathbf{x}^{\dot{l}} d\mathbf{x} = \begin{cases} 0 & \text{for } i \text{ odd} \\ \frac{(i-1)(i-3)\cdots 3\cdot 1}{2^{\frac{\dot{l}}{2}}} & \text{for } i \text{ even} \end{cases} \quad \dot{\imath} = 0, 1, \ldots \tag{26}$$

$$a(h) = e^{-\frac{h^2}{4}};$$

$$a(h) = e^{-\frac{h^2}{4}};\tag{27}$$

**d)** Laguerre polynomials ([17, 24]):

$$\beta\_{\bar{i}} = \int\_0^{+\infty} e^{-\mathbf{x}} \mathbf{x}^{\bar{i}} d\mathbf{x} = \Gamma \left( \bar{i} + 1 \right) = \bar{i}!, \quad \bar{i} = 0, 1, \dots \tag{28}$$

$$a(h) = 1 - h;\tag{29}$$

The following are non-classical examples of Appell polynomial sequences.

#### **e)** Generalized Bernoulli polynomials

• with Jacobi weight ([17]):

$$\beta\_i = \int\_0^1 (1-x)^a x^\beta x^i dx = \frac{\Gamma(a+1)\,\Gamma\left(\beta+i+1\right)}{\Gamma\left(a+\beta+i+2\right)}, \quad a, \beta > -1, \quad i = 0, 1, \ldots \tag{30}$$

$$a(h) = \frac{1}{\int\_0^1 (1-x)^a x^\beta e^{hx} dx};\tag{31}$$

• of order *k* ([11]):

$$\beta\_i = \left(\frac{1}{i+1}\right)^k \text{ , } k \text{ integer}, \quad i = 0, 1, \dots \tag{32}$$

$$a(h) = \left(\frac{h}{e^h - 1}\right)^k;\tag{33}$$

#### 8 Will-be-set-by-IN-TECH 28 Linear Algebra – Theorems and Applications

**f)** Central Bernoulli polynomials ([25]):

$$
\beta\_{2i} = \frac{1}{i+1},
$$

$$
\beta\_{2i+1} = 0, \quad i = 0, 1, \dots \tag{34}
$$

$$a(h) = \frac{h}{\sinh(h)};$$

**g)** Generalized Euler polynomials ([17]):

$$\beta\_0 = 1,$$

$$\beta\_i = \frac{w\_1}{w\_1 + w\_2}, \quad w\_{1\prime} w\_2 > 0, \quad i = 1, 2, \dots \tag{36}$$

$$a(h) = \frac{w\_1 + w\_2}{w\_1 e^h + w\_2};\tag{37}$$

**h)** Generalized Hermite polynomials ([17]):

$$\begin{split} \beta\_{i} &= \frac{1}{\sqrt{\pi}} \int\_{-\infty}^{+\infty} e^{-|x|^{\
u}} x^{i} dx \\ &= \begin{cases} 0 & \text{for } i \text{ odd} \\ \frac{2}{a\sqrt{\pi}} \Gamma\left(\frac{i+1}{a}\right) & \text{for } i \text{ even} \end{cases} \quad \begin{split} i = 0, 1, ..., \\ \alpha > 0, \end{split} \end{split} \tag{38}$$

$$a(h) = \frac{\sqrt{\pi}}{\int\_{-\infty}^{\infty} e^{-|\mathbf{x}|^{k}} e^{h\mathbf{x}} d\mathbf{x}};\tag{39}$$

**i)** Generalized Laguerre polynomials ([17]):

$$\begin{split} \beta\_{i} &= \int\_{0}^{+\infty} e^{-ax} x^{i} dx \\ &= \frac{\Gamma\left(i+1\right)}{a^{i+1}} = \frac{i!}{a^{i+1}}, \quad a > 0, \quad i = 0, 1, \dots \end{split} \tag{40}$$

$$a(h) = a - h.\tag{41}$$

#### **5. General properties of Appell polynomials**

By elementary tools of linear algebra we can prove the general properties of Appell polynomials.

Let *An* (*x*), *n* = 0, 1, ..., be a polynomial sequence and *β<sup>i</sup>* ∈ **R**, *i* = 0, 1, ..., with *β*<sup>0</sup> �= 0.

**Theorem 5** (Recurrence)**.** *An* (*x*) *is the Appell polynomial sequence for β<sup>i</sup> if and only if*

$$A\_{\mathfrak{n}}(\mathbf{x}) = \frac{1}{\beta\_0} \left( \mathbf{x}^n - \sum\_{k=0}^{n-1} \binom{n}{k} \beta\_{n-k} A\_k(\mathbf{x}) \right), \quad n = 1, 2, \ldots \tag{42}$$

*Proof.* Follows observing that the following holds:

8 Will-be-set-by-IN-TECH

*β*2*i*+<sup>1</sup> = 0, *i* = 0, 1, ..., (34)

, *w*1, *w*<sup>2</sup> > 0, *i* = 1, 2, ..., (36)

; (37)

*<sup>e</sup>hxdx* ; (39)

*<sup>α</sup>i*+<sup>1</sup> , *<sup>α</sup>* <sup>&</sup>gt; 0, *<sup>i</sup>* <sup>=</sup> 0, 1, ..., (40)

*<sup>α</sup>* <sup>&</sup>gt; 0, (38)

, *n* = 1, 2, ... (42)

for *<sup>i</sup>* even , *<sup>i</sup>* <sup>=</sup> 0, 1, ...,

*a*(*h*) = *α* − *h*. (41)

; (35)

*<sup>β</sup>*2*<sup>i</sup>* <sup>=</sup> <sup>1</sup> *i* + 1 ,

*<sup>a</sup>*(*h*) = *<sup>h</sup>*

sinh(*h*)

**f)** Central Bernoulli polynomials ([25]):

**g)** Generalized Euler polynomials ([17]):

**h)** Generalized Hermite polynomials ([17]):

**i)** Generalized Laguerre polynomials ([17]):

polynomials.

*β*<sup>0</sup> = 1,

*<sup>β</sup><sup>i</sup>* <sup>=</sup> <sup>1</sup> <sup>√</sup>*<sup>π</sup>*

=

*a*(*h*) =

*β<sup>i</sup>* =

**5. General properties of Appell polynomials**

*An*(*x*) = <sup>1</sup>

*β*0

 *<sup>x</sup><sup>n</sup>* <sup>−</sup>

*<sup>β</sup><sup>i</sup>* <sup>=</sup> *<sup>w</sup>*<sup>1</sup>

*<sup>a</sup>*(*h*) = *<sup>w</sup>*<sup>1</sup> <sup>+</sup> *<sup>w</sup>*<sup>2</sup>

*w*<sup>1</sup> + *w*<sup>2</sup>

*w*1*e<sup>h</sup>* + *w*<sup>2</sup>

 +∞ −∞ *e* −|*x*| *α xi dx*

2 *α* <sup>√</sup>*<sup>π</sup>* <sup>Γ</sup>

 <sup>∞</sup> <sup>−</sup><sup>∞</sup> *<sup>e</sup>*−|*x*<sup>|</sup> *α*

 +∞ 0

<sup>=</sup> <sup>Γ</sup> (*<sup>i</sup>* <sup>+</sup> <sup>1</sup>)

*e* <sup>−</sup>*α<sup>x</sup> x<sup>i</sup> dx*

*<sup>α</sup>i*+<sup>1</sup> <sup>=</sup> *<sup>i</sup>*!

By elementary tools of linear algebra we can prove the general properties of Appell

Let *An* (*x*), *n* = 0, 1, ..., be a polynomial sequence and *β<sup>i</sup>* ∈ **R**, *i* = 0, 1, ..., with *β*<sup>0</sup> �= 0. **Theorem 5** (Recurrence)**.** *An* (*x*) *is the Appell polynomial sequence for β<sup>i</sup> if and only if*

> *n k*

*<sup>β</sup>n*−*kAk* (*x*)

*n*−1 ∑ *k*=0

0 for *i* odd

 *<sup>i</sup>*+<sup>1</sup> *α* 

<sup>√</sup>*<sup>π</sup>*

$$A\_n(\mathbf{x}) = \frac{\begin{vmatrix} 1 & x & x^2 & \cdots & \cdots & x^{n-1} & x^n \\ \beta\_0 \ \beta\_1 & \beta\_2 & \cdots & \cdots & \beta\_{n-1} & \beta\_n \\ 0 & \beta\_0 \ \binom{2}{1} \beta\_1 & \cdots & \cdots & \binom{n-1}{1} \beta\_{n-2} & \binom{n}{1} \beta\_{n-1} \\ 0 & 0 & \beta\_0 & \cdots & \cdots & \binom{n-1}{2} \beta\_{n-3} & \binom{n}{2} \beta\_{n-2} \\ \vdots & & & \ddots & & \vdots \\ \vdots & & & \ddots & \vdots & \vdots \\ \vdots & & & \ddots & \vdots & \vdots \\ 0 & \cdots & \cdots & \cdots & 0 & \beta\_0 & \binom{n}{n-1} \beta\_1 \end{vmatrix}}{\begin{vmatrix} \vdots \\ 0 \end{vmatrix}} + \dots \quad \begin{vmatrix} \beta\_0 \\ \beta\_1 \end{vmatrix} \begin{vmatrix} \beta\_0 \\ \beta\_1 \end{vmatrix} \tag{4.18}$$

$$= \frac{1}{\beta\_0} \left( x^n - \sum\_{k=0}^{n-1} \binom{n}{k} \beta\_{n-k} A\_k \left(x\right) \right), \quad n = 1, 2, \dots \tag{4.3}$$

$$\text{is the Aussul polynomial sequences for } \beta\_0 \text{ from (6), we can observe that } A\_n(\mathbf{x}) \text{ (4.1)}$$

In fact, if *An* (*x*) is the Appell polynomial sequence for *βi*, from (6), we can observe that *An*(*x*) is a determinant of an upper Hessenberg matrix of order *n* + 1 ([16]) and, proceeding as in Theorem 2, we can obtain the (43).

**Corollary 2.** *If An* (*x*) *is the Appell polynomial sequence for β<sup>i</sup> then*

$$\mathbf{x}^{n} = \sum\_{k=0}^{n} \binom{n}{k} \beta\_{n-k} A\_{k} \left( \mathbf{x} \right), \quad n = 0, 1, \ldots \tag{44}$$

*Proof.* Follows from (42).

**Corollary 3.** *Let* P*<sup>n</sup> be the space of polynomials of degree* ≤ *n and* {*An*(*x*)}*<sup>n</sup> be an Appell polynomial sequence, then* {*An*(*x*)}*<sup>n</sup> is a basis for* P*n.*

*Proof.* If we have

$$P\_n(\mathbf{x}) = \sum\_{k=0}^n a\_{n,k} \mathbf{x}^k \quad a\_{n,k} \in \mathbb{R}\_\prime \tag{45}$$

then, by Corollary 2, we get

$$P\_n(\mathbf{x}) = \sum\_{k=0}^n a\_{n,k} \sum\_{j=0}^k \binom{k}{j} \beta\_{k-j} A\_j(\mathbf{x}) = \sum\_{k=0}^n c\_{n,k} A\_k(\mathbf{x}),$$

where

$$c\_{n,k} = \sum\_{j=0}^{n-k} \binom{k+j}{k} a\_{k+j} \beta\_j. \tag{46}$$

**Remark 6.** *An alternative recurrence relation can be determined from (5) after differentiation with respect to h ([18, 26]).*

Let be *βi*, *γ<sup>i</sup>* ∈ **R**, *i* = 0, 1, ..., with *β*0, *γ*<sup>0</sup> �= 0.

Let us consider the Appell polynomial sequences *An* (*x*) and *Bn* (*x*), *n* = 0, 1, ..., for *β<sup>i</sup>* and *γi*, respectively, and indicate with (*AB*)*<sup>n</sup>* (*x*) the polynomial that is obtained replacing in *An* (*x*) the powers *x*0, *x*1, ..., *xn*, respectively, with the polynomials *B*<sup>0</sup> (*x*), *B*<sup>1</sup> (*x*), ..., *Bn* (*x*). Then we have

**Theorem 6.** *The sequences*


*are sequences of Appell polynomials again.*

*Proof. i*) Follows from the property of linearity of determinant.

*ii*) Expanding the determinant (*AB*)*<sup>n</sup>* (*x*) with respect to the first row we obtain

$$(AB)\_n\left(\mathbf{x}\right) = \frac{(-1)^n}{\left(\beta\_0\right)^{n+1}} \sum\_{j=0}^n (-1)^j \left(\beta\_0\right)^j \binom{n}{j} \overline{\pi}\_{n-j} B\_j\left(\mathbf{x}\right) =$$

$$= \sum\_{j=0}^n \frac{(-1)^{n-j}}{\left(\beta\_0\right)^{n-j+1}} \binom{n}{j} \overline{\pi}\_{n-j} B\_j\left(\mathbf{x}\right) \,,\tag{47}$$

where

$$
\overline{\alpha}\_{i} = \begin{vmatrix}
\overline{\alpha}\_{0} = 1, \\
& \begin{vmatrix}
\beta\_{1} & \beta\_{2} & \cdots & \cdots & \beta\_{i-1} & \beta\_{i} \\
\beta\_{0} \binom{i}{1}\beta\_{1} & \cdots & \cdots & \binom{i-1}{1}\beta\_{i-2} \binom{i}{1}\beta\_{i-1} \\
0 & \beta\_{0} & \cdots & \cdots & \binom{i-1}{2}\beta\_{i-3} \binom{i}{2}\beta\_{i-2} \\
\vdots & & \ddots & \vdots & \vdots \\
\vdots & & \ddots & \vdots & \vdots \\
\vdots & & \ddots & \vdots & \vdots \\
0 & \cdots & \cdots & 0 & \beta\_{0} & \binom{i}{i-1}\beta\_{1}
\end{vmatrix}, \quad i = 1, 2, \dots, n.
$$

We observe that

$$A\_i \begin{pmatrix} 0 \end{pmatrix} = \frac{(-1)^i}{\left(\beta\_0\right)^{i+1}} \overline{a}\_{i\prime} \quad i = 1, 2, \dots, n.$$

and hence (47) becomes

$$(AB)\_n \begin{pmatrix} \mathbf{x} \end{pmatrix} = \sum\_{j=0}^n \binom{n}{j} A\_{n-j} \begin{pmatrix} \mathbf{0} \end{pmatrix} B\_j \begin{pmatrix} \mathbf{x} \end{pmatrix}.\tag{48}$$

Differentiating both hand sides of (48) and since *Bj* (*x*) is a sequence of Appell polynomials, we deduce

$$\left(\left(AB\right)\_{\mathfrak{n}}\left(\mathbf{x}\right)\right)' = \mathfrak{n}\left(AB\right)\_{\mathfrak{n}-1}\left(\mathbf{x}\right).\tag{49}$$

Let us, now, introduce the Appell vector.

**Definition 3.** *If An* (*x*) *is the Appell polynomial sequence for β<sup>i</sup> the vector of functions An* (*x*) = [*A*0(*x*), ..., *An*(*x*)]*<sup>T</sup> is called Appell vector for βi.*

Then we have

10 Will-be-set-by-IN-TECH

Let us consider the Appell polynomial sequences *An* (*x*) and *Bn* (*x*), *n* = 0, 1, ..., for *β<sup>i</sup>* and *γi*, respectively, and indicate with (*AB*)*<sup>n</sup>* (*x*) the polynomial that is obtained replacing in *An* (*x*) the powers *x*0, *x*1, ..., *xn*, respectively, with the polynomials *B*<sup>0</sup> (*x*), *B*<sup>1</sup> (*x*), ..., *Bn* (*x*). Then we

Let be *βi*, *γ<sup>i</sup>* ∈ **R**, *i* = 0, 1, ..., with *β*0, *γ*<sup>0</sup> �= 0.

have

*ii*) (*AB*)*<sup>n</sup>* (*x*)

where

We observe that

we deduce

and hence (47) becomes

**Theorem 6.** *The sequences*

*i*) *λAn* (*x*) + *μBn* (*x*), *λ*, *μ* ∈ **R**,

*are sequences of Appell polynomials again.*

*α*<sup>0</sup> = 1,

 

*β*<sup>0</sup> ( 2

> . .

> . .

*α<sup>i</sup>* =

*Proof. i*) Follows from the property of linearity of determinant.

(*AB*)*<sup>n</sup>* (*x*) <sup>=</sup> (−1)

= *n* ∑ *j*=0

*ii*) Expanding the determinant (*AB*)*<sup>n</sup>* (*x*) with respect to the first row we obtain

*n*

(−1) *n*−*j*

*β*<sup>1</sup> *β*<sup>2</sup> ··· ··· *βi*−<sup>1</sup> *β<sup>i</sup>*

<sup>0</sup> ··· ··· <sup>0</sup> *<sup>β</sup>*<sup>0</sup> ( *<sup>i</sup>*

(*β*0)

*n* ∑ *j*=0 *n j* 

Differentiating both hand sides of (48) and since *Bj* (*x*) is a sequence of Appell polynomials,

*Ai* (0) <sup>=</sup> (−1)

(*AB*)*<sup>n</sup>* (*x*) =

*i*−1 <sup>1</sup> )*βi*−<sup>2</sup> (*<sup>i</sup>*

*i*−1 <sup>2</sup> )*βi*−<sup>3</sup> (*<sup>i</sup>*

. . . . .

. . . . .

*i*

<sup>1</sup>)*β*<sup>1</sup> ··· ··· (

. ... .

. ... .

0 *β*<sup>0</sup> ··· ··· (

(*β*0)

*n* ∑ *j*=0

*n*−*j*+1

(−1) *j* (*β*0) *j n j* 

> *n j*

> > <sup>1</sup>)*βi*−<sup>1</sup>

 

<sup>2</sup>)*βi*−<sup>2</sup>

*<sup>i</sup>*−1)*β*<sup>1</sup>

*<sup>i</sup>*+<sup>1</sup> *<sup>α</sup>i*, *<sup>i</sup>* = 1, 2, ..., *<sup>n</sup>*

*<sup>α</sup>n*−*jBj* (*x*) =

, *i* = 1, 2, ..., *n*.

*An*−*<sup>j</sup>* (0) *Bj* (*x*). (48)

((*AB*)*<sup>n</sup>* (*x*))� <sup>=</sup> *<sup>n</sup>* (*AB*)*n*−<sup>1</sup> (*x*). (49)

*<sup>α</sup>n*−*jBj* (*x*), (47)

(*β*0) *n*+1 **Theorem 7** (Matrix form)**.** *Let An* (*x*) *be a vector of polynomial functions. Then An* (*x*) *is the Appell vector for β<sup>i</sup> if and only if, putting*

$$(M)\_{i,j} = \begin{cases} \binom{i}{j} \beta\_{i-j} & i \ge j \\ 0 & \text{otherwise} \end{cases}, \qquad i, j = 0, \ldots, n \tag{50}$$

*and X*(*x*) = [1, *x*, ..., *xn*] *<sup>T</sup> the following relation holds*

$$X(\mathfrak{x}) = M \overline{A}\_{\mathfrak{n}}\left(\mathfrak{x}\right) \tag{51}$$

*or, equivalently,*

$$\overline{A}\_{\mathfrak{n}}\left(\mathbf{x}\right) = \left(M^{-1}\right)X(\mathbf{x})\_{\mathfrak{n}}\tag{52}$$

*being M*−<sup>1</sup> *the inverse matrix of M.*

*Proof.* If *An* (*x*) is the Appell vector for *β<sup>i</sup>* the result easily follows from Corollary 2.

Vice versa, observing that the matrix *M* defined by (50) is invertible, setting

$$\left(M^{-1}\right)\_{i,j} = \begin{cases} \binom{i}{j} \mathfrak{a}\_{i-j} & i \ge j \\ 0 & \text{otherwise} \end{cases}, \qquad i, j = 0, \ldots, n. \tag{53}$$

we have the (52) and therefore the (3) and, being the coefficients *α<sup>k</sup>* and *β<sup>k</sup>* related by (13), we have that *An*(*x*) is the Appell polynomial sequence for *βi*.

**Theorem 8** (Connection constants)**.** *Let An*(*x*) *and Bn*(*x*) *be the Appell vectors for β<sup>i</sup> and γi, respectively. Then*

$$
\overline{A}\_{\mathfrak{n}}(\mathfrak{x}) = \mathbb{C}\overline{B}\_{\mathfrak{n}}(\mathfrak{x}),
\tag{54}
$$

*where*

$$(\mathbf{C})\_{i,j} = \begin{cases} \binom{i}{j} c\_{i-j} & i \ge j \\ 0 & \text{otherwise} \end{cases}, \qquad i, j = 0, \ldots, n. \tag{55}$$

*with*

$$\mathcal{L}\_n = \sum\_{k=0}^n \binom{n}{k} \mathfrak{a}\_{n-k} \gamma\_k. \tag{56}$$

*Proof.* From Theorem 7 we have

$$X(\mathfrak{x}) = M\overline{A}\_n(\mathfrak{x})$$

with *M* as in (50) or, equivalently,

$$
\overline{A}\_n(\mathfrak{x}) = \left(M^{-1}\right)X(\mathfrak{x}),
$$

#### 12 Will-be-set-by-IN-TECH 32 Linear Algebra – Theorems and Applications

with *M*−<sup>1</sup> as in (53).

Always from Theorem 7 we get

with

$$(N)\_{i,j} = \begin{cases} \binom{i}{j} \gamma\_{i-j} & i \ge j \\ 0 & \text{otherwise} \end{cases}, \qquad i, j = 0, \ldots, n.$$

*X*(*x*) = *NBn* (*x*)

Then

$$
\overline{A}\_n\left(\mathbf{x}\right) = M^{-1}N\overline{B}\_n\left(\mathbf{x}\right),
$$

from which, setting *C* = *M*−1*N*, we have the thesis.

**Theorem 9** (Inverse relations)**.** *Let An* (*x*) *be the Appell polynomial sequence for β<sup>i</sup> then the following are inverse relations:*

$$\begin{cases} y\_n = \sum\_{k=0}^n \binom{n}{k} \beta\_{n-k} \mathbf{x}\_k \\ \mathbf{x}\_n = \sum\_{k=0}^n \binom{n}{k} A\_{n-k}(0) y\_k. \end{cases} \tag{57}$$

*Proof.* Let us remember that

*Ak*(0) = *αk*,

where the coefficients *α<sup>k</sup>* and *β<sup>k</sup>* are related by (13).

Moreover, setting *yn* = [*y*0, ..., *yn*] *<sup>T</sup>* and *xn* = [*x*0, ..., *xn*] *<sup>T</sup>*, from (57) we have

$$\begin{cases} \overline{y}\_n = M\_1 \overline{x}\_n \\ \overline{x}\_n = M\_2 \overline{y}\_n \end{cases}$$

with

$$(M\_1)\_{i,j} = \begin{cases} \binom{i}{j} \beta\_{i-j} & i \ge j \\ 0 & \text{otherwise} \end{cases}, \qquad i, j = 0, \ldots, n,\tag{58}$$

$$(M\_2)\_{i,j} = \begin{cases} \binom{i}{j} a\_{i-j} & i \ge j \\ 0 & \text{otherwise} \end{cases}, \qquad i, j = 0, \ldots, n,\tag{59}$$

and, from (13) we get

$$M\_1 M\_2 = I\_{n+1\prime}$$

i.e. (57) are inverse relations.

**Theorem 10** (Inverse relation between two Appell polynomial sequences)**.** *Let An*(*x*) *and Bn*(*x*) *be the Appell vectors for β<sup>i</sup> and γi, respectively. Then the following are inverse relations:*

$$\begin{cases} \overline{A}\_{\mathcal{U}}(\mathbf{x}) = \mathsf{C} \overline{B}\_{\mathcal{U}}(\mathbf{x}) \\ \overline{B}\_{\mathcal{U}}(\mathbf{x}) = \widetilde{\mathsf{C}A}\_{\mathcal{U}}(\mathbf{x}) \end{cases} \tag{60}$$

*with*

12 Will-be-set-by-IN-TECH

*X*(*x*) = *NBn* (*x*)

*An* (*x*) = *M*−1*NBn* (*x*),

**Theorem 9** (Inverse relations)**.** *Let An* (*x*) *be the Appell polynomial sequence for β<sup>i</sup> then the*

�*n k* �

�*n k* �

*Ak*(0) = *αk*,

� *yn* <sup>=</sup> *<sup>M</sup>*1*xn xn* = *M*2*yn*

*<sup>T</sup>* and *xn* = [*x*0, ..., *xn*]

)*βi*−*<sup>j</sup> <sup>i</sup>* ≥ *<sup>j</sup>*

)*αi*−*<sup>j</sup> <sup>i</sup>* ≥ *<sup>j</sup>*

*M*1*M*<sup>2</sup> = *In*+1,

**Theorem 10** (Inverse relation between two Appell polynomial sequences)**.** *Let An*(*x*) *and Bn*(*x*) *be the Appell vectors for β<sup>i</sup> and γi, respectively. Then the following are inverse relations:*

� *An*(*x*) = *CBn*(*x*)

<sup>0</sup> *otherwise* , *<sup>i</sup>*, *<sup>j</sup>* <sup>=</sup> 0, ..., *<sup>n</sup>*.

*<sup>β</sup>n*−*kxk*

*An*−*k*(0)*yk*.

*<sup>T</sup>*, from (57) we have

<sup>0</sup> *otherwise* , *<sup>i</sup>*, *<sup>j</sup>* <sup>=</sup> 0, ..., *<sup>n</sup>*, (58)

<sup>0</sup> *otherwise* , *<sup>i</sup>*, *<sup>j</sup>* <sup>=</sup> 0, ..., *<sup>n</sup>*, (59)

*Bn*(*x*) = *<sup>C</sup>*�*An*(*x*) (60)

(57)

)*γi*−*<sup>j</sup> <sup>i</sup>* ≥ *<sup>j</sup>*

with *M*−<sup>1</sup> as in (53).

with

Then

with

and, from (13) we get

i.e. (57) are inverse relations.

Always from Theorem 7 we get

*following are inverse relations:*

*Proof.* Let us remember that

Moreover, setting *yn* = [*y*0, ..., *yn*]

(*N*)*i*,*<sup>j</sup>* =

from which, setting *C* = *M*−1*N*, we have the thesis.

where the coefficients *α<sup>k</sup>* and *β<sup>k</sup>* are related by (13).

(*M*1)*i*,*<sup>j</sup>* =

(*M*2)*i*,*<sup>j</sup>* =

� ( *i j*

� ( *i j*

� ( *i j*

> ⎧ ⎪⎪⎪⎨

*yn* =

*xn* =

*n* ∑ *k*=0

*n* ∑ *k*=0

⎪⎪⎪⎩

$$(\mathbf{C})\_{i,j} = \begin{cases} \binom{i}{j} \mathbf{c}\_{i-j} & \mathbf{i} \ge j \\ 0 & \text{otherwise} \end{cases}, \qquad \left(\tilde{\mathbf{C}}\right)\_{i,j} = \begin{cases} \binom{i}{j} \tilde{\mathbf{c}}\_{i-j} & \mathbf{i} \ge j \\ 0 & \text{otherwise} \end{cases}, \quad \mathbf{i}, j = \mathbf{0}, \ldots, n,\tag{61}$$

$$\mathbf{c}\_{\mathrm{ll}} = \sum\_{k=0}^{n} \binom{n}{k} A\_{\mathrm{ll}-k}(0) \gamma\_{k\prime} \quad \tilde{\mathbf{c}}\_{\mathrm{ll}} = \sum\_{k=0}^{n} \binom{n}{k} B\_{\mathrm{ll}-k}(0) \beta\_{k\prime} \tag{62}$$

*Proof.* Follows from Theorem 8, after observing that

$$\sum\_{k=0}^{n} \binom{n}{k} c\_{n-k} \tilde{c}\_k = \begin{cases} 1 & n = 0 \\ 0 & n > 0 \end{cases} \tag{63}$$

and therefore

$$
\mathbf{C}\widetilde{\mathbf{C}} = I\_{n+1}.\tag{7}
$$

**Theorem 11** (Binomial identity)**.** *If An* (*x*) *is the Appell polynomial sequence for β<sup>i</sup> we have*

$$A\_{\boldsymbol{n}}\left(\mathbf{x}+\boldsymbol{y}\right) = \sum\_{i=0}^{n} \binom{n}{i} A\_{i}\left(\mathbf{x}\right) \boldsymbol{y}^{n-i}, \quad \boldsymbol{n} = \mathbf{0}, 1, \ldots \tag{64}$$

*Proof.* Starting by the Definition 1 and using the identity

$$(\mathbf{x} + \mathbf{y})^{\dot{i}} = \sum\_{k=0}^{\dot{i}} \binom{\dot{i}}{k} \mathbf{y}^{k} \mathbf{x}^{\dot{i}-k},\tag{65}$$

we infer

$$A\_{n}\left(\mathbf{x}+\mathbf{y}\right)=\frac{\left(-1\right)^{n}}{\left(\beta\_{0}\right)^{n+1}}\begin{vmatrix}1&(\mathbf{x}+\mathbf{y})&\cdots&(\mathbf{x}+\mathbf{y})^{n-1}&(\mathbf{x}+\mathbf{y})^{n}\\\beta\_{0}&\beta\_{1}&\cdots&\beta\_{n-1}&\beta\_{n}\\0&\ddots&\ddots&\vdots\\\vdots&\ddots&\vdots\\0&\cdots&\cdots&\beta\_{0}&\beta\_{1}\binom{n}{n-1}\\\vdots&\vdots&\vdots&\vdots\\\beta\_{0}&\beta\_{1}\left(\mathbf{i}^{i}\_{1}\right)\mathbf{x}^{1}&\left(\mathbf{i}^{i}\_{2}\right)\mathbf{x}^{2}&\cdots&\left(\mathbf{i}^{n}\_{i}\right)\mathbf{x}^{n-i-1}&\binom{n}{i}\mathbf{x}^{n-i}\\\\beta\_{0}&\beta\_{1}\left(\mathbf{i}^{i+1}\_{1}\right)\beta\_{2}\left(\mathbf{i}^{i+2}\_{2}\right)\cdots&\beta\_{n-i-1}\left(\mathbf{i}^{n}\_{i}\right)&\beta\_{n-i}\left(\mathbf{i}^{i}\_{i}\right)\\0&\beta\_{0}&\beta\_{1}\left(\mathbf{i}^{i+2}\_{i+1}\right)\cdots&\beta\_{n-i-1}\left(\mathbf{i}^{n-1}\_{i}\right)\beta\_{n-i-1}\left(\mathbf{i}^{n}\_{i+1}\right)\\\vdots&\vdots&\vdots&\vdots\\\beta\_{0}&\cdots&\beta\_{0}&\cdots&\beta\_{1}\\\vdots&\vdots&\vdots&\vdots\\0&\cdots&\cdots&0&\beta\_{0}&\beta\_{1}\left(\mathbf{i}^{n}\_{n-1}\right)\end{vmatrix}$$

#### 14 Will-be-set-by-IN-TECH 34 Linear Algebra – Theorems and Applications

We divide, now, each *j*−th column, *j* = 2, ..., *n* − *i* + 1, for ( *i*+*j*−1 *<sup>i</sup>* ) and multiply each *h*−th row, *h* = 3, ..., *n* − *i* + 1, for ( *i*+*h*−2 *<sup>i</sup>* ). Thus we finally obtain

$$\begin{aligned} A\_n(x+y) &= \\ &= \sum\_{i=0}^n \frac{\binom{i+1}{i} \cdots \binom{n}{i}}{\binom{i+1}{i} \cdots \binom{n-1}{i}} y^i \frac{(-1)^{n-i}}{(\beta\_0)^{n-i+1}} \\ &\qquad \left| \begin{array}{cccc} 1 & \mathbf{x}^1 & \mathbf{x}^2 & \cdots & \mathbf{x}^{n-i-1} & \mathbf{x}^{n-i} \\ \beta\_0 \ \beta\_1 & \beta\_2 & \cdots & \beta\_{n-i-1} & \beta\_{n-i} \\ 0 & \beta\_0 \ \beta\_1 (^1\_1)^{\cdot} \cdot \cdots \beta\_{n-i-2} (^{n-1}\_{-1}) & \beta\_{n-i-1} (^{n-i}\_1) \\ \vdots & & & \vdots \\ \beta\_0 & & & \vdots \\ \vdots & & \ddots & & \vdots \\ 0 & \cdots & \cdots & 0 & \beta\_0 \end{array} \right| \\ &= \sum\_{i=0}^n \binom{n}{i} A\_{n-i}(x) y^i = \sum\_{i=0}^n \binom{n}{i} A\_i(x) y^{n-i}. \end{aligned}$$

**Theorem 12** (Generalized Appell identity)**.** *Let An*(*x*) *and Bn*(*x*) *be the Appell polynomial sequences for β<sup>i</sup> and γi*, *respectively. Then, if Cn*(*x*) *is the Appell polynomial sequence for δ<sup>i</sup> with*

$$\begin{cases} \delta\_0 = \frac{1}{\mathbb{C}\_0(0)} \prime \\ \delta\_i = -\frac{1}{\mathbb{C}\_0(0)} \sum\_{k=1}^i \binom{i}{k} \delta\_{i-k} \mathbb{C}\_k(0) \prime \ i = 1, \ldots \end{cases} \tag{66}$$

*and*

$$\mathbf{C}\_{i}(\mathbf{0}) = \sum\_{j=0}^{i} \binom{i}{j} B\_{i-j}(\mathbf{0}) A\_{j}(\mathbf{0})\_{\prime} \tag{67}$$

*where Ai*(0) *and Bi*(0) *are related to β<sup>i</sup> and γi, respectively, by relations similar to (66), we have*

$$\mathcal{C}\_{\mathfrak{n}}(y+z) = \sum\_{k=0}^{n} \binom{n}{k} A\_{k}(y) B\_{n-k}(z). \tag{68}$$

*Proof.* Starting from (3) we have

$$\mathbb{C}\_{n}(y+z) = \sum\_{k=0}^{n} \binom{n}{k} \mathbb{C}\_{n-k}(0)(y+z)^{k}.\tag{69}$$

Then, applying (67) and the well-known classical binomial identity, after some calculation, we obtain the thesis.

**Theorem 13** (Combinatorial identities)**.** *Let An*(*x*) *and Bn*(*x*) *be the Appell polynomial sequences for β<sup>i</sup> and γi*, *respectively. Then the following relations holds:*

34 Linear Algebra – Theorems and Applications Algebraic Theory of Appell Polynomials with Application to General Linear Interpolation Problem <sup>15</sup> Algebraic Theory of Appell Polynomials with Application to General Linear Interpolation Problem 35

$$\sum\_{k=0}^{n} \binom{n}{k} A\_k(\mathbf{x}) B\_{n-k}(-\mathbf{x}) = \sum\_{k=0}^{n} \binom{n}{k} A\_k(\mathbf{0}) B\_{n-k}(\mathbf{0}),\tag{70}$$

$$\sum\_{k=0}^{n} \binom{n}{k} A\_k(\mathbf{x}) B\_{n-k}(z) = \sum\_{k=0}^{n} \binom{n}{k} A\_k(\mathbf{x} + z) B\_{n-k}(0). \tag{71}$$

*Proof.* If *Cn*(*x*) is the Appell polynomial sequence for *δ<sup>i</sup>* defined as in (66), from the generalized Appell identity, we have

$$\sum\_{k=0}^{n} \binom{n}{k} A\_k(\mathfrak{x}) B\_{n-k}(-\mathfrak{x}) = \mathbf{C}\_n(0) = \sum\_{k=0}^{n} \binom{n}{k} A\_k(0) B\_{n-k}(0)$$

and *<sup>n</sup>*

14 Will-be-set-by-IN-TECH

*n*−1 *i* )

*An*−*<sup>i</sup>* (*x*) *<sup>y</sup><sup>i</sup>* <sup>=</sup>

*i* ∑ *k*=1 (*i*

*i* ∑ *j*=0 �*i j* �

*where Ai*(0) *and Bi*(0) *are related to β<sup>i</sup> and γi, respectively, by relations similar to (66), we have*

�*n k* �

*n* ∑ *k*=0

*n* ∑ *k*=0 �*n k* �

Then, applying (67) and the well-known classical binomial identity, after some calculation, we

**Theorem 13** (Combinatorial identities)**.** *Let An*(*x*) *and Bn*(*x*) *be the Appell polynomial sequences*

**Theorem 12** (Generalized Appell identity)**.** *Let An*(*x*) *and Bn*(*x*) *be the Appell polynomial sequences for β<sup>i</sup> and γi*, *respectively. Then, if Cn*(*x*) *is the Appell polynomial sequence for δ<sup>i</sup> with*

*<sup>y</sup><sup>i</sup>* (−1)

(*β*0)

<sup>1</sup> *<sup>x</sup>*<sup>1</sup> *<sup>x</sup>*<sup>2</sup> ··· *<sup>x</sup>n*−*i*−<sup>1</sup> *<sup>x</sup>n*−*<sup>i</sup> β*<sup>0</sup> *β*<sup>1</sup> *β*<sup>2</sup> ··· *βn*−*i*−<sup>1</sup> *βn*−*<sup>i</sup>*

. ... .

0 ... ... 0 *β*<sup>0</sup> *β*1( *<sup>n</sup>*−*<sup>i</sup>*

*n* ∑ *i*=0 �*n i* �

<sup>1</sup>) ··· *<sup>β</sup>n*−*i*−2(

*n*−*i*

*n*−*i*+1

*n*−*i*−1

<sup>1</sup> ) *<sup>β</sup>n*−*i*−1(

*n*−*i* 1 ) � � � � � � � � � � � � � � �

=

. . .

. .

*Ai* (*x*) *yn*−*<sup>i</sup>*

*<sup>n</sup>*−*i*−1)

.

*<sup>k</sup>*)*δi*−*kCk*(0), *<sup>i</sup>* <sup>=</sup> 1, ..., (66)

*Bi*−*j*(0)*Aj*(0), (67)

*Ak*(*y*)*Bn*−*k*(*z*). (68)

*Cn*−*k*(0)(*<sup>y</sup>* <sup>+</sup> *<sup>z</sup>*)*k*. (69)

*<sup>i</sup>* ). Thus we finally obtain

( *i*+1 *<sup>i</sup>* )···( *n i*)

0 *β*<sup>0</sup> *β*1(

2

( *i*+1 *<sup>i</sup>* )···( *i*+*j*−1

*<sup>i</sup>* ) and multiply each *h*−th row,

We divide, now, each *j*−th column, *j* = 2, ..., *n* − *i* + 1, for (

*i*+*h*−2

= *n* ∑ *i*=0

= *n* ∑ *i*=0

⎧ ⎪⎨

⎪⎩

*Proof.* Starting from (3) we have

obtain the thesis.

� � � � � � � � � � � � � � �

. . . *β*<sup>0</sup>

. .

> �*n i* �

*δ*<sup>0</sup> = <sup>1</sup> *<sup>C</sup>*0(0),

*<sup>δ</sup><sup>i</sup>* <sup>=</sup> <sup>−</sup> <sup>1</sup> *C*0(0)

*Ci*(0) =

*Cn*(*y* + *z*) =

*Cn*(*y* + *z*) =

*for β<sup>i</sup> and γi*, *respectively. Then the following relations holds:*

*An* (*x* + *y*) =

*h* = 3, ..., *n* − *i* + 1, for (

*and*

$$\sum\_{k=0}^{n} \binom{n}{k} A\_k(\mathbf{x}) B\_{n-k}(\mathbf{z}) = \mathbb{C}\_{\mathbb{R}}(\mathbf{x} + \mathbf{z}) = \sum\_{k=0}^{n} \binom{n}{k} A\_k(\mathbf{x} + \mathbf{z}) B\_{n-k}(\mathbf{0}).$$
 
$$\square$$

**Theorem 14** (Forward difference)**.** *If An* (*x*) *is the Appell polynomial sequence for β<sup>i</sup> we have*

$$\Delta A\_{\rm ll} \left( \mathbf{x} \right) \equiv A\_{\rm ll} \left( \mathbf{x} + \mathbf{1} \right) - A\_{\rm ll} \left( \mathbf{x} \right) = \sum\_{i=0}^{n-1} \binom{n}{i} A\_{i} \left( \mathbf{x} \right), \quad n = 0, 1, \ldots \tag{72}$$

*Proof.* The desired result follows from (64) with *y* = 1.

**Theorem 15** (Multiplication Theorem)**.** *Let An*(*x*) *be the Appell vector for βi.*

*The following identities hold:*

$$
\overline{A}\_{\mathbb{H}}\left(m\mathbf{x}\right) = B\left(\mathbf{x}\right)\overline{A}\_{\mathbb{H}}\left(\mathbf{x}\right) \qquad n = 0, 1, \ldots \qquad m = 1, 2, \ldots. \tag{73}
$$

$$
\overline{A}\_{\text{ll}}(m\mathbf{x}) = M^{-1}DX(\mathbf{x}) \qquad n = 0, 1, \ldots \qquad m = 1, 2, \ldots \tag{74}
$$

*where*

$$(B(\mathbf{x}))\_{i,j} = \begin{cases} \binom{i}{j} (m-1)^{i-j} \mathbf{x}^{i-j} & \mathbf{i} \ge j \\ 0 & \text{otherwise} \end{cases}, \qquad \mathbf{i}, \mathbf{j} = \mathbf{0}, \dots, n,\tag{75}$$

*D* = *diag*[1, *m*, ..., *mn*] *and M*−<sup>1</sup> *defined as in (53).*

*Proof.* The (73) follows from (64) setting *y* = *x* (*m* − 1). In fact we get

$$A\_{\boldsymbol{n}}\left(m\mathbf{x}\right) = \sum\_{i=0}^{n} \binom{n}{i} A\_{\boldsymbol{i}}\left(\mathbf{x}\right) \left(m-1\right)^{n-i} \mathbf{x}^{n-i}.\tag{76}$$

The (74) follows from Theorem 7. In fact we get

$$
\overline{A}\_{\mathfrak{N}}(m\mathbf{x}) = M^{-1}X(m\mathbf{x}) = M^{-1}DX(\mathbf{x}),\tag{77}
$$

#### 16 Will-be-set-by-IN-TECH 36 Linear Algebra – Theorems and Applications

and

$$A\_{\mathfrak{n}}\left(m\mathbf{x}\right) = \sum\_{i=0}^{n} \binom{n}{i} \mathfrak{a}\_{n-i} m^i \mathfrak{x}^i. \tag{78}$$

**Theorem 16** (Differential equation)**.** *If An* (*x*) *is the Appell polynomial sequence for β<sup>i</sup> then An* (*x*) *satisfies the linear differential equation:*

$$\frac{\beta\_n}{n!}y^{(n)}(\mathbf{x}) + \frac{\beta\_{n-1}}{(n-1)!}y^{(n-1)}(\mathbf{x}) + \dots + \frac{\beta\_2}{2!}y^{(2)}(\mathbf{x}) + \beta\_1 y^{(1)}(\mathbf{x}) + \beta\_0 y(\mathbf{x}) = \mathbf{x}^n \tag{79}$$

*Proof.* From Theorem 5 we have

$$A\_{n+1}(\mathbf{x}) = \frac{1}{\beta\_0} \left( \mathbf{x}^{n+1} - \sum\_{k=0}^{n} \binom{n+1}{k+1} \beta\_{k+1} A\_{n-k}(\mathbf{x}) \right). \tag{80}$$

From Theorem 1 we find that

$$A\_{n+1}'(\mathbf{x}) = (n+1)A\_{\mathbb{R}}(\mathbf{x}), \quad \text{and} \quad A\_{n-k}(\mathbf{x}) = \frac{A\_{\mathbb{R}}^{(k)}(\mathbf{x})}{n(n-1)...(n-k+1)},\tag{81}$$

and replacing *An*−*k*(*x*) in the (80) we obtain

$$A\_{n+1}(\mathbf{x}) = \frac{1}{\beta\_0} \left( \mathbf{x}^{n+1} - (n+1) \sum\_{k=0}^n \beta\_{k+1} \frac{A\_n^{(k)}(\mathbf{x})}{(k+1)!} \right). \tag{82}$$

Differentiating both hand sides of the last one and replacing *A*� *<sup>n</sup>*+1(*x*) with (*n* + 1)*An*(*x*), after some calculation we obtain the thesis.

**Remark 7.** *An alternative differential equation for Appell polynomial sequences can be determined by the recurrence relation referred to in Remark 6 ([18, 26]).*

#### **6. Appell polynomial sequences of second kind**

Let *f* : *I* ⊂ **R** → **R** and Δ be the finite difference operator ([23]), i.e.:

$$
\Delta[f](\mathbf{x}) = f(\mathbf{x} + \mathbf{1}) - f(\mathbf{x}) \tag{83}
$$

we define the finite difference operator of order *i*, with *i* ∈ **N**, as

$$
\Delta^i[f](\mathbf{x}) = \Delta(\Delta^{i-1}[f](\mathbf{x})) = \sum\_{j=0}^i (-1)^{i-j} \binom{i}{j} f(\mathbf{x} + j), \tag{84}
$$

meaning Δ<sup>0</sup> = *I* and Δ<sup>1</sup> = Δ, where *I* is the identity operator. Let the sequence of falling factorial defined by

$$\begin{cases} \left(\mathbf{x}\right)\_0 = 1, \\ \left(\mathbf{x}\right)\_n = \mathbf{x}\left(\mathbf{x} - 1\right)\left(\mathbf{x} - 2\right)\cdots\left(\mathbf{x} - n + 1\right), n = 1, 2, \ldots \end{cases} \tag{85}$$

we give the following

16 Will-be-set-by-IN-TECH

**Theorem 16** (Differential equation)**.** *If An* (*x*) *is the Appell polynomial sequence for β<sup>i</sup> then An* (*x*)

*n* ∑ *k*=0

2! *<sup>y</sup>*(2)

*n* + 1 *k* + 1

*n* ∑ *k*=0

*βk*+<sup>1</sup>

(*x*) + ... <sup>+</sup> *<sup>β</sup>*<sup>2</sup>

*n i* 

*<sup>α</sup>n*−*im<sup>i</sup> xi*

(*x*) + *β*1*y*(1)

*<sup>β</sup>k*+1*An*−*k*(*x*)

*A*(*k*) *<sup>n</sup>* (*x*) (*k* + 1)!

Δ[ *f* ](*x*) = *f*(*x* + 1) − *f*(*x*), (83)

*i j*  *<sup>n</sup>* (*x*) *n*(*n* − 1)...(*n* − *k* + 1)

. (78)

(*x*) + *β*0*y*(*x*) = *x<sup>n</sup>* (79)

. (80)

. (82)

*<sup>n</sup>*+1(*x*) with (*n* + 1)*An*(*x*), after

*f*(*x* + *j*), (84)

, (81)

*n* ∑ *i*=0

*An* (*mx*) =

*y*(*n*−1)

*β*0

*β*0

Differentiating both hand sides of the last one and replacing *A*�

**6. Appell polynomial sequences of second kind**

Let *f* : *I* ⊂ **R** → **R** and Δ be the finite difference operator ([23]), i.e.:

we define the finite difference operator of order *i*, with *i* ∈ **N**, as

meaning Δ<sup>0</sup> = *I* and Δ<sup>1</sup> = Δ, where *I* is the identity operator.

[ *f* ](*x*) = Δ(Δ*i*−1[ *f* ](*x*)) =

*<sup>x</sup>n*+<sup>1</sup> <sup>−</sup>

*<sup>n</sup>*+1(*x*)=(*<sup>n</sup>* <sup>+</sup> <sup>1</sup>)*An*(*x*), and *An*−*k*(*x*) = *<sup>A</sup>*(*k*)

*<sup>x</sup>n*+<sup>1</sup> <sup>−</sup> (*<sup>n</sup>* <sup>+</sup> <sup>1</sup>)

**Remark 7.** *An alternative differential equation for Appell polynomial sequences can be determined by*

*i* ∑ *j*=0

(−1)*i*−*<sup>j</sup>*

(*x*)*<sup>n</sup>* <sup>=</sup> *<sup>x</sup>* (*<sup>x</sup>* <sup>−</sup> <sup>1</sup>) (*<sup>x</sup>* <sup>−</sup> <sup>2</sup>)···(*<sup>x</sup>* <sup>−</sup> *<sup>n</sup>* <sup>+</sup> <sup>1</sup>), *<sup>n</sup>* <sup>=</sup> 1, 2, ..., (85)

and

*satisfies the linear differential equation:*

*Proof.* From Theorem 5 we have

From Theorem 1 we find that

*A*�

(*x*) + *<sup>β</sup>n*−<sup>1</sup> (*n* − 1)!

and replacing *An*−*k*(*x*) in the (80) we obtain

some calculation we obtain the thesis.

*An*+1(*x*) = <sup>1</sup>

*An*+1(*x*) = <sup>1</sup>

*the recurrence relation referred to in Remark 6 ([18, 26]).*

Δ*i*

Let the sequence of falling factorial defined by

(*x*)<sup>0</sup> <sup>=</sup> 1,

*βn n*! *y*(*n*) **Definition 4.** *Let ßi* ∈ **R***, i* = 0, 1, ..., *with ß*<sup>0</sup> �= 0*. The polynomial sequence*

$$\begin{cases} \mathcal{A}\_{0}\left(\mathbf{x}\right) = \frac{1}{\beta\_{0}},\\ \begin{bmatrix} \mathcal{A}\_{0}\left(\mathbf{x}\right) = \frac{1}{\beta\_{0}}\\ \mathcal{A}\_{0} \begin{bmatrix} \mathcal{A}\_{1} & \mathcal{A}\_{2} & \cdots & \cdots & \mathcal{A}\_{n-1} & \langle \mathbf{x} \rangle\_{n}\\ \mathcal{A}\_{0} & \beta\_{1} & \beta\_{2} & \cdots & \cdots & \beta\_{n-1} & \beta\_{n}\\ 0 & \beta\_{0} & \binom{n}{1}\mathcal{A}\_{1} & \cdots & \cdots & \binom{n-1}{1}\mathcal{A}\_{n-2} & \binom{n}{1}\mathcal{A}\_{n-1}\\ 0 & 0 & \beta\_{0} & \cdots & \cdots & \binom{n-1}{2}\mathcal{A}\_{n-3} & \binom{n}{2}\mathcal{A}\_{n-2}\\ \vdots & & \ddots & & \vdots & \vdots\\ \vdots & & \ddots & \vdots & \vdots\\ 0 & \cdots & \cdots & \cdots & 0 & \beta\_{0} & \binom{n}{n-1}\mathcal{A}\_{1} \end{bmatrix}, n = 1, 2, \dots \end{cases} \tag{86}$$

*is called Appell polynomial sequence of second kind.*

Then, we have

**Theorem 17.** *For Appell polynomial sequences of second kind we get*

$$
\Delta \mathcal{A}\_{\mathbb{H}} \left( \mathbf{x} \right) = n \mathcal{A}\_{\mathbb{H}-1} \left( \mathbf{x} \right) \quad n = 1, 2, \dots \tag{87}
$$

*Proof.* By the well-known relation ([23])

$$
\Delta\left(\mathbf{x}\right)\_n = n \left(\mathbf{x}\right)\_{n-1}, \quad n = 1, 2, \ldots,\tag{88}
$$

applying the operator Δ to the definition (86) and using the properties of linearity of Δ we have

$$\Delta \mathcal{A}\_{\mathbb{R}}(\mathbf{x}) = \frac{(-1)^{n}}{\left(\boldsymbol{\beta}\_{0}\right)^{n+1}} \begin{vmatrix} \Delta \mathbf{1} \ \Delta \left(\mathbf{x}\right)\_{1} \ \Delta \left(\mathbf{x}\right)\_{2} & \cdots & \cdots & \Delta \left(\mathbf{x}\right)\_{n-1} & \Delta \left(\mathbf{x}\right)\_{n} \\ \boldsymbol{\beta}\_{0} & \boldsymbol{\beta}\_{1} & \boldsymbol{\beta}\_{2} & \cdots & \cdots & \boldsymbol{\beta}\_{n-1} & \boldsymbol{\beta}\_{n} \\ 0 & \boldsymbol{\beta}\_{0} & \binom{2}{1} \boldsymbol{\beta}\_{1} & \cdots & \cdots & \binom{n-1}{1} \boldsymbol{\beta}\_{n-2} & \binom{n}{1} \boldsymbol{\beta}\_{n-1} \\ 0 & 0 & \boldsymbol{\beta}\_{0} & \cdots & \cdots & \binom{n-1}{2} \boldsymbol{\beta}\_{n-3} & \binom{n}{2} \boldsymbol{\beta}\_{n-2} \\ \vdots & & & \ddots & \vdots & \vdots \\ \vdots & & & \ddots & \vdots & \vdots \\ \vdots & & & \ddots & \vdots & \vdots \\ 0 & \cdots & \cdots & \cdots & 0 & \beta\_{0} & \binom{n}{n-1} \boldsymbol{\beta}\_{1} \end{vmatrix}, n = 1, 2, \ldots \tag{89}$$

We can expand the determinant in (89) with respect to the first column and, after multiplying the *<sup>i</sup>*-th row by *<sup>i</sup>* <sup>−</sup> 1, *<sup>i</sup>* <sup>=</sup> 2, ..., *<sup>n</sup>* and the *<sup>j</sup>*-th column by <sup>1</sup> *<sup>j</sup>* , *j* = 1, ..., *n*, we can recognize the factor A*n*−<sup>1</sup> (*x*).

We can observe that the structure of the determinant in (86) is similar to that one of the determinant in (6). In virtue of this it is possible to obtain a dual theory of Appell polynomials of first kind, in the sense that similar properties can be proven ([19]).

For example, the generating function is

$$H(\mathbf{x}, h) = a(h)(1 + h)^{\mathbf{x}},\tag{90}$$

where *a*(*h*) is an invertible formal series of power.

#### **7. Examples of Appell polynomial sequences of second kind**

The following are classical examples of Appell polynomial sequences of second kind.

**a)** Bernoulli polynomials of second kind ([19, 23]):

$$\pounds\_{i} = \frac{(-1)^{l}}{i+1} \text{i!} , \; i = 0, 1, \dots \tag{91}$$

$$H(\mathbf{x}, h) = \frac{h(1+h)^{\mathbf{x}}}{\ln(1+h)};\tag{92}$$

**b)** Boole polynomials ([19, 23]):

$$\begin{array}{c} \{\}\_{i} = \begin{cases} 1, \ i = 0 \\ \frac{1}{2}, \ i = 1 \\ 0, \ i = 2, \ldots \end{cases} \end{array} \tag{93}$$

$$H(\mathbf{x}, h) = \frac{\mathbf{2}(1+h)^{\mathbf{x}}}{\mathbf{2} + h}. \tag{94}$$

#### **8. An application to general linear interpolation problem**

Let *X* be the linear space of real functions defined in the interval [0, 1] continuous and with continuous derivatives of all necessary orders. Let *L* be a linear functional on *X* such that *L*(1) �= 0. If in (6) and respectively in (86) we set

$$\beta\_{\dot{i}} = L(\mathbf{x}^{\dot{i}}), \qquad \beta\_{\dot{i}} = L((\mathbf{x})\_{\dot{i}}), \quad \dot{i} = 0, 1, \dots \tag{95}$$

*An*(*x*) and A*n*(*x*) will be said Appell polynomial sequences of first or of second kind related to the functional *L* and denoted by *AL*,*n*(*x*) and A*L*,*n*(*x*), respectively.

**Remark 8.** *The generating function of the sequence AL*,*n*(*x*) *is*

$$G(\mathbf{x}, h) = \frac{e^{\mathbf{x}h}}{L\_{\mathbf{x}}(e^{\mathbf{x}h})},\tag{96}$$

*and for* A*L*,*n*(*x*) *is*

$$H(\mathbf{x}, h) = \frac{(1 + h)^{\mathbf{x}}}{L\_{\mathbf{x}}((1 + h)^{\mathbf{x}})} \prime \tag{97}$$

*where Lx means that the functional L is applied to the argument as a function of x.*

*Proof.* For *AL*,*n*(*x*) if *<sup>G</sup>*(*x*, *<sup>h</sup>*) = *<sup>a</sup>*(*h*)*exh* with <sup>1</sup> *<sup>a</sup>*(*h*) <sup>=</sup> <sup>∞</sup> ∑ *i*=0 *βi hi <sup>i</sup>*! we have

$$G(\mathbf{x},t) = \frac{e^{\mathbf{x}\mathbf{h}}}{\frac{1}{a(\hbar)}} = \frac{e^{\mathbf{x}\mathbf{h}}}{\sum\_{i=0}^{\infty} \beta\_i \frac{\mathbf{h}^i}{l!}} = \frac{e^{\mathbf{x}\mathbf{h}}}{\sum\_{i=0}^{\infty} L(\mathbf{x}^i) \frac{\mathbf{h}^i}{l!}} = \frac{e^{\mathbf{x}\mathbf{h}}}{L\left(\sum\_{i=0}^{\infty} \mathbf{x}^i \frac{\mathbf{h}^i}{l!}\right)} = \frac{e^{\mathbf{x}\mathbf{h}}}{L\_{\mathbf{x}}(e^{\mathbf{x}\mathbf{h}})}.$$

For A*L*,*n*(*x*), the proof similarly follows.

Then, we have

18 Will-be-set-by-IN-TECH

**7. Examples of Appell polynomial sequences of second kind**

The following are classical examples of Appell polynomial sequences of second kind.

*ßi* <sup>=</sup> (−1)

*ßi* =

**8. An application to general linear interpolation problem**

⎧ ⎨ ⎩ *i*

*<sup>H</sup>*(*x*, *<sup>h</sup>*) = *<sup>h</sup>*(<sup>1</sup> <sup>+</sup> *<sup>h</sup>*)*<sup>x</sup>*

*<sup>H</sup>*(*x*, *<sup>h</sup>*) = <sup>2</sup>(<sup>1</sup> <sup>+</sup> *<sup>h</sup>*)*<sup>x</sup>*

Let *X* be the linear space of real functions defined in the interval [0, 1] continuous and with continuous derivatives of all necessary orders. Let *L* be a linear functional on *X* such that

*An*(*x*) and A*n*(*x*) will be said Appell polynomial sequences of first or of second kind related

*<sup>G</sup>*(*x*, *<sup>h</sup>*) = *<sup>e</sup>xh*

*<sup>H</sup>*(*x*, *<sup>h</sup>*) = (<sup>1</sup> <sup>+</sup> *<sup>h</sup>*)*<sup>x</sup>*

*where Lx means that the functional L is applied to the argument as a function of x.*

*Lx*(*exh*)

*Lx*((1 + *h*)*x*)

*<sup>a</sup>*(*h*) <sup>=</sup> <sup>∞</sup> ∑ *i*=0 *βi hi*

*ln*(1 + *h*)

1, *i* = 0 1 <sup>2</sup> , *i* = 1 0, *i* = 2, ...

*H*(*x*, *h*) = *a*(*h*)(1 + *h*)*x*, (90)

*<sup>i</sup>* <sup>+</sup> <sup>1</sup> *<sup>i</sup>*!, *<sup>i</sup>* <sup>=</sup> 0, 1, ..., (91)

; (92)

<sup>2</sup> <sup>+</sup> *<sup>h</sup>* . (94)

, (96)

, (97)

), *ßi* = *L*((*x*)*i*), *i* = 0, 1, ..., (95)

*<sup>i</sup>*! we have

(93)

For example, the generating function is

**b)** Boole polynomials ([19, 23]):

*and for* A*L*,*n*(*x*) *is*

where *a*(*h*) is an invertible formal series of power.

**a)** Bernoulli polynomials of second kind ([19, 23]):

*L*(1) �= 0. If in (6) and respectively in (86) we set

*Proof.* For *AL*,*n*(*x*) if *<sup>G</sup>*(*x*, *<sup>h</sup>*) = *<sup>a</sup>*(*h*)*exh* with <sup>1</sup>

*β<sup>i</sup>* = *L*(*x<sup>i</sup>*

**Remark 8.** *The generating function of the sequence AL*,*n*(*x*) *is*

to the functional *L* and denoted by *AL*,*n*(*x*) and A*L*,*n*(*x*), respectively.

**Theorem 18.** *Let ω<sup>i</sup>* ∈ **R**, *i* = 0, ..., *n*, *the polynomials*

$$P\_n(\mathbf{x}) = \sum\_{i=0}^n \frac{\omega\_i}{i!} A\_{L,i}(\mathbf{x})\_\prime \tag{98}$$

$$P\_n^\*(\mathbf{x}) = \sum\_{i=0}^n \frac{\omega\_i}{i!} \mathcal{A}\_{L,i}(\mathbf{x}) \tag{99}$$

*are the unique polynomials of degree less than or equal to n*, *such that*

$$L(P\_n^{(i)}) = i!\omega\_{i\prime} \quad i = 0, \ldots, n,\tag{100}$$

$$L(\Delta^l P\_n^\*) = \mathbf{i}!\omega\_{\mathbf{i}\prime} \quad \mathbf{i} = \mathbf{0}, \ldots, n. \tag{101}$$

*Proof.* The proof follows observing that, by the hypothesis on functional *L* there exists a unique polynomial of degree ≤ *n* verifying (100) and , respectively, (101); moreover from the properties of *AL*,*i*(*x*) and A*L*,*i*(*x*), we have

$$L(A\_{L,i}^{(j)}(\\\mathbf{x})) = i(i-1)...(i-j+1)L(A\_{L,i-j}(\\\mathbf{x})) = j! \binom{i}{j} \delta\_{ij} \tag{102}$$

$$L(\Delta^i \mathcal{A}\_{L,i}(\mathbf{x})) = i(i-1)...(i-j+1)L(\mathcal{A}\_{L,i-j}(\mathbf{x})) = j! \binom{i}{j} \delta\_{ij\prime} \tag{103}$$

where *δij* is the Kronecker symbol.

From (102) and (103) it is easy to prove that the polynomials (98) and (99) verify (100) and (101), respectively.

**Remark 9.** *For every linear functional L on X,* {*AL*,*i*(*x*)}, {A*L*,*i*(*x*)}, *i* = 0, ..., *n*, *are basis for* P*<sup>n</sup> and,* ∀*Pn*(*x*) ∈ P*n, we have*

$$P\_n(\mathbf{x}) = \sum\_{i=0}^n \frac{L(P\_n^{(i)})}{i!} \ A\_{L,i}(\mathbf{x}),\tag{104}$$

$$P\_n(\mathbf{x}) = \sum\_{i=0}^n \frac{L(\Delta^i P\_n)}{i!} \, \mathcal{A}\_{L,i}(\mathbf{x}). \tag{105}$$

Let us consider a function *f* ∈ *X*. Then we have the following

#### 20 Will-be-set-by-IN-TECH 40 Linear Algebra – Theorems and Applications

**Theorem 19.** *The polynomials*

$$P\_{L,n}[f](\mathbf{x}) = \sum\_{i=0}^{n} \frac{L(f^{(i)})}{i!} \, A\_{L,i}(\mathbf{x}),\tag{106}$$

$$P\_{L,n}^\*[f](\mathbf{x}) = \sum\_{i=0}^n \frac{L(\Delta^i f)}{i!} \, \mathcal{A}\_{L,i}(\mathbf{x}) \tag{107}$$

*are the unique polynomial of degree* ≤ *n such that*

$$L(P\_{L,n}[f]^{(i)}) = L(f^{(i)}), \ i = 0, \ldots, n,$$

$$L(\Delta^i P\_{L,n}^\*[f]) = L(\Delta^i f), \ i = 0, \ldots, n.$$

*Proof.* Setting *<sup>ω</sup><sup>i</sup>* <sup>=</sup> *<sup>L</sup>*(*<sup>f</sup>* (*i*)) *<sup>i</sup>*! , and respectively, *<sup>ω</sup><sup>i</sup>* <sup>=</sup> *<sup>L</sup>*(Δ*<sup>i</sup> <sup>f</sup>*) *<sup>i</sup>*! , *i* = 0, ..., *n*, the result follows from Theorem 18.

**Definition 5.** *The polynomials (106) and (107) are called Appell interpolation polynomial for f of first and of second kind, respectively.*

Now it is interesting to consider the estimation of the remainders

$$R\_{L, \mathbb{N}}[f](\mathbf{x}) = f(\mathbf{x}) - P\_{L, \mathbb{N}}[f](\mathbf{x}), \ \forall \mathbf{x} \in [0, 1], \tag{108}$$

$$R\_{L, \mathbf{u}}^{\*}[f](\mathbf{x}) = f(\mathbf{x}) - P\_{L, \mathbf{u}}^{\*}[f](\mathbf{x}), \ \forall \mathbf{x} \in [0, 1]. \tag{109}$$

**Remark 10.** *For any f* ∈ P*<sup>n</sup>*

$$R\_{L, \mathbb{N}}[f](\mathbf{x}) = \mathbf{0}, \quad R\_{L, \mathbb{N}}[\mathbf{x}^{n+1}] \neq \mathbf{0}, \text{ } \forall \mathbf{x} \in [\mathbf{0}, \mathbf{1}], \tag{110}$$

$$R\_{L,n}^\*[f](\mathbf{x}) = \mathbf{0}, \quad R\_{L,n}^\*[(\mathbf{x})\_{n+1}] \neq \mathbf{0}, \ \forall \mathbf{x} \in [0,1], \tag{111}$$

*i. e. the polynomial operators (106) and (107) are exact on* P*n.*

For a fixed *x* we may consider the remainder *RL*,*n*[ *f* ](*x*) and *R*<sup>∗</sup> *<sup>L</sup>*,*n*[ *f* ](*x*) as linear functionals which act on *f* and annihilate all elements of P*n*. From Peano's Theorem ([27, p. 69]) if a linear functional has this property, then it must also have a simple representation in terms of *f*(*n*+1). Therefore we have

**Theorem 20.** *Let f* <sup>∈</sup> *<sup>C</sup>n*+<sup>1</sup> [*a*, *<sup>b</sup>*] , *the following relations hold*

$$R\_{L, \mathbb{n}}(f, \mathbf{x}) = \frac{1}{n!} \int\_0^1 K\_{\mathbb{n}}(\mathbf{x}, t) f^{(n+1)}\left(t\right) dt, \quad \forall \mathbf{x} \in \left[0, 1\right], \tag{112}$$

$$R\_{L,n}^\*(f, \mathbf{x}) = \frac{1}{n!} \int\_0^1 K\_n^\*(\mathbf{x}, t) f^{(n+1)}\left(t\right) dt, \quad \forall \mathbf{x} \in \left[0, 1\right], \tag{113}$$

*where*

$$K\_{\mathfrak{n}}(\mathbf{x},t) = \mathcal{R}\_{L,\mathfrak{n}}\left[ (\mathbf{x}-t)\_{+}^{\mathfrak{n}} \right] = (\mathbf{x}-t)\_{+}^{\mathfrak{n}} - \sum\_{i=0}^{\mathfrak{n}} \binom{\mathfrak{n}}{i} L\left( (\mathbf{x}-t)\_{+}^{\mathfrak{n}-i} \right) A\_{L,i}(\mathbf{x})\_{i} \tag{114}$$

40 Linear Algebra – Theorems and Applications Algebraic Theory of Appell Polynomials with Application to General Linear Interpolation Problem <sup>21</sup> Algebraic Theory of Appell Polynomials with Application to General Linear Interpolation Problem 41

$$K\_n^\*(\mathbf{x}, t) = R\_{L, n}^\* \left[ (\mathbf{x} - t)\_+^n \right] = (\mathbf{x} - t)\_+^n - \sum\_{i=0}^n \frac{L \left( \Delta^i (\mathbf{x} - t)\_+^n \right)}{i!} \mathcal{A}\_{L, i}(\mathbf{x}). \tag{115}$$

*Proof.* After some calculation, the results follow by Remark 10 and Peano's Theorem.

**Remark 11** (Bounds)**.** *If f*(*n*+1) ∈ L*p*[0, 1] *and Kn*(*x*, *<sup>t</sup>*), *<sup>K</sup>*<sup>∗</sup> *<sup>n</sup>*(*x*, *<sup>t</sup>*) ∈ L*q*[0, 1] *with* <sup>1</sup> *<sup>p</sup>* <sup>+</sup> <sup>1</sup> *<sup>q</sup>* = 1 *then we apply the Hölder's inequality so that*

$$\begin{aligned} \left| \mathcal{R}\_{L,n}[f](\mathbf{x}) \right| &\leq \frac{1}{n!} \left( \int\_0^1 |\mathcal{K}\_n(\mathbf{x},t)|^q \, dt \right)^{\frac{1}{q}} \left( \int\_0^1 \left| f^{(n+1)} \begin{pmatrix} t \end{pmatrix}^p \, dt \right)^{\frac{1}{p}} \right) \\\ \left| \mathcal{R}\_{L,n}^\*[f](\mathbf{x}) \right| &\leq \frac{1}{n!} \left( \int\_0^1 |\mathcal{K}\_n^\*(\mathbf{x},t)|^q \, dt \right)^{\frac{1}{q}} \left( \int\_0^1 \left| f^{(n+1)} \begin{pmatrix} t \end{pmatrix}^p \, dt \right)^{\frac{1}{p}} .\end{aligned}$$

The two most important cases are *p* = *q* = 2 and *q* = 1, *p* = ∞ :

**i)** for *p* = *q* = 2 we have the estimates

$$|R\_{L,n}[f](\mathbf{x})| \le \sigma\_n \left| ||f|| \right| \, \mathsf{//} \, \_{\prime} \left| R\_{L,n}^\*[f](\mathbf{x}) \right| \le \sigma\_n^\* \left| ||f|| \right| \, \_{\prime} \tag{116}$$

where

20 Will-be-set-by-IN-TECH

*n* ∑ *i*=0

*n* ∑ *i*=0

) = *L*(*f* (*i*)

**Definition 5.** *The polynomials (106) and (107) are called Appell interpolation polynomial for f of first*

which act on *f* and annihilate all elements of P*n*. From Peano's Theorem ([27, p. 69]) if a linear functional has this property, then it must also have a simple representation in terms of *f*(*n*+1).

*L*(*f*(*i*))

*L*(Δ*<sup>i</sup> f*)

*<sup>L</sup>*,*n*[ *<sup>f</sup>* ]) = *<sup>L</sup>*(Δ*<sup>i</sup> <sup>f</sup>*), *<sup>i</sup>* <sup>=</sup> 0, ..., *<sup>n</sup>*.

), *i* = 0, ..., *n*,

*RL*,*n*[ *f* ](*x*) = *f*(*x*) − *PL*,*n*[ *f* ](*x*), ∀*x* ∈ [0, 1], (108)

*RL*,*n*[ *<sup>f</sup>* ](*x*) = 0, *RL*,*n*[*xn*+1] �<sup>=</sup> 0, <sup>∀</sup>*<sup>x</sup>* <sup>∈</sup> [0, 1], (110)

*<sup>i</sup>*! *AL*,*i*(*x*), (106)

*<sup>i</sup>*! <sup>A</sup>*L*,*i*(*x*) (107)

*<sup>i</sup>*! , *i* = 0, ..., *n*, the result follows from

*<sup>L</sup>*,*n*[ *f* ](*x*), ∀*x* ∈ [0, 1]. (109)

*<sup>L</sup>*,*n*[(*x*)*n*+1] �= 0, ∀*x* ∈ [0, 1], (111)

*Kn*(*x*, *<sup>t</sup>*)*f*(*n*+1) (*t*) *dt*, <sup>∀</sup>*<sup>x</sup>* <sup>∈</sup> [0, 1] , (112)

*<sup>n</sup>*(*x*, *<sup>t</sup>*)*f*(*n*+1) (*t*) *dt*, <sup>∀</sup>*<sup>x</sup>* <sup>∈</sup> [0, 1] , (113)

(*<sup>x</sup>* <sup>−</sup> *<sup>t</sup>*)*n*−*<sup>i</sup>* + 

*<sup>L</sup>*,*n*[ *f* ](*x*) as linear functionals

*AL*,*i*(*x*), (114)

*PL*,*n*[ *f* ](*x*) =

*<sup>L</sup>*,*n*[ *f* ](*x*) =

(*i*)

*<sup>i</sup>*! , and respectively, *<sup>ω</sup><sup>i</sup>* <sup>=</sup> *<sup>L</sup>*(Δ*<sup>i</sup> <sup>f</sup>*)

*P*∗

*L*(*PL*,*n*[ *f* ]

*L*(Δ*<sup>i</sup> P*∗

Now it is interesting to consider the estimation of the remainders

*<sup>L</sup>*,*n*[ *f* ](*x*) = *f*(*x*) − *P*<sup>∗</sup>

*<sup>L</sup>*,*n*[ *f* ](*x*) = 0, *R*<sup>∗</sup>

For a fixed *x* we may consider the remainder *RL*,*n*[ *f* ](*x*) and *R*<sup>∗</sup>

*n*! 1 0

*n*! 1 0 *K*∗

= (*x* − *t*)

*n* + −

*n* ∑ *i*=0 *n i L* 

*R*∗

*R*∗

*i. e. the polynomial operators (106) and (107) are exact on* P*n.*

**Theorem 20.** *Let f* <sup>∈</sup> *<sup>C</sup>n*+<sup>1</sup> [*a*, *<sup>b</sup>*] , *the following relations hold*

*RL*,*n*(*<sup>f</sup>* , *<sup>x</sup>*) = <sup>1</sup>

*<sup>L</sup>*,*n*(*<sup>f</sup>* , *<sup>x</sup>*) = <sup>1</sup>

 (*x* − *t*) *n* + 

*R*∗

*Kn*(*x*, *t*) = *RL*,*<sup>n</sup>*

*are the unique polynomial of degree* ≤ *n such that*

**Theorem 19.** *The polynomials*

*Proof.* Setting *<sup>ω</sup><sup>i</sup>* <sup>=</sup> *<sup>L</sup>*(*<sup>f</sup>* (*i*))

*and of second kind, respectively.*

**Remark 10.** *For any f* ∈ P*<sup>n</sup>*

Therefore we have

*where*

Theorem 18.

$$(\sigma\_n)^2 = \left(\frac{1}{n!}\right)^2 \int\_0^1 \left(K\_n(\mathbf{x}, t)\right)^2 dt, \quad (\sigma\_n^\*)^2 = \left(\frac{1}{n!}\right)^2 \int\_0^1 \left(K\_n^\*(\mathbf{x}, t)\right)^2 dt,\tag{117}$$

and

$$|||f|||^2 = \int\_0^1 \left( f^{(n+1)}\left(t\right) \right)^2 dt;\tag{118}$$

**ii)** for *q* = 1, *p* = ∞ we have that

$$\left| \left| R\_{L, \mathbb{n}}[f](\mathbf{x}) \right| \leq \frac{1}{n!} M\_{\mathbb{n} + 1} \int\_0^1 \left| K\_{\mathbb{n}}(\mathbf{x}, t) \right| dt, \quad \left| R\_{L, \mathbb{n}}^\*[f](\mathbf{x}) \right| \leq \frac{1}{n!} M\_{\mathbb{n} + 1} \int\_0^1 \left| K\_{\mathbb{n}}^\*(\mathbf{x}, t) \right| dt,\tag{119}$$

where

$$M\_{\mathbb{N}+1} = \sup\_{a \le x \le b} \left| f^{(\mathbb{n}+1)} \left( \mathbf{x} \right) \right|. \tag{120}$$

A further polynomial operator can be determined as follows: for any fixed *z* ∈ [0, 1] we consider the polynomial

$$\overline{P}\_{\text{L,n}}[f](\mathbf{x}) \equiv f(\mathbf{z}) + P\_{\text{L,n}}[f](\mathbf{x}) - P\_{\text{L,n}}[f](\mathbf{z}) = f(\mathbf{z}) + \sum\_{i=1}^{n} \frac{L(f^{(i)})}{i!} \left(A\_{\text{L,i}}(\mathbf{x}) - A\_{\text{L,i}}(\mathbf{z})\right), \tag{121}$$

and, respectively,

$$\left[\overline{P}\_{\mathrm{L},\mathrm{u}}^{\*}[f](\mathbf{x}) \equiv f(\mathbf{z}) + P\_{\mathrm{L},\mathrm{u}}^{\*}[f](\mathbf{x}) - P\_{\mathrm{L},\mathrm{u}}^{\*}[f](\mathbf{z}) = f(\mathbf{z}) + \sum\_{i=1}^{n} \frac{L(\Delta^{i}f)}{i!} \left(\mathcal{A}\_{\mathrm{L},i}(\mathbf{x}) - \mathcal{A}\_{\mathrm{L},i}(\mathbf{z})\right). \tag{122}$$

#### Then we have the following

**Theorem 21.** *The polynomials PL*,*n*[ *f* ](*x*)*, P*<sup>∗</sup> *<sup>L</sup>*,*n*[ *f* ](*x*) *are approximating polynomials of degree n for f*(*x*)*, i.e.:*

$$\forall \mathbf{x} \in [0, 1], \quad f(\mathbf{x}) = \overline{P}\_{L, \mathbb{H}}[f](\mathbf{x}) + \overline{R}\_{L, \mathbb{H}}[f](\mathbf{x}), \tag{123}$$

$$f(\mathbf{x}) = \overline{P}\_{L,n}^\*[f](\mathbf{x}) + \overline{R}\_{L,n}^\*[f](\mathbf{x}),\tag{124}$$

*where*

$$
\overline{R}\_{L,n}[f](\mathbf{x}) = \mathcal{R}\_{L,n}[f](\mathbf{x}) - \mathcal{R}\_{L,n}[f](\mathbf{z}),\tag{125}
$$

$$\overline{R}\_{L,n}^\*[f](\mathbf{x}) = R\_{L,n}^\*[f](\mathbf{x}) - R\_{L,n}^\*[f](\mathbf{z}),\tag{126}$$

*with*

$$
\overline{R}\_{L,\mathbb{M}}[\mathbf{x}^i] = \mathbf{0}, \quad i = \mathbf{0}, \dots \\
\mathbf{n}, \quad \overline{R}\_{L,\mathbb{M}}[\mathbf{x}^{n+1}] \neq \mathbf{0}, \tag{127}
$$

$$
\overline{\mathcal{R}}\_{\mathbf{L},\mathbb{M}}^{\*}[(\mathbf{x})\_{i}] = \mathbf{0}, \quad i = \mathbf{0}, \ldots, n, \quad \overline{\mathcal{R}}\_{\mathbf{L},\mathbb{M}}^{\*}[(\mathbf{x})\_{n+1}] \neq \mathbf{0}. \tag{128}
$$

*Proof.* ∀*x* ∈ [0, 1] and for any fixed *z* ∈ [0, 1], from (108), we have

$$f(\mathbf{x}) - f(z) = P\_{\mathcal{L}, \mathfrak{n}}[f](\mathbf{x}) - P\_{\mathcal{L}, \mathfrak{n}}[f](z) + \mathcal{R}\_{\mathcal{L}, \mathfrak{n}}[f](\mathbf{x}) - \mathcal{R}\_{\mathcal{L}, \mathfrak{n}}[f](z),$$

from which we get (123) and (125). The exactness of the polynomial *PL*,*n*[ *f* ](*x*) follows from the exactness of the polynomial *PL*,*n*[ *f* ](*x*).

Proceeding in the same manner we can prove the result for the polynomial *P*<sup>∗</sup> *<sup>L</sup>*,*n*[ *f* ](*x*).

**Remark 12.** *The polynomials PL*,*n*[ *f* ](*x*)*, P*<sup>∗</sup> *<sup>L</sup>*,*n*[ *f* ](*x*) *satisfy the interpolation conditions*

$$
\overline{P}\_{L,n}[f](z) = f(z), \quad L(\overline{P}\_{L,n}^{(i)}[f]) = L(f^{(i)}), \text{ i } i = 1, \dots, n,\tag{129}
$$

$$
\overline{P}\_{L,n}^\*[f](z) = f(z), \quad L(\Delta^i \overline{P}\_{L,n}^\*[f]) = L(\Delta^i f), \; i = 1, \ldots, n. \tag{130}
$$

#### **9. Examples of Appell interpolation polynomials**

**a)** Taylor interpolation and classical interpolation on equidistant points: Assuming

$$L(f) = f(\mathbf{x}\_0), \qquad \mathbf{x}\_0 \in [0, 1], \tag{131}$$

the polynomials *PL*,*n*[ *f* ](*x*) and *P*<sup>∗</sup> *<sup>L</sup>*,*n*[ *f* ](*x*) are, respectively, the Taylor interpolation polynomial and the classical interpolation polynomial on equidistant points;

	- Bernoulli interpolation of first kind ([15, 21]): Assuming

$$L(f) = \int\_0^1 f(\mathbf{x})d\mathbf{x},\tag{132}$$

the interpolation polynomials *PL*,*n*[ *f* ](*x*) and *PL*,*n*[ *f* ](*x*) become

$$P\_{\rm L,n}[f](\mathbf{x}) = \int\_0^1 f(\mathbf{x})d\mathbf{x} + \sum\_{i=1}^n \frac{f^{(i-1)}(1) - f^{(i-1)}(0)}{i!} B\_{\mathbf{i}}(\mathbf{x}) \, \, \, \, \tag{133}$$

42 Linear Algebra – Theorems and Applications Algebraic Theory of Appell Polynomials with Application to General Linear Interpolation Problem <sup>23</sup> Algebraic Theory of Appell Polynomials with Application to General Linear Interpolation Problem 43

$$\overline{P}\_{L,\mathbb{II}}[f](\mathbf{x}) = f(\mathbf{0}) + \sum\_{i=1}^{n} \frac{f^{(i-1)}(\mathbf{1}) - f^{(i-1)}(\mathbf{0})}{i!} \left(\mathcal{B}\_{\mathbf{i}}(\mathbf{x}) - \mathcal{B}\_{\mathbf{i}}(\mathbf{0})\right),\tag{134}$$

where *Bi*(*x*) are the classical Bernoulli polynomials ([17, 23]);

• Bernoulli interpolation of second kind ([19]): Assuming

$$L(f) = \left[D\Delta^{-1}f\right]\_{x=0},\tag{135}$$

where Δ−<sup>1</sup> denote the indefinite summation operator and is defined as the linear operator inverse of the finite difference operator Δ, the interpolation polynomials *P*∗ *<sup>L</sup>*,*n*[ *<sup>f</sup>* ](*x*) and *<sup>P</sup>*<sup>∗</sup> *<sup>L</sup>*,*n*[ *f* ](*x*) become

$$P\_{L,n}^\*[f](\mathbf{x}) = [\Delta^{-1}Df]\_{\mathbf{x}=0} + \sum\_{i=0}^{n-1} f'(i)\mathcal{B}\_{n,i}^{II}(\mathbf{x})\,. \tag{136}$$

$$\overline{P}\_{\mathcal{L},\mathfrak{u}}^{\*}[f](\mathbf{x}) = f(\mathbf{0}) + \sum\_{i=0}^{n-1} f'(i) \left( \mathcal{B}\_{n,i}^{II}(\mathbf{x}) - \mathcal{B}\_{n,i}^{II}(\mathbf{0}) \right), \tag{137}$$

where

22 Will-be-set-by-IN-TECH

*f*(*x*) = *P*<sup>∗</sup>

*<sup>L</sup>*,*n*[ *f* ](*x*) *are approximating polynomials of degree n for*

*<sup>L</sup>*,*n*[ *f* ](*x*), (124)

*<sup>L</sup>*,*n*[ *f* ](*z*), (126)

*<sup>L</sup>*,*n*[(*x*)*n*+1] � 0. (128)

*<sup>L</sup>*,*n*[ *f* ](*x*).

), *i* = 1, ..., *n*, (129)

*<sup>L</sup>*,*n*[ *<sup>f</sup>* ]) = *<sup>L</sup>*(Δ*<sup>i</sup> <sup>f</sup>*), *<sup>i</sup>* = 1, ..., *<sup>n</sup>*. (130)

∀*x* ∈ [0, 1] , *f*(*x*) = *PL*,*n*[ *f* ](*x*) + *RL*,*n*[ *f* ](*x*), (123)

*RL*,*n*[ *f* ](*x*) = *RL*,*n*[ *f* ](*x*) − *RL*,*n*[ *f* ](*z*), (125)

] = 0, *<sup>i</sup>* <sup>=</sup> 0, .., *<sup>n</sup>*, *RL*,*n*[*xn*+1] � 0, (127)

*<sup>L</sup>*,*n*[ *f* ](*x*) *satisfy the interpolation conditions*

*L*(*f*) = *f*(*x*0), *x*<sup>0</sup> ∈ [0, 1], (131)

*<sup>f</sup>*(*i*−<sup>1</sup>)(1) <sup>−</sup> *<sup>f</sup>* (*i*−<sup>1</sup>)(0)

*<sup>L</sup>*,*n*[ *f* ](*x*) are, respectively, the Taylor interpolation

*f*(*x*)*dx*, (132)

*<sup>i</sup>*! *Bi* (*x*), (133)

*<sup>L</sup>*,*n*[ *f* ](*x*) + *R*<sup>∗</sup>

*<sup>L</sup>*,*n*[ *f* ](*x*) − *R*<sup>∗</sup>

*f*(*x*) − *f*(*z*) = *PL*,*n*[ *f* ](*x*) − *PL*,*n*[ *f* ](*z*) + *RL*,*n*[ *f* ](*x*) − *RL*,*n*[ *f* ](*z*),

from which we get (123) and (125). The exactness of the polynomial *PL*,*n*[ *f* ](*x*) follows from

*P*∗

*<sup>L</sup>*,*n*[ *<sup>f</sup>* ]) = *<sup>L</sup>*(*f*(*i*)

Then we have the following

*f*(*x*)*, i.e.:*

*where*

*with*

**Theorem 21.** *The polynomials PL*,*n*[ *f* ](*x*)*, P*<sup>∗</sup>

*R*∗

*RL*,*n*[*x<sup>i</sup>*

*Proof.* ∀*x* ∈ [0, 1] and for any fixed *z* ∈ [0, 1], from (108), we have

*PL*,*n*[ *<sup>f</sup>* ](*z*) = *<sup>f</sup>*(*z*), *<sup>L</sup>*(*P*(*i*)

*<sup>L</sup>*,*n*[ *<sup>f</sup>* ](*z*) = *<sup>f</sup>*(*z*), *<sup>L</sup>*(Δ*<sup>i</sup>*

**a)** Taylor interpolation and classical interpolation on equidistant points:

polynomial and the classical interpolation polynomial on equidistant points;

*L*(*f*) =

*f*(*x*)*dx* +

the interpolation polynomials *PL*,*n*[ *f* ](*x*) and *PL*,*n*[ *f* ](*x*) become

 1 0

 1 0

*n* ∑ *i*=1

**9. Examples of Appell interpolation polynomials**

*R*∗

the exactness of the polynomial *PL*,*n*[ *f* ](*x*).

**Remark 12.** *The polynomials PL*,*n*[ *f* ](*x*)*, P*<sup>∗</sup>

*P*∗

the polynomials *PL*,*n*[ *f* ](*x*) and *P*<sup>∗</sup>

**b)** Bernoulli interpolation of first and of second kind: • Bernoulli interpolation of first kind ([15, 21]):

*PL*,*n*[ *f* ](*x*) =

Assuming

Assuming

*<sup>L</sup>*,*n*[ *f* ](*x*) = *R*<sup>∗</sup>

*<sup>L</sup>*,*n*[(*x*)*i*] = 0, *i* = 0, .., *n*, *R*<sup>∗</sup>

Proceeding in the same manner we can prove the result for the polynomial *P*<sup>∗</sup>

$$\mathcal{B}\_{n,i}^{II}(\mathbf{x}) = \sum\_{j=i}^{n-1} \binom{j}{i} \frac{(-1)^{j-i}}{(j+1)!} \mathcal{B}\_{j+1}^{II}(\mathbf{x}) \, , \tag{138}$$

and *BI I <sup>j</sup>* (*x*) are the Bernoulli polynomials of second kind ([19]);

	- Euler interpolation ([21]): Assuming

$$L(f) = \frac{f(0) + f(1)}{2},\tag{139}$$

the interpolation polynomials *PL*,*n*[ *f* ](*x*) and *PL*,*n*[ *f* ](*x*) become

$$P\_{\mathbf{L},n}[f](\mathbf{x}) = \frac{f(0) + f(1)}{2} + \sum\_{i=1}^{n} \frac{f^{(i)}(0) + f^{(i)}(1)}{2i!} E\_i(\mathbf{x}) \, , \tag{140}$$

$$\overline{P}\_{\mathbf{L},n}[f](\mathbf{x}) = f(\mathbf{0}) + \sum\_{i=1}^{n} \frac{f^{(i)}(\mathbf{0}) + f^{(i)}(\mathbf{1})}{2i!} \left( E\_i \left( \mathbf{x} \right) - E\_i \left( \mathbf{0} \right) \right);\tag{141}$$

• Boole interpolation ([19]): Assuming

$$L(f) = [Mf]\_{\mathbf{x}=\mathbf{0}} \, \prime \tag{142}$$

where *M f* is defined by

$$Mf(\mathbf{x}) = \frac{f(\mathbf{x}) + f(\mathbf{x} + 1)}{2},\tag{143}$$

the interpolation polynomials *P*∗ *<sup>L</sup>*,*n*[ *<sup>f</sup>* ](*x*) and *<sup>P</sup>*<sup>∗</sup> *<sup>L</sup>*,*n*[ *f* ](*x*) become

$$P\_{L, \text{II}}^{\*}[f](\mathbf{x}) = \frac{f(0) + f(1)}{2} \mathcal{E}\_{\mathbf{n}, 0}^{II}(\mathbf{x}) + \sum\_{i=1}^{n} \frac{f(i) + f(i+1)}{2} \mathcal{E}\_{\mathbf{n}, i}^{II}(\mathbf{x}),\tag{144}$$

24 Will-be-set-by-IN-TECH 44 Linear Algebra – Theorems and Applications

$$\left[\overline{P}\_{L,n}^\*[f](\mathbf{x}) = f(0) + \sum\_{i=1}^n \frac{f(i) + f(i+1)}{2} \left(\mathcal{E}\_{n,i}^{II}(\mathbf{x}) - \mathcal{E}\_{n,i}^{II}(0)\right),\tag{145}$$

where

$$\mathcal{E}\_{n,i}^{II}(\mathbf{x}) = \sum\_{j=i}^{n} \binom{j}{i} \frac{(-1)^{j-i}}{j!} E\_j^{II}(\mathbf{x}) \, , \tag{146}$$

and *EI I <sup>j</sup>* (*x*) are the Boole polynomials ([19]).

#### **10. The algebraic approach of Yang and Youn**

Yang and Youn ([18]) also proposed an algebraic approach to Appell polynomial sequences but with different methods. In fact, they referred the Appell sequence, *sn*(*x*), to an invertible analytic function g(t):

$$s\_{\hbar}(\mathbf{x}) = \left[ \frac{d^n}{dt} \left( \frac{1}{\mathbf{g}(t)} e^{\mathbf{x}t} \right) \right]\_{t=0} \text{ \,\, \text{\,} \,\tag{147}}$$

and called Appell vector for *g*(*t*) the vector

$$\overline{S}\_{\mathfrak{n}}(\mathfrak{x}) = \left[ \mathbf{s}\_{0}(\mathfrak{x}), \dots, \mathbf{s}\_{\mathfrak{n}}(\mathfrak{x}) \right]^{T}. \tag{148}$$

Then, they proved that

$$\overline{S}\_{\boldsymbol{\eta}}(\boldsymbol{x}) = P\_{\boldsymbol{\eta}} \left[ \frac{1}{g(t)} \right]\_{t=0} \mathcal{W}\_{\boldsymbol{\eta}} \left[ \boldsymbol{e}^{\mathbf{x}t} \right]\_{t=0} = \mathcal{W}\_{\boldsymbol{\eta}} \left[ \frac{1}{g(t)} \boldsymbol{e}^{\mathbf{x}t} \right]\_{t=0} \tag{149}$$

being *Wn* [ *f*(*t*)] = � *f*(*t*), *f* � (*t*), ..., *f*(*n*)(*t*) �*T* and *Pn*[ *f*(*t*)] the generalized Pascal functional matrix of *f*(*t*) ([28]) defined by

$$(P\_{\boldsymbol{\eta}}[f(t)])\_{i,j} = \begin{cases} \binom{i}{j} f^{(i-j)}(t) & i \ge j \\ 0 & \text{otherwise} \end{cases}, \qquad i, j = 0, \ldots, n. \tag{150}$$

Expressing the (149) in matrix form we have

$$
\overline{S}\_{\mathfrak{n}}(\mathfrak{x}) = SX(\mathfrak{x}),
\tag{151}
$$

with

$$S = \begin{bmatrix} s\_{00} & 0 & 0 & \cdots & 0 \\ s\_{10} & s\_{11} & 0 & \cdots & 0 \\ s\_{20} & s\_{21} & s\_{22} & \cdots & 0 \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ s\_{n0} & s\_{n1} & s\_{n2} & \cdots & s\_{nn} \end{bmatrix}, \quad X(x) = \begin{bmatrix} 1, x, \dots, x^n \end{bmatrix}^T,\tag{152}$$

where

$$s\_{i,j} = \binom{i}{j} \left[ \left( \frac{1}{g(t)} \right)^{(i-j)} \right]\_{t=0}, \qquad i = 0, \ldots, n, \quad j = 0, \ldots, i. \tag{153}$$

It is easy to see that the matrix *S* coincides with the matrix *M*−<sup>1</sup> introduced in Section 5, Theorem 7.

#### **11. Conclusions**

24 Will-be-set-by-IN-TECH

*f*(*i*) + *f*(*i* + 1) 2

> � (−1)*j*−*<sup>i</sup> <sup>j</sup>*! *<sup>E</sup>I I*

� E *I I n*,*i*

*t*=0

� 1 *g*(*t*) *ext* �

*<sup>t</sup>*=<sup>0</sup> = *Wn*

(*x*) − E *I I n*,*i* (0) �

*<sup>j</sup>* (*x*), (146)

, (147)

*<sup>T</sup>* . (148)

, (149)

*<sup>T</sup>* , (152)

*t*=0

and *Pn*[ *f*(*t*)] the generalized Pascal functional

<sup>0</sup> *otherwise* , *<sup>i</sup>*, *<sup>j</sup>* <sup>=</sup> 0, ..., *<sup>n</sup>*. (150)

*Sn*(*x*) = *SX*(*x*), (151)

, *i* = 0, ..., *n*, *j* = 0, ..., *i*. (153)

, *X*(*x*) = [1, *x*, ..., *xn*]

, (145)

*n* ∑ *i*=1

> *n* ∑ *j*=*i*

�*j i*

Yang and Youn ([18]) also proposed an algebraic approach to Appell polynomial sequences but with different methods. In fact, they referred the Appell sequence, *sn*(*x*), to an invertible

> � 1 *g*(*t*) *ext* ��

*Sn*(*x*) = [*s*0(*x*), ...,*sn*(*x*)]

*P*∗

and called Appell vector for *g*(*t*) the vector

� *f*(*t*), *f* �

matrix of *f*(*t*) ([28]) defined by

*Sn*(*x*) = *Pn*

(*Pn*[ *f*(*t*)])*i*,*<sup>j</sup>* =

⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣

. . . . . . . . . ... . . .

� �� <sup>1</sup>

*g*(*t*)

Expressing the (149) in matrix form we have

*S* =

�*i j*

*si*,*<sup>j</sup>* =

where

and *EI I*

analytic function g(t):

Then, they proved that

being *Wn* [ *f*(*t*)] =

with

where

Theorem 7.

*<sup>L</sup>*,*n*[ *f* ](*x*) = *f*(0) +

*<sup>j</sup>* (*x*) are the Boole polynomials ([19]).

**10. The algebraic approach of Yang and Youn**

E *I I n*,*i* (*x*) =

*sn*(*x*) =

� 1 *g*(*t*) �

(*t*), ..., *f*(*n*)(*t*)

� ( *i j*

*s*<sup>00</sup> 0 0 ··· 0 *s*<sup>10</sup> *s*<sup>11</sup> 0 ··· 0 *s*<sup>20</sup> *s*<sup>21</sup> *s*<sup>22</sup> ··· 0

*sn*<sup>0</sup> *sn*<sup>1</sup> *sn*<sup>2</sup> ··· *snn*

�(*i*−*j*) �

� *dn dt*

*t*=0 *Wn* � *ext*�

�*T*

)*<sup>f</sup>* (*i*−*<sup>j</sup>*)(*t*) *<sup>i</sup>* <sup>≥</sup> *<sup>j</sup>*

⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦

*t*=0

It is easy to see that the matrix *S* coincides with the matrix *M*−<sup>1</sup> introduced in Section 5,

We have presented an elementary algebraic approach to the theory of Appell polynomials. Given a sequence of real numbers *βi*, *i* = 0, 1, ..., *β*<sup>0</sup> � 0, a polynomial sequence on determinantal form, called of Appell, has been built. The equivalence of this approach with others existing was proven and, almost always using elementary tools of linear algebra, most important properties od Appell polynomials were proven too. A dual theory referred to the finite difference operator Δ has been proposed. This theory has provided a class of polynomials called Appell polynomials of second kind. Finally, given a linear functional *L*, with *L*(1) � 0, and defined

$$L(\mathbf{x}^i) = \beta\_{i\prime} \quad \left(L((\mathbf{x})\_i) = \beta\_i\right), \tag{154}$$

the linear interpolation problem

$$L(P\_n^{(i)}) = i!\omega\_{i\prime} \qquad \left(L(\Delta^i P\_n) = i!\omega\_i\right), \quad P\_n \in \mathcal{P}\_n \quad \omega\_i \in \mathbb{R},\tag{155}$$

has been considered and its solution has been expressed by the basis of Appell polynomials related to the functional *L* by (154). This problem can be extended to appropriate real functions, providing a new approximating polynomial, the remainder of which can be estimated too. This theory is susceptible of extension to the more general class of Sheffer polynomials and to the bi-dimensional case.

#### **Author details**

Costabile Francesco Aldo and Longo Elisabetta *Department of Mathematics, University of Calabria, Rende, CS, Italy.*

#### **12. References**


## **An Interpretation of Rosenbrock's Theorem via Local Rings**

A. Amparan, S. Marcaida and I. Zaballa

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/81051

**1. Introduction**

26 Will-be-set-by-IN-TECH

[10] Mullin R, Rota G.C (1970) On the foundations of combinatorial theory III. Theory of binomial enumeration. B. Harris (Ed.) Graph Theory and its Applications, Academic

[13] Di Bucchianico A, Loeb D (2000) A Selected Survey of Umbral Calculus, Electronic J.

[14] Lehmer D.H (1988) A New Approach to Bernoulli Polynomials. Amer. Math. Monthly

[15] Costabile F.A (1999) On expansion of a real function in Bernoulli polynomials and

[16] Costabile F.A, Dell'Accio F, Gualtieri M.I (2006) A new approach to Bernoulli

[17] Costabile F.A, Longo E (2010) A determinantal approach to Appell polynomials. Journal

[18] Yang Y, Youn H (2009) Appell polynomial sequences: a linear algebra approach. JP

[19] Costabile F.A, Longo E. Appell polynomials sequences of second kind and related

[20] Fort T (1942) Generalizations of the Bernoulli polynomials and numbers and

[21] Costabile F.A, Longo E (2011) The Appell interpolation problem. Journal of

[22] Highman N.H (1996) Accuracy and stability of numerical Algorithms. SIAM.

[24] Boas R.P, Buck R.C (1964) Polynomial Expansions of Analytic Functions.

[25] Tempesta P (2008) On Appell sequences of polynomials of Bernoulli and Euler type.

[26] He M.X, Ricci P.E (2002) Differential equation of Appell polynomials via the factorization

[27] Davis P.J (1975) Interpolation & Approximation. Dover Publication, Inc. New York. [28] Yang Y, Micek C (2007) Generalized Pascal functional matrix and its applications. Linear

polynomials. Rendiconti di matematica e delle sue applicazioni 26: 1-12.

corresponding summation formulas. Bull. Amer. Math. Soc. 48: 567-574.

[23] Jordan C (1965) Calculus of finite difference. Chealsea Pub. Co. New York.

Journal of Mathematical Analysis and Applications 341: 1295-1310.

[11] Roman S (1984) The Umbral Calculus. Academic Press. New York. [12] Roman S, Rota G.C (1978) The Umbral Calculus. Adv. Math. 27: 95-188.

application. Conferenze del Seminario Matem. - Univ. Bari 273.

of Computational and Applied Mathematics 234 (5): 1528-1542.

Computational and Applied Mathematics 236 (6): 1024-1032.

Journal of Algebra, Number Theory and Applications 13 (1): 65-98.

Combinatorics Dynamical Survey DS3: 1-34.

interpolation problem. Under submission.

method. J. Comput. Appl. Math. 139: 231-237.

Press: 167-213.

46 Linear Algebra – Theorems and Applications

95: 905-911.

Philadelphia.

Springer-Verlag, New York.

Algebra Appl. 423: 230-245.

Consider a linear time invariant system

$$
\dot{\mathbf{x}}(t) = A\mathbf{x}(t) + Bu(t) \tag{1}
$$

to be identified with the pair of matrices (*A*, *<sup>B</sup>*) where *<sup>A</sup>* <sup>∈</sup> **<sup>F</sup>***n*×*n*, *<sup>B</sup>* <sup>∈</sup> **<sup>F</sup>***n*×*<sup>m</sup>* and **<sup>F</sup>** <sup>=</sup> **<sup>R</sup>** or **C** the fields of the real or complex numbers. If state-feedback *u*(*t*) = *Fx*(*t*) + *v*(*t*) is applied to system (1), Rosenbrock's Theorem on pole assignment (see [14]) characterizes for the closed-loop system

$$
\dot{\mathbf{x}}(t) = (A + BF)\mathbf{x}(t) + Bv(t), \tag{2}
$$

the invariant factors of its state-space matrix *A* + *BF*. This result can be seen as the solution of an inverse problem; that of finding a non-singular polynomial matrix with prescribed invariant factors and left Wiener–Hopf factorization indices at infinity. To see this we recall that the invariant factors form a complete system of invariants for the finite equivalence of polynomial matrices (this equivalence relation will be revisited in Section 2) and it will be seen in Section 4 that any polynomial matrix is left Wiener–Hopf equivalent at infinity to a diagonal matrix Diag(*sk*<sup>1</sup> ,...,*skm* ), where the non-negative integers *k*1,..., *km* (that can be assumed in non-increasing order) form a complete system of invariants for the left Wiener–Hopf equivalence at infinity. Consider now the transfer function matrix *<sup>G</sup>*(*s*)=(*sI* <sup>−</sup> (*<sup>A</sup>* <sup>+</sup> *BF*))−1*<sup>B</sup>* of (2). This is a rational matrix that can be written as an irreducible matrix fraction description *G*(*s*) = *N*(*s*)*P*(*s*)−1, where *N*(*s*) and *P*(*s*) are right coprime polynomial matrices. In the terminology of [18], *P*(*s*) is a polynomial matrix representation of (2), concept that is closely related to that of polynomial model introduced by Fuhrmann (see for example [8] and the references therein). It turns out that all polynomial matrix representations of a system are right equivalent (see [8, 18]), that is, if *P*1(*s*) and *P*2(*s*) are polynomial matrix representations of the same system there exists a unimodular matrix *U*(*s*) such that *P*2(*s*) = *P*1(*s*)*U*(*s*). Therefore all polynomial matrix representations of (2) have the same invariant factors, which are the invariant factors of *sIn* − (*A* + *BF*) except for some trivial ones. Furthermore, all polynomial

©2012 Amparan et al., licensee InTech. This is an open access chapter distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. ©2012 Amparan et al., licensee InTech. This is a paper distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### 2 Will-be-set-by-IN-TECH 48 Linear Algebra – Theorems and Applications

matrix representations also have the same left Wiener– Hopf factorization indices at infinity, which are equal to the controllability indices of (2) and (1), because the controllability indices are invariant under feedback. With all this in mind it is not hard to see that Rosenbrock's Theorem on pole assignment is equivalent to finding necessary and sufficient conditions for the existence of a non-singular polynomial matrix with prescribed invariant factors and left Wiener–Hopf factorization indices at infinity. This result will be precisely stated in Section 5 once all the elements that appear are properly defined. In addition, there is a similar result to Rosenbrock's Theorem on pole assignment but involving the infinite structure (see [1]).

Our goal is to generalize both results (the finite and infinite versions of Rosenbrock's Theorem) for rational matrices defined on arbitrary fields via local rings. This will be done in Section 5 and an extension to arbitrary fields of the concept of Wiener–Hopf equivalence will be needed. This concept is very well established for complex valued rational matrix functions (see for example [6, 10]). Originally it requires a closed contour, *γ*, that divides the extended complex plane (**C** ∪ {∞}) into two parts: the inner domain (Ω+) and the region outside *γ* (Ω−), which contains the point at infinity. Then two non-singular *m* × *m* complex rational matrices *T*1(*s*) and *T*2(*s*), with no poles and no zeros in *γ*, are said to be left Wiener–Hopf equivalent with respect to *γ* if there are *m* × *m* matrices *U*−(*s*) and *U*+(*s*) with no poles and no zeros in Ω<sup>−</sup> ∪ *γ* and Ω<sup>+</sup> ∪ *γ*, respectively, such that

$$T\_2(s) = \mathcal{U}\_-(s)T\_1(s)\mathcal{U}\_+(s). \tag{3}$$

It can be seen, then, that any non-singular *m* × *m* complex rational matrix *T*(*s*) is left Wiener–Hopf equivalent with respect to *γ* to a diagonal matrix

$$\text{Diag}\left(\left(\text{s}-\text{z}\_{0}\right)^{k\_{1}}, \dots, \left(\text{s}-\text{z}\_{0}\right)^{k\_{m}}\right) \tag{4}$$

where *z*<sup>0</sup> is any complex number in Ω<sup>+</sup> and *k*<sup>1</sup> ≥··· ≥ *km* are integers uniquely determined by *T*(*s*). They are called the left Wiener–Hopf factorization indices of *T*(*s*) with respect to *γ* (see again [6, 10]). The generalization to arbitrary fields relies on the following idea: We can identify Ω<sup>+</sup> ∪ *γ* and (Ω<sup>−</sup> ∪ *γ*) \ {∞} with two sets *M* and *M*� , respectively, of maximal ideals of **C**[*s*]. In fact, to each *z*<sup>0</sup> ∈ **C** we associate the ideal generated by *s* − *z*0, which is a maximal ideal of **C**[*s*]. Notice that *s* − *z*<sup>0</sup> is also a prime polynomial of **C**[*s*] but *M* and *M*� , as defined, cannot contain the zero ideal, which is prime. Thus we are led to consider the set Specm(**C**[*s*]) of maximal ideals of **C**[*s*]. By using this identification we define the left Wiener–Hopf equivalence of rational matrices over an arbitrary field **F** with respect to a subset *M* of Specm(**F**[*s*]), the set of all maximal ideals of **F**[*s*]. In this study local rings play a fundamental role. They will be introduced in Section 2. Localization techniques have been used previously in the algebraic theory of linear systems (see, for example, [7]). In Section 3 the algebraic structure of the rings of proper rational functions with prescribed finite poles is studied (i.e., for a fixed *M* ⊆ Specm(**F**[*s*]) the ring of proper rational functions *p*(*s*) *<sup>q</sup>*(*s*) with gcd(*g*(*s*), *π*(*s*)) = 1 for all (*π*(*s*)) ∈ *M*). It will be shown that if there is an ideal generated by a linear polynomial outside *M* then the set of proper rational functions with no poles in *M* is an Euclidean domain and all rational matrices can be classified according to their Smith–McMillan invariants. In this case, two types of invariants live together for any non-singular rational matrix and any set *M* ⊆ Specm(**F**[*s*]): its Smith–McMillan and left Wiener–Hopf invariants. In Section 5 we show that a Rosenbrock-like Theorem holds true that completely characterizes the relationship between these two types of invariants.

#### **2. Preliminaries**

2 Will-be-set-by-IN-TECH

matrix representations also have the same left Wiener– Hopf factorization indices at infinity, which are equal to the controllability indices of (2) and (1), because the controllability indices are invariant under feedback. With all this in mind it is not hard to see that Rosenbrock's Theorem on pole assignment is equivalent to finding necessary and sufficient conditions for the existence of a non-singular polynomial matrix with prescribed invariant factors and left Wiener–Hopf factorization indices at infinity. This result will be precisely stated in Section 5 once all the elements that appear are properly defined. In addition, there is a similar result to Rosenbrock's Theorem on pole assignment but involving the infinite structure (see [1]).

Our goal is to generalize both results (the finite and infinite versions of Rosenbrock's Theorem) for rational matrices defined on arbitrary fields via local rings. This will be done in Section 5 and an extension to arbitrary fields of the concept of Wiener–Hopf equivalence will be needed. This concept is very well established for complex valued rational matrix functions (see for example [6, 10]). Originally it requires a closed contour, *γ*, that divides the extended complex plane (**C** ∪ {∞}) into two parts: the inner domain (Ω+) and the region outside *γ* (Ω−), which contains the point at infinity. Then two non-singular *m* × *m* complex rational matrices *T*1(*s*) and *T*2(*s*), with no poles and no zeros in *γ*, are said to be left Wiener–Hopf equivalent with respect to *γ* if there are *m* × *m* matrices *U*−(*s*) and *U*+(*s*) with no poles and no zeros in Ω<sup>−</sup> ∪ *γ*

It can be seen, then, that any non-singular *m* × *m* complex rational matrix *T*(*s*) is left

where *z*<sup>0</sup> is any complex number in Ω<sup>+</sup> and *k*<sup>1</sup> ≥··· ≥ *km* are integers uniquely determined by *T*(*s*). They are called the left Wiener–Hopf factorization indices of *T*(*s*) with respect to *γ* (see again [6, 10]). The generalization to arbitrary fields relies on the following idea: We

ideals of **C**[*s*]. In fact, to each *z*<sup>0</sup> ∈ **C** we associate the ideal generated by *s* − *z*0, which is a maximal ideal of **C**[*s*]. Notice that *s* − *z*<sup>0</sup> is also a prime polynomial of **C**[*s*] but *M* and

*<sup>q</sup>*(*s*) with gcd(*g*(*s*), *π*(*s*)) = 1 for all (*π*(*s*)) ∈ *M*). It will be shown that if there is an ideal generated by a linear polynomial outside *M* then the set of proper rational functions with no poles in *M* is an Euclidean domain and all rational matrices can be classified according to their Smith–McMillan invariants. In this case, two types of invariants live together for any non-singular rational matrix and any set *M* ⊆ Specm(**F**[*s*]): its Smith–McMillan and left Wiener–Hopf invariants. In Section 5 we show that a Rosenbrock-like Theorem holds true that

completely characterizes the relationship between these two types of invariants.

, as defined, cannot contain the zero ideal, which is prime. Thus we are led to consider the set Specm(**C**[*s*]) of maximal ideals of **C**[*s*]. By using this identification we define the left Wiener–Hopf equivalence of rational matrices over an arbitrary field **F** with respect to a subset *M* of Specm(**F**[*s*]), the set of all maximal ideals of **F**[*s*]. In this study local rings play a fundamental role. They will be introduced in Section 2. Localization techniques have been used previously in the algebraic theory of linear systems (see, for example, [7]). In Section 3 the algebraic structure of the rings of proper rational functions with prescribed finite poles is studied (i.e., for a fixed *M* ⊆ Specm(**F**[*s*]) the ring of proper rational functions

(*<sup>s</sup>* <sup>−</sup> *<sup>z</sup>*0)*k*<sup>1</sup> ,...,(*<sup>s</sup>* <sup>−</sup> *<sup>z</sup>*0)*km*

Wiener–Hopf equivalent with respect to *γ* to a diagonal matrix

Diag 

can identify Ω<sup>+</sup> ∪ *γ* and (Ω<sup>−</sup> ∪ *γ*) \ {∞} with two sets *M* and *M*�

*T*2(*s*) = *U*−(*s*)*T*1(*s*)*U*+(*s*). (3)

(4)

, respectively, of maximal

and Ω<sup>+</sup> ∪ *γ*, respectively, such that

*M*�

*p*(*s*)

In the sequel **F**[*s*] will denote the ring of polynomials with coefficients in an arbitrary field **F** and Specm(**F**[*s*]) the set of all maximal ideals of **F**[*s*], that is,

$$\text{Spec}(\mathbb{F}[s]) = \left\{ (\pi(s)) : \pi(s) \in \mathbb{F}[s], \text{ irreducible, monic, different from } 1 \right\}. \tag{5}$$

Let *π*(*s*) ∈ **F**[*s*] be a monic irreducible non-constant polynomial. Let *S* = **F**[*s*] \ (*π*(*s*)) be the multiplicative subset of **F**[*s*] whose elements are coprime with *π*(*s*). We denote by **F***π*(*s*) the quotient ring of **F**[*s*] by *S*; i.e., *S*−1**F**[*s*]:

$$\mathcal{F}\_{\pi}(s) = \left\{ \frac{p(s)}{q(s)} : p(s), q(s) \in \mathbb{F}[s], \gcd(q(s), \pi(s)) = 1 \right\}.\tag{6}$$

This is the localization of **F**[*s*] at (*π*(*s*)) (see [5]). The units of **F***π*(*s*) are the rational functions *u*(*s*) = *<sup>p</sup>*(*s*) *<sup>q</sup>*(*s*) such that gcd(*p*(*s*), *π*(*s*)) = 1 and gcd(*q*(*s*), *π*(*s*)) = 1. Consequentially,

$$\mathcal{F}\_{\pi}(s) = \left\{ \mathfrak{u}(s)\pi(s)^d : \mathfrak{u}(s) \text{ is a unit and } d \ge 0 \right\} \cup \{ 0 \}. \tag{7}$$

For any *M* ⊆ Specm(**F**[*s*]), let

$$\begin{array}{lcl} \mathbb{F}\_{M}(\mathbf{s}) = \bigcap\_{(\pi(\mathbf{s})) \in M} \mathbb{F}\_{\pi}(\mathbf{s})\\ = \left\{ \frac{p(\mathbf{s})}{q(\mathbf{s})} : p(\mathbf{s}), q(\mathbf{s}) \in \mathbb{F}[\mathbf{s}], \ \gcd(q(\mathbf{s}), \pi(\mathbf{s})) = 1 \,\forall \,(\pi(\mathbf{s})) \in M \right\}. \end{array} \tag{8}$$

This is a ring whose units are the rational functions *u*(*s*) = *<sup>p</sup>*(*s*) *<sup>q</sup>*(*s*) such that for all ideals (*π*(*s*)) ∈ *M*, gcd(*p*(*s*), *π*(*s*)) = 1 and gcd(*q*(*s*), *π*(*s*)) = 1. Notice that, in particular, if *M* = Specm(**F**[*s*]) then **F***M*(*s*) = **F**[*s*] and if *M* = ∅ then **F***M*(*s*) = **F**(*s*), the field of rational functions.

Moreover, if *α*(*s*) ∈ **F**[*s*] is a non-constant polynomial whose prime factorization, *α*(*s*) = *<sup>k</sup>α*1(*s*)*d*<sup>1</sup> ··· *<sup>α</sup>m*(*s*)*dm* , satisfies the condition that (*αi*(*s*)) <sup>∈</sup> *<sup>M</sup>* for all *<sup>i</sup>*, we will say that *<sup>α</sup>*(*s*) factorizes in *M* or *α*(*s*) has all its zeros in *M*. We will consider that the only polynomials that factorize in *M* = ∅ are the constants. We say that a non-zero rational function factorizes in *M* if both its numerator and denominator factorize in *M*. In this case we will say that the rational function has all its zeros and poles in *M*. Similarly, we will say that *<sup>p</sup>*(*s*) *<sup>q</sup>*(*s*) has no poles in *M* if *p*(*s*) �= 0 and gcd(*q*(*s*), *π*(*s*)) = 1 for all ideals (*π*(*s*)) ∈ *M*. And it has no zeros in *M* if gcd(*p*(*s*), *<sup>π</sup>*(*s*)) = 1 for all ideals (*π*(*s*)) <sup>∈</sup> *<sup>M</sup>*. In other words, it is equivalent that *<sup>p</sup>*(*s*) *<sup>q</sup>*(*s*) has no poles and no zeros in *M* and that *<sup>p</sup>*(*s*) *<sup>q</sup>*(*s*) is a unit of **F***M*(*s*). So, a non-zero rational function factorizes in *<sup>M</sup>* if and only if it is a unit in **<sup>F</sup>**Specm(**F**[*s*])\*M*(*s*).

Let **<sup>F</sup>***M*(*s*)*m*×*<sup>m</sup>* denote the set of *<sup>m</sup>* <sup>×</sup> *<sup>m</sup>* matrices with elements in **<sup>F</sup>***M*(*s*). A matrix is invertible in **F***M*(*s*)*m*×*<sup>m</sup>* if all its elements are in **F***M*(*s*) and its determinant is a unit in **F***M*(*s*). We denote by Gl*m*(**F***M*(*s*)) the group of units of **F***M*(*s*)*m*×*m*.

**Remark 1.** Let *M*1, *M*<sup>2</sup> ⊆ Specm(**F**[*s*]). Notice that

1. If *M*<sup>1</sup> ⊆ *M*<sup>2</sup> then **F***M*<sup>1</sup> (*s*) ⊇ **F***M*<sup>2</sup> (*s*) and Gl*m*(**F***M*<sup>1</sup> (*s*)) ⊇ Gl*m*(**F***M*<sup>2</sup> (*s*)).

2. **<sup>F</sup>***M*1∪*M*<sup>2</sup> (*s*) = **<sup>F</sup>***M*<sup>1</sup> (*s*) ∩ **<sup>F</sup>***M*<sup>2</sup> (*s*) and Gl*m*(**F***M*1∪*M*<sup>2</sup> (*s*)) = Gl*m*(**F***M*<sup>1</sup> (*s*)) ∩ Gl*m*(**F***M*<sup>2</sup> (*s*)).

#### 4 Will-be-set-by-IN-TECH 50 Linear Algebra – Theorems and Applications

For any *M* ⊆ Specm(**F**[*s*]) the ring **F***M*(*s*) is a principal ideal domain (see [3]) and its field of fractions is **<sup>F</sup>**(*s*). Two matrices *<sup>T</sup>*1(*s*), *<sup>T</sup>*2(*s*) <sup>∈</sup> **<sup>F</sup>**(*s*)*m*×*<sup>m</sup>* are equivalent with respect to *<sup>M</sup>* if there exist matrices *U*(*s*), *V*(*s*) ∈ Gl*m*(**F***M*(*s*)) such that *T*2(*s*) = *U*(*s*)*T*1(*s*)*V*(*s*). Since **F***M*(*s*) is a principal ideal domain, for all non-singular *<sup>G</sup>*(*s*) <sup>∈</sup> **<sup>F</sup>***M*(*s*)*m*×*<sup>m</sup>* (see [13]) there exist matrices *U*(*s*), *V*(*s*) ∈ Gl*m*(**F***M*(*s*)) such that

$$G(s) = \mathcal{U}(s) \operatorname{Diag}(\mathfrak{a}\_1(s), \dots, \mathfrak{a}\_{\mathfrak{m}}(s)) V(s) \tag{9}$$

with *α*1(*s*) | ··· | *αm*(*s*) ("|" stands for divisibility) monic polynomials factorizing in *M*, unique up to multiplication by units of **F***M*(*s*). The diagonal matrix is the Smith normal form of *G*(*s*) with respect to *M* and *α*1(*s*),..., *αm*(*s*) are called the invariant factors of *G*(*s*) with respect to *M*. Now we introduce the Smith–McMillan form with respect to *M*. Assume that *<sup>T</sup>*(*s*) <sup>∈</sup> **<sup>F</sup>**(*s*)*m*×*<sup>m</sup>* is a non-singular rational matrix. Then *<sup>T</sup>*(*s*) = *<sup>G</sup>*(*s*) *<sup>d</sup>*(*s*) with *<sup>G</sup>*(*s*) <sup>∈</sup> **<sup>F</sup>***M*(*s*)*m*×*<sup>m</sup>* and *d*(*s*) ∈ **F**[*s*] monic, factorizing in *M*. Let *G*(*s*) = *U*(*s*) Diag(*α*1(*s*),..., *αm*(*s*))*V*(*s*) be the Smith normal form with respect to *M* of *G*(*s*), i.e., *U*(*s*), *V*(*s*) invertible in **F***M*(*s*)*m*×*<sup>m</sup>* and *α*1(*s*) |···| *αm*(*s*) monic polynomials factorizing in *M*. Then

$$T(s) = \mathcal{U}(s)\operatorname{Diag}\left(\frac{\mathfrak{e}\_1(s)}{\psi\_1(s)}, \dots, \frac{\mathfrak{e}\_m(s)}{\psi\_m(s)}\right)V(s) \tag{10}$$

where *�i*(*s*) *<sup>ψ</sup>i*(*s*) are irreducible rational functions, which are the result of dividing *<sup>α</sup>i*(*s*) by *<sup>d</sup>*(*s*) and canceling the common factors. They satisfy that *�*1(*s*) | ··· | *�m*(*s*), *ψm*(*s*) | ··· | *ψ*1(*s*) are monic polynomials factorizing in *M*. The diagonal matrix in (10) is the Smith–McMillan form with respect to *M*. The rational functions *�i*(*s*) *<sup>ψ</sup>i*(*s*), *<sup>i</sup>* = 1, . . . , *<sup>m</sup>*, are called the invariant rational functions of *T*(*s*) with respect to *M* and constitute a complete system of invariants of the equivalence with respect to *M* for rational matrices.

In particular, if *M* = Specm(**F**[*s*]) then **F**Specm(**F**[*s*])(*s*) = **F**[*s*], the matrices *U*(*s*), *V*(*s*) ∈ Gl*m*(**F**[*s*]) are unimodular matrices, (10) is the global Smith–McMillan form of a rational matrix (see [15] or [14] when **F** = **R** or **C**) and *�i*(*s*) *<sup>ψ</sup>i*(*s*) are the global invariant rational functions of *T*(*s*).

From now on rational matrices will be assumed to be non-singular unless the opposite is specified. Given any *M* ⊆ Specm(**F**[*s*]) we say that an *m* × *m* non-singular rational matrix has no zeros and no poles in *M* if its global invariant rational functions are units of **F***M*(*s*). If its global invariant rational functions factorize in *M*, the matrix has its global finite structure localized in *M* and we say that the matrix has all zeros and poles in *M*. The former means that *<sup>T</sup>*(*s*) <sup>∈</sup> Gl*m*(**F***M*(*s*)) and the latter that *<sup>T</sup>*(*s*) <sup>∈</sup> Gl*m*(**F**Specm(**F**[*s*])\*M*(*s*)) because det *T*(*s*) = det *U*(*s*) det *V*(*s*) *�*1(*s*)···*�m*(*s*) *<sup>ψ</sup>*1(*s*)···*ψm*(*s*) and det *<sup>U</sup>*(*s*), det *<sup>V</sup>*(*s*) are non-zero constants. The following result clarifies the relationship between the global finite structure of any rational matrix and its local structure with respect to any *M* ⊆ Specm(**F**[*s*]).

**Proposition 2.** *Let M* <sup>⊆</sup> Specm(**F**[*s*])*. Let T*(*s*) <sup>∈</sup> **<sup>F</sup>**(*s*)*m*×*<sup>m</sup> be non-singular with <sup>α</sup>*1(*s*) *<sup>β</sup>*1(*s*),..., *<sup>α</sup>m*(*s*) *βm*(*s*) *its global invariant rational functions and let �*1(*s*) *<sup>ψ</sup>*1(*s*),..., *�m*(*s*) *<sup>ψ</sup>m*(*s*) *be irreducible rational functions such that �*1(*s*) | ··· | *�m*(*s*)*, ψm*(*s*) | ··· | *ψ*1(*s*) *are monic polynomials factorizing in M. The following properties are equivalent:*


$$T(s) = \mathcal{U}\_1(s) \operatorname{Diag} \left( \frac{\epsilon\_1(s)}{\psi\_1(s)}, \dots, \frac{\epsilon\_m(s)}{\psi\_m(s)} \right) \mathcal{U}\_2(s), \tag{11}$$

*i.e., �*1(*s*) *<sup>ψ</sup>*1(*s*),..., *�m*(*s*) *<sup>ψ</sup>m*(*s*) *are the invariant rational functions of T*(*s*) *with respect to M. 3. αi*(*s*) = *�i*(*s*)*�*� *i* (*s*) *and βi*(*s*) = *ψi*(*s*)*ψ*� *i* (*s*) *with �*� *i* (*s*), *ψ*� *i* (*s*) ∈ **F**[*s*] *units of* **F***M*(*s*)*, for i* = 1, . . . , *m.*

**Proof**.- 1 <sup>⇒</sup> 2. Since the global invariant rational functions of *TL*(*s*) are *�*1(*s*) *<sup>ψ</sup>*1(*s*),..., *�m*(*s*) *<sup>ψ</sup>m*(*s*), there exist *<sup>W</sup>*1(*s*), *<sup>W</sup>*2(*s*) <sup>∈</sup> Gl*m*(**F**[*s*]) such that *TL*(*s*) = *<sup>W</sup>*1(*s*) Diag *�*1(*s*) *<sup>ψ</sup>*1(*s*),..., *�m*(*s*) *ψm*(*s*) *W*2(*s*). As **F**Specm(**F**[*s*])(*s*) = **F**[*s*], by Remark 1.1, *W*1(*s*), *W*2(*s*) ∈ Gl*m*(**F***M*(*s*)). Therefore, putting *U*1(*s*) = *W*1(*s*) and *U*2(*s*) = *W*2(*s*)*TR*(*s*) it follows that *U*1(*s*) and *U*2(*s*) are invertible in **<sup>F</sup>***M*(*s*)*m*×*<sup>m</sup>* and *<sup>T</sup>*(*s*) = *<sup>U</sup>*1(*s*) Diag *�*1(*s*) *<sup>ψ</sup>*1(*s*),..., *�m*(*s*) *ψm*(*s*) *U*2(*s*).

2 ⇒ 3. There exist unimodular matrices *V*1(*s*), *V*2(*s*) ∈ **F**[*s*] *<sup>m</sup>*×*<sup>m</sup>* such that

$$T(s) = V\_1(s) \operatorname{Diag} \left( \frac{\mathfrak{a}\_1(s)}{\mathfrak{F}\_1(s)}, \dots, \frac{\mathfrak{a}\_m(s)}{\mathfrak{F}\_m(s)} \right) V\_2(s) \tag{12}$$

with *<sup>α</sup>i*(*s*) *<sup>β</sup>i*(*s*) irreducible rational functions such that *<sup>α</sup>*1(*s*) |···| *<sup>α</sup>m*(*s*) and *<sup>β</sup>m*(*s*) |···| *<sup>β</sup>*1(*s*) are monic polynomials. Write *<sup>α</sup>i*(*s*) *<sup>β</sup>i*(*s*) <sup>=</sup> *pi*(*s*)*p*� *i* (*s*) *qi*(*s*)*q*� *i* (*s*) such that *pi*(*s*), *qi*(*s*) factorize in *M* and *p*� *i* (*s*), *q*� *i* (*s*) factorize in Specm(**F**[*s*]) \ *M*. Then

$$T(s) = V\_1(s) \operatorname{Diag}\left(\frac{p\_1(s)}{q\_1(s)}, \dots, \frac{p\_m(s)}{q\_m(s)}\right) \operatorname{Diag}\left(\frac{p\_1'(s)}{q\_1'(s)}, \dots, \frac{p\_m'(s)}{q\_m'(s)}\right) V\_2(s) \tag{13}$$

with *<sup>V</sup>*1(*s*) and Diag *<sup>p</sup>*� <sup>1</sup>(*s*) *q*� <sup>1</sup>(*s*) ,..., *<sup>p</sup>*� *<sup>m</sup>*(*s*) *q*� *<sup>m</sup>*(*s*) *V*2(*s*) invertible in **F***M*(*s*)*m*×*m*. Since the Smith–McMillan form with respect to *M* is unique we get that *pi*(*s*) *qi*(*s*) <sup>=</sup> *�i*(*s*) *<sup>ψ</sup>i*(*s*).

3 ⇒ 1. Write (12) as

4 Will-be-set-by-IN-TECH

For any *M* ⊆ Specm(**F**[*s*]) the ring **F***M*(*s*) is a principal ideal domain (see [3]) and its field of fractions is **<sup>F</sup>**(*s*). Two matrices *<sup>T</sup>*1(*s*), *<sup>T</sup>*2(*s*) <sup>∈</sup> **<sup>F</sup>**(*s*)*m*×*<sup>m</sup>* are equivalent with respect to *<sup>M</sup>* if there exist matrices *U*(*s*), *V*(*s*) ∈ Gl*m*(**F***M*(*s*)) such that *T*2(*s*) = *U*(*s*)*T*1(*s*)*V*(*s*). Since **F***M*(*s*) is a principal ideal domain, for all non-singular *<sup>G</sup>*(*s*) <sup>∈</sup> **<sup>F</sup>***M*(*s*)*m*×*<sup>m</sup>* (see [13]) there exist matrices

with *α*1(*s*) | ··· | *αm*(*s*) ("|" stands for divisibility) monic polynomials factorizing in *M*, unique up to multiplication by units of **F***M*(*s*). The diagonal matrix is the Smith normal form of *G*(*s*) with respect to *M* and *α*1(*s*),..., *αm*(*s*) are called the invariant factors of *G*(*s*) with respect to *M*. Now we introduce the Smith–McMillan form with respect to *M*. Assume that

and *d*(*s*) ∈ **F**[*s*] monic, factorizing in *M*. Let *G*(*s*) = *U*(*s*) Diag(*α*1(*s*),..., *αm*(*s*))*V*(*s*) be the Smith normal form with respect to *M* of *G*(*s*), i.e., *U*(*s*), *V*(*s*) invertible in **F***M*(*s*)*m*×*<sup>m</sup>* and

> *�*1(*s*) *ψ*1(*s*)

and canceling the common factors. They satisfy that *�*1(*s*) | ··· | *�m*(*s*), *ψm*(*s*) | ··· | *ψ*1(*s*) are monic polynomials factorizing in *M*. The diagonal matrix in (10) is the Smith–McMillan

rational functions of *T*(*s*) with respect to *M* and constitute a complete system of invariants of

In particular, if *M* = Specm(**F**[*s*]) then **F**Specm(**F**[*s*])(*s*) = **F**[*s*], the matrices *U*(*s*), *V*(*s*) ∈ Gl*m*(**F**[*s*]) are unimodular matrices, (10) is the global Smith–McMillan form of a rational

From now on rational matrices will be assumed to be non-singular unless the opposite is specified. Given any *M* ⊆ Specm(**F**[*s*]) we say that an *m* × *m* non-singular rational matrix has no zeros and no poles in *M* if its global invariant rational functions are units of **F***M*(*s*). If its global invariant rational functions factorize in *M*, the matrix has its global finite structure localized in *M* and we say that the matrix has all zeros and poles in *M*. The former means that *<sup>T</sup>*(*s*) <sup>∈</sup> Gl*m*(**F***M*(*s*)) and the latter that *<sup>T</sup>*(*s*) <sup>∈</sup> Gl*m*(**F**Specm(**F**[*s*])\*M*(*s*)) because

following result clarifies the relationship between the global finite structure of any rational

*that �*1(*s*) | ··· | *�m*(*s*)*, ψm*(*s*) | ··· | *ψ*1(*s*) *are monic polynomials factorizing in M. The following*

*<sup>ψ</sup>*1(*s*),..., *�m*(*s*)

**Proposition 2.** *Let M* <sup>⊆</sup> Specm(**F**[*s*])*. Let T*(*s*) <sup>∈</sup> **<sup>F</sup>**(*s*)*m*×*<sup>m</sup> be non-singular with <sup>α</sup>*1(*s*)

matrix and its local structure with respect to any *M* ⊆ Specm(**F**[*s*]).

*<sup>ψ</sup>i*(*s*) are irreducible rational functions, which are the result of dividing *<sup>α</sup>i*(*s*) by *<sup>d</sup>*(*s*)

,..., *�m*(*s*) *ψm*(*s*) *<sup>ψ</sup>*1(*s*)···*ψm*(*s*) and det *<sup>U</sup>*(*s*), det *<sup>V</sup>*(*s*) are non-zero constants. The

*<sup>T</sup>*(*s*) <sup>∈</sup> **<sup>F</sup>**(*s*)*m*×*<sup>m</sup>* is a non-singular rational matrix. Then *<sup>T</sup>*(*s*) = *<sup>G</sup>*(*s*)

*T*(*s*) = *U*(*s*) Diag

*α*1(*s*) |···| *αm*(*s*) monic polynomials factorizing in *M*. Then

form with respect to *M*. The rational functions *�i*(*s*)

the equivalence with respect to *M* for rational matrices.

matrix (see [15] or [14] when **F** = **R** or **C**) and *�i*(*s*)

det *T*(*s*) = det *U*(*s*) det *V*(*s*) *�*1(*s*)···*�m*(*s*)

*properties are equivalent:*

*its global invariant rational functions and let �*1(*s*)

*G*(*s*) = *U*(*s*) Diag(*α*1(*s*),..., *αm*(*s*))*V*(*s*) (9)

*<sup>d</sup>*(*s*) with *<sup>G</sup>*(*s*) <sup>∈</sup> **<sup>F</sup>***M*(*s*)*m*×*<sup>m</sup>*

*V*(*s*) (10)

*<sup>β</sup>*1(*s*),..., *<sup>α</sup>m*(*s*)

*βm*(*s*)

*<sup>ψ</sup>i*(*s*), *<sup>i</sup>* = 1, . . . , *<sup>m</sup>*, are called the invariant

*<sup>ψ</sup>i*(*s*) are the global invariant rational functions

*<sup>ψ</sup>m*(*s*) *be irreducible rational functions such*

*U*(*s*), *V*(*s*) ∈ Gl*m*(**F***M*(*s*)) such that

where *�i*(*s*)

of *T*(*s*).

$$T(s) = V\_1(s) \operatorname{Diag}\left(\frac{\epsilon\_1(s)}{\psi\_1(s)}, \dots, \frac{\epsilon\_m(s)}{\psi\_m(s)}\right) \operatorname{Diag}\left(\frac{\epsilon'\_1(s)}{\psi'\_1(s)}, \dots, \frac{\epsilon'\_m(s)}{\psi'\_m(s)}\right) V\_2(s) . \tag{14}$$

It follows that *<sup>T</sup>*(*s*) = *TL*(*s*)*TR*(*s*) with *TL*(*s*) = *<sup>V</sup>*1(*s*) Diag *�*1(*s*) *<sup>ψ</sup>*1(*s*),..., *�m*(*s*) *ψm*(*s*) and *TR*(*s*) = Diag *�*� <sup>1</sup>(*s*) *ψ*� 1(*s*),..., *�*� *<sup>m</sup>*(*s*) *ψ*� *<sup>m</sup>*(*s*) *V*2(*s*) ∈ Gl*m*(**F***M*(*s*)).

**Corollary 3.** *Let T*(*s*) <sup>∈</sup> **<sup>F</sup>**(*s*)*m*×*<sup>m</sup> be non-singular and M*1, *<sup>M</sup>*<sup>2</sup> <sup>⊆</sup> Specm(**F**[*s*]) *such that M*<sup>1</sup> <sup>∩</sup> *<sup>M</sup>*<sup>2</sup> <sup>=</sup> <sup>∅</sup>*. If �<sup>i</sup>* <sup>1</sup>(*s*) *ψi* <sup>1</sup>(*s*) ,..., *�<sup>i</sup> <sup>m</sup>*(*s*) *ψi <sup>m</sup>*(*s*) *are the invariant rational functions of T*(*s*) *with respect to Mi, i* <sup>=</sup> 1, 2*, then �*<sup>1</sup> <sup>1</sup> (*s*)*�*<sup>2</sup> <sup>1</sup> (*s*) *ψ*1 <sup>1</sup> (*s*)*ψ*<sup>2</sup> <sup>1</sup> (*s*) ,..., *�*<sup>1</sup> *m*(*s*)*�*<sup>2</sup> *<sup>m</sup>*(*s*) *ψ*1 *m*(*s*)*ψ*<sup>2</sup> *<sup>m</sup>*(*s*) *are the invariant rational functions of T*(*s*) *with respect to M*<sup>1</sup> <sup>∪</sup> *<sup>M</sup>*2*.*

#### 6 Will-be-set-by-IN-TECH 52 Linear Algebra – Theorems and Applications

**Proof**.- Let *<sup>α</sup>*1(*s*) *<sup>β</sup>*1(*s*),..., *<sup>α</sup>m*(*s*) *<sup>β</sup>m*(*s*) be the global invariant rational functions of *<sup>T</sup>*(*s*). By Proposition 2, *αi*(*s*) = *�*<sup>1</sup> *<sup>i</sup>* (*s*)*n*<sup>1</sup> *<sup>i</sup>* (*s*), *<sup>β</sup>i*(*s*) = *<sup>ψ</sup>*<sup>1</sup> *<sup>i</sup>* (*s*)*d*<sup>1</sup> *<sup>i</sup>* (*s*), with *<sup>n</sup>*<sup>1</sup> *<sup>i</sup>* (*s*), *<sup>d</sup>*<sup>1</sup> *<sup>i</sup>* (*s*) ∈ **F**[*s*] units of **F***M*<sup>1</sup> (*s*). On the other hand *αi*(*s*) = *�*<sup>2</sup> *<sup>i</sup>* (*s*)*n*<sup>2</sup> *<sup>i</sup>* (*s*), *<sup>β</sup>i*(*s*) = *<sup>ψ</sup>*<sup>2</sup> *<sup>i</sup>* (*s*)*d*<sup>2</sup> *<sup>i</sup>* (*s*), with *<sup>n</sup>*<sup>2</sup> *<sup>i</sup>* (*s*), *<sup>d</sup>*<sup>2</sup> *<sup>i</sup>* (*s*) ∈ **F**[*s*] units of **F***M*<sup>2</sup> (*s*). So, *�*1 *<sup>i</sup>* (*s*)*n*<sup>1</sup> *<sup>i</sup>* (*s*) = *�*<sup>2</sup> *<sup>i</sup>* (*s*)*n*<sup>2</sup> *<sup>i</sup>* (*s*) or equivalently *<sup>n</sup>*<sup>1</sup> *<sup>i</sup>* (*s*) = *�*<sup>2</sup> *<sup>i</sup>* (*s*)*n*<sup>2</sup> *<sup>i</sup>* (*s*) *�*1 *<sup>i</sup>* (*s*) , *<sup>n</sup>*<sup>2</sup> *<sup>i</sup>* (*s*) = *�*<sup>1</sup> *<sup>i</sup>* (*s*)*n*<sup>1</sup> *<sup>i</sup>* (*s*) *�*2 *<sup>i</sup>* (*s*) . The polynomials *�*1 *<sup>i</sup>* (*s*), *�*<sup>2</sup> *<sup>i</sup>* (*s*) are coprime because *�*<sup>1</sup> *<sup>i</sup>* (*s*) factorizes in *<sup>M</sup>*1, *�*<sup>2</sup> *<sup>i</sup>* (*s*) factorizes in *M*<sup>2</sup> and *M*<sup>1</sup> ∩ *M*<sup>2</sup> = ∅. In consequence *�*<sup>1</sup> *<sup>i</sup>* (*s*) <sup>|</sup> *<sup>n</sup>*<sup>2</sup> *<sup>i</sup>* (*s*) and *�*<sup>2</sup> *<sup>i</sup>* (*s*) <sup>|</sup> *<sup>n</sup>*<sup>1</sup> *<sup>i</sup>* (*s*). Therefore, there exist polynomials *a*(*s*), unit of **F***M*<sup>2</sup> (*s*), and *a*� (*s*), unit of **F***M*<sup>1</sup> (*s*), such that *n*<sup>2</sup> *<sup>i</sup>* (*s*) = *�*<sup>1</sup> *<sup>i</sup>* (*s*)*a*(*s*), *<sup>n</sup>*<sup>1</sup> *<sup>i</sup>* (*s*) = *�*<sup>2</sup> *<sup>i</sup>* (*s*)*a*� (*s*). Since *αi*(*s*) = *�*<sup>1</sup> *<sup>i</sup>* (*s*)*n*<sup>1</sup> *<sup>i</sup>* (*s*) = *�*<sup>1</sup> *<sup>i</sup>* (*s*)*�*<sup>2</sup> *<sup>i</sup>* (*s*)*a*� (*s*) and *αi*(*s*) = *�*<sup>2</sup> *<sup>i</sup>* (*s*)*n*<sup>2</sup> *<sup>i</sup>* (*s*) = *�*<sup>2</sup> *<sup>i</sup>* (*s*)*�*<sup>1</sup> *<sup>i</sup>* (*s*)*a*(*s*). This implies that *a*(*s*) = *a*� (*s*) unit of **<sup>F</sup>***M*<sup>1</sup> (*s*) ∩ **<sup>F</sup>***M*<sup>2</sup> (*s*) = **<sup>F</sup>***M*1∪*M*<sup>2</sup> (*s*). Following the same ideas we can prove that *βi*(*s*) = *ψ*<sup>1</sup> *<sup>i</sup>* (*s*)*ψ*<sup>2</sup> *<sup>i</sup>* (*s*)*b*(*s*) with *<sup>b</sup>*(*s*) a unit of **<sup>F</sup>***M*1∪*M*<sup>2</sup> (*s*). By Proposition 2 *�*1 <sup>1</sup> (*s*)*�*<sup>2</sup> <sup>1</sup> (*s*) *ψ*1 <sup>1</sup> (*s*)*ψ*<sup>2</sup> <sup>1</sup> (*s*) ,..., *�*<sup>1</sup> *m*(*s*)*�*<sup>2</sup> *<sup>m</sup>*(*s*) *ψ*1 *m*(*s*)*ψ*<sup>2</sup> *<sup>m</sup>*(*s*) are the invariant rational functions of *<sup>T</sup>*(*s*) with respect to *<sup>M</sup>*<sup>1</sup> <sup>∪</sup> *<sup>M</sup>*2.

**Corollary 4.** *Let M*1, *M*<sup>2</sup> ⊆ Specm(**F**[*s*])*. Two non-singular matrices are equivalent with respect to M*<sup>1</sup> ∪ *M*<sup>2</sup> *if and only if they are equivalent with respect to M*<sup>1</sup> *and with respect to M*2*.*

**Proof**.- Notice that by Remark 1.2 two matrices *<sup>T</sup>*1(*s*), *<sup>T</sup>*2(*s*) <sup>∈</sup> **<sup>F</sup>**(*s*)*m*×*<sup>m</sup>* are equivalent with respect to *<sup>M</sup>*<sup>1</sup> <sup>∪</sup> *<sup>M</sup>*<sup>2</sup> if and only if there exist *<sup>U</sup>*1(*s*), *<sup>U</sup>*2(*s*) invertible in **<sup>F</sup>***M*<sup>1</sup> (*s*)*m*×*<sup>m</sup>* <sup>∩</sup>**F***M*<sup>2</sup> (*s*)*m*×*<sup>m</sup>* such that *T*2(*s*) = *U*1(*s*)*T*1(*s*)*U*2(*s*). Since *U*1(*s*) and *U*2(*s*) are invertible in both **F***M*<sup>1</sup> (*s*)*m*×*<sup>m</sup>* and **F***M*<sup>2</sup> (*s*)*m*×*<sup>m</sup>* then *T*1(*s*) and *T*2(*s*) are equivalent with respect to *M*<sup>1</sup> and with respect to *M*2.

Conversely, if *T*1(*s*) and *T*2(*s*) are equivalent with respect to *M*<sup>1</sup> and with respect to *M*<sup>2</sup> then, by the necessity of this result, they are equivalent with respect to *M*<sup>1</sup> \ (*M*<sup>1</sup> ∩ *M*2), with respect to *<sup>M</sup>*<sup>2</sup> \ (*M*<sup>1</sup> <sup>∩</sup> *<sup>M</sup>*2) and with respect to *<sup>M</sup>*<sup>1</sup> <sup>∩</sup> *<sup>M</sup>*2. Let *�*<sup>1</sup> <sup>1</sup> (*s*) *ψ*1 <sup>1</sup> (*s*) ,..., *�*<sup>1</sup> *<sup>m</sup>*(*s*) *ψ*1 *<sup>m</sup>*(*s*) be the invariant rational functions of *<sup>T</sup>*1(*s*) and *<sup>T</sup>*2(*s*) with respect to *<sup>M</sup>*<sup>1</sup> \ (*M*<sup>1</sup> <sup>∩</sup> *<sup>M</sup>*2), *�*<sup>2</sup> <sup>1</sup> (*s*) *ψ*2 <sup>1</sup> (*s*) ,..., *�*<sup>2</sup> *<sup>m</sup>*(*s*) *ψ*2 *<sup>m</sup>*(*s*) be the invariant rational functions of *<sup>T</sup>*1(*s*) and *<sup>T</sup>*2(*s*) with respect to *<sup>M</sup>*<sup>2</sup> \ (*M*<sup>1</sup> <sup>∩</sup> *<sup>M</sup>*2) and *�*<sup>3</sup> <sup>1</sup> (*s*) *ψ*3 <sup>1</sup> (*s*) ,..., *�*<sup>3</sup> *<sup>m</sup>*(*s*) *ψ*3 *<sup>m</sup>*(*s*) be the invariant rational functions of *T*1(*s*) and *T*2(*s*) with respect to *M*<sup>1</sup> ∩ *M*2. By Corollary 3 *�*1 <sup>1</sup> (*s*) *ψ*1 <sup>1</sup> (*s*) *�*2 <sup>1</sup> (*s*) *ψ*2 <sup>1</sup> (*s*) *�*3 <sup>1</sup> (*s*) *ψ*3 <sup>1</sup> (*s*) ,..., *�*<sup>1</sup> *<sup>m</sup>*(*s*) *ψ*1 *<sup>m</sup>*(*s*) *�*2 *<sup>m</sup>*(*s*) *ψ*2 *<sup>m</sup>*(*s*) *�*3 *<sup>m</sup>*(*s*) *ψ*3 *<sup>m</sup>*(*s*) must be the invariant rational functions of *<sup>T</sup>*1(*s*) and *<sup>T</sup>*2(*s*) with respect to *M*<sup>1</sup> ∪ *M*2. Therefore, *T*1(*s*) and *T*2(*s*) are equivalent with respect to *M*<sup>1</sup> ∪ *M*2.

Let **F***pr*(*s*) be the ring of proper rational functions, that is, rational functions with the degree of the numerator at most the degree of the denominator. The units in this ring are the rational functions whose numerators and denominators have the same degree. They are called biproper rational functions. A matrix *<sup>B</sup>*(*s*) <sup>∈</sup> **<sup>F</sup>***pr*(*s*)*m*×*<sup>m</sup>* is said to be biproper if it is a unit in **F***pr*(*s*)*m*×*<sup>m</sup>* or, what is the same, if its determinant is a biproper rational function.

Recall that a rational function *t*(*s*) has a pole (zero) at ∞ if *t* <sup>1</sup> *s* has a pole (zero) at 0. Following this idea, we can define the local ring at ∞ as the set of rational functions, *t*(*s*), such that *t* <sup>1</sup> *s* does not have 0 as a pole, that is, **<sup>F</sup>**∞(*s*) = *t*(*s*) ∈ **F**(*s*) : *t* <sup>1</sup> *s* ∈ **F***s*(*s*) . If *t*(*s*) = *<sup>p</sup>*(*s*) *<sup>q</sup>*(*s*) with *<sup>p</sup>*(*s*) = *ats<sup>t</sup>* <sup>+</sup> *at*+1*st*+<sup>1</sup> <sup>+</sup> ··· <sup>+</sup> *apsp*, *ap* �<sup>=</sup> 0, *<sup>q</sup>*(*s*) = *brs<sup>r</sup>* <sup>+</sup> *br*+1*sr*+<sup>1</sup> <sup>+</sup> ··· <sup>+</sup> *bqsq*, *bq* � 0, *<sup>p</sup>* <sup>=</sup> *<sup>d</sup>*(*p*(*s*)), *<sup>q</sup>* <sup>=</sup> *<sup>d</sup>*(*q*(*s*)), where *<sup>d</sup>*(·) stands for "degree of", then

$$t\left(\frac{1}{s}\right) = \frac{\frac{a\_l}{s^l} + \frac{a\_{l+1}}{s^{l+1}} + \dots + \frac{a\_l}{s^l}}{\frac{b\_r}{s^r} + \frac{b\_{r+1}}{s^{r+1}} + \dots + \frac{b\_l}{s^l}} = \frac{a\_l s^{p-t} + a\_{l+1} s^{p-t-1} + \dots + a\_p}{b\_r s^{q-r} + b\_{r+1} s^{q-r-1} + \dots + b\_q} s^{q-p} = \frac{f(s)}{g(s)} s^{q-p}.\tag{15}$$

$$\text{As } \mathbb{F}\_{\mathbf{s}}(\mathbf{s}) = \left\{ \frac{f(s)}{g(s)} s^d : f(0) \neq 0, g(0) \neq 0 \text{ and } d \geq 0 \right\} \cup \{0\}, \text{ then }$$

$$\mathbb{F}\_{\mathbf{so}}(s) = \left\{ \frac{p(s)}{q(s)} \in \mathbb{F}(s) : d(q(s)) \geq d(p(s)) \right\}. \tag{16}$$

Thus, this set is the ring of proper rational functions, **F***pr*(*s*).

6 Will-be-set-by-IN-TECH

*<sup>i</sup>* (*s*), *<sup>d</sup>*<sup>1</sup>

*<sup>i</sup>* (*s*)*n*<sup>2</sup> *<sup>i</sup>* (*s*) *�*1 *<sup>i</sup>* (*s*) , *<sup>n</sup>*<sup>2</sup>

(*s*) and *αi*(*s*) = *�*<sup>2</sup>

*<sup>i</sup>* (*s*), with *<sup>n</sup>*<sup>2</sup>

*<sup>i</sup>* (*s*) = *�*<sup>2</sup>

*<sup>i</sup>* (*s*) factorizes in *<sup>M</sup>*1, *�*<sup>2</sup>

**Corollary 4.** *Let M*1, *M*<sup>2</sup> ⊆ Specm(**F**[*s*])*. Two non-singular matrices are equivalent with respect to*

**Proof**.- Notice that by Remark 1.2 two matrices *<sup>T</sup>*1(*s*), *<sup>T</sup>*2(*s*) <sup>∈</sup> **<sup>F</sup>**(*s*)*m*×*<sup>m</sup>* are equivalent with respect to *<sup>M</sup>*<sup>1</sup> <sup>∪</sup> *<sup>M</sup>*<sup>2</sup> if and only if there exist *<sup>U</sup>*1(*s*), *<sup>U</sup>*2(*s*) invertible in **<sup>F</sup>***M*<sup>1</sup> (*s*)*m*×*<sup>m</sup>* <sup>∩</sup>**F***M*<sup>2</sup> (*s*)*m*×*<sup>m</sup>* such that *T*2(*s*) = *U*1(*s*)*T*1(*s*)*U*2(*s*). Since *U*1(*s*) and *U*2(*s*) are invertible in both **F***M*<sup>1</sup> (*s*)*m*×*<sup>m</sup>* and **F***M*<sup>2</sup> (*s*)*m*×*<sup>m</sup>* then *T*1(*s*) and *T*2(*s*) are equivalent with respect to *M*<sup>1</sup> and with respect to

Conversely, if *T*1(*s*) and *T*2(*s*) are equivalent with respect to *M*<sup>1</sup> and with respect to *M*<sup>2</sup> then, by the necessity of this result, they are equivalent with respect to *M*<sup>1</sup> \ (*M*<sup>1</sup> ∩ *M*2), with respect

the invariant rational functions of *T*1(*s*) and *T*2(*s*) with respect to *M*<sup>1</sup> ∩ *M*2. By Corollary 3

with respect to *M*<sup>1</sup> ∪ *M*2. Therefore, *T*1(*s*) and *T*2(*s*) are equivalent with respect to *M*<sup>1</sup> ∪ *M*2. Let **F***pr*(*s*) be the ring of proper rational functions, that is, rational functions with the degree of the numerator at most the degree of the denominator. The units in this ring are the rational functions whose numerators and denominators have the same degree. They are called biproper rational functions. A matrix *<sup>B</sup>*(*s*) <sup>∈</sup> **<sup>F</sup>***pr*(*s*)*m*×*<sup>m</sup>* is said to be biproper if it is a unit in

Following this idea, we can define the local ring at ∞ as the set of rational functions, *t*(*s*),

*<sup>q</sup>*(*s*) with *<sup>p</sup>*(*s*) = *ats<sup>t</sup>* <sup>+</sup> *at*+1*st*+<sup>1</sup> <sup>+</sup> ··· <sup>+</sup> *apsp*, *ap* �<sup>=</sup> 0, *<sup>q</sup>*(*s*) = *brs<sup>r</sup>* <sup>+</sup> *br*+1*sr*+<sup>1</sup> <sup>+</sup> ··· <sup>+</sup>

*<sup>i</sup>* (*s*) <sup>|</sup> *<sup>n</sup>*<sup>1</sup>

*<sup>i</sup>* (*s*), with *<sup>n</sup>*<sup>1</sup>

*<sup>i</sup>* (*s*)*d*<sup>2</sup>

(*s*), unit of **F***M*<sup>1</sup> (*s*), such that *n*<sup>2</sup>

*M*<sup>1</sup> ∪ *M*<sup>2</sup> *if and only if they are equivalent with respect to M*<sup>1</sup> *and with respect to M*2*.*

*<sup>i</sup>* (*s*)*a*�

*<sup>β</sup>m*(*s*) be the global invariant rational functions of *<sup>T</sup>*(*s*). By Proposition 2,

*<sup>i</sup>* (*s*), *<sup>d</sup>*<sup>2</sup>

*<sup>i</sup>* (*s*) = *�*<sup>1</sup>

*<sup>i</sup>* (*s*) = *�*<sup>1</sup>

(*s*) unit of **<sup>F</sup>***M*<sup>1</sup> (*s*) ∩ **<sup>F</sup>***M*<sup>2</sup> (*s*) = **<sup>F</sup>***M*1∪*M*<sup>2</sup> (*s*). Following the same ideas

*<sup>m</sup>*(*s*) are the invariant rational functions of *<sup>T</sup>*(*s*) with respect to *<sup>M</sup>*<sup>1</sup> <sup>∪</sup> *<sup>M</sup>*2.

<sup>1</sup> (*s*) *ψ*1 <sup>1</sup> (*s*) ,..., *�*<sup>1</sup> *<sup>m</sup>*(*s*) *ψ*1

*<sup>m</sup>*(*s*) must be the invariant rational functions of *<sup>T</sup>*1(*s*) and *<sup>T</sup>*2(*s*)

<sup>1</sup> *s* 

*t*(*s*) ∈ **F**(*s*) : *t*

<sup>1</sup> (*s*) *ψ*2 <sup>1</sup> (*s*) ,..., *�*<sup>2</sup> *<sup>m</sup>*(*s*) *ψ*2

*<sup>i</sup>* (*s*)*n*<sup>2</sup>

*<sup>i</sup>* (*s*)*b*(*s*) with *<sup>b</sup>*(*s*) a unit of **<sup>F</sup>***M*1∪*M*<sup>2</sup> (*s*). By Proposition 2

*<sup>i</sup>* (*s*) ∈ **F**[*s*] units of **F***M*<sup>1</sup> (*s*). On the other

*<sup>i</sup>* (*s*)*n*<sup>1</sup> *<sup>i</sup>* (*s*) *�*2

*<sup>i</sup>* (*s*). Therefore, there exist polynomials *a*(*s*),

*<sup>i</sup>* (*s*)*a*(*s*), *<sup>n</sup>*<sup>1</sup>

*<sup>i</sup>* (*s*) = *�*<sup>2</sup>

*<sup>i</sup>* (*s*) ∈ **F**[*s*] units of **F***M*<sup>2</sup> (*s*). So,

*<sup>i</sup>* (*s*) factorizes in *M*<sup>2</sup> and *M*<sup>1</sup> ∩ *M*<sup>2</sup> =

*<sup>i</sup>* (*s*) . The polynomials

*<sup>i</sup>* (*s*) = *�*<sup>2</sup>

*<sup>m</sup>*(*s*) be the invariant rational

<sup>1</sup> (*s*) *ψ*3 <sup>1</sup> (*s*)

*<sup>m</sup>*(*s*) be the invariant

,..., *�*<sup>3</sup> *<sup>m</sup>*(*s*) *ψ*3 *<sup>m</sup>*(*s*) be

has a pole (zero) at 0.

∈ **F***s*(*s*)

 . If

<sup>1</sup> *s* 

*<sup>i</sup>* (*s*)*�*<sup>1</sup>

*<sup>i</sup>* (*s*)*a*� (*s*).

*<sup>i</sup>* (*s*)*a*(*s*). This

**Proof**.- Let *<sup>α</sup>*1(*s*)

hand *αi*(*s*) = *�*<sup>2</sup>

*<sup>i</sup>* (*s*)*n*<sup>1</sup>

*<sup>i</sup>* (*s*) = *�*<sup>2</sup>

∅. In consequence *�*<sup>1</sup>

unit of **F***M*<sup>2</sup> (*s*), and *a*�

implies that *a*(*s*) = *a*�

,..., *�*<sup>1</sup>

we can prove that *βi*(*s*) = *ψ*<sup>1</sup>

*ψ*1 *m*(*s*)*ψ*<sup>2</sup>

*m*(*s*)*�*<sup>2</sup> *<sup>m</sup>*(*s*)

,..., *�*<sup>1</sup> *<sup>m</sup>*(*s*) *ψ*1 *<sup>m</sup>*(*s*) *�*2 *<sup>m</sup>*(*s*) *ψ*2 *<sup>m</sup>*(*s*) *�*3 *<sup>m</sup>*(*s*) *ψ*3

Since *αi*(*s*) = *�*<sup>1</sup>

*αi*(*s*) = *�*<sup>1</sup>

*�*1 *<sup>i</sup>* (*s*)*n*<sup>1</sup>

*�*1 *<sup>i</sup>* (*s*), *�*<sup>2</sup>

*�*1 <sup>1</sup> (*s*)*�*<sup>2</sup> <sup>1</sup> (*s*)

*ψ*1 <sup>1</sup> (*s*)*ψ*<sup>2</sup> <sup>1</sup> (*s*)

*M*2.

*�*1 <sup>1</sup> (*s*) *ψ*1 <sup>1</sup> (*s*) *�*2 <sup>1</sup> (*s*) *ψ*2 <sup>1</sup> (*s*) *�*3 <sup>1</sup> (*s*) *ψ*3 <sup>1</sup> (*s*)

such that *t*

*t*(*s*) = *<sup>p</sup>*(*s*)

<sup>1</sup> *s* 

*<sup>β</sup>*1(*s*),..., *<sup>α</sup>m*(*s*)

*<sup>i</sup>* (*s*)*n*<sup>2</sup>

*<sup>i</sup>* (*s*) are coprime because *�*<sup>1</sup>

*<sup>i</sup>* (*s*)*n*<sup>1</sup>

*<sup>i</sup>* (*s*) <sup>|</sup> *<sup>n</sup>*<sup>2</sup>

*<sup>i</sup>* (*s*) = *�*<sup>1</sup>

to *<sup>M</sup>*<sup>2</sup> \ (*M*<sup>1</sup> <sup>∩</sup> *<sup>M</sup>*2) and with respect to *<sup>M</sup>*<sup>1</sup> <sup>∩</sup> *<sup>M</sup>*2. Let *�*<sup>1</sup>

functions of *<sup>T</sup>*1(*s*) and *<sup>T</sup>*2(*s*) with respect to *<sup>M</sup>*<sup>1</sup> \ (*M*<sup>1</sup> <sup>∩</sup> *<sup>M</sup>*2), *�*<sup>2</sup>

Recall that a rational function *t*(*s*) has a pole (zero) at ∞ if *t*

rational functions of *<sup>T</sup>*1(*s*) and *<sup>T</sup>*2(*s*) with respect to *<sup>M</sup>*<sup>2</sup> \ (*M*<sup>1</sup> <sup>∩</sup> *<sup>M</sup>*2) and *�*<sup>3</sup>

**F***pr*(*s*)*m*×*<sup>m</sup>* or, what is the same, if its determinant is a biproper rational function.

does not have 0 as a pole, that is, **F**∞(*s*) =

*<sup>i</sup>* (*s*)*n*<sup>2</sup>

*<sup>i</sup>* (*s*), *<sup>β</sup>i*(*s*) = *<sup>ψ</sup>*<sup>1</sup>

*<sup>i</sup>* (*s*)*d*<sup>1</sup>

*<sup>i</sup>* (*s*) or equivalently *<sup>n</sup>*<sup>1</sup>

*<sup>i</sup>* (*s*) and *�*<sup>2</sup>

*<sup>i</sup>* (*s*)*�*<sup>2</sup>

*<sup>i</sup>* (*s*)*ψ*<sup>2</sup>

*<sup>i</sup>* (*s*), *<sup>β</sup>i*(*s*) = *<sup>ψ</sup>*<sup>2</sup>

Two rational matrices *<sup>T</sup>*1(*s*), *<sup>T</sup>*2(*s*) <sup>∈</sup> **<sup>F</sup>**(*s*)*m*×*<sup>m</sup>* are equivalent at infinity if there exist biproper matrices *B*1(*s*), *B*2(*s*) ∈ Gl*m*(**F***pr*(*s*)) such that *T*2(*s*) = *B*1(*s*)*T*1(*s*)*B*2(*s*). Given a non-singular rational matrix *<sup>T</sup>*(*s*) <sup>∈</sup> **<sup>F</sup>**(*s*)*m*×*<sup>m</sup>* (see [15]) there always exist *<sup>B</sup>*1(*s*), *<sup>B</sup>*2(*s*) <sup>∈</sup> Gl*m*(**F***pr*(*s*)) such that

$$T(\mathbf{s}) = B\_1(\mathbf{s}) \operatorname{Diag}(\mathbf{s}^{q\_1}, \dots, \mathbf{s}^{q\_m}) B\_2(\mathbf{s}) \tag{17}$$

where *q*<sup>1</sup> ≥ ··· ≥ *qm* are integers. They are called the invariant orders of *T*(*s*) at infinity and the rational functions *sq*<sup>1</sup> ,...,*sqm* are called the invariant rational functions of *T*(*s*) at infinity.

## **3. Structure of the ring of proper rational functions with prescribed finite poles**

Let *<sup>M</sup>*� <sup>⊆</sup> Specm(**F**[*s*]). Any non-zero rational function *<sup>t</sup>*(*s*) can be uniquely written as *<sup>t</sup>*(*s*) = *<sup>n</sup>*(*s*) *d*(*s*) *n*� (*s*) *<sup>d</sup>*�(*s*) where *<sup>n</sup>*(*s*) *<sup>d</sup>*(*s*) is an irreducible rational function factorizing in *<sup>M</sup>*� and *<sup>n</sup>*� (*s*) *<sup>d</sup>*�(*s*) is a unit of **F***M*�(*s*). Define the following function over **F**(*s*) \ {0} (see [15], [16]):

$$\begin{array}{ccc} \delta: \mathbb{F}(s) \mid \{0\} \to & \mathbb{Z} \\ t(s) & \mapsto d(d'(s)) - d(n'(s)). \end{array} \tag{18}$$

This mapping is not a discrete valuation of **F**(*s*) if *M*� � ∅: Given two non-zero elements *t*1(*s*), *t*2(*s*) ∈ **F**(*s*) it is clear that *δ*(*t*1(*s*)*t*2(*s*)) = *δ*(*t*1(*s*)) + *δ*(*t*2(*s*)); but it may not satisfy that *δ*(*t*1(*s*) + *t*2(*s*)) ≥ min(*δ*(*t*1(*s*)), *δ*(*t*2(*s*))). For example, let *M*� = {(*s* − *a*) ∈ Specm(**R**[*s*]) : *a* ∈/ [−2, <sup>−</sup>1]}. Put *<sup>t</sup>*1(*s*) = *<sup>s</sup>*+0.5 *<sup>s</sup>*+1.5 and *<sup>t</sup>*2(*s*) = *<sup>s</sup>*+2.5 *<sup>s</sup>*+1.5 . We have that *<sup>δ</sup>*(*t*1(*s*)) = *<sup>d</sup>*(*<sup>s</sup>* <sup>+</sup> 1.5) <sup>−</sup> *<sup>d</sup>*(1) = 1, *δ*(*t*2(*s*)) = *d*(*s* + 1.5) − *d*(1) = 1 but *δ*(*t*1(*s*) + *t*2(*s*)) = *δ*(2) = 0.

However, if *M*� = ∅ and *t*(*s*) = *<sup>n</sup>*(*s*) *<sup>d</sup>*(*s*) ∈ **F**(*s*) where *n*(*s*), *d*(*s*) ∈ **F**[*s*], *d*(*s*) � 0, the map

$$\delta\_{\infty} \colon \mathbb{F}(s) \to \mathbb{Z} \cup \{+\infty\} \tag{19}$$

defined via *δ*∞(*t*(*s*)) = *d*(*d*(*s*)) − *d*(*n*(*s*)) if *t*(*s*) � 0 and *δ*∞(*t*(*s*)) = +∞ if *t*(*s*) = 0 is a discrete valuation of **F**(*s*).

Consider the subset of **F**(*s*), **F***M*�(*s*) ∩ **F***pr*(*s*), consisting of all proper rational functions with poles in Specm(**F**[*s*]) \ *M*� , that is, the elements of **F***M*�(*s*) ∩ **F***pr*(*s*) are proper rational functions whose denominators are coprime with all the polynomials *π*(*s*) such that (*π*(*s*)) ∈ *M*� . Notice that *<sup>g</sup>*(*s*) <sup>∈</sup> **<sup>F</sup>***M*�(*s*) <sup>∩</sup> **<sup>F</sup>***pr*(*s*) if and only if *<sup>g</sup>*(*s*) = *<sup>n</sup>*(*s*) *<sup>n</sup>*� (*s*) *<sup>d</sup>*�(*s*) where:

(a) *n*(*s*) ∈ **F**[*s*] is a polynomial factorizing in *M*� ,

(b) *<sup>n</sup>*� (*s*) *<sup>d</sup>*�(*s*) is an irreducible rational function and a unit of **F***M*�(*s*),

(c) *δ*(*g*(*s*)) − *d*(*n*(*s*)) ≥ 0 or equivalently *δ*∞(*g*(*s*)) ≥ 0.

In particular (*c*) implies that *<sup>n</sup>*� (*s*) *<sup>d</sup>*�(*s*) ∈ **F***pr*(*s*). The units in **F***M*�(*s*) ∩ **F***pr*(*s*) are biproper rational functions *<sup>n</sup>*� (*s*) *<sup>d</sup>*�(*s*) , that is *d*(*n*� (*s*)) = *d*(*d*� (*s*)), with *n*� (*s*), *d*� (*s*) factorizing in Specm(**F**[*s*]) \ *M*� . Furthermore, **F***M*�(*s*) ∩ **F***pr*(*s*) is an integral domain whose field of fractions is **F**(*s*) provided that *M*� �= Specm(**F**[*s*])(see, for example, [15, Prop.5.22]). Notice that for *M*� = Specm(**F**[*s*]), **F***M*�(*s*) ∩ **F***pr*(*s*) = **F**[*s*] ∩ **F***pr*(*s*) = **F**.

Assume that there are ideals in Specm(**F**[*s*]) \ *M*� generated by linear polynomials and let (*s* − *<sup>a</sup>*) be any of them. The elements of **<sup>F</sup>***M*�(*s*) <sup>∩</sup> **<sup>F</sup>***pr*(*s*) can be written as *<sup>g</sup>*(*s*) = *<sup>n</sup>*(*s*)*u*(*s*) <sup>1</sup> (*s*−*a*)*<sup>d</sup>* where *n*(*s*) ∈ **F**[*s*] factorizes in *M*� , *u*(*s*) is a unit in **F***M*�(*s*) ∩ **F***pr*(*s*) and *d* = *δ*(*g*(*s*)) ≥ *d*(*n*(*s*)). If **F** is algebraically closed, for example **F** = **C**, and *M*� �= Specm(**F**[*s*]) the previous condition is always fulfilled.

The divisibility in **F***M*�(*s*) ∩ **F***pr*(*s*) is characterized in the following lemma.

**Lemma 5.** *Let M*� ⊆ Specm(**F**[*s*])*. Let g*1(*s*), *g*2(*s*) ∈ **F***M*�(*s*) ∩ **F***pr*(*s*) *be such that g*1(*s*) = *<sup>n</sup>*1(*s*) *<sup>n</sup>*� <sup>1</sup>(*s*) *d*� <sup>1</sup>(*s*) *and g*2(*s*) = *<sup>n</sup>*2(*s*) *<sup>n</sup>*� <sup>2</sup>(*s*) *d*� <sup>2</sup>(*s*) *with n*1(*s*), *<sup>n</sup>*2(*s*) <sup>∈</sup> **<sup>F</sup>**[*s*] *factorizing in M*� *and <sup>n</sup>*� <sup>1</sup>(*s*) *d*� <sup>1</sup>(*s*) , *<sup>n</sup>*� <sup>2</sup>(*s*) *d*� <sup>2</sup>(*s*) *irreducible rational functions, units of* **F***M*�(*s*)*. Then g*1(*s*) *divides g*2(*s*) *in* **F***M*�(*s*) ∩ **F***pr*(*s*) *if and only if*

$$n\_1(\mathbf{s}) \mid n\_2(\mathbf{s}) \text{ in } \mathbb{F}[\mathbf{s}] \tag{20}$$

$$
\delta(\mathcal{g}\_1(\mathbf{s})) - d(n\_1(\mathbf{s})) \le \delta(\mathcal{g}\_2(\mathbf{s})) - d(n\_2(\mathbf{s})).\tag{21}
$$

**Proof**.- If *<sup>g</sup>*1(*s*) <sup>|</sup> *<sup>g</sup>*2(*s*) then there exists *<sup>g</sup>*(*s*) = *<sup>n</sup>*(*s*) *<sup>n</sup>*� (*s*) *<sup>d</sup>*�(*s*) ∈ **F***M*�(*s*) ∩ **F***pr*(*s*), with *n*(*s*) ∈ **F**[*s*] factorizing in *M*� and *n*� (*s*), *d*� (*s*) ∈ **F**[*s*] coprime, factorizing in Specm(**F**[*s*]) \ *M*� , such that *<sup>g</sup>*2(*s*) = *<sup>g</sup>*(*s*)*g*1(*s*). Equivalently, *<sup>n</sup>*2(*s*) *<sup>n</sup>*� <sup>2</sup>(*s*) *d*� <sup>2</sup>(*s*) <sup>=</sup> *<sup>n</sup>*(*s*) *<sup>n</sup>*� (*s*) *<sup>d</sup>*�(*s*) *<sup>n</sup>*1(*s*) *<sup>n</sup>*� <sup>1</sup>(*s*) *d*� <sup>1</sup>(*s*) <sup>=</sup> *<sup>n</sup>*(*s*)*n*1(*s*) *<sup>n</sup>*� (*s*)*n*� <sup>1</sup>(*s*) *d*�(*s*)*d*� <sup>1</sup>(*s*) . So *n*2(*s*) = *n*(*s*)*n*1(*s*) and *δ*(*g*2(*s*)) − *d*(*n*2(*s*)) = *δ*(*g*(*s*)) − *d*(*n*(*s*)) + *δ*(*g*1(*s*)) − *d*(*n*1(*s*)). Moreover, as *g*(*s*) is a proper rational function, *δ*(*g*(*s*)) − *d*(*n*(*s*)) ≥ 0 and *δ*(*g*2(*s*)) − *d*(*n*2(*s*)) ≥ *δ*(*g*1(*s*)) − *d*(*n*1(*s*)).

Conversely, if *n*1(*s*) | *n*2(*s*) then there is *n*(*s*) ∈ **F**[*s*], factorizing in *M*� , such that *n*2(*s*) = *<sup>n</sup>*(*s*)*n*1(*s*). Write *<sup>g</sup>*(*s*) = *<sup>n</sup>*(*s*) *<sup>n</sup>*� (*s*) *<sup>d</sup>*�(*s*) where *<sup>n</sup>*� (*s*) *<sup>d</sup>*�(*s*) is an irreducible fraction representation of *n*� 2(*s*)*d*� <sup>1</sup>(*s*) *d*� 2(*s*)*n*� 1(*s*), i.e., *<sup>n</sup>*� (*s*) *<sup>d</sup>*�(*s*) <sup>=</sup> *<sup>n</sup>*� 2(*s*)*d*� <sup>1</sup>(*s*) *d*� 2(*s*)*n*� <sup>1</sup>(*s*) after canceling possible common factors. Thus *<sup>n</sup>*� <sup>2</sup>(*s*) *d*� <sup>2</sup>(*s*) <sup>=</sup> *<sup>n</sup>*� (*s*) *d*�(*s*) *n*� <sup>1</sup>(*s*) *d*� <sup>1</sup>(*s*) and

$$\begin{array}{l} \delta(\mathcal{g}(\mathbf{s})) - d(n(\mathbf{s})) = d(d'(\mathbf{s})) - d(n'(\mathbf{s})) - d(n(\mathbf{s})) \\ = d(d\_2'(\mathbf{s})) + d(n\_1'(\mathbf{s})) - d(n\_2'(\mathbf{s})) - d(d\_1'(\mathbf{s})) - d(n\_2(\mathbf{s})) + d(n\_1(\mathbf{s})) \\ = \delta(\mathcal{g}\_2(\mathbf{s})) - d(n\_2(\mathbf{s})) - (\delta(\mathcal{g}\_1(\mathbf{s})) - d(n\_1(\mathbf{s}))) \ge 0. \end{array} \tag{22}$$

Then *g*(*s*) ∈ **F***M*�(*s*) ∩ **F***pr*(*s*) and *g*2(*s*) = *g*(*s*)*g*1(*s*).

Notice that condition (20) means that *g*1(*s*) | *g*2(*s*) in **F***M*�(*s*) and condition (21) means that *g*1(*s*) | *g*2(*s*) in **F***pr*(*s*). So, *g*1(*s*) | *g*2(*s*) in **F***M*�(*s*) ∩ **F***pr*(*s*) if and only if *g*1(*s*) | *g*2(*s*) simultaneously in **F***M*�(*s*) and **F***pr*(*s*).

8 Will-be-set-by-IN-TECH

(*s*)), with *n*�

Furthermore, **F***M*�(*s*) ∩ **F***pr*(*s*) is an integral domain whose field of fractions is **F**(*s*) provided that *M*� �= Specm(**F**[*s*])(see, for example, [15, Prop.5.22]). Notice that for *M*� = Specm(**F**[*s*]),

Assume that there are ideals in Specm(**F**[*s*]) \ *M*� generated by linear polynomials and let (*s* − *<sup>a</sup>*) be any of them. The elements of **<sup>F</sup>***M*�(*s*) <sup>∩</sup> **<sup>F</sup>***pr*(*s*) can be written as *<sup>g</sup>*(*s*) = *<sup>n</sup>*(*s*)*u*(*s*) <sup>1</sup>

If **F** is algebraically closed, for example **F** = **C**, and *M*� �= Specm(**F**[*s*]) the previous condition

**Lemma 5.** *Let M*� ⊆ Specm(**F**[*s*])*. Let g*1(*s*), *g*2(*s*) ∈ **F***M*�(*s*) ∩ **F***pr*(*s*) *be such that g*1(*s*) =

*irreducible rational functions, units of* **F***M*�(*s*)*. Then g*1(*s*) *divides g*2(*s*) *in* **F***M*�(*s*) ∩ **F***pr*(*s*) *if and*

<sup>2</sup>(*s*) *d*�

So *n*2(*s*) = *n*(*s*)*n*1(*s*) and *δ*(*g*2(*s*)) − *d*(*n*2(*s*)) = *δ*(*g*(*s*)) − *d*(*n*(*s*)) + *δ*(*g*1(*s*)) − *d*(*n*1(*s*)). Moreover, as *g*(*s*) is a proper rational function, *δ*(*g*(*s*)) − *d*(*n*(*s*)) ≥ 0 and *δ*(*g*2(*s*)) −

(*s*)

(*s*)) − *d*(*n*(*s*))

= *δ*(*g*2(*s*)) − *d*(*n*2(*s*)) − (*δ*(*g*1(*s*)) − *d*(*n*1(*s*))) ≥ 0.

<sup>1</sup>(*s*)) − *d*(*n*�

The divisibility in **F***M*�(*s*) ∩ **F***pr*(*s*) is characterized in the following lemma.

Conversely, if *n*1(*s*) | *n*2(*s*) then there is *n*(*s*) ∈ **F**[*s*], factorizing in *M*�

(*s*) *<sup>d</sup>*�(*s*) where *<sup>n</sup>*�

(*s*)) − *d*(*n*�

<sup>2</sup>(*s*)) + *d*(*n*�

<sup>2</sup>(*s*) *d*�

**Proof**.- If *<sup>g</sup>*1(*s*) <sup>|</sup> *<sup>g</sup>*2(*s*) then there exists *<sup>g</sup>*(*s*) = *<sup>n</sup>*(*s*) *<sup>n</sup>*�

that *<sup>g</sup>*2(*s*) = *<sup>g</sup>*(*s*)*g*1(*s*). Equivalently, *<sup>n</sup>*2(*s*) *<sup>n</sup>*�

2(*s*)*d*� <sup>1</sup>(*s*)

= *d*(*d*�

Then *g*(*s*) ∈ **F***M*�(*s*) ∩ **F***pr*(*s*) and *g*2(*s*) = *g*(*s*)*g*1(*s*).

*d*� 2(*s*)*n*�

(*s*), *d*�

,

*<sup>d</sup>*�(*s*) ∈ **F***pr*(*s*). The units in **F***M*�(*s*) ∩ **F***pr*(*s*) are biproper rational

, *u*(*s*) is a unit in **F***M*�(*s*) ∩ **F***pr*(*s*) and *d* = *δ*(*g*(*s*)) ≥ *d*(*n*(*s*)).

*n*1(*s*) | *n*2(*s*) *in* **F**[*s*] (20)

*<sup>d</sup>*�(*s*) ∈ **F***M*�(*s*) ∩ **F***pr*(*s*), with *n*(*s*) ∈ **F**[*s*]

<sup>1</sup>(*s*) <sup>=</sup> *<sup>n</sup>*(*s*)*n*1(*s*) *<sup>n</sup>*�

<sup>1</sup>(*s*) *d*�

*<sup>d</sup>*�(*s*) is an irreducible fraction representation of

<sup>1</sup>(*s*)) − *d*(*n*2(*s*)) + *d*(*n*1(*s*))

<sup>2</sup>(*s*) *with n*1(*s*), *<sup>n</sup>*2(*s*) <sup>∈</sup> **<sup>F</sup>**[*s*] *factorizing in M*� *and <sup>n</sup>*�

*δ*(*g*1(*s*)) − *d*(*n*1(*s*)) ≤ *δ*(*g*2(*s*)) − *d*(*n*2(*s*)). (21)

(*s*) ∈ **F**[*s*] coprime, factorizing in Specm(**F**[*s*]) \ *M*�

(*s*) *<sup>d</sup>*�(*s*) *<sup>n</sup>*1(*s*) *<sup>n</sup>*�

(*s*)

<sup>2</sup>(*s*) <sup>=</sup> *<sup>n</sup>*(*s*) *<sup>n</sup>*�

<sup>1</sup>(*s*) after canceling possible common factors. Thus *<sup>n</sup>*�

<sup>2</sup>(*s*)) − *d*(*d*�

(*s*) factorizing in Specm(**F**[*s*]) \ *M*�

.

(*s*−*a*)*<sup>d</sup>*

, such

(*s*)*n*� <sup>1</sup>(*s*) *d*�(*s*)*d*� <sup>1</sup>(*s*) .

(*s*) *d*�(*s*) *n*� <sup>1</sup>(*s*) *d*� <sup>1</sup>(*s*)

(22)

, such that *n*2(*s*) =

<sup>2</sup>(*s*) *d*� <sup>2</sup>(*s*) <sup>=</sup> *<sup>n</sup>*�

<sup>1</sup>(*s*) *d*� <sup>1</sup>(*s*) , *<sup>n</sup>*� <sup>2</sup>(*s*) *d*� <sup>2</sup>(*s*)

(*s*), *d*�

(a) *n*(*s*) ∈ **F**[*s*] is a polynomial factorizing in *M*�

In particular (*c*) implies that *<sup>n</sup>*�

*<sup>d</sup>*�(*s*) , that is *d*(*n*�

**F***M*�(*s*) ∩ **F***pr*(*s*) = **F**[*s*] ∩ **F***pr*(*s*) = **F**.

where *n*(*s*) ∈ **F**[*s*] factorizes in *M*�

<sup>1</sup>(*s*) *and g*2(*s*) = *<sup>n</sup>*2(*s*) *<sup>n</sup>*�

factorizing in *M*� and *n*�

*d*(*n*2(*s*)) ≥ *δ*(*g*1(*s*)) − *d*(*n*1(*s*)).

*<sup>n</sup>*(*s*)*n*1(*s*). Write *<sup>g</sup>*(*s*) = *<sup>n</sup>*(*s*) *<sup>n</sup>*�

(*s*) *<sup>d</sup>*�(*s*) <sup>=</sup> *<sup>n</sup>*�

*δ*(*g*(*s*)) − *d*(*n*(*s*)) = *d*(*d*�

1(*s*), i.e., *<sup>n</sup>*�

(*s*)

(c) *δ*(*g*(*s*)) − *d*(*n*(*s*)) ≥ 0 or equivalently *δ*∞(*g*(*s*)) ≥ 0.

*<sup>d</sup>*�(*s*) is an irreducible rational function and a unit of **F***M*�(*s*),

(*s*)

(*s*)) = *d*(*d*�

(b) *<sup>n</sup>*� (*s*)

functions *<sup>n</sup>*�

is always fulfilled.

*<sup>n</sup>*1(*s*) *<sup>n</sup>*� <sup>1</sup>(*s*) *d*�

*only if*

*n*� 2(*s*)*d*� <sup>1</sup>(*s*)

*d*� 2(*s*)*n*�

and

**Lemma 6.** *Let M*� ⊆ Specm(**F**[*s*])*. Let g*1(*s*), *g*2(*s*) ∈ **F***M*�(*s*) ∩ **F***pr*(*s*) *be such that g*1(*s*) = *<sup>n</sup>*1(*s*) *<sup>n</sup>*� <sup>1</sup>(*s*) *d*� <sup>1</sup>(*s*) *and g*2(*s*) = *<sup>n</sup>*2(*s*) *<sup>n</sup>*� <sup>2</sup>(*s*) *d*� <sup>2</sup>(*s*) *as in Lemma 5. If n*1(*s*) *and n*2(*s*) *are coprime in* **<sup>F</sup>**[*s*] *and either δ*(*g*1(*s*)) = *d*(*n*1(*s*)) *or δ*(*g*2(*s*)) = *d*(*n*2(*s*)) *then g*1(*s*) *and g*2(*s*) *are coprime in* **F***M*�(*s*) ∩ **F***pr*(*s*)*.*

**Proof**.- Suppose that *g*1(*s*) and *g*2(*s*) are not coprime. Then there exists a non-unit *g*(*s*) = *n*(*s*) *<sup>n</sup>*� (*s*) *<sup>d</sup>*�(*s*) ∈ **F***M*�(*s*) ∩ **F***pr*(*s*) such that *g*(*s*) | *g*1(*s*) and *g*(*s*) | *g*2(*s*). As *g*(*s*) is not a unit, *n*(*s*) is not a constant or *δ*(*g*(*s*)) > 0. If *n*(*s*) is not a constant then *n*(*s*) | *n*1(*s*) and *n*(*s*) | *n*2(*s*) which is impossible because *n*1(*s*) and *n*2(*s*) are coprime. Otherwise, if *n*(*s*) is a constant then *δ*(*g*(*s*)) > 0 and we have that *δ*(*g*(*s*)) ≤ *δ*(*g*1(*s*)) − *d*(*n*1(*s*)) and *δ*(*g*(*s*)) ≤ *δ*(*g*2(*s*)) − *d*(*n*2(*s*)). But this is again impossible.

It follows from this Lemma that if *g*1(*s*), *g*2(*s*) are coprime in both rings **F***M*�(*s*) and **F***pr*(*s*) then *g*1(*s*), *g*2(*s*) are coprime in **F***M*�(*s*) ∩ **F***pr*(*s*). The following example shows that the converse is not true in general.

**Example 7.** Suppose that **<sup>F</sup>** <sup>=</sup> **<sup>R</sup>** and *<sup>M</sup>*� <sup>=</sup> Specm(**R**[*s*]) \ {(*s*<sup>2</sup> <sup>+</sup> <sup>1</sup>)}. It is not difficult to prove that *<sup>g</sup>*1(*s*) = *<sup>s</sup>*<sup>2</sup> *<sup>s</sup>*<sup>2</sup>+<sup>1</sup> and *<sup>g</sup>*2(*s*) = *<sup>s</sup> <sup>s</sup>*<sup>2</sup>+<sup>1</sup> are coprime elements in **R***M*�(*s*) ∩ **R***pr*(*s*). Assume that there exists a non-unit *g*(*s*) = *n*(*s*) *<sup>n</sup>*� (*s*) *<sup>d</sup>*�(*s*) ∈ **R***M*�(*s*) ∩ **R***pr*(*s*) such that *g*(*s*) | *g*1(*s*) and *<sup>g</sup>*(*s*) <sup>|</sup> *<sup>g</sup>*2(*s*). Then *<sup>n</sup>*(*s*) <sup>|</sup> *<sup>s</sup>*2, *<sup>n</sup>*(*s*) <sup>|</sup> *<sup>s</sup>* and *<sup>δ</sup>*(*g*(*s*)) <sup>−</sup> *<sup>d</sup>*(*n*(*s*)) = 0. Since *<sup>g</sup>*(*s*) is not a unit, *<sup>n</sup>*(*s*) cannot be a constant. Hence, *n*(*s*) = *cs*, *c* �= 0, and *δ*(*g*(*s*)) = 1, but this is impossible because *d*� (*s*) and *n*� (*s*) are powers of *s*<sup>2</sup> + 1. Therefore *g*1(*s*) and *g*2(*s*) must be coprime. However *n*1(*s*) = *s*<sup>2</sup> and *n*2(*s*) = *s* are not coprime.

Now, we have the following property when there are ideals in Specm(**F**[*s*]) \ *M*� , *M*� ⊆ Specm(**F**[*s*]), generated by linear polynomials.

**Lemma 8.** *Let M*� ⊆ Specm(**F**[*s*])*. Assume that there are ideals in* Specm(**F**[*s*]) \ *M*� *generated by linear polynomials and let* (*s* − *a*) *be any of them. Let g*1(*s*), *g*2(*s*) ∈ **F***M*�(*s*) ∩ **F***pr*(*s*) *be such that g*1(*s*) = *n*1(*s*)*u*1(*s*) <sup>1</sup> (*s*−*a*)*<sup>d</sup>*<sup>1</sup> *and g*2(*s*) = *<sup>n</sup>*2(*s*)*u*2(*s*) <sup>1</sup> (*s*−*a*)*<sup>d</sup>*<sup>2</sup> *. If g*1(*s*) *and g*2(*s*) *are coprime in* **F***M*�(*s*)∩**F***pr*(*s*) *then n*1(*s*) *and n*2(*s*) *are coprime in* **F**[*s*] *and either d*<sup>1</sup> = *d*(*n*1(*s*)) *or d*<sup>2</sup> = *d*(*n*2(*s*))*.*

**Proof**.- Suppose that *n*1(*s*) and *n*2(*s*) are not coprime in **F**[*s*]. Then there exists a non-constant *<sup>n</sup>*(*s*) <sup>∈</sup> **<sup>F</sup>**[*s*] such that *<sup>n</sup>*(*s*) <sup>|</sup> *<sup>n</sup>*1(*s*) and *<sup>n</sup>*(*s*) <sup>|</sup> *<sup>n</sup>*2(*s*). Let *<sup>d</sup>* <sup>=</sup> *<sup>d</sup>*(*n*(*s*)). Then *<sup>g</sup>*(*s*) = *<sup>n</sup>*(*s*) <sup>1</sup> (*s*−*a*)*<sup>d</sup>* is not a unit in **F***M*�(*s*) ∩ **F***pr*(*s*) and divides *g*1(*s*) and *g*2(*s*) because 0 = *d* − *d*(*n*(*s*)) ≤ *d*<sup>1</sup> − *d*(*n*1(*s*)) and 0 = *d* − *d*(*n*(*s*)) ≤ *d*<sup>2</sup> − *d*(*n*2(*s*)). This is impossible, so *n*1(*s*) and *n*2(*s*) must be coprime.

Now suppose that *d*<sup>1</sup> > *d*(*n*1(*s*)) and *d*<sup>2</sup> > *d*(*n*2(*s*)). Let *d* = min{*d*<sup>1</sup> − *d*(*n*1(*s*)), *d*<sup>2</sup> − *<sup>d</sup>*(*n*2(*s*))}. We have that *<sup>d</sup>* <sup>&</sup>gt; 0. Thus *<sup>g</sup>*(*s*) = <sup>1</sup> (*s*−*a*)*<sup>d</sup>* is not a unit in **<sup>F</sup>***M*�(*s*) <sup>∩</sup> **<sup>F</sup>***pr*(*s*) and divides *g*1(*s*) and *g*2(*s*) because *d* ≤ *d*<sup>1</sup> − *d*(*n*1(*s*)) and *d* ≤ *d*<sup>2</sup> − *d*(*n*2(*s*)). This is again impossible and either *d*<sup>1</sup> = *d*(*n*1(*s*)) or *d*<sup>2</sup> = *d*(*n*2(*s*)).

The above lemmas yield a characterization of coprimeness of elements in **F***M*�(*s*) ∩ **F***pr*(*s*) when *M*� excludes at least one ideal generated by a linear polynomial.

Following the same steps as in [16, p. 11] and [15, p. 271] we get the following result.

**Lemma 9.** *Let M*� ⊆ Specm(**F**[*s*]) *and assume that there is at least an ideal in* Specm(**F**[*s*]) \ *M*� *generated by a linear polynomial. Then* **F***M*�(*s*) ∩ **F***pr*(*s*) *is a Euclidean domain.*

The following examples show that if all ideals generated by polynomials of degree one are in *M*� , the ring **F***M*�(*s*) ∩ **F***pr*(*s*) may not be a Bezout domain. Thus, it may not be a Euclidean domain. Even more, it may not be a greatest common divisor domain.

**Example 10.** Let **<sup>F</sup>** <sup>=</sup> **<sup>R</sup>** and *<sup>M</sup>*� <sup>=</sup> Specm(**R**[*s*]) \ {(*s*<sup>2</sup> <sup>+</sup> <sup>1</sup>)}. Let *<sup>g</sup>*1(*s*) = *<sup>s</sup>*<sup>2</sup> *<sup>s</sup>*<sup>2</sup>+<sup>1</sup> , *<sup>g</sup>*2(*s*) = *<sup>s</sup> s*<sup>2</sup>+1 ∈ **R***M*�(*s*) ∩ **R***pr*(*s*). We have seen, in the previous example, that *g*1(*s*), *g*2(*s*) are coprime. We show now that the Bezout identity is not fulfilled, that is, there are not *a*(*s*), *b*(*s*) ∈ **R***M*�(*s*) ∩ **R***pr*(*s*) such that *a*(*s*)*g*1(*s*) + *b*(*s*)*g*2(*s*) = *u*(*s*), with *u*(*s*) a unit in **R***M*�(*s*) ∩ **R***pr*(*s*). Elements in **<sup>R</sup>***M*�(*s*) <sup>∩</sup> **<sup>R</sup>***pr*(*s*) are of the form *<sup>n</sup>*(*s*) (*s*<sup>2</sup>+1)*<sup>d</sup>* with *<sup>n</sup>*(*s*) relatively prime with *<sup>s</sup>*<sup>2</sup> <sup>+</sup> 1 and 2*<sup>d</sup>* <sup>≥</sup> *d*(*n*(*s*)) and the units in **R***M*�(*s*) ∩ **R***pr*(*s*) are non-zero constants. We will see that there are not elements *a*(*s*) = *<sup>n</sup>*(*s*) (*s*<sup>2</sup>+1)*<sup>d</sup>* , *<sup>b</sup>*(*s*) = *<sup>n</sup>*� (*s*) (*s*<sup>2</sup>+1)*<sup>d</sup>*� with *<sup>n</sup>*(*s*) and *<sup>n</sup>*� (*s*) coprime with *<sup>s</sup>*<sup>2</sup> <sup>+</sup> 1, 2*<sup>d</sup>* <sup>≥</sup> *<sup>d</sup>*(*n*(*s*)) and 2*d*� ≥ *d*(*n*� (*s*)) such that *a*(*s*)*g*1(*s*) + *b*(*s*)*g*2(*s*) = *c*, with *c* non-zero constant. Assume that *n*(*s*) (*s*<sup>2</sup>+1)*<sup>d</sup> s*2 *<sup>s</sup>*<sup>2</sup>+<sup>1</sup> <sup>+</sup> *<sup>n</sup>*� (*s*) (*s*<sup>2</sup>+1)*<sup>d</sup>*� *<sup>s</sup> <sup>s</sup>*<sup>2</sup>+<sup>1</sup> <sup>=</sup> *<sup>c</sup>*. We conclude that *<sup>c</sup>*(*s*<sup>2</sup> <sup>+</sup> <sup>1</sup>)*d*+<sup>1</sup> or *<sup>c</sup>*(*s*<sup>2</sup> <sup>+</sup> <sup>1</sup>)*d*� <sup>+</sup><sup>1</sup> is a multiple of *s*, which is impossible.

**Example 11.** Let **<sup>F</sup>** <sup>=</sup> **<sup>R</sup>** and *<sup>M</sup>*� <sup>=</sup> Specm(**R**[*s*]) \ {(*s*<sup>2</sup> <sup>+</sup> <sup>1</sup>)}. A fraction *<sup>g</sup>*(*s*) = *<sup>n</sup>*(*s*) (*s*<sup>2</sup>+1)*<sup>d</sup>* ∈ **<sup>R</sup>***M*�(*s*) <sup>∩</sup> **<sup>R</sup>***pr*(*s*) if and only if 2*<sup>d</sup>* <sup>−</sup> *<sup>d</sup>*(*n*(*s*)) <sup>≥</sup> 0. Let *<sup>g</sup>*1(*s*) = *<sup>s</sup>*<sup>2</sup> (*s*<sup>2</sup>+1)<sup>3</sup> , *<sup>g</sup>*2(*s*) = *<sup>s</sup>*(*s*+1) (*s*<sup>2</sup>+1)<sup>4</sup> ∈ **R***M*�(*s*) ∩ **R***pr*(*s*). By Lemma 5:


If *<sup>n</sup>*(*s*) <sup>|</sup> *<sup>s</sup>*<sup>2</sup> and *<sup>n</sup>*(*s*) <sup>|</sup> *<sup>s</sup>*(*<sup>s</sup>* <sup>+</sup> <sup>1</sup>) then *<sup>n</sup>*(*s*) = *<sup>c</sup>* or *<sup>n</sup>*(*s*) = *cs* with *<sup>c</sup>* a non-zero constant. Then *g*(*s*) | *g*1(*s*) and *g*(*s*) | *g*2(*s*) if and only if *n*(*s*) = *c* and *d* ≤ 2 or *n*(*s*) = *cs* and 2*d* ≤ 5. So, the list of common divisors of *g*1(*s*) and *g*2(*s*) is:

$$\left\{c, \frac{c}{s^2 + 1}, \frac{c}{(s^2 + 1)^2}, \frac{cs}{s^2 + 1}, \frac{cs}{(s^2 + 1)^2} : c \in \mathbb{F}, c \neq 0\right\}.\tag{23}$$

If there would be a greatest common divisor, say *<sup>n</sup>*(*s*) (*s*<sup>2</sup>+1)*<sup>d</sup>* , then *<sup>n</sup>*(*s*) = *cs* because *<sup>n</sup>*(*s*) must be a multiple of *c* and *cs*. Thus such a greatest common divisor should be either *cs <sup>s</sup>*<sup>2</sup>+<sup>1</sup> or *cs* (*s*<sup>2</sup>+1)<sup>2</sup> , but *<sup>c</sup>* (*s*<sup>2</sup>+1)<sup>2</sup> does not divide neither of them because

$$A = \delta\left(\frac{c}{(s^2+1)^2}\right) - d(c) > \max\left\{\delta\left(\frac{cs}{s^2+1}\right) - d(cs), \delta\left(\frac{cs}{(s^2+1)^2}\right) - d(cs)\right\} = 3. \tag{24}$$

Thus, *g*1(*s*) and *g*2(*s*) do not have greatest common divisor.

#### **3.1. Smith–McMillan form**

10 Will-be-set-by-IN-TECH

The above lemmas yield a characterization of coprimeness of elements in **F***M*�(*s*) ∩ **F***pr*(*s*)

**Lemma 9.** *Let M*� ⊆ Specm(**F**[*s*]) *and assume that there is at least an ideal in* Specm(**F**[*s*]) \ *M*�

The following examples show that if all ideals generated by polynomials of degree one are in

**R***M*�(*s*) ∩ **R***pr*(*s*). We have seen, in the previous example, that *g*1(*s*), *g*2(*s*) are coprime. We show now that the Bezout identity is not fulfilled, that is, there are not *a*(*s*), *b*(*s*) ∈ **R***M*�(*s*) ∩ **R***pr*(*s*) such that *a*(*s*)*g*1(*s*) + *b*(*s*)*g*2(*s*) = *u*(*s*), with *u*(*s*) a unit in **R***M*�(*s*) ∩ **R***pr*(*s*). Elements

*d*(*n*(*s*)) and the units in **R***M*�(*s*) ∩ **R***pr*(*s*) are non-zero constants. We will see that there are not

*<sup>s</sup>*<sup>2</sup>+<sup>1</sup> <sup>=</sup> *<sup>c</sup>*. We conclude that *<sup>c</sup>*(*s*<sup>2</sup> <sup>+</sup> <sup>1</sup>)*d*+<sup>1</sup> or *<sup>c</sup>*(*s*<sup>2</sup> <sup>+</sup> <sup>1</sup>)*d*�

(*s*)) such that *a*(*s*)*g*1(*s*) + *b*(*s*)*g*2(*s*) = *c*, with *c* non-zero constant. Assume that

(*s*<sup>2</sup>+1)*<sup>d</sup>*� with *<sup>n</sup>*(*s*) and *<sup>n</sup>*�

**Example 11.** Let **<sup>F</sup>** <sup>=</sup> **<sup>R</sup>** and *<sup>M</sup>*� <sup>=</sup> Specm(**R**[*s*]) \ {(*s*<sup>2</sup> <sup>+</sup> <sup>1</sup>)}. A fraction *<sup>g</sup>*(*s*) = *<sup>n</sup>*(*s*)

If *<sup>n</sup>*(*s*) <sup>|</sup> *<sup>s</sup>*<sup>2</sup> and *<sup>n</sup>*(*s*) <sup>|</sup> *<sup>s</sup>*(*<sup>s</sup>* <sup>+</sup> <sup>1</sup>) then *<sup>n</sup>*(*s*) = *<sup>c</sup>* or *<sup>n</sup>*(*s*) = *cs* with *<sup>c</sup>* a non-zero constant. Then *g*(*s*) | *g*1(*s*) and *g*(*s*) | *g*2(*s*) if and only if *n*(*s*) = *c* and *d* ≤ 2 or *n*(*s*) = *cs* and 2*d* ≤ 5. So, the

, *cs*

− *d*(*cs*), *δ*

(*s*<sup>2</sup> <sup>+</sup> <sup>1</sup>)<sup>2</sup> : *<sup>c</sup>* <sup>∈</sup> **<sup>F</sup>**, *<sup>c</sup>* �<sup>=</sup> <sup>0</sup>

 *cs* (*s*<sup>2</sup> + 1)<sup>2</sup>

*s*<sup>2</sup> + 1

, the ring **F***M*�(*s*) ∩ **F***pr*(*s*) may not be a Bezout domain. Thus, it may not be a Euclidean

*<sup>s</sup>*<sup>2</sup>+<sup>1</sup> , *<sup>g</sup>*2(*s*) = *<sup>s</sup>*

<sup>+</sup><sup>1</sup> is a multiple of

. (23)

*<sup>s</sup>*<sup>2</sup>+<sup>1</sup> or *cs*

(*s*<sup>2</sup>+1)<sup>2</sup> ,

= 3. (24)

(*s*<sup>2</sup>+1)*<sup>d</sup>* ∈

(*s*<sup>2</sup>+1)<sup>4</sup> ∈

(*s*) coprime with *<sup>s</sup>*<sup>2</sup> <sup>+</sup> 1, 2*<sup>d</sup>* <sup>≥</sup> *<sup>d</sup>*(*n*(*s*))

(*s*<sup>2</sup>+1)<sup>3</sup> , *<sup>g</sup>*2(*s*) = *<sup>s</sup>*(*s*+1)

(*s*<sup>2</sup>+1)*<sup>d</sup>* , then *<sup>n</sup>*(*s*) = *cs* because *<sup>n</sup>*(*s*) must be

− *d*(*cs*)

(*s*<sup>2</sup>+1)*<sup>d</sup>* with *<sup>n</sup>*(*s*) relatively prime with *<sup>s</sup>*<sup>2</sup> <sup>+</sup> 1 and 2*<sup>d</sup>* <sup>≥</sup>

*s*<sup>2</sup>+1 ∈

Following the same steps as in [16, p. 11] and [15, p. 271] we get the following result.

when *M*� excludes at least one ideal generated by a linear polynomial.

*generated by a linear polynomial. Then* **F***M*�(*s*) ∩ **F***pr*(*s*) *is a Euclidean domain.*

domain. Even more, it may not be a greatest common divisor domain.

in **<sup>R</sup>***M*�(*s*) <sup>∩</sup> **<sup>R</sup>***pr*(*s*) are of the form *<sup>n</sup>*(*s*)

(*s*) (*s*<sup>2</sup>+1)*<sup>d</sup>*� *<sup>s</sup>*

**R***M*�(*s*) ∩ **R***pr*(*s*). By Lemma 5:

(*s*<sup>2</sup>+1)*<sup>d</sup>* , *<sup>b</sup>*(*s*) = *<sup>n</sup>*�

elements *a*(*s*) = *<sup>n</sup>*(*s*)

*s*, which is impossible.

and 2*d*� ≥ *d*(*n*�

*s*2 *<sup>s</sup>*<sup>2</sup>+<sup>1</sup> <sup>+</sup> *<sup>n</sup>*�

*n*(*s*) (*s*<sup>2</sup>+1)*<sup>d</sup>*

but *<sup>c</sup>*

4 = *δ*

 *c* (*s*<sup>2</sup> + 1)<sup>2</sup>

**Example 10.** Let **<sup>F</sup>** <sup>=</sup> **<sup>R</sup>** and *<sup>M</sup>*� <sup>=</sup> Specm(**R**[*s*]) \ {(*s*<sup>2</sup> <sup>+</sup> <sup>1</sup>)}. Let *<sup>g</sup>*1(*s*) = *<sup>s</sup>*<sup>2</sup>

(*s*)

**<sup>R</sup>***M*�(*s*) <sup>∩</sup> **<sup>R</sup>***pr*(*s*) if and only if 2*<sup>d</sup>* <sup>−</sup> *<sup>d</sup>*(*n*(*s*)) <sup>≥</sup> 0. Let *<sup>g</sup>*1(*s*) = *<sup>s</sup>*<sup>2</sup>

• *<sup>g</sup>*(*s*) <sup>|</sup> *<sup>g</sup>*1(*s*) <sup>⇔</sup> *<sup>n</sup>*(*s*) <sup>|</sup> *<sup>s</sup>*<sup>2</sup> and 0 <sup>≤</sup> <sup>2</sup>*<sup>d</sup>* <sup>−</sup> *<sup>d</sup>*(*n*(*s*)) <sup>≤</sup> <sup>6</sup> <sup>−</sup> <sup>2</sup> <sup>=</sup> <sup>4</sup>

list of common divisors of *g*1(*s*) and *g*2(*s*) is:

 *<sup>c</sup>*, *<sup>c</sup> s*<sup>2</sup> + 1

• *g*(*s*) | *g*2(*s*) ⇔ *n*(*s*) | *s*(*s* + 1) and 0 ≤ 2*d* − *d*(*n*(*s*)) ≤ 8 − 2 = 6.

, *<sup>c</sup>*

If there would be a greatest common divisor, say *<sup>n</sup>*(*s*)

(*s*<sup>2</sup>+1)<sup>2</sup> does not divide neither of them because

− *d*(*c*) > *max*

Thus, *g*1(*s*) and *g*2(*s*) do not have greatest common divisor.

(*s*<sup>2</sup> <sup>+</sup> <sup>1</sup>)<sup>2</sup> , *cs*

 *δ cs s*<sup>2</sup> + 1

a multiple of *c* and *cs*. Thus such a greatest common divisor should be either *cs*

*M*�

A matrix *<sup>U</sup>*(*s*) is invertible in **<sup>F</sup>***M*�(*s*)*m*×*<sup>m</sup>* <sup>∩</sup> **<sup>F</sup>***pr*(*s*)*m*×*<sup>m</sup>* if *<sup>U</sup>*(*s*) <sup>∈</sup> **<sup>F</sup>***M*�(*s*)*m*×*<sup>m</sup>* <sup>∩</sup> **<sup>F</sup>***pr*(*s*)*m*×*<sup>m</sup>* and its determinant is a unit in both rings, **F***M*�(*s*) and **F***pr*(*s*), i.e., *U*(*s*) ∈ Gl*m*(**F***M*�(*s*) ∩ **F***pr*(*s*)) if and only if *U*(*s*) ∈ Gl*m*(**F***M*�(*s*)) ∩ Gl*m*(**F***pr*(*s*)).

Two matrices *<sup>G</sup>*1(*s*), *<sup>G</sup>*2(*s*) <sup>∈</sup> **<sup>F</sup>***M*�(*s*)*m*×*<sup>m</sup>* <sup>∩</sup> **<sup>F</sup>***pr*(*s*)*m*×*<sup>m</sup>* are equivalent in **<sup>F</sup>***M*�(*s*) <sup>∩</sup> **<sup>F</sup>***pr*(*s*) if there exist *<sup>U</sup>*1(*s*), *<sup>U</sup>*2(*s*) invertible in **<sup>F</sup>***M*�(*s*)*m*×*<sup>m</sup>* <sup>∩</sup> **<sup>F</sup>***pr*(*s*)*m*×*<sup>m</sup>* such that

$$\mathbf{G\_2(s)} = \mathcal{U}\_1(s)\mathcal{G}\_1(s)\mathcal{U}\_2(s). \tag{25}$$

If there are ideals in Specm(**F**[*s*]) \ *M*� generated by linear polynomials then **F***M*�(*s*) ∩ **F***pr*(*s*) is an Euclidean ring and any matrix with elements in **F***M*�(*s*) ∩ **F***pr*(*s*) admits a Smith normal form (see [13], [15] or [16]). Bearing in mind the characterization of divisibility in **F***M*�(*s*) ∩ **F***pr*(*s*) given in Lemma 5 we have

**Theorem 12.** *(Smith normal form in* **F***M*�(*s*) ∩ **F***pr*(*s*)*) Let M*� ⊆ Specm(**F**[*s*])*. Assume that there are ideals in* Specm(**F**[*s*]) \ *M*� *generated by linear polynomials and let* (*s* − *a*) *be one of them. Let G*(*s*) <sup>∈</sup> **<sup>F</sup>***M*�(*s*)*m*×*<sup>m</sup>* <sup>∩</sup> **<sup>F</sup>***pr*(*s*)*m*×*<sup>m</sup> be non-singular. Then there exist U*1(*s*), *<sup>U</sup>*2(*s*) *invertible in* **<sup>F</sup>***M*�(*s*)*m*×*<sup>m</sup>* <sup>∩</sup> **<sup>F</sup>***pr*(*s*)*m*×*<sup>m</sup> such that*

$$G(s) = \mathcal{U}\_1(s) \operatorname{Diag} \left( n\_1(s) \frac{1}{(s-a)^{d\_1}}, \dots, n\_m(s) \frac{1}{(s-a)^{d\_m}} \right) \mathcal{U}\_2(s) \tag{26}$$

*with n*1(*s*)|···|*nm*(*s*) *monic polynomials factorizing in M*� *and d*1,..., *dm integers such that* 0 ≤ *d*<sup>1</sup> − *d*(*n*1(*s*)) ≤···≤ *dm* − *d*(*nm*(*s*))*.*

Under the hypothesis of the last theorem *n*1(*s*) <sup>1</sup> (*s*−*a*)*<sup>d</sup>*<sup>1</sup> ,..., *nm*(*s*) <sup>1</sup> (*s*−*a*)*dm* form a complete system of invariants for the equivalence in **F***M*�(*s*)∩**F***pr*(*s*) and are called the invariant rational functions of *G*(*s*) in **F***M*�(*s*) ∩ **F***pr*(*s*). Notice that 0 ≤ *d*<sup>1</sup> ≤ ··· ≤ *dm* because *ni*(*s*) divides *ni*<sup>+</sup>1(*s*).

Recall that the field of fractions of **F***M*�(*s*) ∩ **F***pr*(*s*) is **F**(*s*) when *M*� �= Specm(**F**[*s*]). Thus we can talk about equivalence of matrix rational functions. Two rational matrices *T*1(*s*), *T*2(*s*) ∈ **<sup>F</sup>**(*s*)*m*×*<sup>m</sup>* are equivalent in **<sup>F</sup>***M*�(*s*) <sup>∩</sup> **<sup>F</sup>***pr*(*s*) if there are *<sup>U</sup>*1(*s*), *<sup>U</sup>*2(*s*) invertible in **<sup>F</sup>***M*�(*s*)*m*×*<sup>m</sup>* <sup>∩</sup> **F***pr*(*s*)*m*×*<sup>m</sup>* such that

$$T\_2(s) = \mathcal{U}\_1(s)T\_1(s)\mathcal{U}\_2(s). \tag{27}$$

When all ideals generated by linear polynomials are not in *M*� , each rational matrix admits a reduction to Smith–McMillan form with respect to **F***M*�(*s*) ∩ **F***pr*(*s*).

**Theorem 13.** *(Smith–McMillan form in* **F***M*�(*s*) ∩ **F***pr*(*s*)*) Let M*� ⊆ Specm(**F**[*s*])*. Assume that there are ideals in* Specm(**F**[*s*]) \ *M*� *generated by linear polynomials and let* (*s* − *a*) *be any of them. Let T*(*s*) <sup>∈</sup> **<sup>F</sup>**(*s*)*m*×*<sup>m</sup> be a non-singular matrix. Then there exist U*1(*s*), *<sup>U</sup>*2(*s*) *invertible in* **<sup>F</sup>***M*�(*s*)*m*×*<sup>m</sup>* <sup>∩</sup> **<sup>F</sup>***pr*(*s*)*m*×*<sup>m</sup> such that*

$$T(s) = \mathcal{U}\_1(s) \text{Diag}\left(\frac{\frac{\varepsilon\_1(s)}{(s-a)^{n\_1}}}{\frac{\psi\_1(s)}{(s-a)^{d\_1}}}, \dots, \frac{\frac{\varepsilon\_m(s)}{(s-a)^{n\_m}}}{\frac{\psi\_m(s)}{(s-a)^{d\_m}}}\right) \mathcal{U}\_2(s) \tag{28}$$

#### 12 Will-be-set-by-IN-TECH 58 Linear Algebra – Theorems and Applications

*with �i*(*s*) (*s*−*a*)*ni* , *<sup>ψ</sup>i*(*s*) (*s*−*a*)*di* <sup>∈</sup> **<sup>F</sup>***M*�(*s*) <sup>∩</sup> **<sup>F</sup>***pr*(*s*) *coprime for all i such that �i*(*s*)*, <sup>ψ</sup>i*(*s*) *are monic polynomials factorizing in M*� *, �i*(*s*) (*s*−*a*)*ni divides �i*+1(*s*) (*s*−*a*) *ni*+<sup>1</sup> *for i* <sup>=</sup> 1, . . . , *<sup>m</sup>* <sup>−</sup> <sup>1</sup> *while <sup>ψ</sup>i*(*s*) (*s*−*a*)*di divides <sup>ψ</sup>i*−<sup>1</sup>(*s*) (*s*−*a*) *di*−<sup>1</sup> *for i* = 2, . . . , *m.*

The elements *�i*(*s*) (*s*−*a*) *ni ψi*(*s*) (*s*−*a*) *di* of the diagonal matrix, satisfying the conditions of the previous theorem,

constitute a complete system of invariant for the equivalence in **F***M*�(*s*) ∩ **F***pr*(*s*) of rational matrices. However, this system of invariants is not minimal. A smaller one can be obtained by substituting each pair of positive integers (*ni*, *di*) by its difference *li* = *ni* − *di*.

**Theorem 14.** *Under the conditions of Theorem 13, �i*(*s*) *ψi*(*s*) 1 (*s*−*a*)*<sup>l</sup> <sup>i</sup> with �i*(*s*)*, ψi*(*s*) *monic and coprime polynomials factorizing in M*� *, �i*(*s*) | *�i*+1(*s*) *while <sup>ψ</sup>i*(*s*) | *<sup>ψ</sup>i*−1(*s*) *and l*1,..., *lm integers such that l*<sup>1</sup> + *d*(*ψ*1(*s*)) − *d*(*�*1(*s*)) ≤ ··· ≤ *lm* + *d*(*ψm*(*s*)) − *d*(*�m*(*s*)) *also constitute a complete system of invariants for the equivalence in* **F***M*�(*s*) ∩ **F***pr*(*s*)*.*

**Proof**.- We only have to show that from the system *�i*(*s*) *ψi*(*s*) 1 (*s*−*a*)*<sup>l</sup> i* , *i* = 1, . . . , *m*, satisfying the conditions of Theorem 14, the system *�i*(*s*) (*s*−*a*) *ni ψi*(*s*) (*s*−*a*) *di* , *i* = 1, . . . , *n*, can be constructed satisfying the conditions of Theorem 13.

Suppose that *�i*(*s*), *ψi*(*s*) are monic and coprime polynomials factorizing in *M*� such that *�i*(*s*) | *�i*+1(*s*) and *<sup>ψ</sup>i*(*s*) | *<sup>ψ</sup>i*−1(*s*). And suppose also that *<sup>l</sup>*1,..., *lm* are integers such that *l*<sup>1</sup> + *d*(*ψ*1(*s*)) − *d*(*�*1(*s*)) ≤··· ≤ *lm* + *d*(*ψm*(*s*)) − *d*(*�m*(*s*)). If *li* + *d*(*ψi*(*s*)) − *d*(*�i*(*s*)) ≤ 0 for all *i*, we define non-negative integers *ni* = *d*(*�i*(*s*)) and *di* = *d*(*�i*(*s*)) − *li* for *i* = 1, . . . , *m*. If *li* + *d*(*ψi*(*s*)) − *d*(*�i*(*s*)) > 0 for all *i*, we define *ni* = *li* + *d*(*ψi*(*s*)) and *di* = *d*(*ψi*(*s*)). Otherwise there is an index *k* ∈ {2, . . . , *m*} such that

$$d\_{k-1} + d(\psi\_{k-1}(s)) - d(\varepsilon\_{k-1}(s)) \le 0 < l\_k + d(\psi\_k(s)) - d(\varepsilon\_k(s)).\tag{29}$$

Define now the non-negative integers *ni*, *di* as follows:

$$m\_i = \begin{cases} d(\varepsilon\_i(s)) & \text{if } i < k \\ l\_i + d(\psi\_i(s)) \text{ if } i \ge k \end{cases} \quad d\_i = \begin{cases} d(\varepsilon\_i(s)) - l\_i \text{ if } i < k \\ d(\psi\_i(s)) & \text{if } i \ge k \end{cases} \tag{30}$$

Notice that *li* = *ni* − *di*. Moreover,

$$m\_i - d(\mathfrak{e}\_i(s)) = \begin{cases} 0 & \text{if } i < k \\ l\_i + d(\psi\_i(s)) - d(\mathfrak{e}\_i(s)) \text{ if } i \ge k \end{cases} \tag{31}$$

$$d\_i - d(\psi\_i(s)) = \begin{cases} -l\_i - d(\psi\_i(s)) + d(\varepsilon\_i(s)) \text{ if } i < k \\ 0 & \text{if } i \ge k \end{cases} \tag{32}$$

and using (29), (30)

$$n\_1 - d(\varepsilon\_1(s)) = \dots = n\_{k-1} - d(\varepsilon\_{k-1}(s)) = 0 < n\_k - d(\varepsilon\_k(s)) \le \dots \le n\_m - d(\varepsilon\_m(s)) \tag{33}$$

$$d\_1 - d(\psi\_1(s)) \ge \dots \ge d\_{k-1} - d(\psi\_{k-1}(s)) \ge 0 = d\_k - d(\psi\_k(s)) = \dots = d\_m - d(\psi\_m(s)).\tag{34}$$

In any case *�i*(*s*) (*s*−*a*)*ni* and *<sup>ψ</sup>i*(*s*) (*s*−*a*)*di* are elements of **<sup>F</sup>***M*�(*s*) <sup>∩</sup> **<sup>F</sup>***pr*(*s*). Now, on the one hand *�i*(*s*), *ψi*(*s*) are coprime and *ni* − *d*(*�i*(*s*)) = 0 or *di* − *d*(*ψi*(*s*)) = 0. This means (Lemma 6) that *�i*(*s*) (*s*−*a*)*ni* , *<sup>ψ</sup>i*(*s*) (*s*−*a*)*di* are coprime for all *<sup>i</sup>*. On the other hand *�i*(*s*) <sup>|</sup> *�i*+1(*s*) and 0 <sup>≤</sup> *ni* <sup>−</sup> *<sup>d</sup>*(*�i*(*s*)) <sup>≤</sup> *ni*<sup>+</sup><sup>1</sup> <sup>−</sup> *<sup>d</sup>*(*�i*+1(*s*)). Then (Lemma 5) *�i*(*s*) (*s*−*a*)*ni* divides *�i*+1(*s*) (*s*−*a*) *ni*+<sup>1</sup> . Similarly, since *<sup>ψ</sup>i*(*s*) | *<sup>ψ</sup>i*−1(*s*) and 0 <sup>≤</sup> *di* <sup>−</sup> *<sup>d</sup>*(*ψi*(*s*)) <sup>≤</sup> *di*−<sup>1</sup> <sup>−</sup> *<sup>d</sup>*(*ψi*−1(*s*)), it follows that *<sup>ψ</sup>i*(*s*) (*s*−*a*)*di* divides *<sup>ψ</sup>i*−<sup>1</sup>(*s*) (*s*−*a*) *di*−<sup>1</sup> .

We call *�i*(*s*) *ψi*(*s*) 1 (*s*−*a*)*<sup>l</sup> i* , *i* = 1, . . . , *m*, the invariant rational functions of *T*(*s*) in **F***M*�(*s*) ∩ **F***pr*(*s*).

There is a particular case worth considering: If *M*� = ∅ then **F**∅(*s*) ∩ **F***pr*(*s*) = **F***pr*(*s*) and (*s*) ∈ Specm(**F**[*s*]) \ *M*� = Specm(**F**[*s*]). In this case, we obtain the invariant rational functions of *T*(*s*) at infinity (recall (17)).

#### **4. Wiener–Hopf equivalence**

12 Will-be-set-by-IN-TECH

constitute a complete system of invariant for the equivalence in **F***M*�(*s*) ∩ **F***pr*(*s*) of rational matrices. However, this system of invariants is not minimal. A smaller one can be obtained

*l*<sup>1</sup> + *d*(*ψ*1(*s*)) − *d*(*�*1(*s*)) ≤ ··· ≤ *lm* + *d*(*ψm*(*s*)) − *d*(*�m*(*s*)) *also constitute a complete system of*

Suppose that *�i*(*s*), *ψi*(*s*) are monic and coprime polynomials factorizing in *M*� such that *�i*(*s*) | *�i*+1(*s*) and *<sup>ψ</sup>i*(*s*) | *<sup>ψ</sup>i*−1(*s*). And suppose also that *<sup>l</sup>*1,..., *lm* are integers such that *l*<sup>1</sup> + *d*(*ψ*1(*s*)) − *d*(*�*1(*s*)) ≤··· ≤ *lm* + *d*(*ψm*(*s*)) − *d*(*�m*(*s*)). If *li* + *d*(*ψi*(*s*)) − *d*(*�i*(*s*)) ≤ 0 for all *i*, we define non-negative integers *ni* = *d*(*�i*(*s*)) and *di* = *d*(*�i*(*s*)) − *li* for *i* = 1, . . . , *m*. If *li* + *d*(*ψi*(*s*)) − *d*(*�i*(*s*)) > 0 for all *i*, we define *ni* = *li* + *d*(*ψi*(*s*)) and *di* = *d*(*ψi*(*s*)). Otherwise

*ψi*(*s*)

by substituting each pair of positive integers (*ni*, *di*) by its difference *li* = *ni* − *di*.

*�i*(*s*) (*s*−*a*) *ni ψi*(*s*) (*s*−*a*) *di*

(*s*−*a*)*di* <sup>∈</sup> **<sup>F</sup>***M*�(*s*) <sup>∩</sup> **<sup>F</sup>***pr*(*s*) *coprime for all i such that �i*(*s*)*, <sup>ψ</sup>i*(*s*) *are monic polynomials*

*ni*+<sup>1</sup> *for i* <sup>=</sup> 1, . . . , *<sup>m</sup>* <sup>−</sup> <sup>1</sup> *while <sup>ψ</sup>i*(*s*)

of the diagonal matrix, satisfying the conditions of the previous theorem,

1 (*s*−*a*)*<sup>l</sup>*

*ψi*(*s*)

*lk*−<sup>1</sup> + *<sup>d</sup>*(*ψk*−1(*s*)) − *<sup>d</sup>*(*�k*−1(*s*)) ≤ <sup>0</sup> < *lk* + *<sup>d</sup>*(*ψk*(*s*)) − *<sup>d</sup>*(*�k*(*s*)). (29)

0 if *i* < *k*

<sup>−</sup>*li* <sup>−</sup> *<sup>d</sup>*(*ψi*(*s*)) + *<sup>d</sup>*(*�i*(*s*)) if *<sup>i</sup>* <sup>&</sup>lt; *<sup>k</sup>*

*<sup>n</sup>*<sup>1</sup> − *<sup>d</sup>*(*�*1(*s*)) = ··· = *nk*−<sup>1</sup> − *<sup>d</sup>*(*�k*−1(*s*)) = <sup>0</sup> < *nk* − *<sup>d</sup>*(*�k*(*s*)) ≤···≤ *nm* − *<sup>d</sup>*(*�m*(*s*)) (33)

*<sup>d</sup>*<sup>1</sup> − *<sup>d</sup>*(*ψ*1(*s*)) ≥···≥ *dk*−<sup>1</sup> − *<sup>d</sup>*(*ψk*−1(*s*)) ≥ <sup>0</sup> = *dk* − *<sup>d</sup>*(*ψk*(*s*)) = ··· = *dm* − *<sup>d</sup>*(*ψm*(*s*)). (34)

*<sup>d</sup>*(*�i*(*s*)) <sup>−</sup> *li* if *<sup>i</sup>* <sup>&</sup>lt; *<sup>k</sup>*

*li* <sup>+</sup> *<sup>d</sup>*(*ψi*(*s*)) <sup>−</sup> *<sup>d</sup>*(*�i*(*s*)) if *<sup>i</sup>* <sup>≥</sup> *<sup>k</sup>* (31)

<sup>0</sup> if *<sup>i</sup>* <sup>≥</sup> *<sup>k</sup>* (32)

*<sup>d</sup>*(*ψi*(*s*)) if *<sup>i</sup>* <sup>≥</sup> *<sup>k</sup>* (30)

*, �i*(*s*) | *�i*+1(*s*) *while <sup>ψ</sup>i*(*s*) | *<sup>ψ</sup>i*−1(*s*) *and l*1,..., *lm integers such that*

1 (*s*−*a*)*<sup>l</sup> i*

, *i* = 1, . . . , *n*, can be constructed satisfying the

(*s*−*a*)*di divides <sup>ψ</sup>i*−<sup>1</sup>(*s*)

*<sup>i</sup> with �i*(*s*)*, ψi*(*s*) *monic and coprime*

, *i* = 1, . . . , *m*, satisfying the

(*s*−*a*)

*di*−<sup>1</sup> *for*

*with �i*(*s*)

(*s*−*a*)*ni* , *<sup>ψ</sup>i*(*s*)

*, �i*(*s*)

*�i*(*s*) (*s*−*a*) *ni ψi*(*s*) (*s*−*a*) *di*

*polynomials factorizing in M*�

conditions of Theorem 13.

(*s*−*a*)*ni divides �i*+1(*s*)

**Theorem 14.** *Under the conditions of Theorem 13, �i*(*s*)

**Proof**.- We only have to show that from the system *�i*(*s*)

*invariants for the equivalence in* **F***M*�(*s*) ∩ **F***pr*(*s*)*.*

conditions of Theorem 14, the system

there is an index *k* ∈ {2, . . . , *m*} such that

*ni* =

Notice that *li* = *ni* − *di*. Moreover,

and using (29), (30)

Define now the non-negative integers *ni*, *di* as follows:

*ni* − *d*(*�i*(*s*)) =

*di* − *d*(*ψi*(*s*)) =

*d*(*�i*(*s*)) if *i* < *k*

*li* <sup>+</sup> *<sup>d</sup>*(*ψi*(*s*)) if *<sup>i</sup>* <sup>≥</sup> *<sup>k</sup> di* <sup>=</sup>

(*s*−*a*)

*factorizing in M*�

*i* = 2, . . . , *m.*

The elements

The left Wiener–Hopf equivalence of rational matrices with respect to a closed contour in the complex plane has been extensively studied ([6] or [10]). Now we present the generalization to arbitrary fields ([4]).

**Definition 15.** *Let M and M*� *be subsets of* Specm(**F**[*s*]) *such that M* ∪ *M*� = Specm(**F**[*s*])*. Let <sup>T</sup>*1(*s*), *<sup>T</sup>*2(*s*) <sup>∈</sup> **<sup>F</sup>**(*s*)*m*×*<sup>m</sup> be two non-singular rational matrices with no zeros and no poles in M* <sup>∩</sup> *<sup>M</sup>*� *. The matrices T*1(*s*), *T*2(*s*) *are said to be left Wiener–Hopf equivalent with respect to* (*M*, *M*� ) *if there exist both U*1(*s*) *invertible in* **<sup>F</sup>***M*�(*s*)*m*×*<sup>m</sup>* <sup>∩</sup> **<sup>F</sup>***pr*(*s*)*m*×*<sup>m</sup> and U*2(*s*) *invertible in* **<sup>F</sup>***M*(*s*)*m*×*<sup>m</sup> such that*

$$T\_2(\mathbf{s}) = \mathcal{U}\_1(\mathbf{s}) T\_1(\mathbf{s}) \mathcal{U}\_2(\mathbf{s}).\tag{35}$$

This is, in fact, an equivalence relation as it is easily seen. It would be an equivalence relation even if no condition about the union and intersection of *M* and *M*� were imposed. It will be seen later on that these conditions are natural assumptions for the existence of unique diagonal representatives in each class.

The right Wiener–Hopf equivalence with respect to (*M*, *M*� ) is defined in a similar manner: There are invertible matrices *<sup>U</sup>*1(*s*) in **<sup>F</sup>***M*�(*s*)*m*×*<sup>m</sup>* <sup>∩</sup> **<sup>F</sup>***pr*(*s*)*m*×*<sup>m</sup>* and *<sup>U</sup>*2(*s*) in **<sup>F</sup>***M*(*s*)*m*×*<sup>m</sup>* such that

$$T\_2(s) = \mathcal{U}\_2(s)T\_1(s)\mathcal{U}\_1(s). \tag{36}$$

In the following only the left Wiener–Hopf equivalence will be considered, but, by transposition, all results hold for the right Wiener–Hopf equivalence as well.

The aim of this section is to obtain a complete system of invariants for the Wiener–Hopf equivalence with respect to (*M*, *M*� ) of rational matrices, and to obtain, if possible, a canonical form.

There is a particular case that is worth-considering: If *M* = Specm(**F**[*s*]) and *M*� = ∅, the invertible matrices in **<sup>F</sup>**∅(*s*)*m*×*<sup>m</sup>* <sup>∩</sup> **<sup>F</sup>***pr*(*s*)*m*×*<sup>m</sup>* are the biproper matrices and the invertible matrices in **<sup>F</sup>**Specm(**F**[*s*])(*s*)*m*×*<sup>m</sup>* are the unimodular matrices. In this case, the left Wiener–Hopf equivalence with respect to (*M*, *M*� )=(Specm(**F**[*s*]), ∅) is the so-called left Wiener–Hopf equivalence at infinity (see [9]). It is known that any non-singular rational matrix is left Wiener–Hopf equivalent at infinity to a diagonal matrix Diag(*sg*<sup>1</sup> ,...,*sgm* ) where *g*1,..., *gm*

are integers, that is, for any non-singular *<sup>T</sup>*(*s*) <sup>∈</sup> **<sup>F</sup>**(*s*)*m*×*<sup>m</sup>* there exist both a biproper matrix *B*(*s*) ∈ Gl*m*(**F***pr*(*s*)) and a unimodular matrix *U*(*s*) ∈ Gl*m*(**F**[*s*]) such that

$$T(\mathbf{s}) = B(\mathbf{s}) \operatorname{Diag}(\mathbf{s}^{\mathcal{G}\_1}, \dots, \mathbf{s}^{\mathcal{G}\_m}) \mathcal{U}(\mathbf{s}) \tag{37}$$

where *g*<sup>1</sup> ≥ ··· ≥ *gm* are integers uniquely determined by *T*(*s*). They are called the left Wiener–Hopf factorization indices at infinity and form a complete system of invariants for the left Wiener–Hopf equivalence at infinity. These are the basic objects that will produce the complete system of invariants for the left Wiener–Hopf equivalence with respect to (*M*, *M*� ).

For polynomial matrices, their left Wiener–Hopf factorization indices at infinity are the column degrees of any right equivalent (by a unimodular matrix) column proper matrix. Namely, a polynomial matrix is column proper if it can be written as *Pc* Diag(*sg*<sup>1</sup> ,...,*sgm* ) + *<sup>L</sup>*(*s*) with *Pc* <sup>∈</sup> **<sup>F</sup>***m*×*<sup>m</sup>* non-singular, *<sup>g</sup>*1,..., *gm* non-negative integers and *<sup>L</sup>*(*s*) a polynomial matrix such that the degree of the *i*th column of *L*(*s*) smaller than *gi*, 1 ≤ *i* ≤ *m*. Let *P*(*s*) ∈ **F**[*s*] *<sup>m</sup>*×*<sup>m</sup>* be non-singular polynomial. There exists a unimodular matrix *<sup>V</sup>*(*s*) <sup>∈</sup> **<sup>F</sup>**[*s*] *<sup>m</sup>*×*<sup>m</sup>* such that *P*(*s*)*V*(*s*) is column proper. The column degrees of *P*(*s*)*V*(*s*) are uniquely determined by *P*(*s*), although *V*(*s*) is not (see [9], [12, p. 388], [17]). Since *P*(*s*)*V*(*s*) is column proper, it can be written as *P*(*s*)*V*(*s*) = *PcD*(*s*) + *L*(*s*) with *Pc* non-singular, *D*(*s*) = Diag(*sg*<sup>1</sup> ,...,*sgm* ) and the degree of the *i*th column of *L*(*s*) smaller than *gi*, 1 ≤ *i* ≤ *m*. Then *P*(*s*)*V*(*s*) = (*Pc* + *L*(*s*)*D*(*s*)−1)*D*(*s*). Put *B*(*s*) = *Pc* + *L*(*s*)*D*(*s*)−1. Since *Pc* is non-singular and *L*(*s*)*D*(*s*)−<sup>1</sup> is a strictly proper matrix, *B*(*s*) is biproper, and *P*(*s*) = *B*(*s*)*D*(*s*)*U*(*s*) where *U*(*s*) = *V*(*s*)−1.

The left Wiener–Hopf factorization indices at infinity can be used to associate a sequence of integers with every non-singular rational matrix and every *M* ⊆ Specm(**F**[*s*]). This is done as follows: If *<sup>T</sup>*(*s*) <sup>∈</sup> **<sup>F</sup>**(*s*)*m*×*<sup>m</sup>* then it can always be written as *<sup>T</sup>*(*s*) = *TL*(*s*)*TR*(*s*) such that the global invariant rational functions of *TL*(*s*) factorize in *M* and *TR*(*s*) ∈ Gl*m*(**F***M*(*s*)) or, equivalently, the global invariant rational functions of *TR*(*s*) factorize in Specm(**F**[*s*]) \ *M* (see Proposition 2). There may be many factorizations of this type, but it turns out (see [1, Proposition 3.2] for the polynomial case) that the left factors in all of them are right equivalent. This means that if *T*(*s*) = *TL*1(*s*)*TR*1(*s*) = *TL*2(*s*)*TR*2(*s*) with the global invariant rational functions of *TL*1(*s*) and *TL*2(*s*) factorizing in *M* and the global invariant rational functions of *TR*1(*s*) and *TR*2(*s*) factorizing in Specm(**F**[*s*]) \ *M* then there is a unimodular matrix *U*(*s*) such that *TL*1(*s*) = *TL*2(*s*)*U*(*s*). In particular, *TL*1(*s*) and *TL*2(*s*) have the same left Wiener–Hopf factorization indices at infinity. Thus the following definition makes sense:

**Definition 16.** *Let T*(*s*) <sup>∈</sup> **<sup>F</sup>**(*s*)*m*×*<sup>m</sup> be a non-singular rational matrix and M* <sup>⊆</sup> Specm(**F**[*s*])*. Let TL*(*s*), *TR*(*s*) <sup>∈</sup> **<sup>F</sup>**(*s*)*m*×*<sup>m</sup> such that*


*Then the left Wiener–Hopf factorization indices of T*(*s*) *with respect to M are defined to be the left Wiener–Hopf factorization indices of TL*(*s*) *at infinity.*

In the particular case that *M* = Specm(**F**[*s*]), we can put *TL*(*s*) = *T*(*s*) and *TR*(*s*) = *Im*. Therefore, the left Wiener–Hopf factorization indices of *T*(*s*) with respect to Specm(**F**[*s*]) are the left Wiener–Hopf factorization indices of *T*(*s*) at infinity.

We prove now that the left Wiener–Hopf equivalence with respect to (*M*, *M*� ) can be characterized through the left Wiener–Hopf factorization indices with respect to *M*.

**Theorem 17.** *Let M*, *M*� ⊆ Specm(**F**[*s*]) *be such that M* ∪ *M*� = Specm(**F**[*s*])*. Let T*1(*s*)*, T*2(*s*) ∈ **<sup>F</sup>**(*s*)*m*×*<sup>m</sup> be two non-singular rational matrices with no zeros and no poles in M* <sup>∩</sup> *<sup>M</sup>*� *. The matrices T*1(*s*) *and T*2(*s*) *are left Wiener–Hopf equivalent with respect to* (*M*, *M*� ) *if and only if T*1(*s*) *and T*2(*s*) *have the same left Wiener–Hopf factorization indices with respect to M.*

**Proof.-** By Proposition 2 we can write *T*1(*s*) = *TL*1(*s*)*TR*1(*s*), *T*2(*s*) = *TL*2(*s*)*TR*2(*s*) with the global invariant rational functions of *TL*1(*s*) and of *TL*2(*s*) factorizing in *M* \ *M*� (recall that *T*1(*s*) and *T*2(*s*) have no zeros and no poles in *M* ∩ *M*� ) and the global invariant rational functions of *TR*1(*s*) and of *TR*2(*s*) factorizing in *M*� \ *M*.

Assume that *T*1(*s*), *T*2(*s*) have the same left Wiener–Hopf factorization indices with respect to *M*. By definition, *T*1(*s*) and *T*2(*s*) have the same left Wiener–Hopf factorization indices with respect to *M* if *TL*1(*s*) and *TL*2(*s*) have the same left Wiener–Hopf factorization indices at infinity. This means that there exist matrices *B*(*s*) ∈ Gl*m*(**F***pr*(*s*)) and *U*(*s*) ∈ Gl*m*(**F**[*s*]) such that *TL*2(*s*) = *B*(*s*)*TL*1(*s*)*U*(*s*). We have that *T*2(*s*) = *TL*2(*s*)*TR*2(*s*) = *B*(*s*)*TL*1(*s*)*U*(*s*)*TR*2(*s*) = *B*(*s*)*T*1(*s*)(*TR*1(*s*)−1*U*(*s*)*TR*2(*s*)). We aim to prove that *B*(*s*) = *TL*2(*s*)*U*(*s*)−<sup>1</sup>*TL*1(*s*)−<sup>1</sup> is invertible in **<sup>F</sup>***M*�(*s*)*m*×*<sup>m</sup>* and *TR*1(*s*)−1*U*(*s*)*TR*2(*s*) <sup>∈</sup> *Glm*(**F***M*(*s*)). Since the global invariant rational functions of *TL*2(*s*) and *TL*1(*s*) factorize in *M* \ *M*� , *TL*2(*s*), *TL*1(*s*) <sup>∈</sup> **<sup>F</sup>***M*�(*s*)*m*×*<sup>m</sup>* and *<sup>B</sup>*(*s*) <sup>∈</sup> **<sup>F</sup>***M*�(*s*)*m*×*m*. Moreover, det *<sup>B</sup>*(*s*) is a unit in **<sup>F</sup>***M*�(*s*)*m*×*<sup>m</sup>* as desired. Now, *TR*1(*s*)−1*U*(*s*)*TR*2(*s*) <sup>∈</sup> *Glm*(**F***M*(*s*)) because *TR*1(*s*), *TR*2(*s*) <sup>∈</sup> **<sup>F</sup>***M*(*s*)*m*×*<sup>m</sup>* and det *TR*1(*s*) and det *TR*2(*s*) factorize in *M*� \ *M*. Therefore *T*1(*s*) and *T*2(*s*) are left Wiener–Hopf equivalent with respect to (*M*, *M*� ).

Conversely, let *U*1(*s*) ∈ Gl*m*(**F***M*�(*s*)) ∩ Gl*m*(**F***pr*(*s*)) and *U*2(*s*) ∈ Gl*m*(**F***M*(*s*)) such that *T*1(*s*) = *U*1(*s*)*T*2(*s*)*U*2(*s*). Hence, *T*1(*s*) = *TL*1(*s*)*TR*1(*s*) = *U*1(*s*)*TL*2(*s*)*TR*2(*s*)*U*2(*s*). Put *TL*2(*s*) = *U*1(*s*)*TL*2(*s*) and *TR*2(*s*) = *TR*2(*s*)*U*2(*s*). Therefore,

$$\text{(i)}\ T\_1(\mathbf{s}) = T\_{L1}(\mathbf{s})T\_{R1}(\mathbf{s}) = \overline{T}\_{L2}(\mathbf{s})\overline{T}\_{R2}(\mathbf{s})\_{\text{A}}$$

14 Will-be-set-by-IN-TECH

are integers, that is, for any non-singular *<sup>T</sup>*(*s*) <sup>∈</sup> **<sup>F</sup>**(*s*)*m*×*<sup>m</sup>* there exist both a biproper matrix

where *g*<sup>1</sup> ≥ ··· ≥ *gm* are integers uniquely determined by *T*(*s*). They are called the left Wiener–Hopf factorization indices at infinity and form a complete system of invariants for the left Wiener–Hopf equivalence at infinity. These are the basic objects that will produce the complete system of invariants for the left Wiener–Hopf equivalence with respect to (*M*, *M*�

For polynomial matrices, their left Wiener–Hopf factorization indices at infinity are the column degrees of any right equivalent (by a unimodular matrix) column proper matrix. Namely, a polynomial matrix is column proper if it can be written as *Pc* Diag(*sg*<sup>1</sup> ,...,*sgm* ) + *<sup>L</sup>*(*s*) with *Pc* <sup>∈</sup> **<sup>F</sup>***m*×*<sup>m</sup>* non-singular, *<sup>g</sup>*1,..., *gm* non-negative integers and *<sup>L</sup>*(*s*) a polynomial matrix such that the degree of the *i*th column of *L*(*s*) smaller than *gi*, 1 ≤ *i* ≤ *m*. Let *P*(*s*) ∈

*<sup>m</sup>*×*<sup>m</sup>* be non-singular polynomial. There exists a unimodular matrix *<sup>V</sup>*(*s*) <sup>∈</sup> **<sup>F</sup>**[*s*]

factorization indices at infinity. Thus the following definition makes sense:

*iii) the global invariant rational functions of TR*(*s*) *factorize in* Specm(**F**[*s*]) \ *M.*

*ii) the global invariant rational functions of TL*(*s*) *factorize in M, and*

the left Wiener–Hopf factorization indices of *T*(*s*) at infinity.

*Wiener–Hopf factorization indices of TL*(*s*) *at infinity.*

*TL*(*s*), *TR*(*s*) <sup>∈</sup> **<sup>F</sup>**(*s*)*m*×*<sup>m</sup> such that*

*i) T*(*s*) = *TL*(*s*)*TR*(*s*)*,*

**Definition 16.** *Let T*(*s*) <sup>∈</sup> **<sup>F</sup>**(*s*)*m*×*<sup>m</sup> be a non-singular rational matrix and M* <sup>⊆</sup> Specm(**F**[*s*])*. Let*

*Then the left Wiener–Hopf factorization indices of T*(*s*) *with respect to M are defined to be the left*

In the particular case that *M* = Specm(**F**[*s*]), we can put *TL*(*s*) = *T*(*s*) and *TR*(*s*) = *Im*. Therefore, the left Wiener–Hopf factorization indices of *T*(*s*) with respect to Specm(**F**[*s*]) are

that *P*(*s*)*V*(*s*) is column proper. The column degrees of *P*(*s*)*V*(*s*) are uniquely determined by *P*(*s*), although *V*(*s*) is not (see [9], [12, p. 388], [17]). Since *P*(*s*)*V*(*s*) is column proper, it can be written as *P*(*s*)*V*(*s*) = *PcD*(*s*) + *L*(*s*) with *Pc* non-singular, *D*(*s*) = Diag(*sg*<sup>1</sup> ,...,*sgm* ) and the degree of the *i*th column of *L*(*s*) smaller than *gi*, 1 ≤ *i* ≤ *m*. Then *P*(*s*)*V*(*s*) = (*Pc* + *L*(*s*)*D*(*s*)−1)*D*(*s*). Put *B*(*s*) = *Pc* + *L*(*s*)*D*(*s*)−1. Since *Pc* is non-singular and *L*(*s*)*D*(*s*)−<sup>1</sup> is a strictly proper matrix, *B*(*s*) is biproper, and *P*(*s*) = *B*(*s*)*D*(*s*)*U*(*s*) where *U*(*s*) = *V*(*s*)−1. The left Wiener–Hopf factorization indices at infinity can be used to associate a sequence of integers with every non-singular rational matrix and every *M* ⊆ Specm(**F**[*s*]). This is done as follows: If *<sup>T</sup>*(*s*) <sup>∈</sup> **<sup>F</sup>**(*s*)*m*×*<sup>m</sup>* then it can always be written as *<sup>T</sup>*(*s*) = *TL*(*s*)*TR*(*s*) such that the global invariant rational functions of *TL*(*s*) factorize in *M* and *TR*(*s*) ∈ Gl*m*(**F***M*(*s*)) or, equivalently, the global invariant rational functions of *TR*(*s*) factorize in Specm(**F**[*s*]) \ *M* (see Proposition 2). There may be many factorizations of this type, but it turns out (see [1, Proposition 3.2] for the polynomial case) that the left factors in all of them are right equivalent. This means that if *T*(*s*) = *TL*1(*s*)*TR*1(*s*) = *TL*2(*s*)*TR*2(*s*) with the global invariant rational functions of *TL*1(*s*) and *TL*2(*s*) factorizing in *M* and the global invariant rational functions of *TR*1(*s*) and *TR*2(*s*) factorizing in Specm(**F**[*s*]) \ *M* then there is a unimodular matrix *U*(*s*) such that *TL*1(*s*) = *TL*2(*s*)*U*(*s*). In particular, *TL*1(*s*) and *TL*2(*s*) have the same left Wiener–Hopf

*T*(*s*) = *B*(*s*) Diag(*sg*<sup>1</sup> ,...,*sgm* )*U*(*s*) (37)

).

*<sup>m</sup>*×*<sup>m</sup>* such

*B*(*s*) ∈ Gl*m*(**F***pr*(*s*)) and a unimodular matrix *U*(*s*) ∈ Gl*m*(**F**[*s*]) such that

**F**[*s*]

(ii) the global invariant rational functions of *TL*1(*s*) and of *TL*2(*s*) factorize in *M*, and (iii)the global invariant rational functions of *TR*1(*s*) and of *TR*2(*s*) factorize in Specm(**F**[*s*]) \ *M*.

Then *TL*1(*s*) and *TL*2(*s*) are right equivalent (see the remark previous to Definition 16). So, there exists *U*(*s*) ∈ Gl*m*(**F**[*s*]) such that *TL*1(*s*) = *TL*2(*s*)*U*(*s*). Thus, *TL*1(*s*) = *U*1(*s*)*TL*2(*s*)*U*(*s*). Since *U*1(*s*) is biproper and *U*(*s*) is unimodular *TL*1(*s*), *TL*2(*s*) have the same left Wiener–Hopf factorization indices at infinity. Consequentially, *T*1(*s*) and *T*2(*s*) have the same left Wiener–Hopf factorization indices with respect to *M*.

In conclusion, for non-singular rational matrices with no zeros and no poles in *M* ∩ *M*� the left Wiener–Hopf factorization indices with respect to *M* form a complete system of invariants for the left Wiener–Hopf equivalence with respect to (*M*, *M*� ) with *M* ∪ *M*� = Specm(**F**[*s*]).

A straightforward consequence of the above theorem is the following Corollary

**Corollary 18.** *Let M*, *M*� ⊆ Specm(**F**[*s*]) *be such that M* ∪ *M*� = Specm(**F**[*s*])*. Let T*1(*s*)*, <sup>T</sup>*2(*s*) <sup>∈</sup> **<sup>F</sup>**(*s*)*m*×*<sup>m</sup> be non-singular with no zeros and no poles in M* <sup>∩</sup> *<sup>M</sup>*� *. Then T*1(*s*) *and T*2(*s*) *are left Wiener–Hopf equivalent with respect to* (*M*, *M*� ) *if and only if for any factorizations T*1(*s*) = *TL*1(*s*)*TR*1(*s*) *and T*2(*s*) = *TL*2(*s*)*TR*2(*s*) *satisfying the conditions (i)–(iii) of Definition 16, TL*1(*s*) *and TL*2(*s*) *are left Wiener–Hopf equivalent at infinity.*

Next we deal with the problem of factorizing or reducing a rational matrix to diagonal form by Wiener–Hopf equivalence. It will be shown that if there exists in *M* an ideal generated by a monic irreducible polynomial of degree equal to 1 which is not in *M*� , then any non-singular rational matrix, with no zeros and no poles in *M* ∩ *M*� admits a factorization with respect to (*M*, *M*� ). Afterwards, some examples will be given in which these conditions on *M* and *M*� are removed and factorization fails to exist.

**Theorem 19.** *Let M*, *M*� ⊆ Specm(**F**[*s*]) *be such that M* ∪ *M*� = Specm(**F**[*s*])*. Assume that there are ideals in M* \ *<sup>M</sup>*� *generated by linear polynomials. Let* (*<sup>s</sup>* <sup>−</sup> *<sup>a</sup>*) *be any of them and T*(*s*) <sup>∈</sup> **<sup>F</sup>**(*s*)*m*×*<sup>m</sup> a non-singular matrix with no zeros and no poles in M* ∩ *M*� *. There exist both U*1(*s*) *invertible in* **<sup>F</sup>***M*�(*s*)*m*×*<sup>m</sup>* <sup>∩</sup> **<sup>F</sup>***pr*(*s*)*m*×*<sup>m</sup> and U*2(*s*) *invertible in* **<sup>F</sup>***M*(*s*)*m*×*<sup>m</sup> such that*

$$T(s) = \mathcal{U}\_1(s) \operatorname{Diag}((s-a)^{k\_1}, \dots, (s-a)^{k\_w}) \mathcal{U}\_2(s), \tag{38}$$

*where k*<sup>1</sup> ≥ ··· ≥ *km are integers uniquely determined by T*(*s*)*. Moreover, they are the left Wiener–Hopf factorization indices of T*(*s*) *with respect to M.*

**Proof.-** The matrix *T*(*s*) can be written (see Proposition 2) as *T*(*s*) = *TL*(*s*)*TR*(*s*) with the global invariant rational functions of *TL*(*s*) factorizing in *M* \ *M*� and the global invariant rational functions of *TR*(*s*) factorizing in Specm(**F**[*s*]) \ *M* = *M*� \ *M*. As *k*1,..., *km* are the left Wiener–Hopf factorization indices of *TL*(*s*) at infinity, there exist matrices *U*(*s*) ∈ Gl*m*(**F**[*s*]) and *<sup>B</sup>*(*s*) <sup>∈</sup> Gl*m*(**F***pr*(*s*)) such that *TL*(*s*) = *<sup>B</sup>*(*s*)*D*1(*s*)*U*(*s*) with *<sup>D</sup>*1(*s*) = Diag(*sk*<sup>1</sup> ,...,*skm* ). Put *<sup>D</sup>*(*s*) = Diag((*<sup>s</sup>* <sup>−</sup> *<sup>a</sup>*)*k*<sup>1</sup> ,...,(*<sup>s</sup>* <sup>−</sup> *<sup>a</sup>*)*km* ) and *<sup>U</sup>*1(*s*) = *<sup>B</sup>*(*s*) Diag *<sup>s</sup>k*<sup>1</sup> (*s*−*a*)*<sup>k</sup>*<sup>1</sup> ,..., *<sup>s</sup>km* (*s*−*a*)*km* . Then *TL*(*s*) = *U*1(*s*)*D*(*s*)*U*(*s*). If *U*2(*s*) = *U*(*s*)*TR*(*s*) then this matrix is invertible in **F***M*(*s*)*m*×*<sup>m</sup>* and *<sup>T</sup>*(*s*) = *<sup>U</sup>*1(*s*) Diag((*<sup>s</sup>* <sup>−</sup> *<sup>a</sup>*)*k*<sup>1</sup> ,...,(*<sup>s</sup>* <sup>−</sup> *<sup>a</sup>*)*km* )*U*2(*s*). We only have to prove that *<sup>U</sup>*1(*s*) is invertible in **<sup>F</sup>***M*�(*s*)*m*×*<sup>m</sup>* <sup>∩</sup> **<sup>F</sup>***pr*(*s*)*m*×*m*. It is clear that *<sup>U</sup>*1(*s*) is in **<sup>F</sup>***pr*(*s*)*m*×*<sup>m</sup>* and biproper. Moreover, the global invariant rational functions of *TL*(*s*) *U*1(*s*) = *TL*(*s*)(*D*(*s*)*U*(*s*))−<sup>1</sup> factorize in *M* \ *M*� . Therefore, *U*1(*s*) is invertible in **F***M*�(*s*)*m*×*m*.

We prove now the uniqueness of the factorization. Assume that *T*(*s*) also factorizes as

$$T(s) = \tilde{\mathcal{U}}\_1(s) \operatorname{Diag}((s-a)^{\tilde{k}\_1}, \dots, (s-a)^{\tilde{k}\_w}) \tilde{\mathcal{U}}\_2(s), \tag{39}$$

with ˜ *<sup>k</sup>*<sup>1</sup> ≥···≥ ˜ *km* integers. Then,

$$\text{Diag}((s-a)^{\tilde{k}\_1}, \dots, (s-a)^{\tilde{k}\_m}) = \tilde{\text{U}}\_1(s)^{-1} \text{U}\_1(s) \text{Diag}((s-a)^{\tilde{k}\_1}, \dots, (s-a)^{\tilde{k}\_m}) \text{U}\_2(s) \tilde{\text{U}}\_2(s)^{-1}. \tag{40}$$

The diagonal matrices have no zeros and no poles in *M* ∩ *M*� (because (*s* − *a*) ∈ *M* \ *M*� ) and they are left Wiener–Hopf equivalent with respect to (*M*, *M*� ). By Theorem 17, they have the same left Wiener–Hopf factorization indices with respect to *M*. Thus, ˜ *ki* = *ki* for all *i* = 1, . . . , *m*.

Following [6] we could call left Wiener–Hopf factorization indices with respect to (*M*, *M*� ) the exponents *k*<sup>1</sup> ≥···≥ *km* appearing in the diagonal matrix of Theorem 19. They are, actually, the left Wiener–Hopf factorization indices with respect to *M*.

Several examples follow that exhibit some remarkable features about the results that have been proved so far. The first two examples show that if no assumption is made on the intersection and/or union of *M* and *M*� then existence and/or uniqueness of diagonal factorization may fail to exist.

**Example 20.** If *P*(*s*) is a polynomial matrix with zeros in *M* ∩ *M*� then the existence of invertible matrices *U*1(*s*) ∈ Gl*m*(**F***M*�(*s*)) ∩ Gl*m*(**F***pr*(*s*)) and *U*2(*s*) ∈ Gl*m*(**F***M*(*s*)) such that *<sup>P</sup>*(*s*) = *<sup>U</sup>*1(*s*) Diag((*<sup>s</sup>* <sup>−</sup> *<sup>a</sup>*)*k*<sup>1</sup> ,...,(*<sup>s</sup>* <sup>−</sup> *<sup>a</sup>*)*km* )*U*2(*s*) with (*<sup>s</sup>* <sup>−</sup> *<sup>a</sup>*) <sup>∈</sup> *<sup>M</sup>* \ *<sup>M</sup>*� may fail. In fact, suppose that *M* = {(*s*),(*s* + 1)}, *M*� = Specm **F**[*s*] \ {(*s*)}. Therefore, *M* ∩ *M*� = {(*s* + 1)} and (*s*) ∈ *M* \ *M*� . Consider *p*1(*s*) = *s* + 1. Assume that *s* + 1 = *u*1(*s*)*sku*2(*s*) with *u*1(*s*) a unit in **F***M*�(*s*) ∩ **F***pr*(*s*) and *u*2(*s*) a unit in **F***M*(*s*). Thus, *u*1(*s*) = *c* a nonzero constant and *u*2(*s*) = <sup>1</sup> *c s*+1 *sk* which is not a unit in **F***M*(*s*).

16 Will-be-set-by-IN-TECH

Next we deal with the problem of factorizing or reducing a rational matrix to diagonal form by Wiener–Hopf equivalence. It will be shown that if there exists in *M* an ideal generated by

rational matrix, with no zeros and no poles in *M* ∩ *M*� admits a factorization with respect to

**Theorem 19.** *Let M*, *M*� ⊆ Specm(**F**[*s*]) *be such that M* ∪ *M*� = Specm(**F**[*s*])*. Assume that there are ideals in M* \ *<sup>M</sup>*� *generated by linear polynomials. Let* (*<sup>s</sup>* <sup>−</sup> *<sup>a</sup>*) *be any of them and T*(*s*) <sup>∈</sup> **<sup>F</sup>**(*s*)*m*×*<sup>m</sup>*

*where k*<sup>1</sup> ≥ ··· ≥ *km are integers uniquely determined by T*(*s*)*. Moreover, they are the left*

**Proof.-** The matrix *T*(*s*) can be written (see Proposition 2) as *T*(*s*) = *TL*(*s*)*TR*(*s*) with the global invariant rational functions of *TL*(*s*) factorizing in *M* \ *M*� and the global invariant rational functions of *TR*(*s*) factorizing in Specm(**F**[*s*]) \ *M* = *M*� \ *M*. As *k*1,..., *km* are the left Wiener–Hopf factorization indices of *TL*(*s*) at infinity, there exist matrices *U*(*s*) ∈ Gl*m*(**F**[*s*]) and *<sup>B</sup>*(*s*) <sup>∈</sup> Gl*m*(**F***pr*(*s*)) such that *TL*(*s*) = *<sup>B</sup>*(*s*)*D*1(*s*)*U*(*s*) with *<sup>D</sup>*1(*s*) = Diag(*sk*<sup>1</sup> ,...,*skm* ).

*TL*(*s*) = *U*1(*s*)*D*(*s*)*U*(*s*). If *U*2(*s*) = *U*(*s*)*TR*(*s*) then this matrix is invertible in **F***M*(*s*)*m*×*<sup>m</sup>* and *<sup>T</sup>*(*s*) = *<sup>U</sup>*1(*s*) Diag((*<sup>s</sup>* <sup>−</sup> *<sup>a</sup>*)*k*<sup>1</sup> ,...,(*<sup>s</sup>* <sup>−</sup> *<sup>a</sup>*)*km* )*U*2(*s*). We only have to prove that *<sup>U</sup>*1(*s*) is invertible in **<sup>F</sup>***M*�(*s*)*m*×*<sup>m</sup>* <sup>∩</sup> **<sup>F</sup>***pr*(*s*)*m*×*m*. It is clear that *<sup>U</sup>*1(*s*) is in **<sup>F</sup>***pr*(*s*)*m*×*<sup>m</sup>* and biproper. Moreover, the global invariant rational functions of *TL*(*s*) *U*1(*s*) = *TL*(*s*)(*D*(*s*)*U*(*s*))−<sup>1</sup>

˜

The diagonal matrices have no zeros and no poles in *M* ∩ *M*� (because (*s* − *a*) ∈ *M* \ *M*�

Following [6] we could call left Wiener–Hopf factorization indices with respect to (*M*, *M*�

exponents *k*<sup>1</sup> ≥···≥ *km* appearing in the diagonal matrix of Theorem 19. They are, actually,

Several examples follow that exhibit some remarkable features about the results that have been proved so far. The first two examples show that if no assumption is made on the intersection and/or union of *M* and *M*� then existence and/or uniqueness of diagonal

*<sup>k</sup>*<sup>1</sup> ,...,(*<sup>s</sup>* <sup>−</sup> *<sup>a</sup>*)

˜

*km* ) = *<sup>U</sup>*˜ <sup>1</sup>(*s*)−1*U*1(*s*) Diag((*<sup>s</sup>* <sup>−</sup> *<sup>a</sup>*)*k*<sup>1</sup> ,...,(*<sup>s</sup>* <sup>−</sup> *<sup>a</sup>*)*km* )*U*2(*s*)*U*˜ <sup>2</sup>(*s*)<sup>−</sup>1.

. Therefore, *U*1(*s*) is invertible in **F***M*�(*s*)*m*×*m*. We prove now the uniqueness of the factorization. Assume that *T*(*s*) also factorizes as

). Afterwards, some examples will be given in which these conditions on *M* and *M*�

*<sup>T</sup>*(*s*) = *<sup>U</sup>*1(*s*) Diag((*<sup>s</sup>* <sup>−</sup> *<sup>a</sup>*)*k*<sup>1</sup> ,...,(*<sup>s</sup>* <sup>−</sup> *<sup>a</sup>*)*km* )*U*2(*s*), (38)

, then any non-singular

*. There exist both U*1(*s*) *invertible in*

*<sup>s</sup>k*<sup>1</sup>

(*s*−*a*)*<sup>k</sup>*<sup>1</sup> ,..., *<sup>s</sup>km*

*km* )*U*˜ <sup>2</sup>(*s*), (39)

). By Theorem 17, they have

*ki* = *ki* for all *i* =

(*s*−*a*)*km*

 . Then

(40)

) the

)

a monic irreducible polynomial of degree equal to 1 which is not in *M*�

are removed and factorization fails to exist.

*a non-singular matrix with no zeros and no poles in M* ∩ *M*�

*Wiener–Hopf factorization indices of T*(*s*) *with respect to M.*

**<sup>F</sup>***M*�(*s*)*m*×*<sup>m</sup>* <sup>∩</sup> **<sup>F</sup>***pr*(*s*)*m*×*<sup>m</sup> and U*2(*s*) *invertible in* **<sup>F</sup>***M*(*s*)*m*×*<sup>m</sup> such that*

Put *<sup>D</sup>*(*s*) = Diag((*<sup>s</sup>* <sup>−</sup> *<sup>a</sup>*)*k*<sup>1</sup> ,...,(*<sup>s</sup>* <sup>−</sup> *<sup>a</sup>*)*km* ) and *<sup>U</sup>*1(*s*) = *<sup>B</sup>*(*s*) Diag

*<sup>T</sup>*(*s*) = *<sup>U</sup>*˜ <sup>1</sup>(*s*) Diag((*<sup>s</sup>* <sup>−</sup> *<sup>a</sup>*)

*km* integers. Then,

˜

and they are left Wiener–Hopf equivalent with respect to (*M*, *M*�

the left Wiener–Hopf factorization indices with respect to *M*.

the same left Wiener–Hopf factorization indices with respect to *M*. Thus, ˜

*<sup>k</sup>*<sup>1</sup> ,...,(*<sup>s</sup>* <sup>−</sup> *<sup>a</sup>*)

(*M*, *M*�

factorize in *M* \ *M*�

*<sup>k</sup>*<sup>1</sup> ≥···≥ ˜

˜

factorization may fail to exist.

Diag((*s* − *a*)

with ˜

1, . . . , *m*.

**Example 21.** If *M* ∪ *M*� �= Specm **F**[*s*] then the factorization indices with respect to (*M*, *M*� ) may be not unique. Suppose that (*β*(*s*)) ∈/ *M* ∪ *M*� , (*π*(*s*)) ∈ *M* \ *M*� with *d*(*π*(*s*)) = 1 and *<sup>p</sup>*(*s*) = *<sup>u</sup>*1(*s*)*π*(*s*)*ku*2(*s*), with *<sup>u</sup>*1(*s*) a unit in **<sup>F</sup>***M*�(*s*) <sup>∩</sup> **<sup>F</sup>***pr*(*s*) and *<sup>u</sup>*2(*s*) a unit in **<sup>F</sup>***M*(*s*). Then *<sup>p</sup>*(*s*) can also be factorized as *<sup>p</sup>*(*s*) = *<sup>u</sup>*˜1(*s*)*π*(*s*)*k*−*d*(*β*(*s*))*u*˜2(*s*) with *<sup>u</sup>*˜1(*s*) = *<sup>u</sup>*1(*s*) *<sup>π</sup>*(*s*)*<sup>d</sup>*(*β*(*s*)) *<sup>β</sup>*(*s*) a unit in **F***M*�(*s*) ∩ **F***pr*(*s*) and *u*˜2(*s*) = *β*(*s*)*u*2(*s*) a unit in **F***M*(*s*).

The following example shows that if all ideals generated by polynomials of degree equal to one are in *M*� \ *M* then a factorization as in Theorem 19 may not exist.

**Example 22.** Suppose that **<sup>F</sup>** <sup>=</sup> **<sup>R</sup>**. Consider *<sup>M</sup>* <sup>=</sup> {(*s*<sup>2</sup> <sup>+</sup> <sup>1</sup>)} ⊆ Specm(**R**[*s*]) and *<sup>M</sup>*� <sup>=</sup> Specm(**R**[*s*]) \ {(*s*<sup>2</sup> <sup>+</sup> <sup>1</sup>)}. Let

$$P(s) = \begin{bmatrix} s & 0 \\ -s^2 \ (s^2+1)^2 \end{bmatrix}.\tag{41}$$

Notice that *P*(*s*) has no zeros and no poles in *M* ∩ *M*� = ∅. We will see that it is not possible to find invertible matrices *<sup>U</sup>*1(*s*) <sup>∈</sup> **<sup>R</sup>***M*�(*s*)2×<sup>2</sup> <sup>∩</sup> **<sup>R</sup>***pr*(*s*)2×<sup>2</sup> and *<sup>U</sup>*2(*s*) <sup>∈</sup> **<sup>R</sup>***M*(*s*)2×<sup>2</sup> such that

$$\mathcal{U}\_1(s)P(s)\mathcal{U}\_2(s) = \text{Diag}((p(s)/q(s))^{c\_1}, (p(s)/q(s))^{c\_2}).\tag{42}$$

We can write *<sup>p</sup>*(*s*) *<sup>q</sup>*(*s*) <sup>=</sup> *<sup>u</sup>*(*s*)(*s*<sup>2</sup> <sup>+</sup> <sup>1</sup>)*<sup>a</sup>* with *<sup>u</sup>*(*s*) a unit in **<sup>R</sup>***M*(*s*) and *<sup>a</sup>* <sup>∈</sup> **<sup>Z</sup>**. Therefore,

$$\text{Diag}((p(s)/q(s))^{\varepsilon\_1}, (p(s)/q(s))^{\varepsilon\_2}) = \text{Diag}((s^2+1)^{ac\_1}, (s^2+1)^{ac\_2}) \cdot \text{Diag}(u(s)^{\varepsilon\_1}, u(s)^{\varepsilon\_2}) . \tag{43}$$

Diag(*u*(*s*)*c*<sup>1</sup> , *u*(*s*)*c*<sup>2</sup> ) is invertible in **R***M*(*s*)2×<sup>2</sup> and *P*(*s*) is also left Wiener–Hopf equivalent with respect to (*M*, *M*� ) to the diagonal matrix Diag((*s*<sup>2</sup> + 1)*ac*<sup>1</sup> ,(*s*<sup>2</sup> + 1)*ac*<sup>2</sup> ).

Assume that there exist invertible matrices *<sup>U</sup>*1(*s*) <sup>∈</sup> **<sup>R</sup>***M*�(*s*)2×<sup>2</sup> <sup>∩</sup> **<sup>R</sup>***pr*(*s*)2×<sup>2</sup> and *<sup>U</sup>*2(*s*) <sup>∈</sup> **<sup>R</sup>***M*(*s*)2×<sup>2</sup> such that *<sup>U</sup>*1(*s*)*P*(*s*)*U*2(*s*) = Diag((*s*<sup>2</sup> <sup>+</sup> <sup>1</sup>)*d*<sup>1</sup> ,(*s*<sup>2</sup> <sup>+</sup> <sup>1</sup>)*d*<sup>2</sup> ), with *<sup>d</sup>*<sup>1</sup> <sup>≥</sup> *<sup>d</sup>*<sup>2</sup> integers. Notice first that det *U*1(*s*) is a nonzero constant and since det *P*(*s*) = *s*(*s*<sup>2</sup> + 1)<sup>2</sup> and det *U*2(*s*) is a rational function with numerator and denominator relatively prime with *s*<sup>2</sup> + 1, it follows that *cs*(*s*<sup>2</sup> + 1)<sup>2</sup> det *U*2(*s*)=(*s*<sup>2</sup> + 1)*d*1+*d*<sup>2</sup> . Thus, *d*<sup>1</sup> + *d*<sup>2</sup> = 2. Let

$$\mathcal{U}\_{1}(s)^{-1} = \begin{bmatrix} b\_{11}(s) \ b\_{12}(s) \\ b\_{21}(s) \ b\_{22}(s) \end{bmatrix}, \quad \mathcal{U}\_{2}(s) = \begin{bmatrix} \mu\_{11}(s) \ \mu\_{12}(s) \\ \mu\_{21}(s) \ \mu\_{22}(s) \end{bmatrix}. \tag{44}$$

From *P*(*s*)*U*2(*s*) = *U*1(*s*)−<sup>1</sup> Diag((*s*<sup>2</sup> + 1)*d*<sup>1</sup> ,(*s*<sup>2</sup> + 1)*d*<sup>2</sup> ) we get

*su*11(*s*) = *b*11(*s*)(*s* <sup>2</sup> + 1)*d*<sup>1</sup> , (45)

$$s - s^2 u\_{11}(s) + (s^2 + 1)^2 u\_{21}(s) = b\_{21}(s)(s^2 + 1)^{d\_1} \,\,\,\,\tag{46}$$

$$
\mathfrak{su}\_{12}(\mathbf{s}) = b\_{12}(\mathbf{s})(\mathbf{s}^2 + 1)^{d\_2}, \tag{47}
$$

$$-s^2\mu\_{12}(s) + (s^2+1)^2\mu\_{22}(s) = b\_{22}(s)(s^2+1)^{d\_2}.\tag{48}$$

As *<sup>u</sup>*11(*s*) <sup>∈</sup> **<sup>R</sup>***M*(*s*) and *<sup>b</sup>*11(*s*) <sup>∈</sup> **<sup>R</sup>***M*�(*s*) <sup>∩</sup> **<sup>R</sup>***pr*(*s*), we can write *<sup>u</sup>*11(*s*) = *<sup>f</sup>*1(*s*) *<sup>g</sup>*1(*s*) and *<sup>b</sup>*11(*s*) = *h*1(*s*) (*s*<sup>2</sup>+1)*<sup>q</sup>*<sup>1</sup> with *<sup>f</sup>*1(*s*), *<sup>g</sup>*1(*s*), *<sup>h</sup>*1(*s*) <sup>∈</sup> **<sup>R</sup>**[*s*], gcd(*g*1(*s*),*s*<sup>2</sup> <sup>+</sup> <sup>1</sup>) = 1 and *<sup>d</sup>*(*h*1(*s*)) <sup>≤</sup> <sup>2</sup>*q*1. Therefore, by (45), *s <sup>f</sup>*1(*s*) *<sup>g</sup>*1(*s*) <sup>=</sup> *<sup>h</sup>*1(*s*) (*s*<sup>2</sup>+1)*<sup>q</sup>*<sup>1</sup> (*s*<sup>2</sup> <sup>+</sup> <sup>1</sup>)*d*<sup>1</sup> . Hence, *<sup>u</sup>*11(*s*) = *<sup>f</sup>*1(*s*) or *<sup>u</sup>*11(*s*) = *<sup>f</sup>*1(*s*) *<sup>s</sup>* . In the same way and using (47), *<sup>u</sup>*12(*s*) = *<sup>f</sup>*2(*s*) or *<sup>u</sup>*12(*s*) = *<sup>f</sup>*2(*s*) *<sup>s</sup>* with *f*2(*s*) a polynomial. Moreover, by (47), *d*<sup>2</sup> must be non-negative. Hence, *d*<sup>1</sup> ≥ *d*<sup>2</sup> ≥ 0. Using now (46) and (48) and bearing in mind again that *u*21(*s*), *u*22(*s*) ∈ **R***M*(*s*) and *b*21(*s*), *b*22(*s*) ∈ **R***M*�(*s*) ∩ **R***pr*(*s*), we conclude that *u*21(*s*) and *u*22(*s*) are polynomials.

We can distinguish two cases: *d*<sup>1</sup> = 2, *d*<sup>2</sup> = 0 and *d*<sup>1</sup> = *d*<sup>2</sup> = 1. If *d*<sup>1</sup> = 2 and *d*<sup>2</sup> = 0, by (47), *b*12(*s*) is a polynomial and since *b*12(*s*) is proper, it is constant: *b*12(*s*) = *c*1. Thus *u*12(*s*) = *<sup>c</sup>*<sup>1</sup> *s* . By (48), *<sup>b</sup>*22(*s*) = <sup>−</sup>*c*1*<sup>s</sup>* + (*s*<sup>2</sup> <sup>+</sup> <sup>1</sup>)2*u*22(*s*). Since *<sup>u</sup>*22(*s*) is polynomial and *<sup>b</sup>*22(*s*) is proper, *<sup>b</sup>*22(*s*) is also constant and then *u*22(*s*) = 0 and *c*<sup>1</sup> = 0. Consequentially, *b*22(*s*) = 0, and *b*12(*s*) = 0. This is impossible because *U*1(*s*) is invertible.

If *d*<sup>1</sup> = *d*<sup>2</sup> = 1 then , using (46),

$$\begin{split} b\_{21}(s) &= \frac{-s^2 \mu\_{11}(s) + (s^2+1)^2 \mu\_{21}(s)}{s^2+1} = \frac{-s^2 \frac{h\_{11}(s)}{s}(s^2+1) + (s^2+1)^2 \mu\_{21}(s)}{s^2+1} \\ &= -s b\_{11}(s) + (s^2+1) \mu\_{21}(s) = -s \frac{h\_{1}(s)}{(s^2+1)^{\eta\_1}} + (s^2+1) \mu\_{21}(s) \\ &= \frac{-s h\_{1}(s) + (s^2+1)^{\eta\_1+1} \mu\_{21}(s)}{(s^2+1)^{\phi\_1}}. \end{split} \tag{49}$$

Notice that *<sup>d</sup>*(−*sh*1(*s*)) <sup>≤</sup> <sup>1</sup> <sup>+</sup> <sup>2</sup>*q*<sup>1</sup> and *<sup>d</sup>*((*s*<sup>2</sup> <sup>+</sup> <sup>1</sup>)*q*1+1*u*21(*s*)) = <sup>2</sup>(*q*<sup>1</sup> <sup>+</sup> <sup>1</sup>) + *<sup>d</sup>*(*u*21(*s*)) <sup>≥</sup> <sup>2</sup>*q*<sup>1</sup> <sup>+</sup> <sup>2</sup> unless *<sup>u</sup>*21(*s*) = 0. Hence, if *<sup>u</sup>*21(*s*) �<sup>=</sup> 0, *<sup>d</sup>*(−*sh*1(*s*)+(*s*<sup>2</sup> <sup>+</sup> <sup>1</sup>)*q*1+1*u*21(*s*)) <sup>≥</sup> <sup>2</sup>*q*<sup>1</sup> <sup>+</sup> 2 which is greater than *d*((*s*<sup>2</sup> + 1)*q*<sup>1</sup> ) = 2*q*1. This cannot happen because *b*21(*s*) is proper. Thus, *u*21(*s*) = 0. In the same way and reasoning with (48) we get that *u*22(*s*) is also zero. This is again impossible because *U*2(*s*) is invertible. Therefore no left Wiener–Hopf factorization of *P*(*s*) with respect to (*M*, *M*� ) exits.

We end this section with an example where the left Wiener–Hopf factorization indices of the matrix polynomial in the previous example are computed. Then an ideal generated by a polynomial of degree 1 is added to *M* and the Wiener–Hopf factorization indices of the same matrix are obtained in two different cases.

**Example 23.** Let **<sup>F</sup>** <sup>=</sup> **<sup>R</sup>** and *<sup>M</sup>* <sup>=</sup> {(*s*<sup>2</sup> <sup>+</sup> <sup>1</sup>)}. Consider the matrix

$$P(s) = \begin{bmatrix} s & 0 \\ -s^2 \ (s^2+1)^2 \end{bmatrix} \tag{50}$$

which has a zero at 0. It can be written as *P*(*s*) = *P*1(*s*)*P*2(*s*) with

$$P\_1(s) = \begin{bmatrix} 1 & 0 \\ -s \ (s^2 + 1)^2 \end{bmatrix}, \quad P\_2(s) = \begin{bmatrix} s \ 0 \\ 0 \ 1 \end{bmatrix}, \tag{51}$$

where the global invariant factors of *P*1(*s*) are powers of *s*<sup>2</sup> + 1 and the global invariant factors of *P*2(*s*) are relatively prime with *s*<sup>2</sup> + 1. Moreover, the left Wiener–Hopf factorization indices of *P*1(*s*) at infinity are 3, 1 (add the first column multiplied by *s*<sup>3</sup> + 2*s* to the second column; the result is a column proper matrix with column degrees 1 and 3). Therefore, the left Wiener–Hopf factorization indices of *P*(*s*) with respect to *M* are 3, 1.

Consider now *<sup>M</sup>*˜ <sup>=</sup> {(*s*<sup>2</sup> <sup>+</sup> <sup>1</sup>),(*s*)} and *<sup>M</sup>*˜ � <sup>=</sup> Specm(**R**[*s*]) \ *<sup>M</sup>*˜ . There is a unimodular matrix *<sup>U</sup>*(*s*) = 1 *s*<sup>2</sup> + 2 0 1 , invertible in **<sup>R</sup>***M*˜ (*s*)2×2, such that *<sup>P</sup>*(*s*)*U*(*s*) = *s s*<sup>3</sup> <sup>+</sup> <sup>2</sup>*<sup>s</sup>* <sup>−</sup>*s*<sup>2</sup> <sup>1</sup> is column proper with column degrees 3 and 2. We can write

$$P(s)\mathcal{U}(s) = \begin{bmatrix} 0 & 1 \\ -1 & 0 \end{bmatrix} \begin{bmatrix} s^2 & 0 \\ 0 & s^3 \end{bmatrix} + \begin{bmatrix} s \ 2s \\ 0 \ 1 \end{bmatrix} = \mathcal{B}(s) \begin{bmatrix} s^2 & 0 \\ 0 \ s^3 \end{bmatrix} \tag{52}$$

where *B*(*s*) is the following biproper matrix

18 Will-be-set-by-IN-TECH

(*s*<sup>2</sup>+1)*<sup>q</sup>*<sup>1</sup> with *<sup>f</sup>*1(*s*), *<sup>g</sup>*1(*s*), *<sup>h</sup>*1(*s*) <sup>∈</sup> **<sup>R</sup>**[*s*], gcd(*g*1(*s*),*s*<sup>2</sup> <sup>+</sup> <sup>1</sup>) = 1 and *<sup>d</sup>*(*h*1(*s*)) <sup>≤</sup> <sup>2</sup>*q*1. Therefore,

(47), *d*<sup>2</sup> must be non-negative. Hence, *d*<sup>1</sup> ≥ *d*<sup>2</sup> ≥ 0. Using now (46) and (48) and bearing in mind again that *u*21(*s*), *u*22(*s*) ∈ **R***M*(*s*) and *b*21(*s*), *b*22(*s*) ∈ **R***M*�(*s*) ∩ **R***pr*(*s*), we conclude

We can distinguish two cases: *d*<sup>1</sup> = 2, *d*<sup>2</sup> = 0 and *d*<sup>1</sup> = *d*<sup>2</sup> = 1. If *d*<sup>1</sup> = 2 and *d*<sup>2</sup> = 0, by (47), *b*12(*s*) is a polynomial and since *b*12(*s*) is proper, it is constant: *b*12(*s*) = *c*1. Thus *u*12(*s*) = *<sup>c</sup>*<sup>1</sup>

By (48), *<sup>b</sup>*22(*s*) = <sup>−</sup>*c*1*<sup>s</sup>* + (*s*<sup>2</sup> <sup>+</sup> <sup>1</sup>)2*u*22(*s*). Since *<sup>u</sup>*22(*s*) is polynomial and *<sup>b</sup>*22(*s*) is proper, *<sup>b</sup>*22(*s*) is also constant and then *u*22(*s*) = 0 and *c*<sup>1</sup> = 0. Consequentially, *b*22(*s*) = 0, and *b*12(*s*) = 0.

Notice that *<sup>d</sup>*(−*sh*1(*s*)) <sup>≤</sup> <sup>1</sup> <sup>+</sup> <sup>2</sup>*q*<sup>1</sup> and *<sup>d</sup>*((*s*<sup>2</sup> <sup>+</sup> <sup>1</sup>)*q*1+1*u*21(*s*)) = <sup>2</sup>(*q*<sup>1</sup> <sup>+</sup> <sup>1</sup>) + *<sup>d</sup>*(*u*21(*s*)) <sup>≥</sup> <sup>2</sup>*q*<sup>1</sup> <sup>+</sup> <sup>2</sup> unless *<sup>u</sup>*21(*s*) = 0. Hence, if *<sup>u</sup>*21(*s*) �<sup>=</sup> 0, *<sup>d</sup>*(−*sh*1(*s*)+(*s*<sup>2</sup> <sup>+</sup> <sup>1</sup>)*q*1+1*u*21(*s*)) <sup>≥</sup> <sup>2</sup>*q*<sup>1</sup> <sup>+</sup> 2 which is greater than *d*((*s*<sup>2</sup> + 1)*q*<sup>1</sup> ) = 2*q*1. This cannot happen because *b*21(*s*) is proper. Thus, *u*21(*s*) = 0. In the same way and reasoning with (48) we get that *u*22(*s*) is also zero. This is again impossible because *U*2(*s*) is invertible. Therefore no left Wiener–Hopf factorization of *P*(*s*)

We end this section with an example where the left Wiener–Hopf factorization indices of the matrix polynomial in the previous example are computed. Then an ideal generated by a polynomial of degree 1 is added to *M* and the Wiener–Hopf factorization indices of the same

> *s* 0 <sup>−</sup>*s*<sup>2</sup> (*s*<sup>2</sup> <sup>+</sup> <sup>1</sup>)<sup>2</sup>

where the global invariant factors of *P*1(*s*) are powers of *s*<sup>2</sup> + 1 and the global invariant factors of *P*2(*s*) are relatively prime with *s*<sup>2</sup> + 1. Moreover, the left Wiener–Hopf factorization

 *s* 0 0 1 

, *P*2(*s*) =

*<sup>s</sup>*<sup>2</sup>+<sup>1</sup> <sup>=</sup> <sup>−</sup>*s*<sup>2</sup> *<sup>b</sup>*11(*s*)

<sup>=</sup> <sup>−</sup>*sb*11(*s*)+(*s*<sup>2</sup> <sup>+</sup> <sup>1</sup>)*u*21(*s*) = <sup>−</sup>*<sup>s</sup>*

(*s*<sup>2</sup>+1)*<sup>q</sup>*<sup>1</sup> .

(*s*<sup>2</sup>+1)*<sup>q</sup>*<sup>1</sup> (*s*<sup>2</sup> <sup>+</sup> <sup>1</sup>)*d*<sup>1</sup> . Hence, *<sup>u</sup>*11(*s*) = *<sup>f</sup>*1(*s*) or *<sup>u</sup>*11(*s*) = *<sup>f</sup>*1(*s*)

<sup>2</sup> + 1)2*u*22(*s*) = *b*22(*s*)(*s*

<sup>2</sup> + 1)*d*<sup>2</sup> , (47)

*<sup>s</sup>* with *f*2(*s*) a polynomial. Moreover, by

*<sup>s</sup>* (*s*<sup>2</sup>+1)+(*s*<sup>2</sup>+1)<sup>2</sup>*u*21(*s*) *s*<sup>2</sup>+1

(*s*<sup>2</sup>+1)*<sup>q</sup>*<sup>1</sup> + (*s*<sup>2</sup> <sup>+</sup> <sup>1</sup>)*u*21(*s*)

, (50)

, (51)

*h*1(*s*)

<sup>2</sup> + 1)*d*<sup>2</sup> . (48)

*<sup>g</sup>*1(*s*) and *<sup>b</sup>*11(*s*) =

*<sup>s</sup>* . In the same

*s* .

(49)

*su*12(*s*) = *b*12(*s*)(*s*

As *<sup>u</sup>*11(*s*) <sup>∈</sup> **<sup>R</sup>***M*(*s*) and *<sup>b</sup>*11(*s*) <sup>∈</sup> **<sup>R</sup>***M*�(*s*) <sup>∩</sup> **<sup>R</sup>***pr*(*s*), we can write *<sup>u</sup>*11(*s*) = *<sup>f</sup>*1(*s*)

− *s*

way and using (47), *<sup>u</sup>*12(*s*) = *<sup>f</sup>*2(*s*) or *<sup>u</sup>*12(*s*) = *<sup>f</sup>*2(*s*)

*h*1(*s*)

by (45), *s <sup>f</sup>*1(*s*)

*<sup>g</sup>*1(*s*) <sup>=</sup> *<sup>h</sup>*1(*s*)

that *u*21(*s*) and *u*22(*s*) are polynomials.

If *d*<sup>1</sup> = *d*<sup>2</sup> = 1 then , using (46),

with respect to (*M*, *M*�

This is impossible because *U*1(*s*) is invertible.

*<sup>b</sup>*21(*s*) = <sup>−</sup>*s*2*u*11(*s*)+(*s*<sup>2</sup>+1)<sup>2</sup>*u*21(*s*)

) exits.

matrix are obtained in two different cases.

<sup>=</sup> <sup>−</sup>*sh*1(*s*)+(*s*<sup>2</sup>+1)*<sup>q</sup>*1+1*u*21(*s*)

**Example 23.** Let **<sup>F</sup>** <sup>=</sup> **<sup>R</sup>** and *<sup>M</sup>* <sup>=</sup> {(*s*<sup>2</sup> <sup>+</sup> <sup>1</sup>)}. Consider the matrix

which has a zero at 0. It can be written as *P*(*s*) = *P*1(*s*)*P*2(*s*) with

*P*1(*s*) =

*P*(*s*) =

 1 0 <sup>−</sup>*<sup>s</sup>* (*s*<sup>2</sup> <sup>+</sup> <sup>1</sup>)<sup>2</sup>

<sup>2</sup>*u*12(*s*)+(*s*

$$B(s) = \begin{bmatrix} 0 & 1 \\ -1 & 0 \end{bmatrix} + \begin{bmatrix} s \ 2s \\ 0 \ 1 \end{bmatrix} \begin{bmatrix} s^{-2} & 0 \\ 0 & s^{-3} \end{bmatrix} = \begin{bmatrix} \frac{1}{s} & \frac{s^2 + 2}{s^2} \\ -1 & \frac{1}{s^3} \end{bmatrix}.\tag{53}$$

Moreover, the denominators of its entries are powers of *<sup>s</sup>* and det *<sup>B</sup>*(*s*) = (*s*<sup>2</sup>+1)<sup>2</sup> *<sup>s</sup>*<sup>4</sup> . Therefore, *<sup>B</sup>*(*s*) is invertible in **<sup>R</sup>***M*˜ �(*s*)2×<sup>2</sup> <sup>∩</sup> **<sup>R</sup>***pr*(*s*)2×2. Since *<sup>B</sup>*(*s*)−1*P*(*s*)*U*(*s*) = Diag(*s*2,*s*3), the left Wiener–Hopf factorization indices of *P*(*s*) with respect to *M*˜ are 3, 2.

If *<sup>M</sup>*˜ <sup>=</sup> {(*s*<sup>2</sup> <sup>+</sup> <sup>1</sup>),(*<sup>s</sup>* <sup>−</sup> <sup>1</sup>)}, for example, a similar procedure shows that *<sup>P</sup>*(*s*) has 3, 1 as left Wiener–Hopf factorization indices with respect to *M*˜ ; the same indices as with respect to *M*. The reason is that *s* − 1 is not a divisor of det *P*(*s*) and so *P*(*s*) = *P*1(*s*)*P*2(*s*) with *P*1(*s*) and *P*2(*s*) as in (51) and *P*1(*s*) factorizing in *M*˜ .

**Remark 24.** It must be noticed that a procedure has been given to compute, at least theoretically, the left Wiener–Hopf factorization indices of any rational matrix with respect to any subset *M* of Specm(**F**[*s*]). In fact, given a rational matrix *T*(*s*) and *M*, write *T*(*s*) = *TL*(*s*)*TR*(*s*) with the global invariant rational functions of *TL*(*s*) factorizing in *M*, and the global invariant rational functions of *TR*(*s*) factorizing in Specm(**F**[*s*]) \ *M* (for example, using the global Smith–McMillan form of *T*(*s*)). We need to compute the left Wiener–Hopf factorization indices at infinity of the rational matrix *TL*(*s*). The idea is as follows: Let *d*(*s*) be the monic least common denominator of all the elements of *TL*(*s*). The matrix *TL*(*s*) can be written as *TL*(*s*) = *<sup>P</sup>*(*s*) *<sup>d</sup>*(*s*) , with *P*(*s*) polynomial. The left Wiener–Hopf factorization indices of *P*(*s*) at infinity are the column degrees of any column proper matrix right equivalent to *P*(*s*). If *k*1,..., *km* are the left Wiener–Hopf factorization indices at infinity of *P*(*s*) then *k*<sup>1</sup> + *d*,..., *km* + *d* are the left Wiener–Hopf factorization indices of *TL*(*s*), where *d* = *d*(*d*(*s*)) (see [1]). Free and commercial software exists that compute such column degrees.

#### **5. Rosenbrock's Theorem via local rings**

As said in the Introduction, Rosenbrock's Theorem ([14]) on pole assignment by state feedback provides, in its polynomial formulation, a complete characterization of the relationship between the invariant factors and the left Wiener–Hopf factorization indices at infinity of any non-singular matrix polynomial. The precise statement of this result is the following theorem: **Theorem 25.** *Let g*<sup>1</sup> ≥ ··· ≥ *gm and α*1(*s*) | ··· | *αm*(*s*) *be non-negative integers and monic polynomials, respectively. Then there exists a non-singular matrix P*(*s*) ∈ **F**[*s*] *<sup>m</sup>*×*<sup>m</sup> with α*1(*s*),..., *αm*(*s*) *as invariant factors and g*1,..., *gm as left Wiener–Hopf factorization indices at infinity if and only if the following relation holds:*

$$(g\_1, \ldots, g\_m) \prec (d(\mathfrak{a}\_m(\mathbf{s})), \ldots, d(\mathfrak{a}\_1(\mathbf{s}))).\tag{54}$$

Symbol ≺ appearing in (54) is the majorization symbol (see [11]) and it is defined as follows: If (*a*1,..., *am*) and (*b*1,..., *bm*) are two finite sequences of real numbers and *a*[1] ≥···≥ *a*[*m*] and *b*[1] ≥···≥ *b*[*m*] are the given sequences arranged in non-increasing order then (*a*1,..., *am*) ≺ (*b*1,..., *bm*) if

$$\sum\_{i=1}^{l} a\_{[i]} \le \sum\_{i=1}^{l} b\_{[i]}, \quad 1 \le j \le m-1 \tag{55}$$

with equality for *j* = *m*.

The above Theorem 25 can be extended to cover rational matrix functions. Any rational matrix *T*(*s*) can be written as *<sup>N</sup>*(*s*) *<sup>d</sup>*(*s*) where *d*(*s*) is the monic least common denominator of all the elements of *T*(*s*) and *N*(*s*) is polynomial. It turns out that the invariant rational functions of *T*(*s*) are the invariant factors of *N*(*s*) divided by *d*(*s*) after canceling common factors. We also have the following characterization of the left Wiener– Hopf factorization indices at infinity of *T*(*s*): these are those of *N*(*s*) plus the degree of *d*(*s*) (see [1]). Bearing all this in mind one can easily prove (see [1])

**Theorem 26.** *Let g*<sup>1</sup> ≥ ··· ≥ *gm be integers and <sup>α</sup>*1(*s*) *<sup>β</sup>*1(*s*),..., *<sup>α</sup>m*(*s*) *<sup>β</sup>m*(*s*) *irreducible rational functions, where αi*(*s*), *βi*(*s*) ∈ **F**[*s*] *are monic such that α*1(*s*) | ··· | *αm*(*s*) *while βm*(*s*) | ··· | *β*1(*s*)*. Then there exists a non-singular rational matrix T*(*s*) <sup>∈</sup> **<sup>F</sup>**(*s*)*m*×*<sup>m</sup> with g*1,..., *gm as left Wiener–Hopf factorization indices at infinity and <sup>α</sup>*1(*s*) *<sup>β</sup>*1(*s*),..., *<sup>α</sup>m*(*s*) *<sup>β</sup>m*(*s*) *as global invariant rational functions if and only if*

$$d(g\_1, \ldots, g\_{\mathfrak{m}}) \prec (d(a\_{\mathfrak{m}}(\mathbf{s})) - d(\beta\_{\mathfrak{m}}(\mathbf{s})), \ldots, d(a\_1(\mathbf{s})) - d(\beta\_1(\mathbf{s}))).\tag{56}$$

Recall that for *M* ⊆ Specm(**F**[*s*]) any rational matrix *T*(*s*) can be factorized into two matrices (see Proposition 2) such that the global invariant rational functions and the left Wiener–Hopf factorization indices at infinity of the left factor of *T*(*s*) give the invariant rational functions and the left Wiener–Hopf factorization indices of *T*(*s*) with respect to *M*. Using Theorem 26 on the left factor of *T*(*s*) we get:

**Theorem 27.** *Let M* <sup>⊆</sup> Specm(**F**[*s*])*. Let k*<sup>1</sup> ≥ ··· ≥ *km be integers and �*1(*s*) *<sup>ψ</sup>*1(*s*),..., *�m*(*s*) *<sup>ψ</sup>m*(*s*) *be irreducible rational functions such that �*1(*s*) |···| *�m*(*s*)*, ψm*(*s*) |···| *ψ*1(*s*) *are monic polynomials factorizing in M. Then there exists a non-singular matrix T*(*s*) <sup>∈</sup> **<sup>F</sup>**(*s*)*m*×*<sup>m</sup> with �*1(*s*) *<sup>ψ</sup>*1(*s*),..., *�m*(*s*) *<sup>ψ</sup>m*(*s*) *as invariant rational functions with respect to M and k*1,..., *km as left Wiener–Hopf factorization indices with respect to M if and only if*

$$d(k\_1, \ldots, k\_m) \prec (d(\varepsilon\_m(s)) - d(\psi\_m(s)), \ldots, d(\varepsilon\_1(s)) - d(\psi\_1(s))).\tag{57}$$

Theorem 27 relates the left Wiener–Hopf factorization indices with respect to *M* and the finite structure inside *M*. Our last result will relate the left Wiener–Hopf factorization indices with respect to *M* and the structure outside *M*, including that at infinity. The next Theorem is an extension of Rosenbrock's Theorem to the point at infinity, which was proved in [1]:

**Theorem 28.** *Let g*<sup>1</sup> ≥ ··· ≥ *gm and q*<sup>1</sup> ≥ ··· ≥ *qm be integers. Then there exists a non-singular matrix T*(*s*) <sup>∈</sup> **<sup>F</sup>**(*s*)*m*×*<sup>m</sup> with g*1,..., *gm as left Wiener–Hopf factorization indices at infinity and sq*<sup>1</sup> ,...,*sqm as invariant rational functions at infinity if and only if*

$$(\mathbf{g}\_1, \dots, \mathbf{g}\_m) \prec (q\_{1'}, \dots, q\_m). \tag{58}$$

Notice that Theorem 26 can be obtained from Theorem 27 when *M* = Specm(**F**[*s*]). In the same way, taking into account that the equivalence at infinity is a particular case of the equivalence in **F***M*�(*s*) ∩ **F***pr*(*s*) when *M*� = ∅, we can give a more general result than that of Theorem 28. Specifically, necessary and sufficient conditions can be provided for the existence of a non-singular rational matrix with prescribed left Wiener–Hopf factorization indices with respect to *M* and invariant rational functions in **F***M*�(*s*) ∩ **F***pr*(*s*).

**Theorem 29.** *Let M*, *M*� ⊆ Specm(**F**[*s*]) *be such that M* ∪ *M*� = Specm(**F**[*s*])*. Assume that there are ideals in M* \ *M*� *generated by linear polynomials and let* (*s* − *a*) *be any of them. Let <sup>k</sup>*<sup>1</sup> ≥···≥ *km be integers, �*1(*s*) *<sup>ψ</sup>*1(*s*),..., *�m*(*s*) *<sup>ψ</sup>m*(*s*) *irreducible rational functions such that �*1(*s*)|···|*�m*(*s*)*, ψm*(*s*)|···|*ψ*1(*s*) *are monic polynomials factorizing in M*� \ *M and l*1,..., *lm integers such that l*<sup>1</sup> + *d*(*ψ*1(*s*)) − *d*(*�*1(*s*)) ≤ ··· ≤ *lm* + *d*(*ψm*(*s*)) − *d*(*�m*(*s*))*. Then there exists a non-singular matrix T*(*s*) <sup>∈</sup> **<sup>F</sup>**(*s*)*m*×*<sup>m</sup> with no zeros and no poles in M* <sup>∩</sup> *<sup>M</sup>*� *with k*1,..., *km as left Wiener–Hopf factorization indices with respect to M and �*1(*s*) *ψ*1(*s*) 1 (*s*−*a*)*<sup>l</sup>* <sup>1</sup> ,..., *�m*(*s*) *ψm*(*s*) 1 (*s*−*a*)*lm as invariant rational functions in* **F***M*�(*s*) ∩ **F***pr*(*s*) *if and only if the following condition holds:*

$$(k\_1, \ldots, k\_m) \prec (-l\_1, \ldots, -l\_m). \tag{59}$$

The proof of this theorem will be given along the following two subsections. We will use several auxiliary results that will be stated and proved when needed.

#### **5.1. Necessity**

20 Will-be-set-by-IN-TECH

**Theorem 25.** *Let g*<sup>1</sup> ≥ ··· ≥ *gm and α*1(*s*) | ··· | *αm*(*s*) *be non-negative integers and*

*α*1(*s*),..., *αm*(*s*) *as invariant factors and g*1,..., *gm as left Wiener–Hopf factorization indices at*

Symbol ≺ appearing in (54) is the majorization symbol (see [11]) and it is defined as follows: If (*a*1,..., *am*) and (*b*1,..., *bm*) are two finite sequences of real numbers and *a*[1] ≥···≥ *a*[*m*] and *b*[1] ≥···≥ *b*[*m*] are the given sequences arranged in non-increasing order then (*a*1,..., *am*) ≺

The above Theorem 25 can be extended to cover rational matrix functions. Any rational matrix

elements of *T*(*s*) and *N*(*s*) is polynomial. It turns out that the invariant rational functions of *T*(*s*) are the invariant factors of *N*(*s*) divided by *d*(*s*) after canceling common factors. We also have the following characterization of the left Wiener– Hopf factorization indices at infinity of *T*(*s*): these are those of *N*(*s*) plus the degree of *d*(*s*) (see [1]). Bearing all this in mind one can

*where αi*(*s*), *βi*(*s*) ∈ **F**[*s*] *are monic such that α*1(*s*) | ··· | *αm*(*s*) *while βm*(*s*) | ··· | *β*1(*s*)*. Then there exists a non-singular rational matrix T*(*s*) <sup>∈</sup> **<sup>F</sup>**(*s*)*m*×*<sup>m</sup> with g*1,..., *gm as left Wiener–Hopf*

Recall that for *M* ⊆ Specm(**F**[*s*]) any rational matrix *T*(*s*) can be factorized into two matrices (see Proposition 2) such that the global invariant rational functions and the left Wiener–Hopf factorization indices at infinity of the left factor of *T*(*s*) give the invariant rational functions and the left Wiener–Hopf factorization indices of *T*(*s*) with respect to *M*. Using Theorem 26

*irreducible rational functions such that �*1(*s*) |···| *�m*(*s*)*, ψm*(*s*) |···| *ψ*1(*s*) *are monic polynomials*

*invariant rational functions with respect to M and k*1,..., *km as left Wiener–Hopf factorization indices*

*<sup>β</sup>*1(*s*),..., *<sup>α</sup>m*(*s*)

**Theorem 27.** *Let M* <sup>⊆</sup> Specm(**F**[*s*])*. Let k*<sup>1</sup> ≥ ··· ≥ *km be integers and �*1(*s*)

*factorizing in M. Then there exists a non-singular matrix T*(*s*) <sup>∈</sup> **<sup>F</sup>**(*s*)*m*×*<sup>m</sup> with �*1(*s*)

(*g*1,..., *gm*) ≺ (*d*(*αm*(*s*)),..., *d*(*α*1(*s*))). (54)

*<sup>d</sup>*(*s*) where *d*(*s*) is the monic least common denominator of all the

*<sup>β</sup>*1(*s*),..., *<sup>α</sup>m*(*s*)

(*g*1,..., *gm*) ≺ (*d*(*αm*(*s*)) − *d*(*βm*(*s*)),..., *d*(*α*1(*s*)) − *d*(*β*1(*s*))). (56)

(*k*1,..., *km*) ≺ (*d*(*�m*(*s*)) − *d*(*ψm*(*s*)),..., *d*(*�*1(*s*)) − *d*(*ψ*1(*s*))). (57)

*b*[*i*], 1 ≤ *j* ≤ *m* − 1 (55)

*<sup>β</sup>m*(*s*) *as global invariant rational functions if and only*

*<sup>β</sup>m*(*s*) *irreducible rational functions,*

*<sup>ψ</sup>*1(*s*),..., *�m*(*s*)

*<sup>ψ</sup>*1(*s*),..., *�m*(*s*)

*<sup>ψ</sup>m*(*s*) *be*

*<sup>ψ</sup>m*(*s*) *as*

*<sup>m</sup>*×*<sup>m</sup> with*

*monic polynomials, respectively. Then there exists a non-singular matrix P*(*s*) ∈ **F**[*s*]

*infinity if and only if the following relation holds:*

*j* ∑ *i*=1

**Theorem 26.** *Let g*<sup>1</sup> ≥ ··· ≥ *gm be integers and <sup>α</sup>*1(*s*)

*factorization indices at infinity and <sup>α</sup>*1(*s*)

on the left factor of *T*(*s*) we get:

*with respect to M if and only if*

*a*[*i*] ≤

*j* ∑ *i*=1

(*b*1,..., *bm*) if

with equality for *j* = *m*.

easily prove (see [1])

*if*

*T*(*s*) can be written as *<sup>N</sup>*(*s*)

We can give the following result for rational matrices using a similar result given in Lemma 4.2 in [2] for matrix polynomials.

**Lemma 30.** *Let M*, *<sup>M</sup>*� <sup>⊆</sup> Specm(**F**[*s*]) *be such that M* <sup>∪</sup> *<sup>M</sup>*� <sup>=</sup> Specm(**F**[*s*])*. Let T*(*s*) <sup>∈</sup> **<sup>F</sup>**(*s*)*m*×*<sup>m</sup> be a non-singular matrix with no zeros and no poles in M* ∩ *M*� *with g*<sup>1</sup> ≥ ··· ≥ *gm as left Wiener–Hopf factorization indices at infinity and k*<sup>1</sup> ≥ ··· ≥ *km as left Wiener–Hopf factorization indices with respect to M. If �*1(*s*) *<sup>ψ</sup>*1(*s*),..., *�m*(*s*) *<sup>ψ</sup>m*(*s*) *are the invariant rational functions of T*(*s*) *with respect to M*� *then*

$$(g\_1 - k\_1, \dots, g\_{\mathfrak{m}} - k\_{\mathfrak{m}}) \prec (d(\varepsilon\_{\mathfrak{m}}(\mathbf{s})) - d(\psi\_{\mathfrak{m}}(\mathbf{s})), \dots, d(\varepsilon\_1(\mathbf{s})) - d(\psi\_1(\mathbf{s}))).\tag{60}$$

It must be pointed out that (*g*<sup>1</sup> − *k*1,..., *gm* − *km*) may be an unordered *m*-tuple.

**Proof**.- By Proposition 2 there exist unimodular matrices *U*(*s*), *V*(*s*) ∈ **F**[*s*] *<sup>m</sup>*×*<sup>m</sup>* such that

$$T(s) = \mathcal{U}(s)\operatorname{Diag}\left(\frac{a\_1(s)}{\beta\_1(s)}, \dots, \frac{a\_m(s)}{\beta\_m(s)}\right)\operatorname{Diag}\left(\frac{\varepsilon\_1(s)}{\psi\_1(s)}, \dots, \frac{\varepsilon\_m(s)}{\psi\_m(s)}\right)V(s) \tag{61}$$

with *<sup>α</sup>i*(*s*) | *<sup>α</sup>i*+1(*s*), *<sup>β</sup>i*(*s*) | *<sup>β</sup>i*−1(*s*), *�i*(*s*) | *�i*+1(*s*), *<sup>ψ</sup>i*(*s*) | *<sup>ψ</sup>i*−1(*s*), *<sup>α</sup>i*(*s*), *<sup>β</sup>i*(*s*) units in **<sup>F</sup>***M*�\*M*(*s*) and *�i*(*s*), *<sup>ψ</sup>i*(*s*) factorizing in *<sup>M</sup>*� \ *<sup>M</sup>* because *<sup>T</sup>*(*s*) has no poles and no zeros in *<sup>M</sup>* <sup>∩</sup> *M*� . Therefore *<sup>T</sup>*(*s*) = *TL*(*s*)*TR*(*s*), where *TL*(*s*) = *<sup>U</sup>*(*s*) Diag *<sup>α</sup>*1(*s*) *<sup>β</sup>*1(*s*),..., *<sup>α</sup>m*(*s*) *βm*(*s*) has *k*1,..., *km* as left Wiener–Hopf factorization indices at infinity and *TR*(*s*) = Diag *�*1(*s*) *<sup>ψ</sup>*1(*s*),..., *�m*(*s*) *ψm*(*s*) *V*(*s*) has *�*1(*s*) *<sup>ψ</sup>*1(*s*),..., *�m*(*s*) *<sup>ψ</sup>m*(*s*) as global invariant rational functions. Let *<sup>d</sup>*(*s*) = *<sup>β</sup>*1(*s*)*ψ*1(*s*). Hence,

$$d(s)T(s) = \mathcal{U}(s)\operatorname{Diag}(\overline{\mathfrak{a}}\_1(s), \dots, \overline{\mathfrak{a}}\_{\mathfrak{m}}(s))\operatorname{Diag}(\overline{\mathfrak{e}}\_1(s), \dots, \overline{\mathfrak{e}}\_{\mathfrak{m}}(s))V(s) \tag{62}$$

with *<sup>α</sup>i*(*s*) = *<sup>α</sup>i*(*s*) *<sup>β</sup>i*(*s*) *<sup>β</sup>*1(*s*) units in **<sup>F</sup>***M*�\*M*(*s*) and *�i*(*s*) = *�i*(*s*) *<sup>ψ</sup>i*(*s*)*ψ*1(*s*) factorizing in *<sup>M</sup>*� \ *M*. Put *P*(*s*) = *d*(*s*)*T*(*s*). Its left Wiener–Hopf factorization indices at infinity are *g*<sup>1</sup> + *d*(*d*(*s*)),..., *gm* + *d*(*d*(*s*)) [1, Lemma 2.3]. The matrix *P*1(*s*) = *U*(*s*) Diag(*α*1(*s*),..., *αm*(*s*)) = *β*1(*s*)*TL*(*s*) has *k*<sup>1</sup> + *d*(*β*1(*s*)),..., *km* + *d*(*β*1(*s*)) as left Wiener–Hopf factorization indices at infinity. Now if *P*2(*s*) = Diag(*�*1(*s*),..., *�m*(*s*))*V*(*s*) = *ψ*1(*s*)*TR*(*s*) then its invariant factors are *�*1(*s*),..., *�m*(*s*), *P*(*s*) = *P*1(*s*)*P*2(*s*) and, by [2, Lemma 4.2],

$$(g\_1 + d(d(s)) - k\_1 - d(\beta\_1(s)), \dots, g\_{\mathfrak{M}} + d(d(s)) - k\_{\mathfrak{M}} - d(\beta\_1(s))) \prec (d(\overline{\varepsilon}\_{\mathfrak{m}}(s)), \dots, d(\overline{\varepsilon}\_1(s))).\tag{63}$$

Therefore, (60) follows.

#### *5.1.1. Proof of Theorem 29: Necessity*

If *�*1(*s*) *ψ*1(*s*) 1 (*s*−*a*)*<sup>l</sup>* <sup>1</sup> ,..., *�m*(*s*) *ψm*(*s*) 1 (*s*−*a*)*lm* are the invariant rational functions of *<sup>T</sup>*(*s*) in **<sup>F</sup>***M*�(*s*) <sup>∩</sup> **<sup>F</sup>***pr*(*s*) then there exist matrices *<sup>U</sup>*1(*s*), *<sup>U</sup>*2(*s*) invertible in **<sup>F</sup>***M*�(*s*)*m*×*<sup>m</sup>* <sup>∩</sup> **<sup>F</sup>***pr*(*s*)*m*×*<sup>m</sup>* such that

$$T(s) = \mathcal{U}\_1(s) \operatorname{Diag}\left(\frac{\varepsilon\_1(s)}{\psi\_1(s)} \frac{1}{(s-a)^{l\_1}}, \dots, \frac{\varepsilon\_m(s)}{\psi\_m(s)} \frac{1}{(s-a)^{l\_n}}\right) \mathcal{U}\_2(s). \tag{64}$$

We analyze first the finite structure of *T*(*s*) with respect to *M*� . If *<sup>D</sup>*1(*s*) = Diag((*<sup>s</sup>* <sup>−</sup> *<sup>a</sup>*)−*l*<sup>1</sup> , ...,(*<sup>s</sup>* <sup>−</sup> *<sup>a</sup>*)−*lm* ) <sup>∈</sup> **<sup>F</sup>***M*�(*s*)*m*×*m*, we can write *<sup>T</sup>*(*s*) as follows:

$$T(s) = \mathcal{U}\_1(s) \operatorname{Diag} \left( \frac{\mathfrak{e}\_1(s)}{\psi\_1(s)}, \dots, \frac{\mathfrak{e}\_m(s)}{\psi\_m(s)} \right) D\_1(s) \mathcal{U}\_2(s), \tag{65}$$

with *<sup>U</sup>*1(*s*) and *<sup>D</sup>*1(*s*)*U*2(*s*) invertible matrices in **<sup>F</sup>***M*�(*s*)*m*×*m*. Thus *�*1(*s*) *<sup>ψ</sup>*1(*s*),..., *�m*(*s*) *<sup>ψ</sup>m*(*s*) are the invariant rational functions of *T*(*s*) with respect to *M*� . Let *g*<sup>1</sup> ≥ ··· ≥ *gm* be the left Wiener–Hopf factorization indices of *T*(*s*) at infinity. By Lemma 30 we have

$$(\mathfrak{g}\_1 - k\_1, \dots, \mathfrak{g}\_m - k\_m) \prec (d(\varepsilon\_m(s)) - d(\mathfrak{\psi}\_m(s)), \dots, d(\varepsilon\_1(s)) - d(\mathfrak{\psi}\_1(s))).\tag{66}$$

As far as the structure of *T*(*s*) at infinity is concerned, let

$$D\_2(s) = \text{Diag}\left(\frac{\mathfrak{c}\_1(s)}{\psi\_1(s)} \frac{s^{l\_1 + d(\psi\_1(s)) - d(\mathfrak{e}\_1(s))}}{(s - a)^{l\_1}}, \dots, \frac{\mathfrak{e}\_m(s)}{\psi\_m(s)} \frac{s^{l\_m + d(\psi\_m(s)) - d(\mathfrak{e}\_m(s))}}{(s - a)^{l\_m}}\right). \tag{67}$$

Then *D*2(*s*) ∈ *Gl*(**F***pr*(*s*)) and

22 Will-be-set-by-IN-TECH

with *<sup>α</sup>i*(*s*) | *<sup>α</sup>i*+1(*s*), *<sup>β</sup>i*(*s*) | *<sup>β</sup>i*−1(*s*), *�i*(*s*) | *�i*+1(*s*), *<sup>ψ</sup>i*(*s*) | *<sup>ψ</sup>i*−1(*s*), *<sup>α</sup>i*(*s*), *<sup>β</sup>i*(*s*) units in **<sup>F</sup>***M*�\*M*(*s*) and *�i*(*s*), *<sup>ψ</sup>i*(*s*) factorizing in *<sup>M</sup>*� \ *<sup>M</sup>* because *<sup>T</sup>*(*s*) has no poles and no zeros in *<sup>M</sup>* <sup>∩</sup>

*M*. Put *P*(*s*) = *d*(*s*)*T*(*s*). Its left Wiener–Hopf factorization indices at infinity are *g*<sup>1</sup> + *d*(*d*(*s*)),..., *gm* + *d*(*d*(*s*)) [1, Lemma 2.3]. The matrix *P*1(*s*) = *U*(*s*) Diag(*α*1(*s*),..., *αm*(*s*)) = *β*1(*s*)*TL*(*s*) has *k*<sup>1</sup> + *d*(*β*1(*s*)),..., *km* + *d*(*β*1(*s*)) as left Wiener–Hopf factorization indices at infinity. Now if *P*2(*s*) = Diag(*�*1(*s*),..., *�m*(*s*))*V*(*s*) = *ψ*1(*s*)*TR*(*s*) then its invariant factors

(*g*<sup>1</sup> + *d*(*d*(*s*)) − *k*<sup>1</sup> − *d*(*β*1(*s*)),..., *gm* + *d*(*d*(*s*)) − *km* − *d*(*β*1(*s*))) ≺ (*d*(*�m*(*s*)),..., *d*(*�*1(*s*))).

then there exist matrices *<sup>U</sup>*1(*s*), *<sup>U</sup>*2(*s*) invertible in **<sup>F</sup>***M*�(*s*)*m*×*<sup>m</sup>* <sup>∩</sup> **<sup>F</sup>***pr*(*s*)*m*×*<sup>m</sup>* such that

 *�*1(*s*) *ψ*1(*s*)

1 (*<sup>s</sup>* − *<sup>a</sup>*)*l*<sup>1</sup>

 *�*1(*s*) *ψ*1(*s*)

with *<sup>U</sup>*1(*s*) and *<sup>D</sup>*1(*s*)*U*2(*s*) invertible matrices in **<sup>F</sup>***M*�(*s*)*m*×*m*. Thus *�*1(*s*)

Wiener–Hopf factorization indices of *T*(*s*) at infinity. By Lemma 30 we have

*sl*1+*d*(*ψ*1(*s*))−*d*(*�*1(*s*)) (*<sup>s</sup>* − *<sup>a</sup>*)*l*<sup>1</sup>

We analyze first the finite structure of *T*(*s*) with respect to *M*�

...,(*<sup>s</sup>* <sup>−</sup> *<sup>a</sup>*)−*lm* ) <sup>∈</sup> **<sup>F</sup>***M*�(*s*)*m*×*m*, we can write *<sup>T</sup>*(*s*) as follows:

*T*(*s*) = *U*1(*s*) Diag

invariant rational functions of *T*(*s*) with respect to *M*�

As far as the structure of *T*(*s*) at infinity is concerned, let

 *�*1(*s*) *ψ*1(*s*)  Diag

*<sup>ψ</sup>m*(*s*) as global invariant rational functions. Let *<sup>d</sup>*(*s*) = *<sup>β</sup>*1(*s*)*ψ*1(*s*). Hence,

*d*(*s*)*T*(*s*) = *U*(*s*) Diag(*α*1(*s*),..., *αm*(*s*)) Diag(*�*1(*s*),..., *�m*(*s*))*V*(*s*) (62)

(*s*−*a*)*lm* are the invariant rational functions of *<sup>T</sup>*(*s*) in **<sup>F</sup>***M*�(*s*) <sup>∩</sup> **<sup>F</sup>***pr*(*s*)

1 (*<sup>s</sup>* − *<sup>a</sup>*)*lm* *U*2(*s*). (64)

*<sup>ψ</sup>m*(*s*) are the

. (67)

. If *<sup>D</sup>*1(*s*) = Diag((*<sup>s</sup>* <sup>−</sup> *<sup>a</sup>*)−*l*<sup>1</sup> ,

*D*1(*s*)*U*2(*s*), (65)

*<sup>ψ</sup>*1(*s*),..., *�m*(*s*)

. Let *g*<sup>1</sup> ≥ ··· ≥ *gm* be the left

*slm*+*d*(*ψm*(*s*))−*d*(*�m*(*s*)) (*<sup>s</sup>* − *<sup>a</sup>*)*lm*

,..., *�m*(*s*) *ψm*(*s*)

,..., *�m*(*s*) *ψm*(*s*)

(*g*<sup>1</sup> − *k*1,..., *gm* − *km*) ≺ (*d*(*�m*(*s*)) − *d*(*ψm*(*s*)),..., *d*(*�*1(*s*)) − *d*(*ψ*1(*s*))). (66)

,..., *�m*(*s*) *ψm*(*s*)

 *�*1(*s*) *ψ*1(*s*) ,..., *�m*(*s*) *ψm*(*s*)

*<sup>β</sup>*1(*s*),..., *<sup>α</sup>m*(*s*)

*α*1(*s*)

,..., *<sup>α</sup>m*(*s*) *βm*(*s*)

*<sup>β</sup>i*(*s*) *<sup>β</sup>*1(*s*) units in **<sup>F</sup>***M*�\*M*(*s*) and *�i*(*s*) = *�i*(*s*)

*<sup>m</sup>*×*<sup>m</sup>* such that

*V*(*s*) (61)

has *k*1,..., *km*

(63)

*ψm*(*s*) *V*(*s*)

*βm*(*s*) 

*<sup>ψ</sup>*1(*s*),..., *�m*(*s*)

*<sup>ψ</sup>i*(*s*)*ψ*1(*s*) factorizing in *<sup>M</sup>*� \

*�*1(*s*)

**Proof**.- By Proposition 2 there exist unimodular matrices *U*(*s*), *V*(*s*) ∈ **F**[*s*]

 *α*1(*s*) *β*1(*s*)

. Therefore *T*(*s*) = *TL*(*s*)*TR*(*s*), where *TL*(*s*) = *U*(*s*) Diag

are *�*1(*s*),..., *�m*(*s*), *P*(*s*) = *P*1(*s*)*P*2(*s*) and, by [2, Lemma 4.2],

1

*T*(*s*) = *U*1(*s*) Diag

as left Wiener–Hopf factorization indices at infinity and *TR*(*s*) = Diag

*T*(*s*) = *U*(*s*) Diag

*M*�

has *�*1(*s*)

If *�*1(*s*) *ψ*1(*s*)

*<sup>ψ</sup>*1(*s*),..., *�m*(*s*)

with *<sup>α</sup>i*(*s*) = *<sup>α</sup>i*(*s*)

Therefore, (60) follows.

1 (*s*−*a*)*<sup>l</sup>*

*5.1.1. Proof of Theorem 29: Necessity*

<sup>1</sup> ,..., *�m*(*s*) *ψm*(*s*)

*D*2(*s*) = Diag

$$T(\mathbf{s}) = \mathcal{U}\_1(\mathbf{s}) \operatorname{Diag} \left( \mathbf{s}^{-l\_1 - d(\psi\_1(\mathbf{s})) + d(\varepsilon\_1(\mathbf{s}))}, \dots, \mathbf{s}^{-l\_n - d(\psi\_n(\mathbf{s})) + d(\varepsilon\_n(\mathbf{s}))} \right) \mathcal{D}\_2(\mathbf{s}) \mathcal{U}\_2(\mathbf{s}) \tag{68}$$

where *<sup>U</sup>*1(*s*) <sup>∈</sup> **<sup>F</sup>***pr*(*s*)*m*×*<sup>m</sup>* and *<sup>D</sup>*2(*s*)*U*2(*s*) <sup>∈</sup> **<sup>F</sup>***pr*(*s*)*m*×*<sup>m</sup>* are biproper matrices. Therefore *s*−*l*1−*d*(*ψ*1(*s*))+*d*(*�*1(*s*)), ..., *s*−*lm*−*d*(*ψm*(*s*))+*d*(*�m*(*s*)) are the invariant rational functions of *T*(*s*) at infinity. By Theorem 28

$$(g\_1, \ldots, g\_{\mathfrak{m}}) \prec (-l\_1 - d(\psi\_1(\mathbf{s})) + d(\varepsilon\_1(\mathbf{s})), \ldots, -l\_{\mathfrak{m}} - d(\psi\_{\mathfrak{m}}(\mathbf{s})) + d(\varepsilon\_{\mathfrak{m}}(\mathbf{s}))).\tag{69}$$

Let *σ* ∈ Σ*<sup>m</sup>* (the symmetric group of order *m*) be a permutation such that *gσ*(1) − *kσ*(1) ≥···≥ *gσ*(*m*) − *kσ*(*m*) and define *ci* = *gσ*(*i*) − *kσ*(*i*), *i* = 1, . . . , *m*. Using (66) and (69) we obtain

$$\begin{array}{ll}\sum\_{j=1}^{r}k\_{j} + \sum\_{j=1}^{r} \left(d(\varepsilon\_{j}(s)) - d(\psi\_{j}(s))\right) \leq \sum\_{j=1}^{r}k\_{j} + \sum\_{j=m-r+1}^{m} c\_{j} \\ \leq \sum\_{j=1}^{r}k\_{j} + \sum\_{j=1}^{r} (\mathfrak{g}\_{j} - k\_{j}) = \sum\_{j=1}^{r} \mathfrak{g}\_{j} \\ \leq \sum\_{j=1}^{r} -l\_{j} + \sum\_{j=1}^{r} \left(d(\varepsilon\_{j}(s)) - d(\psi\_{j}(s))\right) \end{array} \tag{70}$$

for *r* = 1, . . . , *m* − 1. When *r* = *m* the previous inequalities are all equalities and condition (59) is satisfied.

**Remark 31.** It has been seen in the above proof that if a matrix has *�*1(*s*) *ψ*1(*s*) 1 (*s*−*a*)*<sup>l</sup>* <sup>1</sup> ,..., *�m*(*s*) *ψm*(*s*) 1 (*s*−*a*)*lm* as invariant rational functions in **<sup>F</sup>***M*�(*s*) <sup>∩</sup> **<sup>F</sup>***pr*(*s*) then *�*1(*s*) *<sup>ψ</sup>*1(*s*),..., *�m*(*s*) *<sup>ψ</sup>m*(*s*) are its invariant rational functions with respect to *<sup>M</sup>*� and *<sup>s</sup>*−*l*1−*d*(*ψ*1(*s*))+*d*(*�*1(*s*)), ..., *s*−*lm*−*d*(*ψm*(*s*))+*d*(*�m*(*s*)) are its invariant rational functions at infinity.

#### **5.2. Sufficiency**

Let *a*, *b* ∈ **F** be arbitrary elements such that *ab* �= 1. Consider the changes of indeterminate

$$f(s) = a + \frac{1}{s - b'} \quad \vec{f}(s) = b + \frac{1}{s - a} \tag{71}$$

and notice that *f*( ˜ *f*(*s*)) = ˜ *f*(*f*(*s*)) = *s*. For *α*(*s*) ∈ **F**[*s*], let **F**[*s*] \ (*α*(*s*)) denote the multiplicative subset of **F**[*s*] whose elements are coprime with *α*(*s*). For *a*, *b* ∈ **F** as above define

$$\begin{array}{lcl} t\_{a,b}: \mathbb{F}[s] \to \mathbb{F}[s] \backslash (s-b) \\ \pi(s) \mapsto (s-b)^{d(\pi(s))} \, \pi\left(a + \frac{1}{s-b}\right) = (s-b)^{d(\pi(s))} \, \pi(f(s)) \, . \end{array} \tag{72}$$

In words, if *<sup>π</sup>*(*s*) = *pd*(*<sup>s</sup>* <sup>−</sup> *<sup>a</sup>*)*<sup>d</sup>* <sup>+</sup> *pd*−1(*<sup>s</sup>* <sup>−</sup> *<sup>a</sup>*)*d*−<sup>1</sup> <sup>+</sup> ··· <sup>+</sup> *<sup>p</sup>*1(*<sup>s</sup>* <sup>−</sup> *<sup>a</sup>*) + *<sup>p</sup>*<sup>0</sup> (*pd* �<sup>=</sup> 0) then

$$t\_{a,b}(\pi(s)) = p\_0(s-b)^d + p\_1(s-b)^{d-1} + \dots + p\_{d-1}(s-b) + p\_d. \tag{73}$$

In general *d*(*ta*,*b*(*π*(*s*))) ≤ *d*(*π*(*s*)) with equality if and only if *π*(*s*) ∈ **F**[*s*] \ (*s* − *a*). This shows that the restriction *ha*,*<sup>b</sup>* : **F**[*s*] \ (*s* − *a*) → **F**[*s*] \ (*s* − *b*) of *ta*,*<sup>b</sup>* to **F**[*s*] \ (*s* − *a*) is a bijection. In addition *h*−<sup>1</sup> *<sup>a</sup>*,*<sup>b</sup>* is the restriction of *tb*,*<sup>a</sup>* to **F**[*s*] \ (*s* − *b*); i.e.,

$$\begin{aligned} \mathcal{H}\_{a,b}^{-1}: \mathbb{F}[s] \backslash (s-b) &\to \mathbb{F}[s] \backslash (s-a) \\ a(s) &\mapsto (s-a)^{d(a(s))} a \left(b + \frac{1}{s-a}\right) = (s-a)^{d(a(s))} a(\tilde{f}(s)) \end{aligned} \tag{74}$$

#### 24 Will-be-set-by-IN-TECH 70 Linear Algebra – Theorems and Applications

or *h*−<sup>1</sup> *<sup>a</sup>*,*<sup>b</sup>* = *hb*,*a*.

In what follows we will think of *a*, *b* as given elements of **F** and the subindices of *ta*,*b*, *ha*,*<sup>b</sup>* and *h*−<sup>1</sup> *<sup>a</sup>*,*<sup>b</sup>* will be removed. The following are properties of *<sup>h</sup>* (and *<sup>h</sup>*−1) that can be easily proved.

**Lemma 32.** *Let π*1(*s*), *π*2(*s*) ∈ **F**[*s*] \ (*s* − *a*)*. The following properties hold:*


As a consequence the map

$$\begin{array}{ccc} H: \mathsf{Spec}\, \mathsf{m}\left(\mathsf{F}[s]\right) \backslash \{ (s-a) \} \to \mathsf{Spec}\, \mathsf{m}\left(\mathsf{F}[s]\right) \backslash \{ (s-b) \} \\\ (\pi(s)) & \mapsto & \left(\frac{1}{p\_0}h(\pi(s))\right) \end{array} \tag{75}$$

with *p*<sup>0</sup> = *π*(*a*), is a bijection whose inverse is

$$\begin{array}{ccc} H^{-1}: \operatorname{Spec}\left(\mathbb{F}[s]\right) \backslash \{ (s-b) \} \to \operatorname{Spec}\left(\mathbb{F}[s]\right) \backslash \{ (s-a) \} \\\ (a(s)) & \mapsto & \left(\frac{1}{a\_0}h^{-1}(a(s))\right) \end{array} \tag{76}$$

where *<sup>a</sup>*<sup>0</sup> <sup>=</sup> *<sup>α</sup>*(*b*). In particular, if *<sup>M</sup>*� <sup>⊆</sup> Specm(**F**[*s*]) \ {(*<sup>s</sup>* <sup>−</sup> *<sup>a</sup>*)} and *<sup>M</sup>*˜ <sup>=</sup> Specm(**F**[*s*]) \ (*M*� <sup>∪</sup> {(*s* − *a*)}) (i.e. the complementary subset of *M*� in Specm (**F**[*s*]) \ {(*s* − *a*)}) then

$$H(\tilde{M}) = \text{Spec}\, \text{cm}\left(\mathbb{F}[s]\right) \backslash \left(H(M') \cup \{(s-b)\}\right). \tag{77}$$

In what follows and for notational simplicity we will assume *b* = 0.

**Lemma 33.** *Let M*� ⊆ Specm (**F**[*s*]) \ {(*s* − *a*)} *where a* ∈ **F** *is an arbitrary element of* **F***.*


**Proof.-** 1. Let *<sup>π</sup>*(*s*) = *<sup>c</sup>π*1(*s*)*g*<sup>1</sup> ··· *<sup>π</sup>m*(*s*)*gm* with *<sup>c</sup>* �<sup>=</sup> 0 constant, (*πi*(*s*)) <sup>∈</sup> *<sup>M</sup>*� and *gi* <sup>≥</sup> 1. Then *<sup>h</sup>*(*π*(*s*)) = *<sup>c</sup>*(*h*(*π*1(*s*)))*g*<sup>1</sup> ···(*h*(*πm*(*s*)))*gm* . By Lemma 32 *<sup>h</sup>*(*πi*(*s*)) is an irreducible polynomial (that may not be monic). If *ci* is the leading coefficient of *h*(*πi*(*s*)) then <sup>1</sup> *ci h*(*πi*(*s*)) is monic, irreducible and ( <sup>1</sup> *ci h*(*πi*(*s*))) ∈ *H*(*M*� ). Hence *h*(*π*(*s*)) factorizes in *H*(*M*� ).

2. If *<sup>π</sup>*(*s*) <sup>∈</sup> **<sup>F</sup>**[*s*] is a unit of **<sup>F</sup>***M*�(*s*) then it can be written as *<sup>π</sup>*(*s*)=(*<sup>s</sup>* <sup>−</sup> *<sup>a</sup>*)*gπ*1(*s*) where *<sup>g</sup>* <sup>≥</sup> 0 and *<sup>π</sup>*1(*s*) is a unit of **<sup>F</sup>***M*�∪{(*s*−*a*)}(*s*). Therefore *<sup>π</sup>*1(*s*) factorizes in Specm(**F**[*s*]) \ (*M*� <sup>∪</sup> {(*s* − *a*)}). Since *t*(*π*(*s*)) = *h*(*π*1(*s*)), it factorizes in (recall that we are assuming *b* = 0) *H*(Specm(**F**[*s*]) \ (*M*� ∪ {(*s* − *a*)}) = Specm(**F**[*s*]) \ (*H*(*M*� ) ∪ {(*s*)}). So, *t*(*π*(*s*)) is a unit of **F***H*(*M*�)(*s*).

**Lemma 34.** *Let a* ∈ **F** *be an arbitrary element. Then*

$$\text{1. } \text{If } M' \subseteq \text{Spec}\left(\mathbb{F}[s]\right) \\
\text{ } \{\left(s-a\right)\} \text{ and } \mathcal{U}(s) \in \text{Gl}\_{\mathfrak{m}}(\mathbb{F}\_{M'}(s)) \text{ then } \mathcal{U}(f(s)) \in \text{Gl}\_{\mathfrak{m}}(\mathbb{F}\_{H(M')}(s)).$$


**Proof.-** Let *<sup>p</sup>*(*s*) *<sup>q</sup>*(*s*) with *p*(*s*), *q*(*s*) ∈ **F**[*s*].

24 Will-be-set-by-IN-TECH

In what follows we will think of *a*, *b* as given elements of **F** and the subindices of *ta*,*b*, *ha*,*<sup>b</sup>* and

*<sup>a</sup>*,*<sup>b</sup>* will be removed. The following are properties of *<sup>h</sup>* (and *<sup>h</sup>*−1) that can be easily proved.

**Lemma 32.** *Let π*1(*s*), *π*2(*s*) ∈ **F**[*s*] \ (*s* − *a*)*. The following properties hold:*

*3. If π*1(*s*) *is an irreducible polynomial then h*(*π*1(*s*)) *is an irreducible polynomial.*

*4. If π*1(*s*), *π*2(*s*) *are coprime polynomials then h*(*π*1(*s*))*, h*(*π*2(*s*)) *are coprime polynomials.*

*H* : Specm (**F**[*s*]) \ {(*s* − *a*)} → Specm (**F**[*s*]) \ {(*s* − *b*)} (*π*(*s*)) �→ ( <sup>1</sup>

*<sup>H</sup>*−<sup>1</sup> : Specm (**F**[*s*]) \ {(*<sup>s</sup>* <sup>−</sup> *<sup>b</sup>*)} <sup>→</sup> Specm (**F**[*s*]) \ {(*<sup>s</sup>* <sup>−</sup> *<sup>a</sup>*)} (*α*(*s*)) �→ ( <sup>1</sup>

where *<sup>a</sup>*<sup>0</sup> <sup>=</sup> *<sup>α</sup>*(*b*). In particular, if *<sup>M</sup>*� <sup>⊆</sup> Specm(**F**[*s*]) \ {(*<sup>s</sup>* <sup>−</sup> *<sup>a</sup>*)} and *<sup>M</sup>*˜ <sup>=</sup> Specm(**F**[*s*]) \ (*M*� <sup>∪</sup>

**F**[*s*]) \ (*H*(*M*�

**Proof.-** 1. Let *<sup>π</sup>*(*s*) = *<sup>c</sup>π*1(*s*)*g*<sup>1</sup> ··· *<sup>π</sup>m*(*s*)*gm* with *<sup>c</sup>* �<sup>=</sup> 0 constant, (*πi*(*s*)) <sup>∈</sup> *<sup>M</sup>*� and *gi* <sup>≥</sup> 1. Then *<sup>h</sup>*(*π*(*s*)) = *<sup>c</sup>*(*h*(*π*1(*s*)))*g*<sup>1</sup> ···(*h*(*πm*(*s*)))*gm* . By Lemma 32 *<sup>h</sup>*(*πi*(*s*)) is an irreducible polynomial

2. If *<sup>π</sup>*(*s*) <sup>∈</sup> **<sup>F</sup>**[*s*] is a unit of **<sup>F</sup>***M*�(*s*) then it can be written as *<sup>π</sup>*(*s*)=(*<sup>s</sup>* <sup>−</sup> *<sup>a</sup>*)*gπ*1(*s*) where *<sup>g</sup>* <sup>≥</sup> 0 and *<sup>π</sup>*1(*s*) is a unit of **<sup>F</sup>***M*�∪{(*s*−*a*)}(*s*). Therefore *<sup>π</sup>*1(*s*) factorizes in Specm(**F**[*s*]) \ (*M*� <sup>∪</sup> {(*s* − *a*)}). Since *t*(*π*(*s*)) = *h*(*π*1(*s*)), it factorizes in (recall that we are assuming *b* = 0)

*1. If M*� ⊆ Specm (**F**[*s*]) \ {(*s* − *a*)} *and U*(*s*) ∈ Gl*m*(**F***M*�(*s*)) *then U*(*f*(*s*)) ∈ Gl*m*(**F***H*(*M*�)(*s*))*.*

). Hence *h*(*π*(*s*)) factorizes in *H*(*M*�

) ∪ {(*s* − *b*)

)*.*

{(*s* − *a*)}) (i.e. the complementary subset of *M*� in Specm (**F**[*s*]) \ {(*s* − *a*)}) then

**Lemma 33.** *Let M*� ⊆ Specm (**F**[*s*]) \ {(*s* − *a*)} *where a* ∈ **F** *is an arbitrary element of* **F***.*

*H*(*M*˜ ) = Specm

*1. If π*(*s*) ∈ **F**[*s*] *factorizes in M*� *then h*(*π*(*s*)) *factorizes in H*(*M*�

*h*(*πi*(*s*))) ∈ *H*(*M*�

*H*(Specm(**F**[*s*]) \ (*M*� ∪ {(*s* − *a*)}) = Specm(**F**[*s*]) \ (*H*(*M*�

**Lemma 34.** *Let a* ∈ **F** *be an arbitrary element. Then*

In what follows and for notational simplicity we will assume *b* = 0.

*2. If π*(*s*) ∈ **F**[*s*] *is a unit of* **F***M*�(*s*) *then t*(*π*(*s*)) *is a unit of* **F***H*(*M*�)(*s*)*.*

(that may not be monic). If *ci* is the leading coefficient of *h*(*πi*(*s*)) then <sup>1</sup>

*<sup>p</sup>*<sup>0</sup> *<sup>h</sup>*(*π*(*s*))) (75)

*<sup>a</sup>*<sup>0</sup> *<sup>h</sup>*−1(*α*(*s*))) (76)

*ci*

).

) ∪ {(*s*)}). So, *t*(*π*(*s*)) is a unit of

). (77)

*h*(*πi*(*s*)) is monic,

or *h*−<sup>1</sup>

*h*−<sup>1</sup>

*<sup>a</sup>*,*<sup>b</sup>* = *hb*,*a*.

*1. h*(*π*1(*s*)*π*2(*s*)) = *h*(*π*1(*s*))*h*(*π*2(*s*))*. 2. If π*1(*s*) | *π*2(*s*) *then h*(*π*1(*s*)) | *h*(*π*2(*s*))*.*

with *p*<sup>0</sup> = *π*(*a*), is a bijection whose inverse is

As a consequence the map

irreducible and ( <sup>1</sup>

**F***H*(*M*�)(*s*).

*ci*

$$\frac{p(f(s))}{q(f(s))} = \frac{s^{d(p(s))} p(f(s))}{s^{d(q(s))} q(f(s))} s^{d(q(s)) - d(p(s))} = \frac{t(p(s))}{t(q(s))} s^{d(q(s)) - d(p(s))}.\tag{78}$$

1. Assume that *<sup>U</sup>*(*s*) <sup>∈</sup> Gl*m*(**F***M*�(*s*)) and let *<sup>p</sup>*(*s*) *<sup>q</sup>*(*s*) be any element of *U*(*s*). Therefore *q*(*s*) is a unit of **F***M*�(*s*) and, by Lemma 33.2, *t*(*q*(*s*)) is a unit of **F***H*(*M*�)(*s*). Moreover, *s* is also a unit of **<sup>F</sup>***H*(*M*�)(*s*). Hence, *<sup>p</sup>*(*f*(*s*)) *<sup>q</sup>*(*f*(*s*)) <sup>∈</sup> **<sup>F</sup>***H*(*M*�)(*s*). Furthermore, if det *<sup>U</sup>*(*s*) = *<sup>p</sup>*˜(*s*) *<sup>q</sup>*˜(*s*) , it is a unit of **F***M*�(*s*) and det *U*(*f*(*s*)) = *<sup>p</sup>*˜(*f*(*s*)) *<sup>q</sup>*˜(*f*(*s*)) is a unit of **F***H*(*M*�)(*s*).

2. If *<sup>p</sup>*(*s*) *<sup>q</sup>*(*s*) is any element of *U*(*s*) ∈ Gl*m*(**F***s*−*a*(*s*)) then *q*(*s*) ∈ **F**[*s*] \ (*s* − *a*) and so *d*(*h*(*q*(*s*))) = *d*(*q*(*s*)). Since *s* − *a* may divide *p*(*s*) we have that *d*(*t*(*p*(*s*))) ≤ *d*(*p*(*s*)). Hence, *<sup>d</sup>*(*h*(*q*(*s*))) <sup>−</sup> *<sup>d</sup>*(*q*(*s*)) <sup>≥</sup> *<sup>d</sup>*(*t*(*p*(*s*)) <sup>−</sup> *<sup>d</sup>*(*p*(*s*)) and *<sup>p</sup>*(*f*(*s*)) *<sup>q</sup>*(*f*(*s*)) <sup>=</sup> *<sup>t</sup>*(*p*(*s*)) *<sup>h</sup>*(*q*(*s*))*sd*(*q*(*s*))−*d*(*p*(*s*)) <sup>∈</sup> **<sup>F</sup>***pr*(*s*). Moreover if det *U*(*s*) = *<sup>p</sup>*˜(*s*) *<sup>q</sup>*˜(*s*) then *p*˜(*s*), *q*˜(*s*) ∈ **F**[*s*] \ (*s* − *a*), *d*(*h*(*p*˜(*s*))) = *d*(*p*˜(*s*)) and *d*(*h*(*q*˜(*s*))) = *d*(*q*˜(*s*)). Thus, det *U*(*f*(*s*)) = *<sup>h</sup>*(*p*˜(*s*)) *<sup>h</sup>*(*q*˜(*s*))*sd*(*q*˜(*s*))−*d*(*p*˜(*s*)) is a biproper rational function, i.e., a unit of **F***pr*(*s*).

3. If *<sup>U</sup>*(*s*) <sup>∈</sup> Gl*m*(**F***pr*(*s*)) and *<sup>p</sup>*(*s*) *<sup>q</sup>*(*s*) is any element of *U*(*s*) then *d*(*q*(*s*)) ≥ *d*(*p*(*s*)). Since *p*(*f*(*s*)) *<sup>q</sup>*(*f*(*s*)) <sup>=</sup> *<sup>t</sup>*(*p*(*s*)) *<sup>t</sup>*(*q*(*s*))*sd*(*q*(*s*))−*d*(*p*(*s*)) and *<sup>t</sup>*(*p*(*s*)), *<sup>t</sup>*(*q*(*s*)) <sup>∈</sup> **<sup>F</sup>**[*s*] \ (*s*) we obtain that *<sup>U</sup>*(*f*(*s*)) <sup>∈</sup> **<sup>F</sup>***s*(*s*)*m*×*m*. In addition, if det *<sup>U</sup>*(*s*) = *<sup>p</sup>*˜(*s*) *<sup>q</sup>*˜(*s*) , which is a unit of **F***pr*(*s*), then *d*(*q*˜(*s*)) = *d*(*p*˜(*s*)) and since *<sup>t</sup>*(*p*˜(*s*)), *<sup>t</sup>*(*q*˜(*s*)) <sup>∈</sup> **<sup>F</sup>**[*s*] \ (*s*) we conclude that det *<sup>U</sup>*(*f*(*s*)) = *<sup>t</sup>*(*p*˜(*s*)) *<sup>t</sup>*(*q*˜(*s*)) is a unit of **F***s*(*s*).

4. It is a consequence of 1., 2. and Remark 1.2.

**Proposition 35.** *Let M* <sup>⊆</sup> Specm(**F**[*s*]) *and* (*<sup>s</sup>* <sup>−</sup> *<sup>a</sup>*) <sup>∈</sup> *M. If T*(*s*) <sup>∈</sup> **<sup>F</sup>**(*s*)*m*×*<sup>m</sup> is non-singular with ni*(*s*) *di*(*s*) = (*<sup>s</sup>* <sup>−</sup> *<sup>a</sup>*)*gi �i*(*s*) *<sup>ψ</sup>i*(*s*) (*�i*(*s*), *<sup>ψ</sup>i*(*s*) ∈ **<sup>F</sup>**[*s*] \ (*<sup>s</sup>* − *<sup>a</sup>*)) *as invariant rational functions with respect to M then T*(*f*(*s*))*<sup>T</sup>* <sup>∈</sup> **<sup>F</sup>**(*s*)*m*×*<sup>m</sup> is a non-singular matrix with* <sup>1</sup> *ci h*(*�i*(*s*)) *<sup>h</sup>*(*ψi*(*s*))*s*−*gi*+*d*(*ψi*(*s*))−*d*(*�i*(*s*)) *as invariant rational functions in* **<sup>F</sup>***H*(*M*\{(*s*−*a*)})(*s*)*m*×*<sup>m</sup>* <sup>∩</sup> **<sup>F</sup>***pr*(*s*)*m*×*<sup>m</sup> where ci* <sup>=</sup> *�i*(*a*) *<sup>ψ</sup>i*(*a*)*.*

**Proof.-** Since (*<sup>s</sup>* <sup>−</sup> *<sup>a</sup>*)*gi �i*(*s*) *<sup>ψ</sup>*(*s*) are the invariant rational functions of *T*(*s*) with respect to *M*, there are *U*1(*s*), *U*2(*s*) ∈ *Glm*(**F***M*(*s*)) such that

$$T(s) = \mathcal{U}\_1(s) \operatorname{Diag} \left( (s-a)^{\mathcal{G}\_1} \frac{\varepsilon\_1(s)}{\psi\_1(s)}, \dots, (s-a)^{\mathcal{G}\_m} \frac{\varepsilon\_m(s)}{\psi\_m(s)} \right) \mathcal{U}\_2(s). \tag{79}$$

#### 26 Will-be-set-by-IN-TECH 72 Linear Algebra – Theorems and Applications

Notice that (*f*(*s*) − *a*) *gi �i*(*f*(*s*)) *<sup>ψ</sup>i*(*f*(*s*)) <sup>=</sup> *<sup>h</sup>*(*�i*(*s*)) *<sup>h</sup>*(*ψi*(*s*))*s*−*gi*+*d*(*ψi*(*s*))−*d*(*�i*(*s*)). Let *ci* <sup>=</sup> *�i*(*a*) *<sup>ψ</sup>i*(*a*), which is a non-zero constant, and put *D* = Diag (*c*1,..., *cm*). Hence,

$$\left(T(f(s))^T = \mathcal{U}\_2(f(s))^T \mathcal{D}L(s)\mathcal{U}\_1(f(s))^T\right.\tag{80}$$

with

$$L(s) = \text{Diag}\left(\frac{1}{c\_1} \frac{h(\varepsilon\_1(s))}{h(\psi\_1(s))} s^{-\mathcal{G}+d(\psi\_1(s))-d(\varepsilon\_1(s))}, \dots, \frac{1}{c\_m} \frac{h(\varepsilon\_m(s))}{h(\psi\_m(s))} s^{-\mathcal{g}\_m+d(\psi\_n(s))-d(\varepsilon\_n(s))}\right). \tag{81}$$

By 4 of Lemma 34 matrices *<sup>U</sup>*1(*f*(*s*))*T*, *<sup>U</sup>*2(*f*(*s*))*<sup>T</sup>* <sup>∈</sup> Gl*m*(**F***H*(*M*\{(*s*−*a*)})(*s*)) <sup>∩</sup> Gl*m*(**F***pr*(*s*)) and the Proposition follows.

**Proposition 36.** *Let M*, *M*� ⊆ Specm(**F**[*s*]) *such that M* ∪ *M*� = Specm(**F**[*s*])*. Assume that there are ideals in M* \ *M*� *generated by linear polynomials and let* (*s* − *a*) *be any of them. If T*(*s*) ∈ **<sup>F</sup>**(*s*)*m*×*<sup>m</sup> is a non-singular rational matrix with no poles and no zeros in M* <sup>∩</sup> *<sup>M</sup>*� *and k*1,..., *km as left Wiener–Hopf factorization indices with respect to M then T*(*f*(*s*))*<sup>T</sup>* <sup>∈</sup> **<sup>F</sup>**(*s*)*m*×*<sup>m</sup> is a non-singular rational matrix with no poles and no zeros in H*(*M* ∩ *M*� ) *and* −*km*,..., −*k*<sup>1</sup> *as left Wiener–Hopf factorization indices with respect to H*(*M*� ) ∪ {(*s*)}*.*

**Proof.-** By Theorem 19 there are matrices *<sup>U</sup>*1(*s*) invertible in **<sup>F</sup>***M*�(*s*)*m*×*<sup>m</sup>* <sup>∩</sup> **<sup>F</sup>***pr*(*s*)*m*×*<sup>m</sup>* and *<sup>U</sup>*2(*s*) invertible in **<sup>F</sup>***M*(*s*)*m*×*<sup>m</sup>* such that *<sup>T</sup>*(*s*) = *<sup>U</sup>*1(*s*) Diag (*s* − *a*) *<sup>k</sup>*<sup>1</sup> ,...,(*<sup>s</sup>* <sup>−</sup> *<sup>a</sup>*) *km U*2(*s*). By Lemma 34 *<sup>U</sup>*2(*f*(*s*))*<sup>T</sup>* is invertible in **<sup>F</sup>***H*(*M*\{(*s*−*a*)})(*s*)*m*×*<sup>m</sup>* <sup>∩</sup> **<sup>F</sup>***pr*(*s*)*m*×*<sup>m</sup>* and *<sup>U</sup>*1(*f*(*s*))*<sup>T</sup>* is invertible in **<sup>F</sup>***H*(*M*�)(*s*)*m*×*<sup>m</sup>* <sup>∩</sup> **<sup>F</sup>***s*(*s*)*m*×*<sup>m</sup>* <sup>=</sup> **<sup>F</sup>***H*(*M*�)∪{(*s*)}(*s*)*m*×*m*. Moreover, *<sup>H</sup>*(*<sup>M</sup>* \ {(*<sup>s</sup>* <sup>−</sup> *<sup>a</sup>*)}) <sup>∪</sup> *H*(*M*� ) ∪ {(*s*)} = Specm(**F**[*s*]) and *H*(*M* \ {(*s* − *a*)}) ∩ (*H*(*M*� ) ∪ {(*s*)}) = *H*(*M* ∩ *M*� ). Thus, *<sup>T</sup>*(*f*(*s*))*<sup>T</sup>* <sup>=</sup> *<sup>U</sup>*2(*f*(*s*))*<sup>T</sup>* Diag *s*−*k*<sup>1</sup> ,...,*s*−*km <sup>U</sup>*1(*f*(*s*))*<sup>T</sup>* has no poles and no zeros in *<sup>H</sup>*(*<sup>M</sup>* <sup>∩</sup> *M*� ) and −*km*,..., −*k*<sup>1</sup> are its left Wiener–Hopf factorization indices with respect to *H*(*M*� ) ∪ {(*s*)}.

#### *5.2.1. Proof of Theorem 29: Sufficiency*

Let *<sup>k</sup>*<sup>1</sup> ≥ ··· ≥ *km* be integers, *�*1(*s*) *<sup>ψ</sup>*1(*s*),..., *�m*(*s*) *<sup>ψ</sup>m*(*s*) irreducible rational functions such that *�*1(*s*) | ··· | *�m*(*s*), *ψm*(*s*) | ··· | *ψ*1(*s*) are monic polynomials factorizing in *M*� \ *M* and *l*1,..., *lm* integers such that *l*<sup>1</sup> + *d*(*ψ*1(*s*)) − *d*(*�*1(*s*)) ≤ ··· ≤ *lm* + *d*(*ψm*(*s*)) − *d*(*�m*(*s*)) and satisfying (59).

Since *�i*(*s*) and *ψi*(*s*) are coprime polynomials that factorize in *M*� \ *M* and (*s* − *a*) ∈ *M* \ *M*� , by Lemmas 32 and 33, *<sup>h</sup>*(*�*1(*s*)) *<sup>h</sup>*(*ψ*1(*s*))*sl*1+*d*(*ψ*1(*s*))−*d*(*�*1(*s*)),..., *<sup>h</sup>*(*�m*(*s*)) *<sup>h</sup>*(*ψm*(*s*))*slm*+*d*(*ψm*(*s*))−*d*(*�m*(*s*)) are irreducible rational functions with numerators and denominators polynomials factorizing in *H*(*M*� ) ∪ {(*s*)} (actually, in *H*(*M*� \ *M*) ∪ {(*s*)}) and such that each numerator divides the next one and each denominator divides the previous one.

By (59) and Theorem 27 there is a matrix *<sup>G</sup>*(*s*) <sup>∈</sup> **<sup>F</sup>**(*s*)*m*×*<sup>m</sup>* with <sup>−</sup>*km*,..., <sup>−</sup>*k*<sup>1</sup> as left Wiener–Hopf factorization indices with respect to *H*(*M*� ) ∪ {(*s*)} and 1 *c*1 *h*(*�*1(*s*)) *<sup>h</sup>*(*ψ*1(*s*))*sl*1+*d*(*ψ*1(*s*))−*d*(*�*1(*s*)),..., <sup>1</sup> *cm h*(*�m*(*s*)) *<sup>h</sup>*(*ψm*(*s*))*slm*+*d*(*ψm*(*s*))−*d*(*�m*(*s*)) as invariant rational functions with respect to *H*(*M*� ) ∪ {(*s*)} where *ci* <sup>=</sup> *�i*(*a*) *<sup>ψ</sup>i*(*a*), *<sup>i</sup>* = 1, . . . , *<sup>m</sup>*. Notice that *<sup>G</sup>*(*s*) has no zeros and poles in *H*(*M* ∩ *M*� ) because the numerator and denominator of each rational function *h*(*�i*(*s*)) *<sup>h</sup>*(*ψi*(*s*))*sli*+*d*(*ψi*(*s*))−*d*(*�i*(*s*)) factorizes in *<sup>H</sup>*(*M*� \ *<sup>M</sup>*) ∪ {(*s*)} and so it is a unit of **<sup>F</sup>***H*(*M*∩*M*�)(*s*).

Put *M* = *H*(*M*� ) ∪ {(*s*)} and *M* � = *H*(*M* \ {(*s* − *a*)}). As remarked in the proof of Proposition 36, *M* ∪ *M* � = Specm(**F**[*s*]) and *M* ∩ *M* � = *H*(*M* ∩ *M*� ). Now (*s*) ∈ *M* so that we can apply Proposition 35 to *G*(*s*) with the change of indeterminate *f* (*s*) = <sup>1</sup> *<sup>s</sup>*−*<sup>a</sup>* . Thus the invariant rational functions of *G*(*f* (*s*))*<sup>T</sup>* in **<sup>F</sup>***M*�(*s*) <sup>∩</sup> **<sup>F</sup>***pr*(*s*) are *�*1(*s*) *ψ*1(*s*) 1 (*s*−*a*)*<sup>l</sup>* <sup>1</sup> ,..., *�m*(*s*) *ψm*(*s*) 1 (*s*−*a*)*lm* .

On the other hand *M* � = *H*(*M* \ {(*s* − *a*)}) ⊆ Specm(**F**[*s*]) \ {(*s*)} and so (*s*) ∈ *M* \ *M* � . Then we can apply Proposition 36 to *G*(*s*) with *f* (*s*) = <sup>1</sup> *<sup>s</sup>*−*<sup>a</sup>* so that *<sup>G</sup>*(*<sup>f</sup>* (*s*))*<sup>T</sup>* is a non-singular matrix with no poles and no zeros in *<sup>H</sup>*−1(*<sup>M</sup>* <sup>∩</sup> *<sup>M</sup>* � ) = *<sup>H</sup>*−1(*H*(*<sup>M</sup>* <sup>∩</sup> *<sup>M</sup>*� )) = *M* ∩ *M*� and *<sup>k</sup>*1,..., *km* as left Wiener–Hopf factorization indices with respect to *<sup>H</sup>*−1(*<sup>M</sup>* � ) ∪ {(*s* − *a*)} = (*M* \ {(*s* − *a*)}) ∪ {(*s* − *a*)} = *M*. The theorem follows by letting *T*(*s*) = *G*(*f* (*s*))*T*.

**Remark 37.** Notice that when *M*� = ∅ and *M* = Specm(**F**[*s*]) in Theorem 29 we obtain Theorem 28 (*qi* = −*li*).

#### **Author details**

26 Will-be-set-by-IN-TECH

<sup>−</sup>*g*1+*d*(*ψ*1(*s*))−*d*(*�*1(*s*)),..., <sup>1</sup>

By 4 of Lemma 34 matrices *<sup>U</sup>*1(*f*(*s*))*T*, *<sup>U</sup>*2(*f*(*s*))*<sup>T</sup>* <sup>∈</sup> Gl*m*(**F***H*(*M*\{(*s*−*a*)})(*s*)) <sup>∩</sup> Gl*m*(**F***pr*(*s*)) and

**Proposition 36.** *Let M*, *M*� ⊆ Specm(**F**[*s*]) *such that M* ∪ *M*� = Specm(**F**[*s*])*. Assume that there are ideals in M* \ *M*� *generated by linear polynomials and let* (*s* − *a*) *be any of them. If T*(*s*) ∈ **<sup>F</sup>**(*s*)*m*×*<sup>m</sup> is a non-singular rational matrix with no poles and no zeros in M* <sup>∩</sup> *<sup>M</sup>*� *and k*1,..., *km as left Wiener–Hopf factorization indices with respect to M then T*(*f*(*s*))*<sup>T</sup>* <sup>∈</sup> **<sup>F</sup>**(*s*)*m*×*<sup>m</sup> is a non-singular*

**Proof.-** By Theorem 19 there are matrices *<sup>U</sup>*1(*s*) invertible in **<sup>F</sup>***M*�(*s*)*m*×*<sup>m</sup>* <sup>∩</sup> **<sup>F</sup>***pr*(*s*)*m*×*<sup>m</sup>* and

By Lemma 34 *<sup>U</sup>*2(*f*(*s*))*<sup>T</sup>* is invertible in **<sup>F</sup>***H*(*M*\{(*s*−*a*)})(*s*)*m*×*<sup>m</sup>* <sup>∩</sup> **<sup>F</sup>***pr*(*s*)*m*×*<sup>m</sup>* and *<sup>U</sup>*1(*f*(*s*))*<sup>T</sup>* is invertible in **<sup>F</sup>***H*(*M*�)(*s*)*m*×*<sup>m</sup>* <sup>∩</sup> **<sup>F</sup>***s*(*s*)*m*×*<sup>m</sup>* <sup>=</sup> **<sup>F</sup>***H*(*M*�)∪{(*s*)}(*s*)*m*×*m*. Moreover, *<sup>H</sup>*(*<sup>M</sup>* \ {(*<sup>s</sup>* <sup>−</sup> *<sup>a</sup>*)}) <sup>∪</sup>

) and −*km*,..., −*k*<sup>1</sup> are its left Wiener–Hopf factorization indices with respect to *H*(*M*�

··· | *�m*(*s*), *ψm*(*s*) | ··· | *ψ*1(*s*) are monic polynomials factorizing in *M*� \ *M* and *l*1,..., *lm* integers such that *l*<sup>1</sup> + *d*(*ψ*1(*s*)) − *d*(*�*1(*s*)) ≤ ··· ≤ *lm* + *d*(*ψm*(*s*)) − *d*(*�m*(*s*)) and satisfying

Since *�i*(*s*) and *ψi*(*s*) are coprime polynomials that factorize in *M*� \ *M* and (*s* − *a*) ∈

irreducible rational functions with numerators and denominators polynomials factorizing in

By (59) and Theorem 27 there is a matrix *<sup>G</sup>*(*s*) <sup>∈</sup> **<sup>F</sup>**(*s*)*m*×*<sup>m</sup>* with <sup>−</sup>*km*,..., <sup>−</sup>*k*<sup>1</sup>

*<sup>h</sup>*(*ψ*1(*s*))*sl*1+*d*(*ψ*1(*s*))−*d*(*�*1(*s*)),..., *<sup>h</sup>*(*�m*(*s*))

) ∪ {(*s*)} (actually, in *H*(*M*� \ *M*) ∪ {(*s*)}) and such that each numerator divides the next

) ∪ {(*s*)}*.*

*<sup>h</sup>*(*ψi*(*s*))*s*−*gi*+*d*(*ψi*(*s*))−*d*(*�i*(*s*)). Let *ci* <sup>=</sup> *�i*(*a*)

*cm*

*T*(*f*(*s*))*<sup>T</sup>* = *U*2(*f*(*s*))*TDL*(*s*)*U*1(*f*(*s*))*<sup>T</sup>* (80)

*h*(*�m*(*s*)) *<sup>h</sup>*(*ψm*(*s*))*<sup>s</sup>*

> (*s* − *a*)

*<sup>ψ</sup>i*(*a*), which is a

 . (81)

−*gm*+*d*(*ψm*(*s*))−*d*(*�m*(*s*))

) *and* −*km*,..., −*k*<sup>1</sup> *as left Wiener–Hopf*

*<sup>k</sup>*<sup>1</sup> ,...,(*<sup>s</sup>* <sup>−</sup> *<sup>a</sup>*)

*<sup>h</sup>*(*ψm*(*s*))*slm*+*d*(*ψm*(*s*))−*d*(*�m*(*s*)) are

) ∪ {(*s*)} and

) ∪ {(*s*)}) = *H*(*M* ∩ *M*�

*<sup>U</sup>*1(*f*(*s*))*<sup>T</sup>* has no poles and no zeros in *<sup>H</sup>*(*<sup>M</sup>* <sup>∩</sup>

*<sup>ψ</sup>m*(*s*) irreducible rational functions such that *�*1(*s*) |

*<sup>h</sup>*(*ψm*(*s*))*slm*+*d*(*ψm*(*s*))−*d*(*�m*(*s*)) as invariant rational functions

*<sup>ψ</sup>i*(*a*), *<sup>i</sup>* = 1, . . . , *<sup>m</sup>*. Notice that *<sup>G</sup>*(*s*) has no zeros

*km U*2(*s*).

). Thus,

) ∪

Notice that (*f*(*s*) − *a*)

*L*(*s*) = Diag

the Proposition follows.

 1 *c*1

with

*H*(*M*�

{(*s*)}.

(59).

*M* \ *M*�

*H*(*M*�

1 *c*1 *h*(*�*1(*s*))

with respect to *H*(*M*�

*M*�

*gi �i*(*f*(*s*))

*h*(*�*1(*s*)) *<sup>h</sup>*(*ψ*1(*s*))*<sup>s</sup>*

*rational matrix with no poles and no zeros in H*(*M* ∩ *M*�

*U*2(*s*) invertible in **F***M*(*s*)*m*×*<sup>m</sup>* such that *T*(*s*) = *U*1(*s*) Diag

) ∪ {(*s*)} = Specm(**F**[*s*]) and *H*(*M* \ {(*s* − *a*)}) ∩ (*H*(*M*�

*s*−*k*<sup>1</sup> ,...,*s*−*km*

*<sup>ψ</sup>*1(*s*),..., *�m*(*s*)

as left Wiener–Hopf factorization indices with respect to *H*(*M*�

*h*(*�m*(*s*))

*cm*

) ∪ {(*s*)} where *ci* <sup>=</sup> *�i*(*a*)

*factorization indices with respect to H*(*M*�

*T*(*f*(*s*))*<sup>T</sup>* = *U*2(*f*(*s*))*<sup>T</sup>* Diag

*5.2.1. Proof of Theorem 29: Sufficiency*

Let *<sup>k</sup>*<sup>1</sup> ≥ ··· ≥ *km* be integers, *�*1(*s*)

*<sup>h</sup>*(*ψ*1(*s*))*sl*1+*d*(*ψ*1(*s*))−*d*(*�*1(*s*)),..., <sup>1</sup>

, by Lemmas 32 and 33, *<sup>h</sup>*(*�*1(*s*))

one and each denominator divides the previous one.

non-zero constant, and put *D* = Diag (*c*1,..., *cm*). Hence,

*<sup>ψ</sup>i*(*f*(*s*)) <sup>=</sup> *<sup>h</sup>*(*�i*(*s*))

A. Amparan, S. Marcaida, I. Zaballa *Universidad del País Vasco/Euskal Herriko Unibertsitatea UPV/EHU, Spain*

#### **6. References**

	- [12] Kailath, T. [1980]. *Linear systems*, Prentice Hall, New Jersey.
	- [13] Newman, M. [1972]. *Integral matrices*, Academic Press, New York and London.
	- [14] Rosenbrock, H. H. [1970]. *State-space and multivariable theory*, Thomas Nelson and Sons, London.
	- [15] Vardulakis, A. I. G. [1991]. *Linear multivariable control*, John Wiley and Sons, New York.
	- [16] Vidyasagar, M. [1985]. *Control system synthesis. A factorization approach*, The MIT Press, New York.
	- [17] Wolovich, W. A. [1974]. *Linear multivariable systems*, Springer-Verlag, New York.
	- [18] Zaballa, I. [1997]. Controllability and hermite indices of matrix pairs, *Int. J. Control* 68(1): 61–86.

## **Gauge Theory, Combinatorics, and Matrix Models**

Taro Kimura

28 Will-be-set-by-IN-TECH

[14] Rosenbrock, H. H. [1970]. *State-space and multivariable theory*, Thomas Nelson and Sons,

[15] Vardulakis, A. I. G. [1991]. *Linear multivariable control*, John Wiley and Sons, New York. [16] Vidyasagar, M. [1985]. *Control system synthesis. A factorization approach*, The MIT Press,

[18] Zaballa, I. [1997]. Controllability and hermite indices of matrix pairs, *Int. J. Control*

[13] Newman, M. [1972]. *Integral matrices*, Academic Press, New York and London.

[17] Wolovich, W. A. [1974]. *Linear multivariable systems*, Springer-Verlag, New York.

[12] Kailath, T. [1980]. *Linear systems*, Prentice Hall, New Jersey.

London.

New York.

68(1): 61–86.

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/46481

**1. Introduction**

Quantum field theory is the most universal method in physics, applied to all the area from condensed-matter physics to high-energy physics. The standard tool to deal with quantum field theory is the perturbation method, which is quite useful if we know the vacuum of the system, namely the starting point of our analysis. On the other hand, sometimes the vacuum itself is not obvious due to the quantum nature of the system. In that case, since the perturbative method is not available any longer, we have to treat the theory in a non-perturbative way.

Supersymmetric gauge theory plays an important role in study on the non-perturbative aspects of quantum field theory. The milestone paper by Seiberg and Witten proposed a *solution* to N = 2 supersymmetric gauge theory [48, 49], which completely describes the low energy effective behavior of the theory. Their solution can be written down by an auxiliary complex curve, called *Seiberg-Witten curve*, but its meaning was not yet clear and the origin was still mysterious. Since the establishment of Seiberg-Witten theory, tremendous number of works are devoted to understand the Seiberg-Witten's solution, not only by physicists but also mathematicians. In this sense the solution was not a *solution* at that time, but just a *starting point* of the exploration.

One of the most remarkable progress in N = 2 theories referring to Seiberg-Witten theory is then the exact derivation of the gauge theory partition function by performing the integral over the instanton moduli space [43]. The partition function is written down by multiple partitions, thus we can discuss it in a combinatorial way. It was mathematically proved that the partition function correctly reproduces the Seiberg-Witten solution. This means Seiberg-Witten theory was mathematically established at that time.

The recent progress on the four dimensional N = 2 supersymmetric gauge theory has revealed a remarkable relation to the two dimensional conformal field theory [1]. This relation provides the explicit interpretation for the partition function of the four dimensional gauge theory as the conformal block of the two dimensional Liouville field theory. It is naturally regarded as a consequence of the M-brane compactifications [23, 60], and also reproduces

#### 2 Linear Algebra 76 Linear Algebra – Theorems and Applications

the results of Seiberg-Witten theory. It shows how Seiberg-Witten curve characterizes the corresponding four dimensional gauge theory, and thus we can obtain a novel viewpoint of Seiberg-Witten theory.

Based on the connection between the two and four dimensional theories, established results on the two dimensional side can be reconsidered from the viewpoint of the four dimensional theory, and vice versa. One of the useful applications is the matrix model description of the supersymmetric gauge theory [12, 16, 17, 47]. This is based on the fact that the conformal block on the sphere can be also regarded as the matrix integral, which is called Dotsenko-Fateev integral representation [14, 15]. In this direction some extensions of the matrix model description are performed by starting with the two dimensional conformal field theory.

Another type of the matrix model is also investigated so far [27, 28, 30, 52, 53]. This is apparently different from the Dotsenko-Fateev type matrix models, but both of them correctly reproduce the results of the four dimensional gauge theory, e.g. Seiberg-Witten curve. While these studies mainly focus on rederiving the gauge theory results, the present author reveals the new kind of Seiberg-Witten curve by studying the corresponding new matrix model [27, 28]. Such a matrix models is directly derived from the combinatorial representation of the partition function by considering its asymptotic behavior. This treatment is quite analogous to the matrix integral representation of the combinatorial object, for example, the longest increasing subsequences in random permutations [3], the non-equilibrium stochastic model, so-called TASEP [26], and so on (see also [46]). Their remarkable connection to the Tracy-Widom distribution [56] can be understood from the viewpoint of the random matrix theory through the Robinson-Schensted-Knuth (RSK) correspondence (see e.g. [51]).

In this article we review such a universal relation between combinatorics and the matrix model, and discuss its relation to the gauge theory. The gauge theory consequence can be naturally extacted from such a matrix model description. Actually the spectral curve of the matrix model can be interpreted as Seiberg-Witten curve for N = 2 supersymmetric gauge theory. This identification suggests some aspects of the gauge theory are also described by the significant universality of the matrix model.

This article is organized as follows. In section 2 we introduce statistical models defined in a combinaorial manner. These models are based on the Plancherel measure on a combinatorial object, and its origin from the gauge theory perspective is also discussed. In section 3 it is shown that the matrix model is derived from the combinatorial model by considering its asymptotic limit. There are various matrix integral representations, corresponding to some deformations of the combinatorial model. In section 4 we investigate the large matrix size limit of the matrix model. It is pointed out that the algebraic curve is quite useful to study one-point function. Its relation to Seiberg-Witten theory is also discussed. Section 5 is devoted to conclusion.

## **2. Combinatorial partition function**

In this section we introduce several kinds of combinatorial models. Their partition functions are defined as summation over partitions with a certain weight function, which is called *Plancherel measure*. It is also shown that such a combinatorial partition function is obtained by performing the path integral for supersymmetric gauge theories.

**Figure 1.** Graphical representation of a partition *λ* = (5, 4, 3, 1, 1) and its transposed partition *λ*ˇ = (5, 3, 2, 2, 1) by the associated Young diagrams. There are 5 non-zero entries in both of them, �(*λ*) = *λ*ˇ <sup>1</sup> = 5 and �(*λ*ˇ) = *λ*<sup>1</sup> = 5.

#### **2.1. Random partition model**

2 Linear Algebra

the results of Seiberg-Witten theory. It shows how Seiberg-Witten curve characterizes the corresponding four dimensional gauge theory, and thus we can obtain a novel viewpoint of

Based on the connection between the two and four dimensional theories, established results on the two dimensional side can be reconsidered from the viewpoint of the four dimensional theory, and vice versa. One of the useful applications is the matrix model description of the supersymmetric gauge theory [12, 16, 17, 47]. This is based on the fact that the conformal block on the sphere can be also regarded as the matrix integral, which is called Dotsenko-Fateev integral representation [14, 15]. In this direction some extensions of the matrix model description are performed by starting with the two dimensional conformal field theory.

Another type of the matrix model is also investigated so far [27, 28, 30, 52, 53]. This is apparently different from the Dotsenko-Fateev type matrix models, but both of them correctly reproduce the results of the four dimensional gauge theory, e.g. Seiberg-Witten curve. While these studies mainly focus on rederiving the gauge theory results, the present author reveals the new kind of Seiberg-Witten curve by studying the corresponding new matrix model [27, 28]. Such a matrix models is directly derived from the combinatorial representation of the partition function by considering its asymptotic behavior. This treatment is quite analogous to the matrix integral representation of the combinatorial object, for example, the longest increasing subsequences in random permutations [3], the non-equilibrium stochastic model, so-called TASEP [26], and so on (see also [46]). Their remarkable connection to the Tracy-Widom distribution [56] can be understood from the viewpoint of the random matrix

theory through the Robinson-Schensted-Knuth (RSK) correspondence (see e.g. [51]).

significant universality of the matrix model.

**2. Combinatorial partition function**

by performing the path integral for supersymmetric gauge theories.

to conclusion.

In this article we review such a universal relation between combinatorics and the matrix model, and discuss its relation to the gauge theory. The gauge theory consequence can be naturally extacted from such a matrix model description. Actually the spectral curve of the matrix model can be interpreted as Seiberg-Witten curve for N = 2 supersymmetric gauge theory. This identification suggests some aspects of the gauge theory are also described by the

This article is organized as follows. In section 2 we introduce statistical models defined in a combinaorial manner. These models are based on the Plancherel measure on a combinatorial object, and its origin from the gauge theory perspective is also discussed. In section 3 it is shown that the matrix model is derived from the combinatorial model by considering its asymptotic limit. There are various matrix integral representations, corresponding to some deformations of the combinatorial model. In section 4 we investigate the large matrix size limit of the matrix model. It is pointed out that the algebraic curve is quite useful to study one-point function. Its relation to Seiberg-Witten theory is also discussed. Section 5 is devoted

In this section we introduce several kinds of combinatorial models. Their partition functions are defined as summation over partitions with a certain weight function, which is called *Plancherel measure*. It is also shown that such a combinatorial partition function is obtained

Seiberg-Witten theory.

Let us first recall a partition of a positive integer *n*: it is a way of writing *n* as a sum of positive integers

$$
\lambda = (\lambda\_1, \lambda\_2, \dots, \lambda\_{\ell(\lambda)}) \tag{1}
$$

satisfying the following conditions,

$$m = \sum\_{i=1}^{\ell(\lambda)} \lambda\_i \equiv |\lambda|\_{\prime} \qquad \lambda\_1 \ge \lambda\_2 \ge \cdots \cdot \lambda\_{\ell(\lambda)} > 0 \tag{2}$$

Here �(*λ*) is the number of non-zero entries in *λ*. Now it is convenient to define *λ<sup>i</sup>* = 0 for *i* > �(*λ*). Fig. 2 shows *Young diagram*, which graphically describes a partition *λ* = (5, 4, 2, 1, 1) with �(*λ*) = 5.

It is known that the partition is quite usefull for representation theory. We can obtain an irreducible representation of symmetric group S*n*, which is in one-to-one correspondence with a partition *λ* with |*λ*| = *n*. For such a finite group, one can define a natural measure, which is called *Plancherel measure*,

$$
\mu\_{\mathfrak{n}}(\lambda) = \frac{(\dim \lambda)^2}{n!} \tag{3}
$$

This measure is normalized as

$$\sum\_{\substack{\lambda \text{ s.t. } |\lambda|=n}} \mu\_n(\lambda) = 1 \tag{4}$$

It is also interpreted as Fourier transform of Haar measure on the group. This measure has another useful representation, which is described in a combinatorial way,

$$\mu\_{\mathfrak{n}}(\lambda) = n! \prod\_{(i,j)\in\lambda} \frac{1}{h(i,j)^2} \tag{5}$$

This *h*(*i*, *j*) is called *hook length*, which is defined with *arm length* and *leg length*,

$$\begin{aligned} h(i,j) &= a(i,j) + l(i,j) + 1, \\ a(i,j) &= \lambda\_i - j, \\ l(i,j) &= \lambda\_j - i \end{aligned} \tag{6}$$

Here *λ*ˇ stands for the transposed partition. Thus the height of a partition *λ* can be explicitly written as �(*λ*) = *λ*ˇ 1.

**Figure 2.** Combinatorics of Young diagram. Definitions of hook, arm and leg lengths are shown in (6). For the shaded box in this figure, *a*(2, 3) = 4, *l*(2, 3) = 3, and *h*(2, 3) = 8.

With this combinatorial measure, we now introduce the following partition function,

$$Z\_{\mathbf{U}(1)} = \sum\_{\lambda} \left(\frac{\Lambda}{\hbar}\right)^{2|\lambda|} \prod\_{(i,j)\in\lambda} \frac{1}{\hbar (i,j)^2} \tag{7}$$

This model is often called *random partition model*. Here Λ is regarded as a parameter like a *chemical potential*, or a *fugacity*, and ¯*h* stands for the size of boxes.

Note that a deformed model, which includes higher Casimir potentials, is also investigated in detail [19],

$$Z\_{\text{higher}} = \sum\_{\lambda} \prod\_{(i,j) \in \lambda} \frac{1}{h(i,j)^2} \prod\_{k=1} e^{-\mathcal{G}k\mathcal{C}\_k(\lambda)} \tag{8}$$

In this case the chemical potential term is absorbed by the linear potential term. There is an interesting interpretation of this deformation in terms of topological string, gauge theory and so on [18, 38].

In order to compute the U(1) partition function it is useful to rewrite it in a "canonical form" instead of the "grand canonical form" which is originally shown in (7),

$$Z\_{\mathbf{U}(1)} = \sum\_{n=0} \sum\_{\substack{\lambda \text{ s.t. } |\lambda|=n}} \left(\frac{\Lambda}{\hbar}\right)^{2n} \prod\_{(i,j) \in \lambda} \frac{1}{\hbar (i,j)^2} \tag{9}$$

Due to the normalization condition (4), this partition function can be computed as

$$Z\_{\mathbf{U}(1)} = \exp\left(\frac{\Lambda}{\hbar}\right)^2\tag{10}$$

Although this is explicitly solvable, its universal property and explicit connections to other models are not yet obvious. We will show, in section 3 and section 4, the matrix model description plays an important role in discussing such an interesting aspect of the combinatorial model.

Now let us remark one interesting observation, which is partially related to the following discussion. The combinatorial partition function (7) has another field theoretical representation using the free boson field [44]. We now consider the following coherent state,

$$|\psi\rangle = \exp\left(\frac{\Lambda}{\hbar}a\_{-1}\right)|0\rangle \tag{11}$$

Here we introduce Heisenberg algebra, satisfying the commutation relation, [*an*, *am*] = *nδn*+*m*,0, and the vacuum |0� annihilated by any positive modes, *an*|0� = 0 for *n* > 0. Then it is easy to show the norm of this state gives rise to the partition function,

$$Z\_{\mathbf{U}(1)} = \langle \psi | \psi \rangle \tag{12}$$

Similar kinds of observation is also performed for generalized combinatorial models introduced in section 2.2 [22, 44, 55].

Let us then introduce some generalizations of the U(1) model. First is what we call *β-deformed model* including an arbitrary parameter *β* ∈ **R**,

$$Z\_{\mathbf{U}(1)}^{(\beta)} = \sum\_{\lambda} \left(\frac{\Lambda}{\hbar}\right)^{2|\lambda|} \prod\_{(i,j)\in\lambda} \frac{1}{h\_{\beta}(i,j)h^{\beta}(i,j)}\tag{13}$$

Here we involve the deformed hook lengths,

4 Linear Algebra

**Figure 2.** Combinatorics of Young diagram. Definitions of hook, arm and leg lengths are shown in (6).

<sup>2</sup>|*λ*|

This model is often called *random partition model*. Here Λ is regarded as a parameter like a

Note that a deformed model, which includes higher Casimir potentials, is also investigated in

In this case the chemical potential term is absorbed by the linear potential term. There is an interesting interpretation of this deformation in terms of topological string, gauge theory and

In order to compute the U(1) partition function it is useful to rewrite it in a "canonical form"

 Λ *h*¯

> Λ *h*¯ <sup>2</sup>

<sup>2</sup>*<sup>n</sup>*

∏ (*i*,*j*)∈*λ* 1

1 *<sup>h</sup>*(*i*, *<sup>j</sup>*)<sup>2</sup> ∏ *k*=1 *e*

∏ (*i*,*j*)∈*λ* 1

*<sup>h</sup>*(*i*, *<sup>j</sup>*)<sup>2</sup> (7)

<sup>−</sup>*gkCk* (*λ*) (8)

*<sup>h</sup>*(*i*, *<sup>j</sup>*)<sup>2</sup> (9)

(10)

With this combinatorial measure, we now introduce the following partition function,

 Λ *h*¯

∏ (*i*,*j*)∈*λ*

∑ *λ s*.*t*. |*λ*|=*n*

Due to the normalization condition (4), this partition function can be computed as

*Z*U(1) = exp

Although this is explicitly solvable, its universal property and explicit connections to other models are not yet obvious. We will show, in section 3 and section 4, the matrix

For the shaded box in this figure, *a*(2, 3) = 4, *l*(2, 3) = 3, and *h*(2, 3) = 8.

*chemical potential*, or a *fugacity*, and ¯*h* stands for the size of boxes.

*Z*higher = ∑

instead of the "grand canonical form" which is originally shown in (7),

*n*=0

*Z*U(1) = ∑

detail [19],

so on [18, 38].

*Z*U(1) = ∑

*λ*

*λ*

$$h\_{\beta}(\mathbf{i}, \mathbf{j}) = a(\mathbf{i}, \mathbf{j}) + \beta l(\mathbf{i}, \mathbf{j}) + 1, \qquad h^{\beta}(\mathbf{i}, \mathbf{j}) = a(\mathbf{i}, \mathbf{j}) + \beta l(\mathbf{i}, \mathbf{j}) + \beta \tag{14}$$

This generalized model corresponds to Jack polynomial, which is a kind of symmetric polynomial obtained by introducing a free parameter to Schur polynomial [34]. This Jack polynomial is applied to several physical theories: quantum integrable model called Calogero-Sutherland model [10, 54], quantum Hall effect [4–6] and so on.

Second is a further generalized model involving two free parameters,

$$Z\_{\mathbf{U}(1)}^{(q,t)} = \sum\_{\lambda} \left(\frac{\Lambda}{\hbar}\right)^{2|\lambda|} \prod\_{(i,j)\in\lambda} \frac{(1-q)(1-q^{-1})}{(1-q^{a(i,j)+1}t^{l(i,j)})(1-q^{-a(i,j)}t^{-l(i,j)-1})} \tag{15}$$

This is just a *q*-analog of the previous combinatorial model. One can see this is reduced to the *<sup>β</sup>*-deformed model (13) in the limit of *<sup>q</sup>* <sup>→</sup> 1 with fixing *<sup>t</sup>* <sup>=</sup> *<sup>q</sup>β*. This generalization is also related to the symmetric polynomial, which is called *Macdonald polynomial* [34]. This symmetric polynomial is used to study Ruijsenaars-Schneider model [45], and the stochastic process based on this function has been recently proposed [8].

Next is **Z***r*-generalization of the model, which is defined as

$$Z\_{\text{orbifold},\mathcal{U}(1)} = \sum\_{\lambda} \left(\frac{\Lambda}{\hbar}\right)^{2|\lambda|} \prod\_{\Gamma:\text{inv}\subset\lambda} \frac{1}{h(i,j)^2} \tag{16}$$


**Figure 3.** Γ-invariant sector for U(1) theory with *λ* = (8, 5, 5, 4, 2, 2, 2, 1). Numbers in boxes stand for their hook lengths *<sup>h</sup>*(*i*, *<sup>j</sup>*) = *<sup>λ</sup><sup>i</sup>* <sup>−</sup> *<sup>j</sup>* <sup>+</sup> *<sup>λ</sup>*<sup>ˇ</sup> *<sup>j</sup>* <sup>−</sup> *<sup>i</sup>* <sup>+</sup> 1. Shaded boxes are invariant under the action of <sup>Γ</sup> <sup>=</sup> **<sup>Z</sup>**3.

Here the product is taken only for the Γ-invariant sector as shown in Fig. 3,

$$h(i,j) = a(i,j) + l(i,j) + 1 \equiv 0 \pmod{r} \tag{17}$$

This restriction is considered in order to study the four dimensional supersymmetric gauge theory on orbifold **R**4/**Z***<sup>r</sup>* ∼= **C**2/**Z***<sup>r</sup>* [11, 20, 27], thus we call this *orbifold partition function*. This also corresponds to a certain symmetric polynomial [57] (see also [32]), which is related to the Calogero-Sutherland model involving spin degrees of freedom. We can further generalize this model (16) to the *β*- or the *q*-deformed **Z***r*-orbifold model, and the generic toric orbifold model [28].

Let us comment on a relation between the orbifold partition function and the *q*-deformed model. Taking the limit of *q* → 1, the latter is reduced to the U(1) model because the *q*-integer is just replaced by the usual integer in such a limit,

$$[\mathfrak{x}]\_q \equiv \frac{1 - q^{-\mathfrak{x}}}{1 - q^{-1}} \longrightarrow \mathfrak{x} \tag{18}$$

This can be easily shown by l'Hopital's rule and so on. On the other hand, parametrizing *q* → *ωrq* with *ω<sup>r</sup>* = exp(2*πi*/*r*) being the primitive *r*-th root of unity, we have

$$\frac{1 - (\omega\_r q)^{-\chi}}{1 - (\omega\_r q)^{-1}} \xrightarrow{q \to 1} \begin{cases} \text{x } (\mathfrak{x} \equiv 0, \text{ mod } r) \\ 1 \ (\mathfrak{x} \not\equiv 0, \text{ mod } r) \end{cases} \tag{19}$$

Therefore the orbifold partition function (16) is derived from the *q*-deformed one (15) by taking this root of unity limit. This prescription is useful to study its asymptotic behavior.

#### **2.2. Gauge theory partition function**

The path integral in quantum field theory involves some kinds of divergence, which are due to infinite degrees of freedom in the theory. On the other hand, we can exactly perform the path integral for several highly supersymmetric theories. We now show that the gauge theory partition function can be described in a combinatorial way, and yields some extended versions of the model we have introduced in section 2.1.

The main part of the gauge theory path integral is just evaluation of the moduli space volume for a topological excitation, for example, a vortex in two dimensional theory and an instanton in four dimensional theory. Here we concentrate on the four dimentional case. See [13, 21, 50] for the two dimensional vortex partition function. The most usuful method to deal with the instanton is ADHM construction [2]. According to this, the instanton moduli space for *k*-instanton in SU(*n*) gauge theory on **R**4, is written as a kind of hyper-Kähler quotient,

6 Linear Algebra


**Figure 3.** Γ-invariant sector for U(1) theory with *λ* = (8, 5, 5, 4, 2, 2, 2, 1). Numbers in boxes stand for their hook lengths *<sup>h</sup>*(*i*, *<sup>j</sup>*) = *<sup>λ</sup><sup>i</sup>* <sup>−</sup> *<sup>j</sup>* <sup>+</sup> *<sup>λ</sup>*<sup>ˇ</sup> *<sup>j</sup>* <sup>−</sup> *<sup>i</sup>* <sup>+</sup> 1. Shaded boxes are invariant under the action of <sup>Γ</sup> <sup>=</sup> **<sup>Z</sup>**3.

This restriction is considered in order to study the four dimensional supersymmetric gauge theory on orbifold **R**4/**Z***<sup>r</sup>* ∼= **C**2/**Z***<sup>r</sup>* [11, 20, 27], thus we call this *orbifold partition function*. This also corresponds to a certain symmetric polynomial [57] (see also [32]), which is related to the Calogero-Sutherland model involving spin degrees of freedom. We can further generalize this model (16) to the *β*- or the *q*-deformed **Z***r*-orbifold model, and the generic toric orbifold model

Let us comment on a relation between the orbifold partition function and the *q*-deformed model. Taking the limit of *q* → 1, the latter is reduced to the U(1) model because the *q*-integer

This can be easily shown by l'Hopital's rule and so on. On the other hand, parametrizing

Therefore the orbifold partition function (16) is derived from the *q*-deformed one (15) by taking this root of unity limit. This prescription is useful to study its asymptotic behavior.

The path integral in quantum field theory involves some kinds of divergence, which are due to infinite degrees of freedom in the theory. On the other hand, we can exactly perform the path integral for several highly supersymmetric theories. We now show that the gauge theory partition function can be described in a combinatorial way, and yields some extended versions

*<sup>x</sup>* (*<sup>x</sup>* <sup>≡</sup> 0, mod *<sup>r</sup>*)

[*x*]*<sup>q</sup>* <sup>≡</sup> <sup>1</sup> <sup>−</sup> *<sup>q</sup>*−*<sup>x</sup>*

*q*→1 −→

*q* → *ωrq* with *ω<sup>r</sup>* = exp(2*πi*/*r*) being the primitive *r*-th root of unity, we have

<sup>1</sup> <sup>−</sup> (*ωrq*)−*<sup>x</sup>* <sup>1</sup> − (*ωrq*)−<sup>1</sup>

  *h*(*i*, *j*) = *a*(*i*, *j*) + *l*(*i*, *j*) + 1 ≡ 0 (mod *r*) (17)

<sup>1</sup> <sup>−</sup> *<sup>q</sup>*−<sup>1</sup> −→ *<sup>x</sup>* (18)

<sup>1</sup> (*<sup>x</sup>* �≡ 0, mod *<sup>r</sup>*) (19)


 -

  -

 


Here the product is taken only for the Γ-invariant sector as shown in Fig. 3,

is just replaced by the usual integer in such a limit,

**2.2. Gauge theory partition function**

of the model we have introduced in section 2.1.

[28].

 

$$\mathcal{M}\_{\mathfrak{n},k} = \left\{ (B\_1, B\_2, I\_\prime J) \,|\,\mu\_\mathbb{R} = 0, \mu\_\mathbb{C} = 0 \right\} / \,\mathsf{U}(k) \tag{20}$$

$$B\_{1,2} \in \text{Hom}(\mathbb{C}^k, \mathbb{C}^k), \quad I \in \text{Hom}(\mathbb{C}^n, \mathbb{C}^k), \quad I \in \text{Hom}(\mathbb{C}^k, \mathbb{C}^n) \tag{21}$$

$$
\mu\_{\mathbb{R}} = [B\_1 \, B\_1^\dagger] + [B\_2 \, B\_2^\dagger] + II^\dagger - I^\dagger I,\tag{22}
$$

$$
\mu\_{\mathbb{C}} = [B\_1, B\_2] + II \tag{23}
$$

The *k* × *k* matrix condition *μ***<sup>R</sup>** = *μ***<sup>C</sup>** = 0, and parameters (*B*1, *B*2, *I*, *J*) satisfying this condition are called ADHM equation and ADHM data. Note that they are identified under the following U(*k*) transformation,

$$(\mathbf{g}\_1, \mathbf{g}\_2, \mathbf{l}\_\prime \mathbf{l}\_\prime) \sim (\mathbf{g} \mathbf{g}\_1 \mathbf{g}^{-1}, \mathbf{g} \mathbf{g}\_2 \mathbf{g}^{-1}, \mathbf{g} \mathbf{l}\_\prime \mathbf{l}\_\prime \mathbf{g}^{-1}), \qquad \mathbf{g} \in \mathbf{U}(k) \tag{24}$$

Thus all we have to do is to estimate the volume of this parameter space. However it is well known that there are some singularities in this moduli space, so that one has to regularize it in order to obtain a meaningful result. Its regularized volume had been derived by applying the localization formula to the moduli space integral [41], and it was then shown that the partition function correctly reproduces Seiberg-Witten theory [43].

We then consider the action of isometries on **C**<sup>2</sup> ∼= **R**<sup>4</sup> for the ADHM data. If we assign (*z*1, *<sup>z</sup>*2) <sup>→</sup> (*ei�*<sup>1</sup> *<sup>z</sup>*1,*ei�*<sup>2</sup> *<sup>z</sup>*2) for the spatial coordinate of **<sup>C</sup>**2, and U(1)*n*−<sup>1</sup> rotation coming from the gauge symmetry SU(*n*), ADHM data transform as

$$(B\_1, B\_2, I, I) \quad \longrightarrow \quad \left(T\_1 B\_{1\prime} T\_2 B\_{2\prime} I T\_a^{-1}, T\_1 T\_2 T\_a I\right) \tag{25}$$

where we define the torus actions as *Ta* <sup>=</sup> diag(*eia*<sup>1</sup> , ··· ,*eian* ) <sup>∈</sup> <sup>U</sup>(1)*n*−1, *<sup>T</sup><sup>α</sup>* <sup>=</sup> *<sup>e</sup>i�α* <sup>∈</sup> <sup>U</sup>(1)2. Note that these toric actions are based on the maximal torus of the gauge theory symmetry, <sup>U</sup>(1)<sup>2</sup> <sup>×</sup> <sup>U</sup>(1)*n*−<sup>1</sup> <sup>⊂</sup> SO(4) <sup>×</sup> SU(*n*). We have to consider the fixed point of these isometries up to gauge transformation *g* ∈ U(*k*) to perform the localization formula.

The localization formula in the instanton moduli space is based on the vector field *ξ*∗, which is associated with *<sup>ξ</sup>* <sup>∈</sup> <sup>U</sup>(1)<sup>2</sup> <sup>×</sup> <sup>U</sup>(1)*n*−1. It generates the one-parameter flow *<sup>e</sup>t<sup>ξ</sup>* on the moduli space M, corresponding to the isometries. The vector field is represented by the element of the maximal torus of the gauge theory symmetry under the Ω-background deformation. The gauge theory action is invariant under the deformed BRST transformation, whose generator satisfies *ξ*<sup>∗</sup> = {*Q*∗, *Q*∗}/2. Thus this generator can be interpreted as the equivariant derivative *d<sup>ξ</sup>* = *d* + *iξ*<sup>∗</sup> where *iξ*<sup>∗</sup> stands for the contraction with the vector field *ξ*∗. The localization formula is given by

$$\int\_{\mathcal{M}} \mathfrak{a}(\boldsymbol{\xi}) = (-2\pi)^{n/2} \sum\_{\mathbf{x}\_0} \frac{\mathfrak{a}\_0(\boldsymbol{\xi})(\mathbf{x}\_0)}{\mathbf{det}^{1/2} \mathcal{L}\_{\mathbf{x}\_0}} \tag{26}$$

#### 8 Linear Algebra 82 Linear Algebra – Theorems and Applications

where *α*(*ξ*) is an equivariant form, which is related to the gauge theory action. *α*0(*ξ*) is zero degree part and L*x*<sup>0</sup> : *Tx*0M → *Tx*0M is the map generated by the vector field *ξ*<sup>∗</sup> at the fixed points *x*0. These fixed points are defined as *ξ*∗(*x*0) = 0 up to U(*k*) transformation of the instanton moduli space.

Let us then study the fixed point in the moduli space. The fixed point condition for them are obtained from the infinitesimal version of (24) and (25) as

$$(\phi\_{\dot{l}} - \phi\_{\dot{l}} + \varepsilon\_a)B\_{\text{a,ij}} = 0, \qquad (\phi\_{\dot{l}} - a\_{\dot{l}})I\_{\text{il}} = 0, \qquad (-\phi\_{\dot{l}} + a\_{\dot{l}} + \varepsilon)I\_{\text{li}} = 0 \tag{27}$$

where the element of U(*k*) gauge transformation is diagonalized as *<sup>e</sup>i<sup>φ</sup>* <sup>=</sup> diag(*eiφ*<sup>1</sup> , ··· ,*eiφ<sup>k</sup>* ) <sup>∈</sup> U(*k*) with *�* = *�*<sup>1</sup> + *�*2. We can show that an eigenvalue of *φ* turns out to be

$$a\_l + (j-1)\varepsilon\_1 + (i-1)\varepsilon\_2 \tag{28}$$

and the corresponding eigenvector is given by

$$B\_1^{j-1} B\_2^{i-1} I\_l \tag{29}$$

Since *φ* is a finite dimensional matrix, we can obtain *kl* independent vectors from (29) with *k*<sup>1</sup> + ··· + *kn* = *k*. This means that the solution of this condition can be characterized by *n*-tuple Young diagrams, or partitions *<sup>λ</sup>* = (*λ*(1), ··· , *<sup>λ</sup>*(*n*)) [42]. Thus the characters of the vector spaces are yielding

$$V = \sum\_{l=1}^{n} \sum\_{\substack{(i,j) \in \lambda^{(l)}}} T\_{a\_l} T\_1^{-j+1} T\_2^{-i+1} \, \prime \qquad W = \sum\_{l=1}^{n} T\_{a\_l} \tag{30}$$

and that of the tangent space at the fixed point under the isometries can be represented in terms of the *n*-tuple partition as

$$\begin{split} \chi\_{\vec{\lambda}} &= -V^\* V (1 - T\_1) (1 - T\_2) + W^\* V + V^\* W T\_1 T\_2 \\ &= \sum\_{l,m}^n \sum\_{(i,j) \in \lambda^{(l)}} \left( T\_{a\_{ml}} T\_1^{-\lambda\_j^{(l)} + i} T\_2^{\lambda\_i^{(m)} - j + 1} + T\_{a\_{lm}} T\_1^{\lambda\_j^{(l)} - i + 1} T\_2^{-\lambda\_i^{(m)} + j} \right) \end{split} \tag{31}$$

Here *λ*ˇ is a conjugated partition. Therefore the instanton partition function is obtained by reading the weight function from the character [43, 44],

$$\mathbf{Z\_{SU(n)}} = \sum\_{\vec{\lambda}} \Lambda^{2\mathbf{n}[\vec{\lambda}]} Z\_{\vec{\lambda}} \tag{32}$$

$$Z\_{\vec{\lambda}} = \prod\_{i=1}^{n} \prod\_{\substack{\vec{\lambda} \ \vec{\lambda} \ \vec{\lambda} \ \vdots \ \vec{\lambda} \ \vec{\lambda} \ \vdots \ \vec{\lambda} \ \vdots \ \vec{\lambda} \ \vdots \ \vec{\lambda} \ \vdots \ \vec{\lambda} \ \vdots \ \vec{\lambda} \end{pmatrix}} \frac{1}{1 + \prod\_{\substack{\vec{\lambda} \ \vec{\lambda} \ \vec{\lambda} \ \vdots \ \vec{\lambda} \ \vdots \ \vec{\lambda} \ \vdots \ \vec{\lambda} \ \vdots \ \vdots \ \vdots \ \vdots}} $$

$$Z\_{\vec{\lambda}} = \prod\_{l,m} \prod\_{\substack{(i,j) \in \lambda^{(l)} \ a\_{ml}}} \overline{a\_{ml} + \varepsilon\_2(\lambda\_i^{(m)} - j + 1) - \varepsilon\_1(\vec{\lambda}\_j^{(l)} - i)} \overline{a\_{lm} - \varepsilon\_2(\lambda\_i^{(m)} - j) + \varepsilon\_1(\vec{\lambda}\_j^{(l)} - i + 1)} \tag{33}$$

This is regarded as a generalized model of (7) or (13). Furthermore by lifting it to the five dimensional theory on **<sup>R</sup>**<sup>4</sup> <sup>×</sup> *<sup>S</sup>*1, one can obtain a generalized version of the *<sup>q</sup>*-deformed partition function (15). Actually it is easy to see these SU(*n*) models are reduced to the U(1) models in the case of *n* = 1. Note, if we take into account other matter contributions in addition to the vector multiplet, this partition function involves the associated combinatorial factors. We can extract various properties of the gauge theory from these partition functions, especially its asymptotic behavior.

#### **3. Matrix model description**

In this section we discuss the matrix model description of the combinatorial partition function. The matrix integral representation can be treated in a standard manner, which is developed in the random matrix theory [40].

#### **3.1. Matrix integral**

8 Linear Algebra

where *α*(*ξ*) is an equivariant form, which is related to the gauge theory action. *α*0(*ξ*) is zero degree part and L*x*<sup>0</sup> : *Tx*0M → *Tx*0M is the map generated by the vector field *ξ*<sup>∗</sup> at the fixed points *x*0. These fixed points are defined as *ξ*∗(*x*0) = 0 up to U(*k*) transformation of the

Let us then study the fixed point in the moduli space. The fixed point condition for them are

where the element of U(*k*) gauge transformation is diagonalized as *<sup>e</sup>i<sup>φ</sup>* <sup>=</sup> diag(*eiφ*<sup>1</sup> , ··· ,*eiφ<sup>k</sup>* ) <sup>∈</sup>

*Bj*−<sup>1</sup> <sup>1</sup> *<sup>B</sup>i*−<sup>1</sup>

*Tal <sup>T</sup>*−*j*+<sup>1</sup>

*<sup>λ</sup>* = −*V*∗*V*(1 − *T*1)(1 − *T*2) + *W*∗*V* + *V*∗*WT*1*T*<sup>2</sup>

<sup>−</sup>*λ*<sup>ˇ</sup> (*l*) *<sup>j</sup>* +*i* <sup>1</sup> *<sup>T</sup>λ*(*m*)

1

*<sup>i</sup>* <sup>−</sup> *<sup>j</sup>* <sup>+</sup> <sup>1</sup>) <sup>−</sup> *�*1(*λ*<sup>ˇ</sup> (*l*)

Since *φ* is a finite dimensional matrix, we can obtain *kl* independent vectors from (29) with *k*<sup>1</sup> + ··· + *kn* = *k*. This means that the solution of this condition can be characterized by

<sup>1</sup> *<sup>T</sup>*−*i*+<sup>1</sup>

and that of the tangent space at the fixed point under the isometries can be represented in

Here *λ*ˇ is a conjugated partition. Therefore the instanton partition function is obtained by

This is regarded as a generalized model of (7) or (13). Furthermore by lifting it to the five dimensional theory on **<sup>R</sup>**<sup>4</sup> <sup>×</sup> *<sup>S</sup>*1, one can obtain a generalized version of the *<sup>q</sup>*-deformed partition function (15). Actually it is easy to see these SU(*n*) models are reduced to the U(1) models in the case of *n* = 1. Note, if we take into account other matter contributions in

*<sup>i</sup>* −*j*+1 <sup>2</sup> + *Talm T*

<sup>2</sup> , *W* =

*<sup>λ</sup>* (32)

*<sup>j</sup>* − *i*)

U(*k*) with *�* = *�*<sup>1</sup> + *�*2. We can show that an eigenvalue of *φ* turns out to be

(*φ<sup>i</sup>* − *φ<sup>j</sup>* + *�α*)*Bα*,*ij* = 0, (*φ<sup>i</sup>* − *al*)*Iil* = 0, (−*φ<sup>i</sup>* + *al* + *�*)*Jli* = 0 (27)

*al* + (*j* − 1)*�*<sup>1</sup> + (*i* − 1)*�*<sup>2</sup> (28)

<sup>2</sup> *Il* (29)

*Tal* (30)

*<sup>i</sup>* +*j*

1

*<sup>i</sup>* <sup>−</sup> *<sup>j</sup>*) + *�*1(*λ*<sup>ˇ</sup> (*l*)

(31)

*<sup>j</sup>* − *i* + 1)

(33)

2

*<sup>λ</sup>* = (*λ*(1), ··· , *<sup>λ</sup>*(*n*)) [42]. Thus the characters of the

*n* ∑ *l*=1

*λ*ˇ (*l*) *<sup>j</sup>* −*i*+1 <sup>1</sup> *<sup>T</sup>*−*λ*(*m*)

*alm* <sup>−</sup> *�*2(*λ*(*m*)

instanton moduli space.

obtained from the infinitesimal version of (24) and (25) as

and the corresponding eigenvector is given by

*n*-tuple Young diagrams, or partitions

*V* =

∑ (*i*,*j*)∈*λ*(*l*)

reading the weight function from the character [43, 44],

*aml* <sup>+</sup> *�*2(*λ*(*m*)

 *Taml T*

*n* ∑ *l*=1

∑ (*i*,*j*)∈*λ*(*l*)

vector spaces are yielding

terms of the *n*-tuple partition as

*χ*

Λ2*n*<sup>|</sup> *λ*| *Z*

∏ (*i*,*j*)∈*λ*(*l*)

*Z*SU(*n*) = ∑

*Z <sup>λ</sup>* =  *λ*

*n* ∏ *l*,*m*

= *n* ∑ *l*,*m* Let us consider the following *N* × *N* matrix integral,

$$Z\_{\text{matrix}} = \int \mathcal{D}X \, e^{-\frac{1}{\hbar} \text{Tr} \, V(X)} \tag{34}$$

Here *X* is an hermitian matrix, and D*X* is the associated matrix measure. This matrix can be diagonalized by a unitary transformation, *gXg*−<sup>1</sup> <sup>=</sup> diag(*x*1, ··· , *xN*) with *<sup>g</sup>* <sup>∈</sup> <sup>U</sup>(*N*), and the integrand is invariant under this transformation, Tr *V*(*X*) = Tr *V*(*gXg*−1) = ∑*<sup>N</sup> <sup>i</sup>*=<sup>1</sup> *V*(*xi*). On the other hand, we have to take care of the matrix measure in (34): the non-trivial Jacobian is arising from the matrix diagonalization (see, e.g. [40]),

$$\mathcal{D}X = \mathcal{D}x \mathcal{D}U \Delta(x)^2 \tag{35}$$

The Jacobian part is called *Vandermonde determinant*, which is written as

$$\Delta(\mathbf{x}) = \prod\_{i$$

and D*U* is the Haar measure, which is invariant under unitary transformation, D(*gU*) = <sup>D</sup>*U*. The diagonal part is simply given by <sup>D</sup>*<sup>x</sup>* <sup>≡</sup> <sup>∏</sup>*<sup>N</sup> <sup>i</sup>*=<sup>1</sup> *dxi*. Therefore, by integrating out the off-diagonal part, the matrix integral (34) is reduced to the integral over the matrix eigenvalues,

$$Z\_{\text{matrix}} = \int \mathcal{D}\mathbf{x} \,\Delta(\mathbf{x})^2 \, e^{-\frac{1}{\hbar} \sum\_{i=1}^{N} V(\mathbf{x}\_i)} \tag{37}$$

This expression is up to a constant factor, associated with the volume of the unitary group, vol(U(*N*)), coming from the off-diagonal integral.

When we consider a real symmetric or a quaternionic self-dual matrix, it can be diagonalized by orthogonal/symplectic transformation. In these cases, the Jacobian part is slightly modified,

$$Z\_{\text{matrix}} = \int \mathcal{D}\mathbf{x} \,\Delta(\mathbf{x})^{2\mathcal{G}} \, e^{-\frac{1}{\hbar} \sum\_{i=1}^{N} V(\mathbf{x}\_{i})} \tag{38}$$

The power of the Vandermonde determinant is given by *β* = <sup>1</sup> <sup>2</sup> , 1, 2 for symmetric, hermitian and self-dual, respcecively.1 They correspond to orthogonal, unitary, symplectic ensembles in random matrix theory, and the model with a generic *β* ∈ **R** is called *β-ensemble matrix model*.

<sup>1</sup> This notation is different from the standard one: 2*<sup>β</sup>* <sup>→</sup> *<sup>β</sup>* <sup>=</sup> 1, 2, 4 for symmetric, hermitian and self-dual matrices.

**Figure 4.** Shape of Young diagram can be represented by introducing one-dimensional exclusive particles. Positions of particles would be interpreted as eigenvalues of the matrix.

#### **3.2.** U(1) **partition function**

We would like to show an essential connection between the combinatorial partition function and the matrix model. By considering the thermodynamical limit of the partition function, it can be represented as a matrix integral discussed above.

Let us start with the most fundamental partition function (7). The main part of its partition function is the product all over the boxes in the partition *λ*. After some calculations, we can show this combinatorial factor is rewritten as

$$\prod\_{(i,j)\in\lambda} \frac{1}{h(i,j)} = \prod\_{i$$

where *N* is an arbitrary integer satisfying *N* > �(*λ*). This can be also represented in an infinite product form,

$$\prod\_{(i,j)\in\lambda} \frac{1}{h(i,j)} = \prod\_{i$$

These expressions correspond to an embedding of the finite dimensional symmetric group S*<sup>N</sup>* into the infinite dimensional one S∞.

By introducing a new set of variables *ξ<sup>i</sup>* = *λ<sup>i</sup>* + *N* − *i* + 1, we have another representation of the partition function,

$$Z\_{\mathsf{U}(1)} = \sum\_{\lambda} \left(\frac{\Lambda}{\hbar}\right)^{2\sum\_{i=1}^{N} \xi\_i - N(N+1)} \prod\_{i$$

These new variables satisfy *ξ<sup>i</sup>* > *ξ*<sup>2</sup> > ··· > *ξ*�(*λ*) while the original ones satisfy *λ*<sup>1</sup> ≥ *λ*<sup>2</sup> ≥ ··· ≥ *λ*�(*λ*). This means {*ξi*} and {*λi*} are interpreted as fermionic and bosonic degrees of freedom. Fig. 4 shows the correspondence between the bosinic and fermionic variables. The bosonic excitation is regarded as density fluctuation of the fermionic particles around the Fermi energy. This is just the bosonization method, which is often used to study quantum one-dimensional systems (For example, see [24]). Especially we concentrate only on either of the Fermi points. Thus it yields the chiral conformal field theory.

We would like to show that the matrix integral form is obtained from the expression (41). First we rewrite the summation over partitions as

$$\sum\_{\lambda} = \sum\_{\lambda\_1 \ge \cdots \ge \lambda\_N} = \sum\_{\substack{\mathfrak{F}\_1 > \cdots > \mathfrak{F}\_N \\ \mathfrak{F}\_1 < \cdots < \mathfrak{F}\_N}} = \frac{1}{N!} \sum\_{\substack{\mathfrak{F}\_1 \cdots \mathfrak{F}\_N \\ \mathfrak{F}\_1 < \cdots < \mathfrak{F}\_N}} \tag{42}$$

Then, introducing another variable defined as *xi* = *h*¯ *ξi*, it can be regarded as a continuous variable in the large *N* limit,

$$N \longrightarrow \infty, \qquad \hbar \longrightarrow 0, \qquad \hbar N = \mathcal{O}(1) \tag{43}$$

This is called 't Hooft limit. The measure for this variable is given by

10 Linear Algebra

**Figure 4.** Shape of Young diagram can be represented by introducing one-dimensional exclusive

We would like to show an essential connection between the combinatorial partition function and the matrix model. By considering the thermodynamical limit of the partition function, it

Let us start with the most fundamental partition function (7). The main part of its partition function is the product all over the boxes in the partition *λ*. After some calculations, we can

> *N* ∏ *i*=1

*λ<sup>i</sup>* − *λ<sup>j</sup>* + *j* − *i*

1

<sup>Γ</sup>(*λ<sup>i</sup>* <sup>+</sup> *<sup>N</sup>* <sup>−</sup> *<sup>i</sup>* <sup>+</sup> <sup>1</sup>) (39)

*<sup>j</sup>* <sup>−</sup> *<sup>i</sup>* (40)

(*λ<sup>i</sup>* − *λ<sup>j</sup>* + *j* − *i*)

where *N* is an arbitrary integer satisfying *N* > �(*λ*). This can be also represented in an infinite

∞ ∏ *i*<*j*

These expressions correspond to an embedding of the finite dimensional symmetric group

By introducing a new set of variables *ξ<sup>i</sup>* = *λ<sup>i</sup>* + *N* − *i* + 1, we have another representation of

*<sup>i</sup>*=<sup>1</sup> *ξi*−*N*(*N*+1) *N*

These new variables satisfy *ξ<sup>i</sup>* > *ξ*<sup>2</sup> > ··· > *ξ*�(*λ*) while the original ones satisfy *λ*<sup>1</sup> ≥ *λ*<sup>2</sup> ≥ ··· ≥ *λ*�(*λ*). This means {*ξi*} and {*λi*} are interpreted as fermionic and bosonic degrees of freedom. Fig. 4 shows the correspondence between the bosinic and fermionic variables. The bosonic excitation is regarded as density fluctuation of the fermionic particles around the Fermi energy. This is just the bosonization method, which is often used to study quantum one-dimensional systems (For example, see [24]). Especially we concentrate only on either of

∏ *i*<*j*

(*ξ<sup>i</sup>* <sup>−</sup> *<sup>ξ</sup>j*)<sup>2</sup>

*N* ∏ *i*=1

1

<sup>Γ</sup>(*ξi*)<sup>2</sup> (41)

particles. Positions of particles would be interpreted as eigenvalues of the matrix.

*N* ∏ *i*<*j*

∏ (*i*,*j*)∈*λ*

1 *<sup>h</sup>*(*i*, *<sup>j</sup>*) <sup>=</sup>

can be represented as a matrix integral discussed above.

1 *<sup>h</sup>*(*i*, *<sup>j</sup>*) <sup>=</sup>

show this combinatorial factor is rewritten as

∏ (*i*,*j*)∈*λ*

S*<sup>N</sup>* into the infinite dimensional one S∞.

*Z*U(1) = ∑

*λ*

 Λ *h*¯

the Fermi points. Thus it yields the chiral conformal field theory.

<sup>2</sup> <sup>∑</sup>*<sup>N</sup>*

**3.2.** U(1) **partition function**

product form,

the partition function,

$$d\mathbf{x}\_i \approx \hbar \sim \frac{1}{N} \tag{44}$$

Therefore the partition function (41) is rewritten as the following matrix integral,

$$Z\_{\mathbb{U}(1)} \approx \int \mathcal{D}\mathbf{x} \,\Delta(\mathbf{x})^2 \, e^{-\frac{1}{\hbar} \sum\_{l=1}^{N} V(\mathbf{x}\_l)} \tag{45}$$

Here the matrix potential is derived from the asymptotic behavior of the Γ-function,

$$
\hbar \log \Gamma(\mathbf{x}/\hbar) \longrightarrow \mathbf{x} \log \mathbf{x} - \mathbf{x}, \qquad \hbar \longrightarrow \mathbf{0} \tag{46}
$$

Since this variable can take a negative value, the potential term should be simply extended to the region of *x* < 0. Thus, taking into account the fugacity parameter Λ, the matrix potential is given by

$$V(\mathbf{x}) = 2\left[\mathbf{x}\log\left|\frac{\mathbf{x}}{\Lambda}\right| - \mathbf{x}\right] \tag{47}$$

This is the simplest version of the **CP**<sup>1</sup> matrix model [18]. If we start with the partition function including the higher Casimir operators (8), the associated integral expression just yields the **CP**<sup>1</sup> matrix model.

Let us comment on other possibilities to obtain the matrix model. It is shown that the matrix integral form can be derived without taking the large *N* limit [19]. Anyway one can see that it is reduced to the model we discussed above in the large *N* limit. There is another kind of the matrix model derived from the combinatorial partition function by *poissonizing* the probability measure. In this case, only the linear potential is arising in the matrix potential term. Such a matrix model is called Bessel-type matrix model, where its short range fluctuation is described by the Bessel kernel.

Next we shall derive the matrix model corresponding to the *β*-deformed U(1) model (13). The combinatorial part of the partition function is similarly given by

$$\prod\_{\{\boldsymbol{i},\boldsymbol{j}\}\in\lambda} \frac{1}{h\_{\boldsymbol{\beta}}(\boldsymbol{i},\boldsymbol{j})h^{\boldsymbol{\beta}}(\boldsymbol{i},\boldsymbol{j})} = \Gamma(\boldsymbol{\beta})^{N} \prod\_{i
$$\times \prod\_{i=1}^{N} \frac{1}{\Gamma(\lambda\_{i}+\beta(N-i)+\beta)} \frac{1}{\Gamma(\lambda\_{i}+\beta(N-i)+1)}\tag{48}$$
$$

#### 12 Linear Algebra 86 Linear Algebra – Theorems and Applications

In this case we shall introduce the following variables, *ξ* (*β*) *<sup>i</sup>* = *λ<sup>i</sup>* + *β*(*N* − *i*) + 1 or *ξ* (*β*) *<sup>i</sup>* = *λ<sup>i</sup>* + *β*(*N* − *i*) + *β*, satisfying *ξ* (*β*) *<sup>i</sup>* − *ξ* (*β*) *<sup>i</sup>*+<sup>1</sup> ≥ *β*. This means the parameter *β* characterizes how they are exclusive. They satisfy the generalized fractional exclusive statistics for *β* � 1 [25] (see also [32]). They are reduced to fermions and bosons for *β* = 1 and *β* = 0, respectively. Then, rescaling the variables, *xi* = *h*¯ *ξ* (*β*) *<sup>i</sup>* , the combinatorial part (48) in the 't Hooft limit yields

$$\prod\_{(i,j)\in\lambda} \frac{1}{h\_{\beta}(i,j)h^{\beta}(i,j)} \longrightarrow \Delta(\mathbf{x})^{2\beta} \ e^{-\frac{1}{\hbar} \sum\_{i=1}^{N} V(\mathbf{x}\_i)}\tag{49}$$

Here we use <sup>Γ</sup>(*<sup>α</sup>* <sup>+</sup> *<sup>β</sup>*)/Γ(*α*) <sup>∼</sup> *<sup>α</sup><sup>β</sup>* with *<sup>α</sup>* <sup>→</sup> <sup>∞</sup>. The matrix potential obtained here is the same as (47). Therefore the matrix model associated with the *β*-deformed partition function is given by

$$Z\_{\mathbf{U}(1)}^{(\beta)} \approx \int \mathcal{D}\mathbf{x} \,\Delta(\mathbf{x})^{2\beta} \, e^{-\frac{1}{\hbar} \sum\_{l=1}^{N} V(\mathbf{x}\_l)} \tag{50}$$

This is just the *β*-ensemble matrix model shown in (38).

We can consider the matrix model description of the (*q*, *t*)-deformed partition function. In this case the combinatorial part of (15) is written as

$$\prod\_{(i,j)\in\lambda} \frac{1-q}{1-q^{a(i,j)+1}t^{l(i,j)}} = (1-q)^{|\lambda|} \prod\_{i 
$$\prod\_{i=1}^{N} \frac{1-q^{-1}}{1-q^{-1}|\lambda|} \prod\_{j=1}^{N} (q^{-\lambda\_i+\lambda\_j+1}t^{-j+i-1};q)\_{\infty} \prod\_{i=1}^{N} \qquad (qt^{-1};q)\_{\infty}$$
$$

$$\prod\_{(i,j)\in\lambda} \frac{1-q^{-1}}{1-q^{-a(i,j)}t^{-l(i,j)-1}} = (1-q^{-1})^{|\lambda|} \prod\_{i$$

Here (*x*; *<sup>q</sup>*)*<sup>n</sup>* <sup>=</sup> <sup>∏</sup>*n*−<sup>1</sup> *<sup>m</sup>*=0(<sup>1</sup> <sup>−</sup> *xqm*) is the *<sup>q</sup>*-Pochhammer symbol. When we parametrize *<sup>q</sup>* <sup>=</sup> *<sup>e</sup>*−*hR*¯ and *<sup>t</sup>* <sup>=</sup> *<sup>q</sup>β*, a set of the variables {*<sup>ξ</sup>* (*β*) *<sup>i</sup>* } plays an important role in considering the large *N* limit as well as the *β*-deformed model. Thus, rescaling these as *xi* = *h*¯ *ξ* (*β*) *<sup>i</sup>* and taking the 't Hooft limit, we obtain the integral expression of the *q*-deformed partition function,

$$Z\_{\mathbf{U}(1)}^{(q,t)} \approx \int \mathcal{D}\mathbf{x} \ (\Delta\_{\mathbf{R}}(\mathbf{x}))^{2\beta} \ e^{-\frac{1}{\hbar} \sum\_{i=1}^{N} V\_{\mathbf{R}}(\mathbf{x}\_i)} \tag{53}$$

The matrix measure and potential are given by

$$\Delta\_{\mathbb{R}}(\mathbf{x}) = \prod\_{i$$

$$V\_R(\mathbf{x}) = -\frac{1}{R} \left[ \text{Li}\_2 \left( e^{R\mathbf{x}} \right) - \text{Li}\_2 \left( e^{-R\mathbf{x}} \right) \right] \tag{55}$$

We will discuss how to obtain these expressions below. We can see they are reduced to the standard ones in the limit of *R* → 0,

$$
\Delta\_R(\mathbf{x}) \longrightarrow \Delta(\mathbf{x}), \qquad V\_R(\mathbf{x}) \longrightarrow V(\mathbf{x}) \tag{56}
$$

Note that this hyperbolic-type matrix measure is also investigated in the Chern-Simons matrix model [35], which is extensively involved with the recent progress on the three dimensional supersymmetric gauge theory via the localization method [36].

Let us comment on useful formulas to derive the integral expression (53). The measure part is relevant to the asymptotic form of the following function,

$$\frac{(\mathbf{x};q)\_{\infty}}{(\mathbf{t}\mathbf{x};q)\_{\infty}} \longrightarrow \frac{(\mathbf{x};q)\_{\infty}}{(\mathbf{t}\mathbf{x};q)\_{\infty}}\bigg|\_{q\to 1} = (1-\mathbf{x})^{\beta}, \qquad \mathbf{x} \longrightarrow \infty \tag{57}$$

This essentially corresponds to the *<sup>q</sup>* <sup>→</sup> 1 limit of the *<sup>q</sup>*-Vandermonde determinant2,

$$\Delta\_{q,t}^2(\mathbf{x}) = \prod\_{i \neq j}^N \frac{(\mathbf{x}\_i/\mathbf{x}\_j; q)\_{\infty}}{(t\mathbf{x}\_i/\mathbf{x}\_j; q)\_{\infty}} \tag{58}$$

Then, to investigate the matrix potential term, we now introduce the quantum dilogarithm function,

$$g(\mathbf{x};q) = \prod\_{n=1}^{\infty} \left(1 - \frac{1}{\mathbf{x}} q^n\right) \tag{59}$$

Its asymptotic expansion is given by (see, e.g. [19])

$$\log g(\mathbf{x}; q = e^{-\hbar R}) = -\frac{1}{\hbar R} \sum\_{m=0}^{\infty} \text{Li}\_{2-m} \left( \mathbf{x}^{-1} \right) \frac{B\_{m}}{m!} (\hbar R)^{m} \tag{60}$$

where *Bm* is the *m*-th Bernouilli number, and Li*m*(*x*) = ∑<sup>∞</sup> *<sup>k</sup>*=<sup>1</sup> *<sup>x</sup>k*/*k<sup>m</sup>* is the polylogarithm function. The potential term is coming from the leading term of this expression.

#### **3.3.** SU(*n*) **partition function**

12 Linear Algebra

they are exclusive. They satisfy the generalized fractional exclusive statistics for *β* � 1 [25] (see also [32]). They are reduced to fermions and bosons for *β* = 1 and *β* = 0, respectively.

*<sup>h</sup>β*(*i*, *<sup>j</sup>*)*hβ*(*i*, *<sup>j</sup>*) −→ <sup>Δ</sup>(*x*)2*<sup>β</sup> <sup>e</sup>*

Here we use <sup>Γ</sup>(*<sup>α</sup>* <sup>+</sup> *<sup>β</sup>*)/Γ(*α*) <sup>∼</sup> *<sup>α</sup><sup>β</sup>* with *<sup>α</sup>* <sup>→</sup> <sup>∞</sup>. The matrix potential obtained here is the same as (47). Therefore the matrix model associated with the *β*-deformed partition function is given

<sup>D</sup>*<sup>x</sup>* <sup>Δ</sup>(*x*)2*<sup>β</sup> <sup>e</sup>*

We can consider the matrix model description of the (*q*, *t*)-deformed partition function. In this

Here (*x*; *<sup>q</sup>*)*<sup>n</sup>* <sup>=</sup> <sup>∏</sup>*n*−<sup>1</sup> *<sup>m</sup>*=0(<sup>1</sup> <sup>−</sup> *xqm*) is the *<sup>q</sup>*-Pochhammer symbol. When we parametrize *<sup>q</sup>* <sup>=</sup> *<sup>e</sup>*−*hR*¯

− 1 *<sup>h</sup>*¯ <sup>∑</sup>*<sup>N</sup>*

(*qλi*−*λj*+1*tj*−*i*−1; *q*)<sup>∞</sup> (*qλi*−*λj*+1*tj*−*<sup>i</sup>*

(*q*−*λi*+*λj*+1*t*

(*q*−*λi*+*λj*+1*t*−*j*+*<sup>i</sup>*

<sup>2</sup>*<sup>β</sup> e* − 1 *<sup>h</sup>*¯ <sup>∑</sup>*<sup>N</sup>*

> − Li2 *e* −*Rx*

Δ*R*(*x*) −→ Δ(*x*), *VR*(*x*) −→ *V*(*x*) (56)

; *q*)<sup>∞</sup>

*N* ∏ *i*=1

<sup>−</sup>*j*+*i*−1; *q*)<sup>∞</sup>

; *q*)<sup>∞</sup>

*<sup>i</sup>* } plays an important role in considering the large *N*

(*β*)

− 1 *<sup>h</sup>*¯ <sup>∑</sup>*<sup>N</sup>*

*<sup>i</sup>*+<sup>1</sup> ≥ *β*. This means the parameter *β* characterizes how

*<sup>i</sup>* , the combinatorial part (48) in the 't Hooft limit yields

*<sup>i</sup>* = *λ<sup>i</sup>* + *β*(*N* − *i*) + 1 or *ξ*

*<sup>i</sup>*=<sup>1</sup> *<sup>V</sup>*(*xi*) (49)

*<sup>i</sup>*=<sup>1</sup> *<sup>V</sup>*(*xi*) (50)

(*qλi*+1*tN*−*<sup>i</sup>*

*N* ∏ *i*=1

(*q*; *q*)<sup>∞</sup>

(*β*)

*<sup>i</sup>*=<sup>1</sup> *VR*(*xi*) (53)

<sup>2</sup> (*xi* <sup>−</sup> *xj*) (54)

; *q*)<sup>∞</sup>

(*qt*−1; *q*)<sup>∞</sup> (*q*−*λi*+1*t*−*N*+*i*−1; *q*)<sup>∞</sup>

*<sup>i</sup>* and taking the 't

(51)

(52)

(55)

(*β*) *<sup>i</sup>* =

In this case we shall introduce the following variables, *ξ*

∏ (*i*,*j*)∈*λ*

This is just the *β*-ensemble matrix model shown in (38).

case the combinatorial part of (15) is written as

<sup>1</sup> <sup>−</sup> *<sup>q</sup>a*(*i*,*j*)+1*tl*(*i*,*j*) = (<sup>1</sup> <sup>−</sup> *<sup>q</sup>*)|*λ*<sup>|</sup>

<sup>1</sup> <sup>−</sup> *<sup>q</sup>*−*a*(*i*,*j*)*t*−*l*(*i*,*j*)−<sup>1</sup> = (<sup>1</sup> <sup>−</sup> *<sup>q</sup>*−1)|*λ*<sup>|</sup>

The matrix measure and potential are given by

1 − *q*

<sup>1</sup> <sup>−</sup> *<sup>q</sup>*−<sup>1</sup>

and *<sup>t</sup>* <sup>=</sup> *<sup>q</sup>β*, a set of the variables {*<sup>ξ</sup>*

standard ones in the limit of *R* → 0,

(*β*) *<sup>i</sup>* − *ξ*

*Z*(*β*) U(1) ≈

(*β*)

(*β*)

1

*N* ∏ *i*<*j*

(*β*)

Hooft limit, we obtain the integral expression of the *q*-deformed partition function,

*N* ∏ *i*<*j*

*R* Li2 *eRx*

D*x* (Δ*R*(*x*))

2 *<sup>R</sup>* sinh *<sup>R</sup>*

We will discuss how to obtain these expressions below. We can see they are reduced to the

limit as well as the *β*-deformed model. Thus, rescaling these as *xi* = *h*¯ *ξ*

Δ*R*(*x*) =

*VR*(*x*) = <sup>−</sup> <sup>1</sup>

*Z*(*q*,*t*) U(1) ≈

*N* ∏ *i*<*j*

*λ<sup>i</sup>* + *β*(*N* − *i*) + *β*, satisfying *ξ*

by

∏ (*i*,*j*)∈*λ*

∏ (*i*,*j*)∈*λ*

Then, rescaling the variables, *xi* = *h*¯ *ξ*

Generalizing the result shown in section 3.2, we deal with the combinatorial partition function for SU(*n*) gauge theory (32). Its matrix model description is evolved in [30].

The combinatorial factor of the SU(*n*) partition function (33) can be represented as

$$Z\_{\vec{\lambda}} = \frac{1}{\mathfrak{e}\_2^{2\mathfrak{n}[\vec{\lambda}]}} \prod\_{(l,i)\neq(m,j)} \frac{\Gamma(\lambda\_i^{(l)} - \lambda\_j^{(m)} + \beta(j-i) + b\_{lm} + \beta)}{\Gamma(\lambda\_i^{(l)} - \lambda\_j^{(m)} + \beta(j-i) + b\_{lm})} \frac{\Gamma(\beta(j-i) + b\_{lk})}{\Gamma(\beta(j-i) + b\_{lk} + \beta)}\tag{61}$$

where we define parameters as *β* = −*�*1/*�*2, *blm* = *alm*/*�*2. This is an infinite product expression of the partition function. Anyway in this case one can see it is useful to introduce *n* kinds of fermionic variables, corresponding to the *n*-tupe partition,

$$\mathfrak{F}\_{i}^{(l)} = \lambda\_{i}^{(l)} + \beta(N - i) + 1 + b\_{l} \tag{62}$$

<sup>2</sup> This expression is up to logarithmic term, which can be regarded as the zero mode contribution of the free boson field. See [28, 29] for details.

**Figure 5.** The decomposition of the partition for **Z***r*=3. First suppose the standard correspondence between the one-dimensional particles and the original partition, and then rearrange them with respect to mod *r*.

Then, assuming *blm* � 1, let us introduce a set of variables,

$$(\mathfrak{J}\_1, \mathfrak{J}\_2, \dots, \mathfrak{J}\_{nN}) = (\mathfrak{J}\_1^{(n)}, \dots, \mathfrak{J}\_N^{(n)}, \mathfrak{J}\_1^{(n-1)}, \dots, \dots, \mathfrak{J}\_N^{(2)}, \mathfrak{J}\_1^{(1)}, \dots, \mathfrak{J}\_N^{(1)}) \tag{63}$$

satisfying *ζ*<sup>1</sup> > *ζ*<sup>2</sup> > ··· > *ζnN*. The combinatorial factor (61) is rewritten with these variables as

$$Z\_{\widetilde{\lambda}} = \frac{1}{\mathfrak{e}\_{\mathbf{2}}^{2n|\widetilde{\lambda}|}} \prod\_{i$$

From this expression we can obtain the matrix model description for SU(*n*) gauge theory partition function, by rescaling *xi* = *h*¯ *ζ<sup>i</sup>* with reparametrizing ¯*h* = *�*2,

$$Z\_{\rm SU(n)} \approx \int \mathcal{D}\mathbf{x} \,\Delta(\mathbf{x})^{2\mathcal{B}} \, e^{-\frac{1}{\hbar} \sum\_{l=1}^{nN} V\_{\rm SU(n)}(\mathbf{x}\_l)} \,\tag{65}$$

In this case the matrix potential is given by

$$V\_{\rm SU(n)}(\mathbf{x}) = 2 \sum\_{l=1}^{n} \left[ (\mathbf{x} - a\_l) \log \left| \frac{\mathbf{x} - a\_l}{\Lambda} \right| - (\mathbf{x} - a\_l) \right] \tag{66}$$

Note that this matrix model is regarded as the U(1) matrix model with external fields *al*. We will discuss how to extract the gauge theory consequences from this matrix model in section 4.

#### **3.4. Orbifold partition function**

The matrix model description for the random partition model is also possible for the orbifold theory. We would like to derive another kind of the matrix model from the combinatorial orbifold partition function (16). We now concentrate on the U(1) orbifold partition function for simplicity. See [27, 28] for details of the SU(*n*) theory.

To obtain the matrix integral representation of the combinatorial partition function, we have to find the associated one-dimensional particle description of the combinatorial factor. In this case, although the combinatorial weight itself is the same as the standard U(1) model, there is restriction on its product region. Thus it is useful to introduce another basis obtained by dividing the partition as follows,

$$\left\{ r\left(\lambda\_i^{(u)} + N^{(u)} - i\right) + u \Big| i = 1, \dots, N^{(u)}, u = 0, \dots, r - 1 \right\} = \{\lambda\_i + N - i | i = 1, \dots, N\} \tag{67}$$

Fig.5 shows the meaning of this procedure graphically. We now assume *N*(*u*) = *N* for all *u*. With these one-dimensional particles, we now utilize the relation between the orbifold partition function and the *q*-deformed model as discussed in section 2.1. Its calculation is quite straightforward, but a little bit complicated. See [27, 28] for details.

After some computations, we finally obtain the matrix model for the *β*-deformed orbifold partition function,

$$Z\_{\text{orbifold},\mathcal{U}(1)}^{(\mathcal{G})} \approx \int \mathcal{D}\vec{\mathcal{X}} \left(\Delta\_{\text{orb}}^{(\mathcal{G})}(\mathbf{x})\right)^2 e^{-\frac{1}{\hbar}\sum\_{a=0}^{r-1}\sum\_{l=1}^{N} V(\mathbf{x}\_l^{(a)})} \tag{68}$$

In this case, we have a multi-matrix integral representation, since we introduce *r* kinds of partitions from the original partition. The matrix measure and the matrix potential are given as follows,

$$\mathcal{D}\vec{\mathcal{X}} = \prod\_{u=0}^{r-1} \prod\_{i=1}^{N} d\boldsymbol{x}\_i^{(u)} \tag{69}$$

$$\left(\Delta\_{\text{orb}}^{(\beta)}(\mathbf{x})\right)^{2} = \prod\_{u=0}^{r-1} \prod\_{i$$

$$V(\mathbf{x}) = \frac{2}{r} \left[ \mathbf{x} \log \left| \frac{\mathbf{x}}{\Lambda} \right| - \mathbf{x} \right] \tag{71}$$

The matrix measure consists of two parts, interaction between eigenvalues from the same matrix and that between eigenvalues from different matrices. Note that in the case of *β* = 1, because the interaction part in the matrix measure beteen different matrices is vanishing, this multi-matrix model is simply reduced to the one-matrix model.

#### **4. Large** *N* **analysis**

14 Linear Algebra

**Figure 5.** The decomposition of the partition for **Z***r*=3. First suppose the standard correspondence between the one-dimensional particles and the original partition, and then rearrange them with respect

> (*n*) *<sup>N</sup>* , *ξ*

Γ(*ζ<sup>i</sup>* − *ζ<sup>j</sup>* + *β*) Γ(*ζ<sup>i</sup>* − *ζj*)

satisfying *ζ*<sup>1</sup> > *ζ*<sup>2</sup> > ··· > *ζnN*. The combinatorial factor (61) is rewritten with these variables

From this expression we can obtain the matrix model description for SU(*n*) gauge theory

<sup>D</sup>*<sup>x</sup>* <sup>Δ</sup>(*x*)2*<sup>β</sup> <sup>e</sup>*

(*x* − *al*)log

Note that this matrix model is regarded as the U(1) matrix model with external fields *al*. We will discuss how to extract the gauge theory consequences from this matrix model in section 4.

The matrix model description for the random partition model is also possible for the orbifold theory. We would like to derive another kind of the matrix model from the combinatorial orbifold partition function (16). We now concentrate on the U(1) orbifold partition function

(*n*−1)

*nN* ∏ *i*=1

> − 1 *<sup>h</sup>*¯ <sup>∑</sup>*nN*

> >

*x* − *al* Λ

 

<sup>−</sup> (*<sup>x</sup>* <sup>−</sup> *al*)

*n* ∏ *l*=1

<sup>1</sup> , ······ , *ξ*

(2) *<sup>N</sup>* , *ξ* (1) <sup>1</sup> , ··· , *ξ*

Γ(−*ζ<sup>i</sup>* + *bl* + 1)

(1)

<sup>Γ</sup>(*ζ<sup>i</sup>* <sup>−</sup> *bl* <sup>−</sup> <sup>1</sup> <sup>+</sup> *<sup>β</sup>*) (64)

*<sup>i</sup>*=<sup>1</sup> *<sup>V</sup>*SU(*n*)(*xi*) (65)

*<sup>N</sup>* ) (63)

(66)

Then, assuming *blm* � 1, let us introduce a set of variables,

(*n*) <sup>1</sup> , ··· , *ξ*

*nN* ∏ *i*<*j*

partition function, by rescaling *xi* = *h*¯ *ζ<sup>i</sup>* with reparametrizing ¯*h* = *�*2,

*n* ∑ *l*=1 *Z*SU(*n*) ≈

*V*SU(*n*)(*x*) = 2

for simplicity. See [27, 28] for details of the SU(*n*) theory.

(*ζ*1, *ζ*2, ··· , *ζnN*)=(*ξ*

*Z*� *<sup>λ</sup>* <sup>=</sup> <sup>1</sup> *�* 2*n*| � *λ*| 2

In this case the matrix potential is given by

**3.4. Orbifold partition function**

to mod *r*.

as

One of the most important aspects of the matrix model is universality arising in the large *N* limit. The universality class described by the matrix model covers huge kinds of the statistical models, in particular its characteristic fluctuation rather than the eigenvalue density function. In the large *N* limit, which is regarded as a justification to apply a kind of the mean field approximation, anslysis of the matrix model is extremely reduced to the saddle point equation and a simple fluctuation around it.

#### **4.1. Saddle point equation and spectral curve**

Let us first define the prepotential, which is also interpreted as the effective action for the eigenvalues, from the matrix integral representation

#### 16 Linear Algebra 90 Linear Algebra – Theorems and Applications

(37),

$$-\frac{1}{\hbar^2}\mathcal{F}(\{\mathbf{x}\_i\}) = -\frac{1}{\hbar}\sum\_{i=1}^N V(\mathbf{x}\_i) + 2\sum\_{i$$

This is essentially the genus zero part of the prepotential. In the large *N* limit, in particular 't Hooft limit (43) with *Nh*¯ ≡ *t*, we shall investigate the saddle point equation for the matrix integral. We can obtain the condition for criticality by differentiating the prepotential,

$$V'(\mathbf{x}\_i) = 2\hbar \sum\_{j(\neq i)}^{N} \frac{1}{\mathbf{x}\_j - \mathbf{x}\_i}, \qquad \text{for all } i \tag{73}$$

This is also given by the extremal condition of the effective potential defined as

$$V\_{\rm eff}(\mathbf{x}\_{i}) = V(\mathbf{x}\_{i}) - 2\hbar \sum\_{j(\neq i)}^{N} \log(\mathbf{x}\_{i} - \mathbf{x}\_{j}) \tag{74}$$

This potential involves a logarithmic Coulomb repulsion between eigenvalues. If the 't Hooft coupling is small, the potential term dominates the Coulomb interaction and eigenvalues concentrate on extrema of the potential *V*� (*x*) = 0. On the other hand, as the coupling gets bigger, the eigenvalue distribution is extended.

To deal with such a situation, we now define the density of eigenvalues,

$$\rho(\mathbf{x}) = \frac{1}{N} \sum\_{i=1}^{N} \delta(\mathbf{x} - \mathbf{x}\_i) \tag{75}$$

where *xi* is the solution of the criticality condition (73). In the large *N* limit, it is natural to think this eigenvalue distribution is smeared, and becomes a continuous function. Furthermore, we assume the eigenvalues are distributed around the critical points of the potential *V*(*x*) as linear segments. Thus we generically denote the *l*-th segment for *ρ*(*x*) as C*l*, and the total number of eigenvalues *N* splits into *n* integers for these segments,

$$N = \sum\_{l=1}^{n} N\_l \tag{76}$$

where *Nl* is the number of eigenvalues in the interval C*l*. The density of eigenvalues *ρ*(*x*) takes non-zero value only on the segment C*l*, and is normalized as

$$\int\_{\mathcal{C}\_l} d\mathbf{x} \,\rho(\mathbf{x}) = \frac{N\_l}{N} \equiv \nu\_l \tag{77}$$

where we call it *filling fraction*. According to these fractions, we can introduce the partial 't Hooft parameters, *tl* = *Nlh*¯. Note there are *n* 't Hooft couplings and filling fractions, but only *<sup>n</sup>* <sup>−</sup> 1 fractions are independent since they have to satisfy <sup>∑</sup>*<sup>n</sup> <sup>l</sup>*=<sup>1</sup> *ν<sup>l</sup>* = 1 while all the 't Hooft couplings are independent.

We then introduce the resolvent for this model as an auxiliary function, a kind of Green function. By taking the large *N* limit, it can be given by the integral representation,

#### 90 Linear Algebra – Theorems and Applications Gauge Theory, Combinatorics, and Matrix Models <sup>17</sup> Gauge Theory, Combinatorics, and Matrix Models 91

$$
\omega(\mathbf{x}) = t \int dy \, \frac{\rho(y)}{\mathbf{x} - y} \tag{78}
$$

This means that the density of states is regarded as the Hilbert transformation of this resolvent function. Indeed the density of states is associated with the discontinuities of the resolvent,

$$\rho(\mathbf{x}) = -\frac{1}{2\pi it} \left( \omega(\mathbf{x} + i\boldsymbol{\epsilon}) - \omega(\mathbf{x} - i\boldsymbol{\epsilon}) \right) \tag{79}$$

Thus all we have to do is to determine the resolvent instead of the density of states with satisfying the asymptotic behavior,

$$
\omega(\mathbf{x}) \longrightarrow \frac{1}{\mathbf{x}'} \qquad \mathbf{x} \longrightarrow \infty \tag{80}
$$

Writing down the prepotential with the density of states,

$$\mathcal{F}(\{\mathbf{x}\_{i}\}) = t \int d\mathbf{x} \,\rho(\mathbf{x}) V(\mathbf{x}) - t^{2} \mathbf{P} \int d\mathbf{x} dy \,\rho(\mathbf{x}) \rho(y) \log(\mathbf{x} - y) \tag{81}$$

the criticality condition is given by

$$\frac{1}{2t}V'(\mathbf{x}) = \mathbf{P} \int dy \, \frac{\rho(y)}{\mathbf{x} - y} \tag{82}$$

Here P stands for the principal value. Thus this saddle point equation can be also written in the following convenient form to discuss its analytic property,

$$V'(\mathbf{x}) = \omega(\mathbf{x} + i\varepsilon) + \omega(\mathbf{x} - i\varepsilon) \tag{83}$$

On the other hand, we have another convenient form to treat the saddle point equation, which is called *loop equation*, given by

$$\left(y^2(\mathbf{x}) - V'(\mathbf{x})^2 + \mathcal{R}(\mathbf{x}) = 0\right) \tag{84}$$

where we denote

16 Linear Algebra

This is essentially the genus zero part of the prepotential. In the large *N* limit, in particular 't Hooft limit (43) with *Nh*¯ ≡ *t*, we shall investigate the saddle point equation for the matrix integral. We can obtain the condition for criticality by differentiating the prepotential,

> 1 *xj* − *xi*

> > *N* ∑ *j*(�=*i*)

This potential involves a logarithmic Coulomb repulsion between eigenvalues. If the 't Hooft coupling is small, the potential term dominates the Coulomb interaction and eigenvalues

> *N* ∑ *i*=1

where *xi* is the solution of the criticality condition (73). In the large *N* limit, it is natural to think this eigenvalue distribution is smeared, and becomes a continuous function. Furthermore, we assume the eigenvalues are distributed around the critical points of the potential *V*(*x*) as linear segments. Thus we generically denote the *l*-th segment for *ρ*(*x*) as C*l*, and the total number of

> *n* ∑ *l*=1

where *Nl* is the number of eigenvalues in the interval C*l*. The density of eigenvalues *ρ*(*x*) takes

*dx <sup>ρ</sup>*(*x*) = *Nl*

where we call it *filling fraction*. According to these fractions, we can introduce the partial 't Hooft parameters, *tl* = *Nlh*¯. Note there are *n* 't Hooft couplings and filling fractions, but only

We then introduce the resolvent for this model as an auxiliary function, a kind of Green

function. By taking the large *N* limit, it can be given by the integral representation,

*N* =

*V*(*xi*) + 2

*N* ∑ *i*<*j*

log(*xi* − *xj*) (72)

, for all *i* (73)

log(*xi* − *xj*) (74)

(*x*) = 0. On the other hand, as the coupling gets

*δ*(*x* − *xi*) (75)

*Nl* (76)

*<sup>N</sup>* <sup>≡</sup> *<sup>ν</sup><sup>l</sup>* (77)

*<sup>l</sup>*=<sup>1</sup> *ν<sup>l</sup>* = 1 while all the 't Hooft

*h*¯

*N* ∑ *j*(�=*i*)

This is also given by the extremal condition of the effective potential defined as

*V*eff(*xi*) = *V*(*xi*) − 2¯*h*

To deal with such a situation, we now define the density of eigenvalues,

*<sup>ρ</sup>*(*x*) = <sup>1</sup> *N*

*N* ∑ *i*=1

(37),

− 1

concentrate on extrema of the potential *V*�

bigger, the eigenvalue distribution is extended.

eigenvalues *N* splits into *n* integers for these segments,

non-zero value only on the segment C*l*, and is normalized as

*<sup>n</sup>* <sup>−</sup> 1 fractions are independent since they have to satisfy <sup>∑</sup>*<sup>n</sup>*

couplings are independent.

 C*l*

*<sup>h</sup>*¯ <sup>2</sup> <sup>F</sup>({*xi*}) = <sup>−</sup><sup>1</sup>

(*xi*) = 2¯*h*

*V*�

$$y(\mathbf{x}) = V'(\mathbf{x}) - 2\omega(\mathbf{x}) = -2\omega\_{\text{sing}}(\mathbf{x})\tag{85}$$

$$R(\mathbf{x}) = \frac{4t}{N} \sum\_{i=1}^{N} \frac{V'(\mathbf{x}) - V'(\mathbf{x}\_i)}{\mathbf{x} - \mathbf{x}\_i} \tag{86}$$

It is obtained from the saddle point equation by multiplying 1/(*x* − *xi*) and taking their summation and the large *N* limit. This representation (84) is more appropriate to reveal its geometric meaning. Indeed this algebraic curve is interpreted as the hyperelliptic curve which is given by resolving the singular form,

$$\left(y^2(\mathbf{x}) - V'(\mathbf{x})\right)^2 = 0\tag{87}$$

The genus of the Riemann surface is directly related to the number of cuts of the corresponding resolvent. The filling fraction, or the partial 't Hooft coupling, is simply given by the contour

#### 18 Linear Algebra 92 Linear Algebra – Theorems and Applications

integral on the hyperelliptic curve

$$\text{tr}\_l = \frac{1}{2\pi i} \oint\_{\mathcal{C}\_l} d\mathbf{x} \,\omega\_{\text{sing}}(\mathbf{x}) = -\frac{1}{4\pi i} \oint\_{\mathcal{C}\_l} d\mathbf{x} \, y(\mathbf{x})\tag{88}$$

#### **4.2. Relation to Seiberg-Witten theory**

We now discuss the relation between Seiberg-Witten curve and the matrix model. In the first place, the matrix model captures the asymptotic behavior of the combinatorial representation of the partition function. The energy functional, which is derived from the asymptotics of the partition function [44], in terms of the profile function

$$\mathcal{E}\_{\Lambda}(f) = \frac{1}{4} \text{P} \int\_{y<\infty} d\mathbf{x} dy \, f^{\prime\prime}(\mathbf{x}) f^{\prime\prime}(y) (\mathbf{x} - y)^2 \left( \log \left( \frac{\mathbf{x} - y}{\Lambda} \right) - \frac{\mathbf{3}}{2} \right) \tag{89}$$

can be rewritten as

$$\mathcal{E}\_{\Lambda}(\varrho) = -\mathbb{P} \int\_{\mathbf{x} \neq \mathbf{y}} d\mathbf{x} dy \, \frac{\varrho(\mathbf{x})\varrho(\mathbf{y})}{(\mathbf{x} - \mathbf{y})^2} - 2 \int d\mathbf{x} \, \varrho(\mathbf{x}) \log \prod\_{l=1}^{N} \left(\frac{\mathbf{x} - a\_l}{\Lambda}\right) \tag{90}$$

up to the perturbative contribution

$$\frac{1}{2} \sum\_{l,m} (a\_l - a\_m)^2 \log\left(\frac{a\_l - a\_m}{\Lambda}\right) \tag{91}$$

by identifying

$$f(\mathbf{x}) - \sum\_{l=1}^{n} |\mathbf{x} - a\_l| = \varrho(\mathbf{x}) \tag{92}$$

Then integrating (90) by parts, we have

$$\mathcal{E}\_{\Lambda}(\varrho) = -\mathbb{P} \int\_{\mathbf{x} \neq \mathbf{y}} d\mathbf{x} dy \,\varrho'(\mathbf{x}) \varrho'(\mathbf{y}) \log(\mathbf{x} - \mathbf{y}) + 2 \int d\mathbf{x} \,\varrho'(\mathbf{x}) \sum\_{l=1}^{n} \left[ (\mathbf{x} - a\_{l}) \log\left(\frac{\mathbf{x} - a\_{l}}{\Lambda}\right) - (\mathbf{x} - a\_{l}) \right] \tag{93}$$

This is just the matrix model discussed in section 3.3 if we identify *�*� (*x*) = *ρ*(*x*). Therefore analysis of this matrix model is equivalent to that of [**?** ]. But in this section we reconsider the result of the gauge theory from the viewpoint of the matrix model.

We can introduce a regular function on the complex plane, except at the infinity,

$$P\_n(\mathbf{x}) = \Lambda^n \left( e^{y/2} + e^{-y/2} \right) \equiv \Lambda^n \left( w + \frac{1}{w} \right) \tag{94}$$

It is because the saddle point equation (83) yields the following equation,

$$e^{y(\mathbf{x}+i\varepsilon)/2} + e^{-y(\mathbf{x}+i\varepsilon)/2} = e^{y(\mathbf{x}-i\varepsilon)/2} + e^{-y(\mathbf{x}-i\varepsilon)/2} \tag{95}$$

This entire function turns out to be a monic polynomial *Pn*(*x*) = *<sup>x</sup><sup>n</sup>* <sup>+</sup> ··· , because it is an analytic function with the following asymptotic behavior,

$$
\Lambda^n e^{y/2} = \Lambda^n e^{-\omega(\mathbf{x})} \prod\_{l=1}^n \left( \frac{\mathbf{x} - a\_l}{\Lambda} \right) \longrightarrow \mathbf{x}^n, \qquad \mathbf{x} \longrightarrow \infty \tag{96}
$$

Here *w* should be the smaller root with the boundary condition as

$$w \longrightarrow \frac{\Lambda^n}{\mathfrak{x}^n} \, \Big|\qquad \mathfrak{x} \longrightarrow \infty \tag{97}$$

thus we now identify

18 Linear Algebra

*dx <sup>ω</sup>*sing(*x*) = <sup>−</sup> <sup>1</sup>

We now discuss the relation between Seiberg-Witten curve and the matrix model. In the first place, the matrix model captures the asymptotic behavior of the combinatorial representation of the partition function. The energy functional, which is derived from the asymptotics of the

*dxdy f* ��(*x*)*<sup>f</sup>* ��(*y*)(*<sup>x</sup>* <sup>−</sup> *<sup>y</sup>*)<sup>2</sup>

(*<sup>x</sup>* <sup>−</sup> *<sup>y</sup>*)<sup>2</sup> <sup>−</sup> <sup>2</sup>

(*al* <sup>−</sup> *am*)<sup>2</sup> log

*n* ∑ *l*=1

> *dx �*� (*x*) *n* ∑ *l*=1

analysis of this matrix model is equivalent to that of [**?** ]. But in this section we reconsider the

−*y*/2

<sup>≡</sup> <sup>Λ</sup>*<sup>n</sup> w* + 1 *w* 

*<sup>y</sup>*(*x*−*i�*)/2 + *e*

 *al* <sup>−</sup> *am* Λ

*dxdy �*(*x*)*�*(*y*)

*f*(*x*) −

(*y*)log(*x* − *y*) +2

We can introduce a regular function on the complex plane, except at the infinity,

<sup>−</sup>*y*(*x*+*i�*)/2 = *e*

This entire function turns out to be a monic polynomial *Pn*(*x*) = *<sup>x</sup><sup>n</sup>* <sup>+</sup> ··· , because it is an

 *e <sup>y</sup>*/2 + *e*

It is because the saddle point equation (83) yields the following equation,

This is just the matrix model discussed in section 3.3 if we identify *�*�

result of the gauge theory from the viewpoint of the matrix model.

*Pn*(*x*) = Λ*<sup>n</sup>*

*<sup>y</sup>*(*x*+*i�*)/2 + *e*

analytic function with the following asymptotic behavior,

4*πi* C*l*

> log

*dx �*(*x*)log

 *<sup>x</sup>* <sup>−</sup> *<sup>y</sup>* Λ

> *N* ∏ *l*=1


(*x* − *al*)log

 − 3 2 

 *<sup>x</sup>* <sup>−</sup> *al* Λ

 *<sup>x</sup>* <sup>−</sup> *al* Λ

<sup>−</sup>*y*(*x*−*i�*)/2 (95)

(*x*) = *ρ*(*x*). Therefore

*dx y*(*x*) (88)

(89)

(90)

(91)

− (*x* − *al*)

(93)

(94)

integral on the hyperelliptic curve

**4.2. Relation to Seiberg-Witten theory**

<sup>E</sup>Λ(*f*) = <sup>1</sup>

EΛ(*�*) = −P

up to the perturbative contribution

Then integrating (90) by parts, we have

*dxdy �*�

*e*

(*x*)*�*�

 *x*�=*y*

can be rewritten as

by identifying

EΛ(*�*) = −P

*tl* <sup>=</sup> <sup>1</sup> 2*πi* C*l*

partition function [44], in terms of the profile function

 *x*�=*y*

> 1 <sup>2</sup> ∑ *l*,*m*

4 P *y*<*x*

$$w = e^{-y/2} \tag{98}$$

Therefore from the hyperelliptic curve (94) we can relate Seiberg-Witten curve to the spectral curve of the matrix model,

$$\begin{split} dS &= \frac{1}{2\pi i} \mathbf{x} \frac{dw}{w} \\ &= -\frac{1}{2\pi i} \log w \, dx \\ &= \frac{1}{4\pi i} y(x) dz \end{split} \tag{99}$$

Note that it is shown in [37, 38] we have to take the vanishing fraction limit to obtain the Coulomb moduli from the matrix model contour integral. This is the essential difference between the profile function method and the matrix model description.

#### **4.3. Eigenvalue distribution**

We now demonstrate that the eigenvalue distribution function is indeed derived from the spectral curve of the matrix model. The spectral curve (94) in the case of *n* = 1 with setting Λ = 1 and *Pn*=1(*x*) = *x* is written as

$$x = w + \frac{1}{w} \tag{100}$$

From this relation the singular part of the resolvent can be extracted as

$$
\omega\_{\text{sing}}(\mathbf{x}) = \arccos \left( \frac{\mathbf{x}}{2} \right) \tag{101}
$$

This has a branch cut only on *x* ∈ [−2, 2], namely a one-cut solution. Thus the eigenvalue distribution function is witten as follows at least on *x* ∈ [−2, 2],

$$\rho(\mathbf{x}) = \frac{1}{\pi} \arccos\left(\frac{\mathbf{x}}{2}\right) \tag{102}$$

Note that this function has a non-zero value at the left boundary of the cut, *ρ*(−2) = 1, while at the right boundary we have *ρ*(2) = 0. Equivalently we now choose the cut of arccos function in this way. This seems a little bit strange because the eigenvalue density has to vanish except for on the cut. On the other hand, recalling the meaning of the eigenvalues, i.e. positions of one-dimensional particles, as shown in Fig. 4, this situation is quite reasonable. The region below the Fermi level is filled of the particles, and thus the density has to be a non-zero constant in such a region. This is just a property of the Fermi distribution function. (1/*N* correction could be interpreted as a finite temperature effect.) Therefore the total eigenvalue

**Figure 6.** The eigenvalue distribution function for the U(1) model.

distribution function is given by

$$\rho(\mathbf{x}) = \begin{cases} 1 & \mathbf{x} < -2 \\ \frac{1}{\pi} \arccos\left(\frac{\mathbf{x}}{2}\right) & |\mathbf{x}| < 2 \\ 0 & \mathbf{x} > 2 \end{cases} \tag{103}$$

Remark the eigenvalue density (103) is quite similar to the Wigner's semi-circle distribution function, especially its behavior around the edge,

$$\rho\_{\text{circ}}(\mathbf{x}) = \frac{1}{\pi} \sqrt{1 - \left(\frac{\mathbf{x}}{2}\right)^2} \longrightarrow \frac{1}{\pi} \sqrt{2 - \mathbf{x}}, \qquad \mathbf{x} \longrightarrow \mathbf{2} \tag{104}$$

The fluctuation at the spectral edge of the random matrix obeys Tracy-Widom distribution [56], thus it is natural that the edge fluctuation of the combinatorial model is also described by Tracy-Widom distribution. This remarkable fact was actually shown by [9]. Evolving such a similarity to the gaussian random matrix theory, the kernel of this model is also given by the following sine kernel,

$$K(\mathbf{x}, y) = \frac{\sin \rho\_0 \pi (\mathbf{x} - y)}{\pi (\mathbf{x} - y)} \tag{105}$$

where *ρ*<sup>0</sup> is the averaged density of eigenvalues. This means the U(1) combinatorial model belongs to the GUE random matrix universal class [40]. Then all the correlation functions can be written as a determinant of this kernel,

$$\rho(\mathbf{x}\_1, \dots, \mathbf{x}\_k) = \det\left[\mathcal{K}(\mathbf{x}\_i, \mathbf{x}\_j)\right]\_{1 \le i, j \le k} \tag{106}$$

Let us then remark a relation to the profile function of the Young diagram. It was shown that the shape of the Young diagram goes to the following form in the thermodynamical limit [33, 58, 59],

$$\Omega(\mathbf{x}) = \begin{cases} \frac{2}{\pi} \left( \mathbf{x} \arcsin \frac{\mathbf{x}}{2} + \sqrt{4 - \mathbf{x}^2} \right) & |\mathbf{x}| < 2 \\\ & |\mathbf{x}| \end{cases} \tag{107}$$

Rather than this profile function itself, the derivative of this function is more relevant to our study,

$$\Omega'(\mathbf{x}) = \begin{cases} -1 & \mathbf{x} < -2 \\ \frac{2}{\pi} \arcsin\left(\frac{\mathbf{x}}{2}\right) & |\mathbf{x}| < 2 \\ 1 & \mathbf{x} > 2 \end{cases} \tag{108}$$

One can see the eigenvalue density (103) is directly related to this derivative function (108) as

$$\rho(\mathbf{x}) = \frac{1 - \Omega'(\mathbf{x})}{2} \tag{109}$$

This relation is easily obtained from the correspondence between the Young diagram and the one-dimensional particle as shown in Fig. 4.

#### **5. Conclusion**

20 Linear Algebra

**Figure 6.** The eigenvalue distribution function for the U(1) model.

function, especially its behavior around the edge,

be written as a determinant of this kernel,

Ω(*x*) =

Ω� (*x*) =

*<sup>ρ</sup>*circ(*x*) = <sup>1</sup>

*ρ*(*x*) =

*π* � 1 − � *x* 2 �2 −→ 1 *π*

⎧ ⎨ ⎩

1

*<sup>π</sup>* arccos � *<sup>x</sup>*

Remark the eigenvalue density (103) is quite similar to the Wigner's semi-circle distribution

The fluctuation at the spectral edge of the random matrix obeys Tracy-Widom distribution [56], thus it is natural that the edge fluctuation of the combinatorial model is also described by Tracy-Widom distribution. This remarkable fact was actually shown by [9]. Evolving such a similarity to the gaussian random matrix theory, the kernel of this model is also given by the

*<sup>K</sup>*(*x*, *<sup>y</sup>*) = sin *<sup>ρ</sup>*0*π*(*<sup>x</sup>* <sup>−</sup> *<sup>y</sup>*)

where *ρ*<sup>0</sup> is the averaged density of eigenvalues. This means the U(1) combinatorial model belongs to the GUE random matrix universal class [40]. Then all the correlation functions can

�

Let us then remark a relation to the profile function of the Young diagram. It was shown that the shape of the Young diagram goes to the following form in the thermodynamical limit

Rather than this profile function itself, the derivative of this function is more relevant to our

<sup>2</sup> <sup>+</sup> <sup>√</sup>

*x* arcsin *<sup>x</sup>*

⎧ ⎨ ⎩

2 *<sup>π</sup>* arcsin � *<sup>x</sup>*

*K*(*xi*, *xj*)

<sup>4</sup> − *<sup>x</sup>*<sup>2</sup> �

−1 *x* < −2

� <sup>|</sup>*x*<sup>|</sup> <sup>&</sup>lt; <sup>2</sup> 1 *x* > 2

2


<sup>|</sup>*x*| |*x*<sup>|</sup> <sup>&</sup>gt; <sup>2</sup> (107)

�

*ρ*(*x*1, ··· , *xk*) = det

� 2 *π* � 1 *x* < −2

� <sup>|</sup>*x*<sup>|</sup> <sup>&</sup>lt; <sup>2</sup> 0 *x* > 2

<sup>√</sup><sup>2</sup> <sup>−</sup> *<sup>x</sup>*, *<sup>x</sup>* −→ <sup>2</sup> (104)

*<sup>π</sup>*(*<sup>x</sup>* <sup>−</sup> *<sup>y</sup>*) (105)

<sup>1</sup>≤*i*,*j*,≤*<sup>k</sup>* (106)

(103)

(108)

2

distribution function is given by

following sine kernel,

[33, 58, 59],

study,

In this article we have investigated the combinatorial statistical model through its matrix model description. Starting from the U(1) model, which is motivated by representation theory, we have dealt with its *β*-deformation and *q*-deformation. We have shown that its non-Abelian generalization, including external field parameters, is obtained as the four dimensional supersymmetric gauge theory partition function. We have also referred to the orbifold partition function, and its relation to the *q*-deformed model through the root of unity limit.

We have then shown the matrix integral representation is derived from such a combinatorial partition function by considering its asymptotic behavior in the large *N* limit. Due to variety of the combinatorial model, we can obtain the *β*-ensemble matrix model, the hyperbolic matrix model, and those with external fields. Furthermore from the orbifold partition function the multi-matrix model is derived.

Based on the matrix model description, we have study the asymptotic behavior of the combinatorial models in the large *N* limit. In this limit we can extract various important properties of the matrix model by analysing the saddle point equation. Introducing the resolvent as an auxiliary function, we have obtained the algebraic curve for the matrix model, which is called the spectral curve. We have shown it can be interpreted as Seiberg-Witten curve, and then the eigenvalue distribution function is also obtained from this algebraic curve.

Let us comment on some possibilities of generalization and perspective. As discussed in this article we can obtain various interesting results from Macdonald polynomial by taking the corresponding limit. It is interesting to research its matrix model consequence from the exotic limit of Macdonald polynomial. For example, the *q* → 0 limit of Macdonald polynomial, which is called Hall-Littlewood polynomial, is not investigated with respect to its connection with the matrix model. We also would like to study properties of the *BC*-type polynomial [31], which is associated with the corresponding root system. Recalling the meaning of the *q*-deformation in terms of the gauge theory, namely lifting up to the five dimensional theory **<sup>R</sup>**<sup>4</sup> <sup>×</sup> *<sup>S</sup>*<sup>1</sup> by taking into account all the Kaluza-Klein modes, it seems interesting to study the six dimensional theory on **<sup>R</sup>**<sup>4</sup> <sup>×</sup> *<sup>T</sup>*2. In this case it is natural to obtain the elliptic generalization of the matrix model. It can not be interpreted as matrix integral representation any longer, however the large *N* analysis could be anyway performed in the standard manner. We would like to expect further develpopment beyond this work.

#### **Author details**

Taro Kimura *Mathematical Physics Laboratory, RIKEN Nishina Center, Japan*

#### **6. References**


[25] Haldane, F. D. M. [1991]. "Fractional statistics" in arbitrary dimensions: A generalization of the Pauli principle, *Phys. Rev. Lett.* 67: 937–940.

22 Linear Algebra

[1] Alday, L. F., Gaiotto, D. & Tachikawa, Y. [2010]. Liouville Correlation Functions from

[2] Atiyah, M. F., Hitchin, N. J., Drinfeld, V. G. & Manin, Y. I. [1978]. Construction of

[3] Baik, J., Deift, P. & Johansson, K. [1999]. On the Distribution of the Length of the Longest Increasing Subsequence of Random Permutations, *J. Amer. Math. Soc.* 12: 1119–1178. [4] Bernevig, B. A. & Haldane, F. D. M. [2008a]. Generalized clustering conditions of Jack

[5] Bernevig, B. A. & Haldane, F. D. M. [2008b]. Model Fractional Quantum Hall States and

[6] Bernevig, B. A. & Haldane, F. D. M. [2008c]. Properties of Non-Abelian Fractional

[7] Bonelli, G., Maruyoshi, K., Tanzini, A. & Yagi, F. [2011]. Generalized matrix models and

[10] Calogero, F. [1969]. Ground state of one-dimensional *N* body system, *J. Math. Phys.*

[11] Dijkgraaf, R. & Sułkowski, P. [2008]. Instantons on ALE spaces and orbifold partitions,

[12] Dijkgraaf, R. & Vafa, C. [2009]. Toda Theories, Matrix Models, Topological Strings, and

[13] Dimofte, T., Gukov, S. & Hollands, L. [2010]. Vortex Counting and Lagrangian

[14] Dotsenko, V. S. & Fateev, V. A. [1984]. Conformal algebra and multipoint correlation

[15] Dotsenko, V. S. & Fateev, V. A. [1985]. Four-point correlation functions and the operator algebra in 2D conformal invariant theories with central charge *c* ≤ 1, *Nucl. Phys.*

[16] Eguchi, T. & Maruyoshi, K. [2010a]. Penner Type Matrix Model and Seiberg-Witten

[17] Eguchi, T. & Maruyoshi, K. [2010b]. Seiberg-Witten theory, matrix model and AGT

[18] Eguchi, T. & Yang, S.-K. [1994]. The Topological *CP*<sup>1</sup> model and the large *N* matrix

[19] Eynard, B. [2008]. All orders asymptotic expansion of large partitions, *J. Stat. Mech.*

[20] Fucito, F., Morales, J. F. & Poghossian, R. [2004]. Multi instanton calculus on ALE spaces,

[21] Fujimori, T., Kimura, T., Nitta, M. & Ohashi, K. [2012]. Vortex counting from field theory,

[22] Gaiotto, D. [2009a]. Asymptotically free N = 2 theories and irregular conformal blocks,

[24] Giamarchi, T. [2003]. *Quantum Physics in One Dimension*, Oxford University Press.

[8] Borodin, A. & Corwin, I. [2011]. Macdonald processes, arXiv:1111.4408 [math.PR]. [9] Borodin, A., Okounkov, A. & Olshanski, G. [2000]. On asymptotics of the plancherel

Four-dimensional Gauge Theories, *Lett.Math.Phys.* 91: 167–197.

polynomials at negative Jack parameter *α*, *Phys. Rev.* B77: 184502.

Quantum Hall States at Filling *ν* = *k*/*r*, *Phys. Rev. Lett.* 101: 246806.

measures for symmetric groups, *J. Amer. Math. Soc.* 13: 481–515.

functions in 2D statistical models, *Nucl. Phys.* B240: 312–348.

[23] Gaiotto, D. [2009b]. N = 2 dualities, arXiv:0904.2715 [hep-th].

instantons, *Phys. Lett.* A65: 185–187.

Jack Polynomials, *Phys. Rev. Lett.* 100: 246802.

AGT correspondence at all genera, *JHEP* 07: 055.

N = 2 Gauge Systems, arXiv:0909.2453 [hep-th].

3-manifolds, *Lett. Math. Phys.* 98: 225–287.

integral, *Mod. Phys. Lett.* A9: 2893–2902.

**6. References**

10: 2197.

*JHEP* 03: 013.

B251: 691–734.

07: P07023.

Theory, *JHEP* 02: 022.

relation, *JHEP* 07: 081.

*Nucl. Phys.* B703: 518–536.

arXiv:1204.1968 [hep-th].

arXiv:0908.0307 [hep-th].

	- [49] Seiberg, N. & Witten, E. [1994b]. Monopoles, duality and chiral symmetry breaking in N = 2 supersymmetric QCD, *Nucl. Phys.* B431: 484–550.
	- [50] Shadchin, S. [2007]. On F-term contribution to effective action, *JHEP* 08: 052.
	- [51] Stanley, R. P. [2001]. *Enumerative Combinatorics: Volume 2*, Cambridge Univ. Press.
	- [52] Sułkowski, P. [2009]. Matrix models for 2∗ theories, *Phys. Rev.* D80: 086006.
	- [53] Sułkowski, P. [2010]. Matrix models for *β*-ensembles from Nekrasov partition functions, *JHEP* 04: 063.
	- [54] Sutherland, B. [1971]. Quantum many body problem in one-dimension: Ground state, *J. Math. Phys.* 12: 246.
	- [55] Taki, M. [2011]. On AGT Conjecture for Pure Super Yang-Mills and W-algebra, *JHEP* 05: 038.
	- [56] Tracy, C. & Widom, H. [1994]. Level-spacing distributions and the Airy kernel, *Commun. Math. Phys.* 159: 151–174.
	- [57] Uglov, D. [1998]. Yangian Gelfand-Zetlin bases, gl*N*-Jack polynomials and computation of dynamical correlation functions in the spin Calogero-Sutherland model, *Commun. Math. Phys.* 193: 663–696.
	- [58] Vershik, A. & Kerov, S. [1977]. Asymptotics of the Plahcherel measure of the symmetric group and the limit form of Young tablezux, *Soviet Math. Dokl.* 18: 527–531.
	- [59] Vershik, A. & Kerov, S. [1985]. Asymptotic of the largest and the typical dimensions of irreducible representations of a symmetric group, *Func. Anal. Appl.* 19: 21–31.
	- [60] Witten, E. [1997]. Solutions of four-dimensional field theories via M-theory, *Nucl. Phys.* B500: 3–42.

## **Nonnegative Inverse Eigenvalue Problem**

Ricardo L. Soto

24 Linear Algebra

[49] Seiberg, N. & Witten, E. [1994b]. Monopoles, duality and chiral symmetry breaking in

[53] Sułkowski, P. [2010]. Matrix models for *β*-ensembles from Nekrasov partition functions,

[54] Sutherland, B. [1971]. Quantum many body problem in one-dimension: Ground state, *J.*

[55] Taki, M. [2011]. On AGT Conjecture for Pure Super Yang-Mills and W-algebra, *JHEP*

[56] Tracy, C. & Widom, H. [1994]. Level-spacing distributions and the Airy kernel, *Commun.*

[57] Uglov, D. [1998]. Yangian Gelfand-Zetlin bases, gl*N*-Jack polynomials and computation of dynamical correlation functions in the spin Calogero-Sutherland model, *Commun.*

[58] Vershik, A. & Kerov, S. [1977]. Asymptotics of the Plahcherel measure of the symmetric

[59] Vershik, A. & Kerov, S. [1985]. Asymptotic of the largest and the typical dimensions of irreducible representations of a symmetric group, *Func. Anal. Appl.* 19: 21–31. [60] Witten, E. [1997]. Solutions of four-dimensional field theories via M-theory, *Nucl. Phys.*

group and the limit form of Young tablezux, *Soviet Math. Dokl.* 18: 527–531.

[50] Shadchin, S. [2007]. On F-term contribution to effective action, *JHEP* 08: 052. [51] Stanley, R. P. [2001]. *Enumerative Combinatorics: Volume 2*, Cambridge Univ. Press. [52] Sułkowski, P. [2009]. Matrix models for 2∗ theories, *Phys. Rev.* D80: 086006.

N = 2 supersymmetric QCD, *Nucl. Phys.* B431: 484–550.

*JHEP* 04: 063.

05: 038.

B500: 3–42.

*Math. Phys.* 12: 246.

*Math. Phys.* 159: 151–174.

*Math. Phys.* 193: 663–696.

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/48279

## **1. Introduction**

Nonnegative matrices have long been a sorce of interesting and challenging mathematical problems. They are real matrices with all their entries being nonnegative and arise in a number of important application areas: communications systems, biological systems, economics, ecology, computer sciences, machine learning, and many other engineering systems. Inverse eigenvalue problems constitute an important subclass of inverse problems that arise in the context of mathematical modeling and parameter identification. A simple application of such problems is the construction of Leontief models in economics [1]-[3].

The *nonnegative inverse eigenvalue problem* (*NIEP*) is the problem of characterizing those lists Λ = {*λ*1, *λ*2, ..., *λn*} of complex numbers which can be the spectra of *n* × *n* entrywise nonnegative matrices. If there exists a nonnegative matrix *A* with spectrum Λ we say that Λ is realized by *A* and that *A* is the realizing matrix. A set *K* of conditions is said to be a *realizability criterion* if any list Λ = {*λ*1, *λ*2, ..., *λn*}, real or complex, satisfying conditions *K* is realizable. The *NIEP is an open problem.* A full solution is unlikely in the near future. The problem has only been solved for *n* = 3 by Loewy and London ([4], 1978) and for *n* = 4 by Meehan ([5], 1998) and Torre-Mayo et al.([6], 2007). The case *n* = 5 has been solved for matrices of trace zero in ([7], 1999). Other results, mostly in terms of sufficient conditions for the problem to have a solution (in the case of a complex list Λ), have been obtained, in chronological order, in [8]-[13].

Two main subproblems of the *NIEP* are of great interest: the *real nonnegative inverse eigenvalue problem* (*RNIEP*), in which Λ is a list of real numbers, and the *symmetric nonnegative inverse eigenvalue problem* (*SNIEP*), in which the realizing matrix must be symmetric. Both problems, *RNIEP* and *SNIEP* are equivalent for *n* ≤ 4 (see [14]), but they are different otherwise (see [15]). Moreover, both problems remains unsolved for *n* ≥ 5. The *NIEP* is also of interest for nonnegative matrices with a particular structure, like stochastic and doubly stochastic, circulant, persymmetric, centrosymmetric, Hermitian, Toeplitz, etc.

The first sufficient conditions for the existence of a nonnegative matrix with a given real spectrum (*RNIEP*) were obtained by Suleimanova ([16], 1949) and Perfect ([17, 18], 1953 and

1955). Other sufficient conditions have also been obtained, in chronological order in [19]-[26], (see also [27, 28], and references therein for a comprehensive survey).

The first sufficient conditions for the *SNIEP* were obtained by Fiedler ([29], 1974). Other results for symmetric realizability have been obtained in [8, 30] and [31]-[33]. Recently, new sufficient conditions for the *SNIEP* have been given in [34]-[37].

#### **1.1. Necessary conditions**

Let *A* be a nonnegative matrix with spectrum Λ = {*λ*1, *λ*2, ..., *λn*}. Then, from the Perron Frobenius theory we have the following basic necessary conditions

$$\begin{array}{l} \text{(1)} \ \overline{\Lambda} = \{ \overline{\lambda\_1}, \dots, \overline{\lambda\_n} \} = \Lambda \\ \text{(2)} \ \max\_{\overline{j}} \{ \left| \lambda\_{\overline{j}} \right| \} \in \Lambda \\ \text{(3)} \ s\_m(\Lambda) = \sum\_{j=1}^n \lambda\_{\overline{j}}^m \ge 0, \ m = 1, 2, \dots, \end{array} \tag{1}$$

where Λ = Λ means that Λ is closed under complex comjugation.

Moreover, we have

$$\begin{array}{ll}(4) \ (s\_k(\Lambda))^m \le n^{m-1} s\_{km}(\Lambda), \ k, m = 1, 2, \dots \\ (5) \ (s\_2(\Lambda))^2 \le (n-1) s\_4(\Lambda), \ n \text{ odd}, \ tr(A) = 0. \end{array} \tag{2}$$

Necessary condition (4) is due to Loewy and London [4]. Necessary condition (5), which is a refinement of (4), is due to Laffey and Meehan [38]. The list Λ = {5, 4, −3, −3, −3} for instance, satisfies all above necessary conditions, except condition (5). Therefore Λ is not a realizable list. In [39] it was obtained a new necessary condition, which is independent of the previous ones. This result is based on the Newton's inequalities associated to the normalized coefficients of the characteristic polynomial of an M-matrix or an inverse M-matrix.

The chapter is organized as follows: In section 2 we introduce two important matrix results, due to Brauer and Rado, which have allowed to obtain many of the most general sufficient conditions for the *RNIEP*, the *SNIEP* and the complex case. In section 3 we consider the real case and we introduce, without proof (we indicate where the the proofs can be found), two sufficient conditions with illustrative examples. We consider, in section 4, the symmetric case. Here we introduce a symmetric version of the Rado result, Theorem 2, and we set, without proof (see the appropriate references), three sufficient conditions, which are, as far as we know, the most general sufficient conditions for the *SNIEP*. In section 5, we discuss the complex (non real) case. Here we present several results with illustrative examples. Section 6 is devoted to discuss some Fiedler results and Guo results, which are very related with the problem and have been employed with success to derive sufficient conditions. Finally, in section 7, we introduce some open questions.

### **2. Brauer and Rado Theorems**

A real matrix *A* = (*aij*)*<sup>n</sup> <sup>i</sup>*=<sup>1</sup> is said to have *constant row sums* if all its rows sum up to the same constant, say, *α*, that is, *n* ∑ *j*=1 *aij* = *α*, *i* = 1, . . . , *n*. The set of all real matrices with constant row sums equal to *α* is denoted by CS*α*. It is clear that any matrix in CS*<sup>α</sup>* has eigenvector **<sup>e</sup>** = (1, 1, . . . , 1)*<sup>T</sup>* corresponding to the eigenvalue *<sup>α</sup>*. Denote by **<sup>e</sup>***<sup>k</sup>* the *<sup>n</sup>*−dimensional vector with one in the *k* − *th* position and zeros elsewhere.

It is well known that the problem of finding a nonnegative matrix with spectrum Λ = {*λ*1,..., *λn*} is equivalent to the problem of finding a nonnegative matrix in CS*λ*<sup>1</sup> with spectrum Λ (see [40]). This will allow us to exploit the advantages of two important theorems, Brauer Theorem and Rado Theorem, which will be introduced in this section.

The spectra of circulant nonnegative matrices have been characterized in [9], while in [10], a simple complex generalization of Suleimanova result has been proved, and efficient and general sufficient conditions for the realizability of partitioned spectra, with the partition allowing some of its pieces to be nonrealizable, provided there are other pieces, which are realizable and, in certain way, compensate the nonnrealizability of the former, have been obtained. This is the procedure which we call *negativity compensation*. This strategy, based in the use of the following two perturbation results, together with the properties of real matrices with constant row sums, has proved to be successful.

**Theorem 1.** *Brauer [41] Let A be an n* × *n arbitrary matrix with eigenvalues λ*1,..., *λn*. *Let* **v** = (*v*1, ..., *vn*)*<sup>T</sup> an eigenvector of A associated with the eigenvalue λ<sup>k</sup> and let* **q** = (*q*1, ..., *qn*)*<sup>T</sup> be any n-dimensional vector. Then the matrix A* <sup>+</sup> **vq***<sup>T</sup> has eigenvalues <sup>λ</sup>*1,..., *<sup>λ</sup>k*−1, *<sup>λ</sup><sup>k</sup>* <sup>+</sup> *vTq*, *λk*<sup>+</sup>1,..., *λn*.

**Proof.** Let *U* be an *n* × *n* nonsingular matrix such that

2 Will-be-set-by-IN-TECH

1955). Other sufficient conditions have also been obtained, in chronological order in [19]-[26],

The first sufficient conditions for the *SNIEP* were obtained by Fiedler ([29], 1974). Other results for symmetric realizability have been obtained in [8, 30] and [31]-[33]. Recently, new sufficient

Let *A* be a nonnegative matrix with spectrum Λ = {*λ*1, *λ*2, ..., *λn*}. Then, from the Perron

*<sup>j</sup>* ≥ 0, *m* = 1, 2, . . . ,

(5) (*s*2(Λ))<sup>2</sup> <sup>≤</sup> (*<sup>n</sup>* <sup>−</sup> <sup>1</sup>)*s*4(Λ), *<sup>n</sup>* odd, *tr*(*A*) = 0. (2)

*<sup>i</sup>*=<sup>1</sup> is said to have *constant row sums* if all its rows sum up to the same

*aij* = *α*, *i* = 1, . . . , *n*. The set of all real matrices with constant

(1)

(see also [27, 28], and references therein for a comprehensive survey).

Frobenius theory we have the following basic necessary conditions

 *λj* } ∈ <sup>Λ</sup>

(1) Λ = {*λ*1,..., *λn*} = Λ

*<sup>j</sup>*=<sup>1</sup> *<sup>λ</sup><sup>m</sup>*

(4) (*sk*(Λ))*<sup>m</sup>* <sup>≤</sup> *<sup>n</sup>m*−<sup>1</sup>*skm*(Λ), *<sup>k</sup>*, *<sup>m</sup>* <sup>=</sup> 1, 2, . . .

coefficients of the characteristic polynomial of an M-matrix or an inverse M-matrix.

Necessary condition (4) is due to Loewy and London [4]. Necessary condition (5), which is a refinement of (4), is due to Laffey and Meehan [38]. The list Λ = {5, 4, −3, −3, −3} for instance, satisfies all above necessary conditions, except condition (5). Therefore Λ is not a realizable list. In [39] it was obtained a new necessary condition, which is independent of the previous ones. This result is based on the Newton's inequalities associated to the normalized

The chapter is organized as follows: In section 2 we introduce two important matrix results, due to Brauer and Rado, which have allowed to obtain many of the most general sufficient conditions for the *RNIEP*, the *SNIEP* and the complex case. In section 3 we consider the real case and we introduce, without proof (we indicate where the the proofs can be found), two sufficient conditions with illustrative examples. We consider, in section 4, the symmetric case. Here we introduce a symmetric version of the Rado result, Theorem 2, and we set, without proof (see the appropriate references), three sufficient conditions, which are, as far as we know, the most general sufficient conditions for the *SNIEP*. In section 5, we discuss the complex (non real) case. Here we present several results with illustrative examples. Section 6 is devoted to discuss some Fiedler results and Guo results, which are very related with the problem and have been employed with success to derive sufficient conditions. Finally, in section 7, we

row sums equal to *α* is denoted by CS*α*. It is clear that any matrix in CS*<sup>α</sup>* has eigenvector

conditions for the *SNIEP* have been given in [34]-[37].

(2) max*j*{

(3) *sm*(Λ) = ∑*<sup>n</sup>*

where Λ = Λ means that Λ is closed under complex comjugation.

**1.1. Necessary conditions**

Moreover, we have

introduce some open questions.

A real matrix *A* = (*aij*)*<sup>n</sup>*

constant, say, *α*, that is,

**2. Brauer and Rado Theorems**

*n* ∑ *j*=1

$$U^{-1}AU = \begin{bmatrix} \lambda\_1 \ \* \ \* \ \cdots \ \* \\\\ \lambda\_2 \ \* \ \vdots \\\\ \cdot \ \* \ \* \\\\ \lambda\_n \end{bmatrix}$$

is an upper triangular matrix, where we choose the first column of *U* as **v** (*U* there exists from a well known result of Schur). Then,

$$(\mathbf{U}^{-1}(A+\mathbf{v}\mathbf{q}^{T})\mathbf{U} = \mathbf{U}^{-1}A\mathbf{U} + \begin{bmatrix} q\_{1} \ q\_{2} \cdot \cdots \ q\_{n} \\\\\\\\\\\end{bmatrix} \mathbf{U} = \begin{bmatrix} \lambda\_{1} + \mathbf{q}^{T}\mathbf{v} & \* & \cdots & \* \\\\ & \lambda\_{2} & \ddots & \vdots \\\\&& & \ddots & \* \\\\ & & & \lambda\_{n} \end{bmatrix} .$$

and the result follows. This proof is due to Reams [42].

**Theorem 2.** *Rado [18] Let A be an n* × *n arbitrary matrix with eigenvalues λ*1,..., *λ<sup>n</sup> and let* Ω = *diag*{*λ*1,..., *λr*} *for some r* ≤ *n*. *Let X be an n* × *r matrix with rank r such that its columns x*1, *x*2,..., *xr satisfy Axi* = *λixi*, *i* = 1, . . . ,*r*. *Let C be an r* × *n arbitrary matrix. Then the matrix A* + *XC has eigenvalues μ*1,..., *μr*, *λr*+1,..., *λn*, *where μ*1,..., *μ<sup>r</sup> are eigenvalues of the matrix* Ω + *CX*.

**Proof.** Let *<sup>S</sup>* <sup>=</sup> [*<sup>X</sup>* <sup>|</sup> *<sup>Y</sup>*] a nonsingular matrix with *<sup>S</sup>*−<sup>1</sup> <sup>=</sup> [ *U <sup>V</sup>*]. Then *UX* = *Ir*, *VY* = *In*−*<sup>r</sup>* and *VX* = 0, *UY* = 0. Let *C* = [*C*<sup>1</sup> | *C*2] , *X* = [ *X*<sup>1</sup> *X*<sup>2</sup> ], *Y* = [ *Y*1 *Y*2 ]. Then, since *AX* = *X*Ω,

$$S^{-1}AS = \begin{bmatrix} \mathcal{U} \\ V \end{bmatrix} \begin{bmatrix} X\Omega \mid AY \end{bmatrix} = \begin{bmatrix} \Omega \downarrow \mathcal{U}AY \\ \mathbf{0} \ \mathcal{V}AY \end{bmatrix}$$

and

$$\mathbf{S}^{-1}\mathbf{X}\mathbf{C}\mathbf{S} = \begin{bmatrix} I\_r \\ 0 \end{bmatrix} \begin{bmatrix} \mathbf{C}\_1 \ \mathbf{C}\_2 \end{bmatrix} \mathbf{S} = \begin{bmatrix} \mathbf{C}\_1 \ \mathbf{C}\_2 \\ 0 \ 0 \end{bmatrix} \begin{bmatrix} X\_1 \ Y\_1 \\ X\_2 \ Y\_2 \end{bmatrix} = \begin{bmatrix} \mathbf{C}\mathbf{X} \ \mathbf{C}\mathbf{Y} \\ 0 \ 0 \end{bmatrix}.$$

Thus,

$$\mathbf{S}^{-1}(A+\mathbf{XC})\mathbf{S} = \mathbf{S}^{-1}A\mathbf{S} + \mathbf{S}^{-1}\mathbf{XCS} = \begin{bmatrix} \Omega + \mathbf{C}X \ UAAY + \mathbf{C}Y\\ \mathbf{0} & \forall AY \end{bmatrix}\mathbf{A}$$

and we have *σ*(*A* + *XC*) = *σ*(Ω + *CX*) + *σ*(*A*) − *σ*(Ω).

#### **3. Real nonnegative inverse eigenvalue problem.**

Regarding the *RNIEP,* by applying Brauer Theorem and Rado Theorem, efficient and general sufficient conditions have been obtained in [18, 22, 24, 36].

**Theorem 3.** *[24] Let* Λ = {*λ*1, *λ*2, ..., *λn*} *be a given list of real numbers. Suppose that: i*) *There exists a partition* Λ = Λ<sup>1</sup> ∪ ... ∪ Λ*t*, *where*

Λ*<sup>k</sup>* = {*λk*1, *λk*2,... *λkpk* }, *λ*<sup>11</sup> = *λ*1, *λk*<sup>1</sup> ≥···≥ *λkpk* , *λk*<sup>1</sup> ≥ 0,

*k* = 1, . . . , *t*, *such that for each sublist* Λ*<sup>k</sup> we associate a corresponding list*

$$\Gamma\_k = \{\omega\_{k'}\lambda\_{k2'}...\lambda\_{kp\_k}\}, \ 0 \le \omega\_k \le \lambda\_{1\nu}$$

*which is realizable by a nonnegative matrix Ak* ∈ CS*ω<sup>k</sup> of order pk*. *ii*) *There exists a nonnegative matrix B* ∈ CS*λ*<sup>1</sup> *with eigenvalues λ*1, *λ*21, ..., *λt*<sup>1</sup> *(the first elements of the lists* Λ*k) and diagonal entries ω*1, *ω*2,..., *ω<sup>t</sup> (the first elements of the lists* Γ*k). Then* Λ *is realizable by a nonnegative matrix A* ∈ CS*λ*<sup>1</sup> .

Perfect [18] gave conditions under which *λ*1, *λ*2, ..., *λ<sup>t</sup>* and *ω*1, *ω*2,..., *ω<sup>t</sup>* are the eigenvalues and the diagonal entries, respectively, of a *t* × *t* nonnegative matrix *B* ∈ *CSλ*<sup>1</sup> . For *t* = 2 it is necessary and sufficient that *λ*<sup>1</sup> + *λ*<sup>2</sup> = *ω*<sup>1</sup> + *ω*2, with 0 ≤ *ω<sup>i</sup>* ≤ *λ*1. For *t* = 3 Perfect gave the following result:

**Theorem 4.** *[18] The real numbers λ*1, *λ*2, *λ*<sup>3</sup> *and ω*1, *ω*2, *ω*<sup>3</sup> *are the eigenvalues and the diagonal entries, respectively, of a* 3 × 3 *nonnegative matrix B* ∈ CS*λ*<sup>1</sup> , *if and only if:*

$$\begin{array}{ll} i) & 0 \le \omega\_i \le \lambda\_1, \quad i = 1, 2, 3\\ ii) & \lambda\_1 + \lambda\_2 + \lambda\_3 = \omega\_1 + \omega\_2 + \omega\_3\\ iii) & \lambda\_1\lambda\_2 + \lambda\_1\lambda\_3 + \lambda\_2\lambda\_3 \le \omega\_1\omega\_2 + \omega\_1\omega\_3 + \omega\_2\omega\_3\\ iv) & \max\_k \omega\_k \ge \lambda\_2 \end{array} \tag{3}$$

*Then, an appropriate* 3 × 3 *nonnegative matrix B is*

$$B = \begin{bmatrix} \omega\_1 & 0 & \lambda\_1 - \omega\_1 \\ \lambda\_1 - \omega\_2 - p & \omega\_2 & p \\ 0 & \lambda\_1 - \omega\_3 & \omega\_3 \end{bmatrix} \tag{4}$$

*where*

$$p = \frac{1}{\lambda\_1 - \omega\_3} (\omega\_1 \omega\_2 + \omega\_1 \omega\_3 + \omega\_2 \omega\_3 - \lambda\_1 \lambda\_2 + \lambda\_1 \lambda\_3 + \lambda\_2 \lambda\_3).$$

For *t* ≥ 4, we only have a sufficient condition:

4 Will-be-set-by-IN-TECH

� *C*<sup>1</sup> *C*<sup>2</sup> 0 0

Regarding the *RNIEP,* by applying Brauer Theorem and Rado Theorem, efficient and general

Λ*<sup>k</sup>* = {*λk*1, *λk*2,... *λkpk* }, *λ*<sup>11</sup> = *λ*1, *λk*<sup>1</sup> ≥···≥ *λkpk* , *λk*<sup>1</sup> ≥ 0,

Γ*<sup>k</sup>* = {*ωk*, *λk*2, ..., *λkpk* }, 0 ≤ *ω<sup>k</sup>* ≤ *λ*1,

*ii*) *There exists a nonnegative matrix B* ∈ CS*λ*<sup>1</sup> *with eigenvalues λ*1, *λ*21, ..., *λt*<sup>1</sup> *(the first elements of*

Perfect [18] gave conditions under which *λ*1, *λ*2, ..., *λ<sup>t</sup>* and *ω*1, *ω*2,..., *ω<sup>t</sup>* are the eigenvalues and the diagonal entries, respectively, of a *t* × *t* nonnegative matrix *B* ∈ *CSλ*<sup>1</sup> . For *t* = 2 it is necessary and sufficient that *λ*<sup>1</sup> + *λ*<sup>2</sup> = *ω*<sup>1</sup> + *ω*2, with 0 ≤ *ω<sup>i</sup>* ≤ *λ*1. For *t* = 3 Perfect gave the

**Theorem 4.** *[18] The real numbers λ*1, *λ*2, *λ*<sup>3</sup> *and ω*1, *ω*2, *ω*<sup>3</sup> *are the eigenvalues and the diagonal*

*iii*) *λ*1*λ*<sup>2</sup> + *λ*1*λ*<sup>3</sup> + *λ*2*λ*<sup>3</sup> ≤ *ω*1*ω*<sup>2</sup> + *ω*1*ω*<sup>3</sup> + *ω*2*ω*<sup>3</sup>

*λ*<sup>1</sup> − *ω*<sup>2</sup> − *p ω*<sup>2</sup> *p* 0 *λ*<sup>1</sup> − *ω*<sup>3</sup> *ω*<sup>3</sup>

*ω*<sup>1</sup> 0 *λ*<sup>1</sup> − *ω*<sup>1</sup>

(*ω*1*ω*<sup>2</sup> + *ω*1*ω*<sup>3</sup> + *ω*2*ω*<sup>3</sup> − *λ*1*λ*<sup>2</sup> + *λ*1*λ*<sup>3</sup> + *λ*2*λ*3).

⎤

⎦ , (4)

**Theorem 3.** *[24] Let* Λ = {*λ*1, *λ*2, ..., *λn*} *be a given list of real numbers. Suppose that:*

� � *X*<sup>1</sup> *Y*<sup>1</sup> *X*<sup>2</sup> *Y*<sup>2</sup>

�

� = � *CX CY* 0 0

Ω + *CX UAY* + *CY* 0 *VAY*

� .

> � ,

> > (3)

[*C*<sup>1</sup> | *C*2] *S* =

*S*−1(*A* + *XC*)*S* = *S*−1*AS* + *S*−1*XCS* =

*k* = 1, . . . , *t*, *such that for each sublist* Λ*<sup>k</sup> we associate a corresponding list*

*the lists* Λ*k) and diagonal entries ω*1, *ω*2,..., *ω<sup>t</sup> (the first elements of the lists* Γ*k).*

*entries, respectively, of a* 3 × 3 *nonnegative matrix B* ∈ CS*λ*<sup>1</sup> , *if and only if:*

*iv*) *maxkω<sup>k</sup>* ≥ *λ*<sup>2</sup>

⎡ ⎣

*B* =

*Then, an appropriate* 3 × 3 *nonnegative matrix B is*

*<sup>p</sup>* <sup>=</sup> <sup>1</sup>

*λ*<sup>1</sup> − *ω*<sup>3</sup>

*i*) 0 ≤ *ω<sup>i</sup>* ≤ *λ*1, *i* = 1, 2, 3 *ii*) *λ*<sup>1</sup> + *λ*<sup>2</sup> + *λ*<sup>3</sup> = *ω*<sup>1</sup> + *ω*<sup>2</sup> + *ω*<sup>3</sup>

*which is realizable by a nonnegative matrix Ak* ∈ CS*ω<sup>k</sup> of order pk*.

*Then* Λ *is realizable by a nonnegative matrix A* ∈ CS*λ*<sup>1</sup> .

and

Thus,

following result:

*where*

*S*−1*XCS* =

� *Ir* 0 �

and we have *σ*(*A* + *XC*) = *σ*(Ω + *CX*) + *σ*(*A*) − *σ*(Ω).

**3. Real nonnegative inverse eigenvalue problem.**

sufficient conditions have been obtained in [18, 22, 24, 36].

*i*) *There exists a partition* Λ = Λ<sup>1</sup> ∪ ... ∪ Λ*t*, *where*

$$\begin{array}{ll} i) & 0 \le \omega\_k \le \lambda\_{1\prime} \ k = 1, 2, \dots, t\_{\prime} \\ ii) & \omega\_1 + \omega\_2 \cdot \dots + \omega\_t = \lambda\_1 + \lambda\_2 \cdot \dots + \lambda\_{t\prime} \\ iii) & \omega\_k \ge \lambda\_{k\prime} \ \omega\_1 \ge \lambda\_{k\prime} \ k = 2, 3, \dots, t\_{\prime} \end{array} \tag{5}$$

with the following matrix *B* ∈ *CSλ*<sup>1</sup> having eigenvalues and diagonal entries *λ*1, *λ*2,..., *λ<sup>t</sup>* and *ω*1, *ω*2,..., *ωt*, respectively:

$$B = \begin{bmatrix} \omega\_1 & \omega\_2 - \lambda\_2 \cdot \cdots \cdot \omega\_r - \lambda\_t \\ \omega\_1 - \lambda\_2 & \omega\_2 & \cdots \cdot \omega\_r - \lambda\_t \\ \vdots & \vdots & \ddots & \vdots \\ \omega\_1 - \lambda\_t \cdot \omega\_2 - \lambda\_2 \cdot \cdots \cdot \omega\_t \end{bmatrix} \tag{6}$$

**Example 1.** *Let us consider the list* Λ = {6, 1, 1, −4, −4} *with the partition*

$$
\Lambda\_1 = \{6, -4\}, \ \Lambda\_2 = \{1, -4\}, \ \Lambda\_3 = \{1\}
$$

*and the realizable associated lists*

$$
\Gamma\_1 = \{4, -4\}, \; \Gamma\_2 = \{4, -4\}, \; \Gamma\_3 = \{0\}.
$$

*From (4) we compute the* 3 × 3 *nonnegative matrix*

$$B = \begin{bmatrix} 4 \ 0 \ 2 \\ \frac{3}{2} \ 4 \ \frac{1}{2} \\ 0 \ 6 \ 0 \end{bmatrix}$$

*with eigenvalues* 6, 1, 1, *and diagonal entries* 4, 4, 0. *Then*

$$\begin{aligned} A &= \begin{bmatrix} 0 \ 4 \ 0 \ 0 \ 0 \\ 4 \ 0 \ 0 \ 0 \\ 0 \ 0 \ 0 \ 4 \ 0 \\ 0 \ 0 \ 4 \ 0 \\ 0 \ 0 \ 0 \ 0 \\ 0 \end{bmatrix} + \begin{bmatrix} 1 \ 0 \ 0 \\ 1 \ 0 \ 0 \\ 0 \ 1 \ 0 \\ 0 \ 0 \ 1 \\ 0 \ 0 \ 1 \end{bmatrix} \begin{bmatrix} 0 \ 0 \ 0 \ 0 \ 2 \\ \frac{3}{2} \ 0 \ 0 \ 0 \ \frac{1}{2} \\ 0 \ 0 \ 6 \ 0 \ 0 \end{bmatrix} \\ &= \begin{bmatrix} 0 \ 4 \ 0 \ 0 \ 2 \\ 4 \ 0 \ 0 \ 0 \ 2 \\ \frac{3}{2} \ 0 \ 0 \ 4 \ \frac{1}{2} \\ \frac{3}{2} \ 0 \ 4 \ 0 \ \frac{1}{2} \\ 0 \ 0 \ 6 \ 0 \ 0 \end{bmatrix} \end{aligned}$$

*is nonnegative with spectrum* Λ.

A map of sufficient conditions for the *RNIEP* it was constructed in [28]*,* There, the sufficient conditions were compared to establish inclusion or independence relations between them. It is also shown in [28] that the criterion given by Theorem 3 contains all realizability criteria for lists of real numbers studied therein. In [36], from a new special partition, Theorem 3 is extended. Now, the first element *λk*<sup>1</sup> of the sublist Λ*<sup>k</sup>* need not to be nonnegative and the realizable auxiliar list Γ*<sup>k</sup>* = {*ωk*, *λk*1, ..., *λkpk* } contains one more element. Moreover, the number of lists of the partition depend on the number of elements of the first list Λ1, and some lists Λ*<sup>k</sup>* can be empty.

**Theorem 5.** *[36] Let* Λ = {*λ*1, *λ*2,..., *λn*} *be a list of real numbers and let the partition* Λ = Λ<sup>1</sup> ∪···∪ Λ*p*1+<sup>1</sup> *be such that*

$$\Lambda\_k = \{\lambda\_{k1}, \lambda\_{k2}, \dots \lambda\_{kp\_k}\}, \ \lambda\_{11} = \lambda\_{1\prime} \ \lambda\_{k1} \ge \lambda\_{k2} \ge \dots \ge \lambda\_{kp\_{k^{\prime}}}$$

*k* = 1, . . . , *p*<sup>1</sup> + 1, *where p*<sup>1</sup> *is the number of elements of the list* Λ<sup>1</sup> *and some of the lists* Λ*<sup>k</sup> can be empty. Let ω*2,..., *ωp*1+<sup>1</sup> *be real numbers satisfying* 0 ≤ *ω<sup>k</sup>* ≤ *λ*1, *k* = 2, . . . , *p*<sup>1</sup> + 1. *Suppose that the following conditions hold:*

*i*) *For each k* = 2, . . . , *p*<sup>1</sup> + 1, *there exists a nonnegative matrix Ak* ∈ CS*ω<sup>k</sup> with spectrum* Γ*<sup>k</sup>* = {*ωk*, *λk*1, ..., *λkpk* },

*ii*) *There exists a p*<sup>1</sup> × *p*<sup>1</sup> *nonnegative matrix B* ∈ CS*λ*1, *with spectrum* Λ<sup>1</sup> *and with diagonal entries ω*2,..., *ωp*1+1.

*Then* Λ *is realizable by a nonnegative matrix A* ∈ CS*λ*<sup>1</sup> .

**Example 2.** *With this extension, the authors show for instance, that the list*

$$\{5, 4, 0, -3, -3, -3\}$$

*is realizable, which can not be done from the criterion given by Theorem 3. In fact, let the partition*

$$\begin{aligned} \Lambda\_1 &= \{5, 4, 0, -3\}, \,\Lambda\_2 = \{-3\}, \,\Lambda\_3 = \{-3\} \text{ with} \\ \Gamma\_2 &= \{3, -3\}, \,\Gamma\_3 = \{3, -3\}, \,\Gamma\_4 = \Gamma\_5 = \{0\}. \end{aligned}$$

*The nonnegative matrix*

$$B = \begin{bmatrix} \mathbf{3} \ \mathbf{0} \ \mathbf{2} \ \mathbf{0} \\ \mathbf{0} \ \mathbf{3} \ \mathbf{0} \ \mathbf{2} \\ \mathbf{3} \ \mathbf{0} \ \mathbf{0} \ \mathbf{2} \\ \mathbf{0} \ \mathbf{3} \ \mathbf{2} \ \mathbf{0} \end{bmatrix}$$

*has spectrum* Λ<sup>1</sup> *and diagonal entries* 3, 3, 0, 0. *It is clear that*

$$A\_2 = A\_3 = \begin{bmatrix} 0 \ 3 \\ 3 \ 0 \end{bmatrix} \text{ realizes } \Gamma\_2 = \Gamma\_3.$$

*Then*

$$A = \begin{bmatrix} A\_2 \\ A\_3 \\ 0 \\ 0 \end{bmatrix} + \begin{bmatrix} 1 \ 0 \ 0 \ 0 \\ 1 \ 0 \ 0 \ 0 \\ 0 \ 1 \ 0 \ 0 \\ 0 \ 1 \ 0 \ 0 \\ 0 \ 0 \ 1 \ 0 \\ 0 \ 0 \ 0 \ 1 \end{bmatrix} \begin{bmatrix} 0 \ 0 \ 0 \ 0 \ 2 \ 0 \\ 0 \ 0 \ 0 \ 0 \ 0 \ 2 \\ 3 \ 0 \ 0 \ 0 \ 0 \ 2 \\ 0 \ 0 \ 3 \ 0 \ 2 \ 0 \\ 0 \ 0 \ 3 \ 0 \ 2 \ 0 \end{bmatrix} = \begin{bmatrix} 0 \ 3 \ 0 \ 0 \ 0 \ 2 \ 0 \\ 3 \ 0 \ 0 \ 0 \ 2 \ 0 \\ 0 \ 0 \ 0 \ 3 \ 0 \ 2 \\ 0 \ 0 \ 3 \ 0 \ 0 \ 2 \\ 3 \ 0 \ 0 \ 0 \ 0 \ 2 \\ 0 \ 0 \ 3 \ 0 \ 2 \ 0 \end{bmatrix}$$

*has the desired spectrum* {5, 4, 0, −3, −3, −3}.

#### **4. Symmetric nonnegative inverse eigenvalue problem**

Several realizability criteria which were first obtained for the *RNIEP* have later been shown to be symmetric realizability criteria as well. For example, Kellogg criterion [19] was showed by Fiedler [29] to imply symmetric realizability. It was proved by Radwan [8] that Borobia's criterion [21] is also a symmetric realizability criterion, and it was proved in [33] that Soto's criterion for the *RNIEP* is also a criterion for the *SNIEP.* In this section we shall consider the most general and efficient symmetric realizability criteria for the *SNIEP* (as far as we know they are). We start by introducing a symmetric version of the Rado Theorem:.

**Theorem 6.** *[34] Let A be an n* × *n symmetric matrix with spectrum* Λ = {*λ*1, *λ*2,..., *λn*} *and for some r* ≤ *n*, *let* {**x**1, **x**2,..., **x***r*} *be an orthonormal set of eigenvectors of A spanning the invariant subspace associated with λ*1, *λ*2,..., *λr*. *Let X be the n* × *r matrix with i* − *th column* **x***i*, *let* Ω = *diag*{*λ*1,..., *<sup>λ</sup>r*}, *and let C be any r* <sup>×</sup> *r symmetric matrix. Then the symmetric matrix A* <sup>+</sup> *XCX<sup>T</sup> has eigenvalues μ*1, *μ*2,..., *μr*, *λr*+1,..., *λn*, *where μ*1, *μ*2,..., *μ<sup>r</sup> are the eigenvalues of the matrix* Ω + *C*.

**Proof.** Since the columns of *X* are an orthonormal set, we may complete *X* to an orthogonal matrix *<sup>W</sup>* = [*X Y*], that is, *<sup>X</sup>TX* <sup>=</sup> *Ir*, *<sup>Y</sup>TY* <sup>=</sup> *In*−*r*, *<sup>X</sup>TY* <sup>=</sup> 0, *<sup>Y</sup>TX* <sup>=</sup> 0. Then

$$W^{-1}AW = \begin{bmatrix} X^T \\ Y^T \end{bmatrix} A \begin{bmatrix} X & Y \end{bmatrix} = \begin{bmatrix} \Omega \ X^T A Y \\ 0 \ Y^T A Y \end{bmatrix}.$$

$$W^{-1}(\mathbf{X} \mathbf{C} \mathbf{X}^T)W = \begin{bmatrix} I\_r \\ \mathbf{0} \end{bmatrix} \mathbf{C} \begin{bmatrix} I\_r & \mathbf{0} \end{bmatrix} = \begin{bmatrix} \mathbf{C} \ \mathbf{0} \\ \mathbf{0} \ \mathbf{0} \end{bmatrix}.$$

Therefore,

6 Will-be-set-by-IN-TECH

**Theorem 5.** *[36] Let* Λ = {*λ*1, *λ*2,..., *λn*} *be a list of real numbers and let the partition* Λ =

Λ*<sup>k</sup>* = {*λk*1, *λk*2,... *λkpk* }, *λ*<sup>11</sup> = *λ*1, *λk*<sup>1</sup> ≥ *λk*<sup>2</sup> ≥···≥ *λkpk* , *k* = 1, . . . , *p*<sup>1</sup> + 1, *where p*<sup>1</sup> *is the number of elements of the list* Λ<sup>1</sup> *and some of the lists* Λ*<sup>k</sup> can be empty. Let ω*2,..., *ωp*1+<sup>1</sup> *be real numbers satisfying* 0 ≤ *ω<sup>k</sup>* ≤ *λ*1, *k* = 2, . . . , *p*<sup>1</sup> + 1. *Suppose that*

*i*) *For each k* = 2, . . . , *p*<sup>1</sup> + 1, *there exists a nonnegative matrix Ak* ∈ CS*ω<sup>k</sup> with spectrum* Γ*<sup>k</sup>* =

*ii*) *There exists a p*<sup>1</sup> × *p*<sup>1</sup> *nonnegative matrix B* ∈ CS*λ*1, *with spectrum* Λ<sup>1</sup> *and with diagonal entries*

{5, 4, 0, −3, −3, −3} *is realizable, which can not be done from the criterion given by Theorem 3. In fact, let the partition*

> Λ<sup>1</sup> = {5, 4, 0, −3}, Λ<sup>2</sup> = {−3}, Λ<sup>3</sup> = {−3} *with* Γ<sup>2</sup> = {3, −3}, Γ<sup>3</sup> = {3, −3}, Γ<sup>4</sup> = Γ<sup>5</sup> = {0}.

⎤ ⎥ ⎥ ⎦

*realizes* Γ<sup>2</sup> = Γ3.

⎤ ⎥ ⎥ ⎦ = ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣

⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦

⎡ ⎢ ⎢ ⎣

*B* =

� 0 3 3 0 �

⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦

⎡ ⎢ ⎢ ⎣

Several realizability criteria which were first obtained for the *RNIEP* have later been shown to be symmetric realizability criteria as well. For example, Kellogg criterion [19] was showed by Fiedler [29] to imply symmetric realizability. It was proved by Radwan [8] that Borobia's criterion [21] is also a symmetric realizability criterion, and it was proved in [33] that Soto's criterion for the *RNIEP* is also a criterion for the *SNIEP.* In this section we shall consider the most general and efficient symmetric realizability criteria for the *SNIEP* (as far as we know

Λ<sup>1</sup> ∪···∪ Λ*p*1+<sup>1</sup> *be such that*

*the following conditions hold:*

*Then* Λ *is realizable by a nonnegative matrix A* ∈ CS*λ*<sup>1</sup> .

*has spectrum* Λ<sup>1</sup> *and diagonal entries* 3, 3, 0, 0. *It is clear that*

*A*<sup>2</sup> = *A*<sup>3</sup> =

⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣

**4. Symmetric nonnegative inverse eigenvalue problem**

they are). We start by introducing a symmetric version of the Rado Theorem:.

⎤ ⎥ ⎥ ⎦ +

**Example 2.** *With this extension, the authors show for instance, that the list*

{*ωk*, *λk*1, ..., *λkpk* },

*The nonnegative matrix*

*A* =

⎡ ⎢ ⎢ ⎣

*A*2 *A*3 0 0

*has the desired spectrum* {5, 4, 0, −3, −3, −3}.

*Then*

*ω*2,..., *ωp*1+1.

$$W^{-1}(A + XCX^T)W = \begin{bmatrix} \Omega + \mathcal{C} \ X^T A Y\\ 0 & Y^T A Y \end{bmatrix}$$

and *<sup>A</sup>* + *XCX<sup>T</sup>* is symmetric with eigenvalues *<sup>μ</sup>*1,..., *<sup>μ</sup>r*, *<sup>λ</sup>r*+1,..., *<sup>λ</sup>n*.

By using Theorem 6, the following sufficient condition was proved in [34]:

**Theorem 7.** *[34] Let* Λ = {*λ*1, *λ*2,..., *λn*} *be a list of real numbers with λ*<sup>1</sup> ≥ *λ*<sup>2</sup> ≥···≥ *λ<sup>n</sup> and, for some t* ≤ *n*, *let ω*1,..., *ω<sup>t</sup> be real numbers satisfying* 0 ≤ *ω<sup>k</sup>* ≤ *λ*1, *k* = 1, . . . , *t*. *Suppose there exists:*

*i*) *a partition* Λ = Λ<sup>1</sup> ∪···∪ Λ*<sup>t</sup> with*

$$
\Lambda\_{\mathbf{k}} = \{\lambda\_{\mathbf{k}1\prime}\lambda\_{\mathbf{k}2\prime}\dots\lambda\_{\mathbf{k}p\_{\mathbf{k}}}\}, \ \lambda\_{11} = \lambda\_{1\prime} \quad \lambda\_{\mathbf{k}1} \ge 0, \ \lambda\_{\mathbf{k}1} \ge \lambda\_{\mathbf{k}2} \ge \dots \ge \lambda\_{\mathbf{k}p\_{\mathbf{k}}}.
$$

*such that for each k* = 1, . . . , *t*, *the list* Γ*<sup>k</sup>* = {*ωk*, *λk*2, ..., *λkpk* } *is realizable by a symmetric nonnegative matrix Ak of order pk*, *and*

*ii*) *a t* × *t symmetric nonnegative matrix B with eigenvalues λ*11, *λ*21, ..., *λt*1} *and with diagonal entries ω*1, *ω*2,..., *ωt*.

*Then* Λ *is realizable by a symmetric nonnegative matrix.*

**Proof.** Since *Ak* is a *pk* × *pk* symmetric nonnegative matrix realizing Γ*k*, then *A* = *diag*{*A*1, *A*2,..., *At*} is symmetric nonnegative with spectrum Γ<sup>1</sup> ∪ Γ<sup>2</sup> ∪ ··· ∪ Γ*t*. Let {**x**1,..., **x***t*} be an orthonormal set of eigenvectors of *A* associated with *ω*1,..., *ωt*, respectively. Then the *n* × *t* matrix *X* with *i* − *th* column **x***<sup>i</sup>* satisfies *AX* = *X*Ω for Ω = *dig*{*ω*1,..., *ωt*}. Moreover, *X* is entrywise nonnegative, since each **x***<sup>i</sup>* contains the Perron eigenvector of *Ai* and zeros. Now, if we set *C* = *B* − Ω, the matrix *C* is symmetric nonnegative and Ω + *C* has eigenvalues *λ*1,..., *λt*. Therefore, by Theorem 6 the symmetric matrix *A* + *XCX<sup>T</sup>* has spectrum Λ. Besides, it is nonnegative since all the entries of *A*, *X*, and *C* are nonnegative.

Theorem 7 not only ensures the existence of a realizing matrix, but it also allows to construct the realizing matrix. Of course, the key is to know under which conditions does there exists a *t* × *t* symmetrix nonnegative matrix *B* with eigenvalues *λ*1,..., *λ<sup>t</sup>* and diagonal entries *ω*1,..., *ωt*.

#### 8 Will-be-set-by-IN-TECH 106 Linear Algebra – Theorems and Applications

The following conditions for the existence of a real symmetric matrix, not necessarily nonnegative, with prescribed eigenvalues and diagonal entries are due to Horn [43]: *There exists a real symmetric matrix with eigenvalues λ*<sup>1</sup> ≥ *λ*<sup>2</sup> ≥ ··· ≥ *λ<sup>t</sup> and diagonal entries ω*<sup>1</sup> ≥ *ω*<sup>2</sup> ≥···≥ *ω<sup>t</sup> if and only if*

$$\begin{aligned} \sum\_{i=1}^{k} \lambda\_i &\ge \sum\_{i=1}^{k} \omega\_{i\star} \ k = 1, \dots, t-1 \\ \sum\_{i=1}^{t} \lambda\_i &= \sum\_{i=1}^{t} \omega\_i \end{aligned} \tag{7}$$

For *t* = 2, the conditions (7) become

$$
\lambda\_1 \ge \omega\_{1\prime} \quad \lambda\_1 + \lambda\_2 = \omega\_1 + \omega\_{2\prime}
$$

and they are also sufficient for the existence of a 2 × 2 symmetric nonnegative matrix *B* with eigenvalues *λ*<sup>1</sup> ≥ *λ*<sup>2</sup> and diagonal entries *ω*<sup>1</sup> ≥ *ω*<sup>2</sup> ≥ 0, namely,

$$B = \left[ \frac{\omega\_1}{\sqrt{(\lambda\_1 - \omega\_1)(\lambda\_1 - \omega\_2)}} \sqrt{\frac{(\lambda\_1 - \omega\_1)(\lambda\_1 - \omega\_2)}{\omega\_2}} \right].$$

For *t* = 3, we have the following conditions:

**Lemma 1.** *[29] The conditions*

$$\begin{array}{c} \lambda\_1 \ge \omega\_1\\ \lambda\_1 + \lambda\_2 \ge \omega\_1 + \omega\_2\\ \lambda\_1 + \lambda\_2 + \lambda\_3 = \omega\_1 + \omega\_2 + \omega\_3\\ \omega\_1 \ge \lambda\_2 \end{array} \tag{8}$$

*are necessary and sufficient for the existence of a* 3 × 3 *symmetric nonnegative matrix B with eigenvalues λ*<sup>1</sup> ≥ *λ*<sup>2</sup> ≥ *λ*<sup>3</sup> *and diagonal entries ω*<sup>1</sup> ≥ *ω*<sup>2</sup> ≥ *ω*<sup>3</sup> ≥ 0.

In [34], the following symmetric nonnegative matrix *B*, satisfying conditions (8), it was constructed:

$$B = \begin{bmatrix} \omega\_1 & \sqrt{\frac{\mu - \omega\_3}{2\mu - \omega\_2 - \omega\_3}} s & \sqrt{\frac{\mu - \omega\_2}{2\mu - \omega\_2 - \omega\_3}} s\\ \sqrt{\frac{\mu - \omega\_2}{2\mu - \omega\_2 - \omega\_3}} s & \omega\_2 & \sqrt{(\mu - \omega\_2)(\mu - \omega\_3)}\\ \sqrt{\frac{\mu - \omega\_2}{2\mu - \omega\_2 - \omega\_3}} s & \sqrt{(\mu - \omega\_2)(\mu - \omega\_3)} & \omega\_3 \end{bmatrix},\tag{9}$$

where *<sup>μ</sup>* <sup>=</sup> *<sup>λ</sup>*<sup>1</sup> <sup>+</sup> *<sup>λ</sup>*<sup>2</sup> <sup>−</sup> *<sup>ω</sup>*1; *<sup>s</sup>* <sup>=</sup> �(*λ*<sup>1</sup> <sup>−</sup> *<sup>μ</sup>*)(*λ*<sup>1</sup> <sup>−</sup> *<sup>ω</sup>*1).

For *t* ≥ 4 we have only a sufficient condition:

**Theorem 8.** *Fiedler [29] If λ*<sup>1</sup> ≥···≥ *λ<sup>t</sup> and ω*<sup>1</sup> ≥···≥ *ω<sup>t</sup> satisfy*

$$\begin{aligned} \text{i)} \quad & \sum\_{i=1}^{s} \lambda\_i \ge \sum\_{i=1}^{s} \omega\_{i\prime} \text{ s} = 1, \dots, t-1 \\ \text{ii)} \quad & \sum\_{i=1}^{t} \lambda\_i = \sum\_{i=1}^{t} \omega\_i \\ \text{iii)} \, & \omega\_{k-1} \ge \lambda\_{k\prime} \text{ } k = 2, \dots, t-1 \end{aligned} \tag{10}$$

*then there exists a t* × *t symmetric nonnegative matrix with eigenvalues λ*1,..., *λ<sup>t</sup> and diagonal entries ω*1,..., *ωt*.

Observe that

8 Will-be-set-by-IN-TECH

The following conditions for the existence of a real symmetric matrix, not necessarily nonnegative, with prescribed eigenvalues and diagonal entries are due to Horn [43]: *There exists a real symmetric matrix with eigenvalues λ*<sup>1</sup> ≥ *λ*<sup>2</sup> ≥ ··· ≥ *λ<sup>t</sup> and diagonal entries*

*λ*<sup>1</sup> ≥ *ω*1, *λ*<sup>1</sup> + *λ*<sup>2</sup> = *ω*<sup>1</sup> + *ω*2, and they are also sufficient for the existence of a 2 × 2 symmetric nonnegative matrix *B* with

(*λ*<sup>1</sup> − *ω*1)(*λ*<sup>1</sup> − *ω*2) *ω*<sup>2</sup>

*λ*<sup>1</sup> ≥ *ω*<sup>1</sup> *λ*<sup>1</sup> + *λ*<sup>2</sup> ≥ *ω*<sup>1</sup> + *ω*<sup>2</sup> *λ*<sup>1</sup> + *λ*<sup>2</sup> + *λ*<sup>3</sup> = *ω*<sup>1</sup> + *ω*<sup>2</sup> + *ω*<sup>3</sup> *ω*<sup>1</sup> ≥ *λ*<sup>2</sup>

*are necessary and sufficient for the existence of a* 3 × 3 *symmetric nonnegative matrix B with*

In [34], the following symmetric nonnegative matrix *B*, satisfying conditions (8), it was

�(*<sup>μ</sup>* <sup>−</sup> *<sup>ω</sup>*2)(*<sup>μ</sup>* <sup>−</sup> *<sup>ω</sup>*3) *<sup>ω</sup>*<sup>3</sup>

*ωi*, *s* = 1, . . . , *t* − 1

� *<sup>μ</sup>*−*ω*<sup>3</sup> <sup>2</sup>*μ*−*ω*2−*ω*<sup>3</sup> *<sup>s</sup>*

> *s* ∑ *i*=1

*t* ∑ *i*=1 *ωi*

*iii*) *<sup>ω</sup>k*−<sup>1</sup> ≥ *<sup>λ</sup>k*, *<sup>k</sup>* = 2, . . . , *<sup>t</sup>* − <sup>1</sup>

*λ<sup>i</sup>* =

�

*ωi*, *k* = 1, . . . , *t* − 1

⎫ ⎪⎪⎪⎬

⎪⎪⎪⎭

(*λ*<sup>1</sup> − *ω*1)(*λ*<sup>1</sup> − *ω*2)

⎫ ⎪⎪⎬

⎪⎪⎭

� *<sup>μ</sup>*−*ω*<sup>2</sup> <sup>2</sup>*μ*−*ω*2−*ω*<sup>3</sup> *<sup>s</sup>*

�(*<sup>μ</sup>* <sup>−</sup> *<sup>ω</sup>*2)(*<sup>μ</sup>* <sup>−</sup> *<sup>ω</sup>*3)

⎫ ⎪⎪⎪⎪⎪⎬

⎪⎪⎪⎪⎪⎭

⎤ ⎥ ⎥ ⎥

, (10)

<sup>⎦</sup> , (9)

� . (7)

(8)

*ω*<sup>1</sup> ≥ *ω*<sup>2</sup> ≥···≥ *ω<sup>t</sup> if and only if*

For *t* = 2, the conditions (7) become

*B* =

For *t* = 3, we have the following conditions:

**Lemma 1.** *[29] The conditions*

*B* =

⎡ ⎢ ⎢ ⎢ ⎣

constructed:

*k* ∑ *i*=1 *λ<sup>i</sup>* ≥

eigenvalues *λ*<sup>1</sup> ≥ *λ*<sup>2</sup> and diagonal entries *ω*<sup>1</sup> ≥ *ω*<sup>2</sup> ≥ 0, namely,

� *ω*<sup>1</sup>

*eigenvalues λ*<sup>1</sup> ≥ *λ*<sup>2</sup> ≥ *λ*<sup>3</sup> *and diagonal entries ω*<sup>1</sup> ≥ *ω*<sup>2</sup> ≥ *ω*<sup>3</sup> ≥ 0.

<sup>2</sup>*μ*−*ω*2−*ω*<sup>3</sup> *<sup>s</sup> <sup>ω</sup>*<sup>2</sup>

**Theorem 8.** *Fiedler [29] If λ*<sup>1</sup> ≥···≥ *λ<sup>t</sup> and ω*<sup>1</sup> ≥···≥ *ω<sup>t</sup> satisfy*

*s* ∑ *i*=1 *λ<sup>i</sup>* ≥

*i*)

*ii*) *t* ∑ *i*=1

*ω*1

� *<sup>μ</sup>*−*ω*<sup>3</sup>

� *<sup>μ</sup>*−*ω*<sup>2</sup> <sup>2</sup>*μ*−*ω*2−*ω*<sup>3</sup> *<sup>s</sup>*

where *<sup>μ</sup>* <sup>=</sup> *<sup>λ</sup>*<sup>1</sup> <sup>+</sup> *<sup>λ</sup>*<sup>2</sup> <sup>−</sup> *<sup>ω</sup>*1; *<sup>s</sup>* <sup>=</sup> �(*λ*<sup>1</sup> <sup>−</sup> *<sup>μ</sup>*)(*λ*<sup>1</sup> <sup>−</sup> *<sup>ω</sup>*1).

For *t* ≥ 4 we have only a sufficient condition:

�

*t* ∑ *i*=1

*k* ∑ *i*=1

*t* ∑ *i*=1 *ωi*

*λ<sup>i</sup>* =

$$B = \begin{bmatrix} 5 & 2 & \frac{1}{2} & \frac{1}{2} \\ 2 & 5 & \frac{1}{2} & \frac{1}{2} \\ \frac{1}{2} & \frac{1}{2} & 5 & 2 \\ \frac{1}{2} & \frac{1}{2} & 2 & 5 \end{bmatrix}$$

has eigenvalues 8, 6, 3, 3, but *λ*<sup>2</sup> = 6 > 5 = *ω*1.

**Example 3.** *Let us consider the list* Λ = {7, 5, 1, −3, −4, −6} *with the partition*

$$\begin{aligned} \Lambda\_1 &= \{7, -6\}, & \Lambda\_2 &= \{5, -4\}, & \Lambda\_3 &= \{1, -3\} \text{ with} \\ \Gamma\_1 &= \{6, -6\}, & \Gamma\_2 &= \{4, -4\}, & \Gamma\_3 &= \{3, -3\}. \end{aligned}$$

*We look for a symmetric nonnegative matrix B with eigenvalues* 7, 5, 1 *and diagonal entries* 6, 4, 3. *Then conditions (8) are satisfied and from (9) we compute*

$$B = \begin{bmatrix} 6 & \sqrt{\frac{3}{5}} \sqrt{\frac{2}{5}} \\ \sqrt{\frac{3}{5}} & 4 & \sqrt{6} \\ \sqrt{\frac{2}{5}} & \sqrt{6} & 3 \end{bmatrix} \text{ and } \mathbb{C} = B - \Omega\_{\prime}$$

*where* Ω = *diag*{6, 4, 3}. *The symmetric matrices*

$$A\_1 = \begin{bmatrix} 0 \ 6 \\ 6 \ 0 \end{bmatrix}, \ A\_2 = \begin{bmatrix} 0 \ 4 \\ 4 \ 0 \end{bmatrix}, \ A\_3 = \begin{bmatrix} 0 \ 3 \\ 3 \ 0 \end{bmatrix}$$

*realize* Γ1, Γ2, Γ3. *Then*

$$A = \begin{bmatrix} A\_1 \\ & A\_2 \\ & & A\_3 \end{bmatrix} + \text{XCX}^T \text{ \\_where \ X = \begin{bmatrix} \frac{\sqrt{2}}{2} & 0 & 0 \\ \frac{\sqrt{2}}{2} & 0 & 0 \\ 0 & \frac{\sqrt{2}}{2} & 0 \\ 0 & \frac{\sqrt{2}}{2} & 0 \\ 0 & 0 & \frac{\sqrt{2}}{2} \\ 0 & 0 & \frac{\sqrt{2}}{2} \end{bmatrix}.$$

*is symmetric nonnegative with spectrum* Λ.

In the same way as Theorem 3 was extended to Theorem 5 (in the real case), Theorem 7 was also extended to the following result:

**Theorem 9.** *[36] Let* Λ = {*λ*1, *λ*2,..., *λn*} *be a list of real numbers and let the partition* Λ = Λ<sup>1</sup> ∪···∪ Λ*p*1+<sup>1</sup> *be such that*

$$\Lambda\_k = \{\lambda\_{k1}, \lambda\_{k2}, \dots \lambda\_{kp\_k}\}, \ \lambda\_{11} = \lambda\_{1\prime} \ \lambda\_{k1} \ge \lambda\_{k2} \ge \dots \ge \lambda\_{kp\_{k^{\prime}}}$$

*k* = 1, . . . , *p*<sup>1</sup> + 1, *where* Λ<sup>1</sup> *is symmetrically realizable, p*<sup>1</sup> *is the number of elements of* Λ<sup>1</sup> *and some lists* Λ*<sup>k</sup> can be empty. Let ω*2,..., *ωp*1+<sup>1</sup> *be real numbers satisfying* 0 ≤ *ω<sup>k</sup>* ≤ *λ*1, *k* = 2, . . . , *p*<sup>1</sup> + 1. *Suppose that the following conditions hold:*

*i*) *For each k* = 2, . . . , *p*<sup>1</sup> + 1, *there exists a symmetric nonnegative matrix Ak with spectrum* Γ*<sup>k</sup>* = {*ωk*, *λk*1, ..., *λkpk* },

*ii*) *There exists a p*<sup>1</sup> × *p*<sup>1</sup> *symmetric nonnegative matrix B with spectrum* Λ<sup>1</sup> *and with diagonal entries ω*2,..., *ωp*1+1.

*Then* Λ *is symmetrically realizable.*

**Example 4.** *Now, from Theorem 9, we can see that there exists a symmetric nonnegative matrix with spectrum* Λ = {5, 4, 0, −3, −3, −3}*, which can not be seen from Theorem 7. Moreover, we can compute a realizing matrix. In fact, let the partition*

$$\begin{aligned} \Lambda\_1 &= \{5, 4, 0, -3\}, \,\Lambda\_2 = \{-3\}, \,\Lambda\_3 = \{-3\} \text{ with} \\ \Gamma\_2 &= \{3, -3\}, \,\Gamma\_3 = \{3, -3\}, \,\Gamma\_4 = \Gamma\_5 = \{0\}. \end{aligned}$$

*The symmetric nonnegative matrix*

$$B = \begin{bmatrix} 3 & 0 & \sqrt{6} & 0 \\ 0 & 3 & 0 & \sqrt{6} \\ \sqrt{6} & 0 & 0 & 2 \\ 0 & \sqrt{6} & 2 & 0 \end{bmatrix}$$

*has spectrum* Λ<sup>1</sup> *and diagonal entries* 3, 3, 0, 0. *Let* Ω = *diag*{3, 3, 0, 0} *and*

$$X = \begin{bmatrix} \frac{\sqrt{2}}{2} & 0 & 0 \ 0\\ \frac{\sqrt{2}}{2} & 0 & 0 \ 0\\ 0 & \frac{\sqrt{2}}{2} & 0 \ 0\\ 0 & \frac{\sqrt{2}}{2} & 0 \ 0\\ 0 & 0 & 1 \ 0\\ 0 & 0 & 0 \ 1 \end{bmatrix}, \ A\_2 = A\_3 = \begin{bmatrix} 0 \ 3\\ 3 \ 0 \end{bmatrix}, \ \mathbf{C} = \mathbf{B} - \boldsymbol{\Omega}.$$

*Then, from Theorem 6 we obtain*

$$A = \begin{bmatrix} A\_2 \\ & A\_3 \\ & & 0 \\ & & 0 \end{bmatrix} + \mathbf{X} \mathbf{C} \mathbf{X}^T \mathbf{A}$$

*which is symmetric nonnegative with spectrum* Λ.

The following result, although is not written in the fashion of a sufficient condition, is indeed a very general and efficient sufficient condition for the *SNIEP*.

**Theorem 10.** *[35] Let A be an n* × *n irreducible symmetric nonnegative matrix with spectrum* Λ = {*λ*1, *λ*2,..., *λn*}, *Perron eigenvalue λ*<sup>1</sup> *and a diagonal element c*. *Let B be an m* × *m symmetric nonnegative matrix with spectrum* Γ = {*μ*1, *μ*2,..., *μm*} *and Perron eigenvalue μ*1.

*i*) *If μ*<sup>1</sup> ≤ *c*, *then there exists a symmetric nonnegative matrix C*, *of order* (*n* + *m* − 1), *with spectrum* {*λ*1,..., *λn*, *μ*2,..., *μm*}.

*ii*) *If μ*<sup>1</sup> ≥ *c*, *then there exists a symmetric nonnegative matrix C*, *of order* (*n* + *m* − 1), *with spectrum* {*λ*<sup>1</sup> + *μ*<sup>1</sup> − *c*, *λ*2,..., *λn*, *μ*2,..., *μm*}.

**Example 5.** *The following example, given in [35], shows that* {7, 5, 0, −4, −4, −4} *with the partition*

$$\Lambda = \{7, 5, 0, -4\}, \ \Gamma = \{4, -4\}.$$

*satisfies conditions of Theorem 10, where*

10 Will-be-set-by-IN-TECH

*i*) *For each k* = 2, . . . , *p*<sup>1</sup> + 1, *there exists a symmetric nonnegative matrix Ak with spectrum* Γ*<sup>k</sup>* =

*ii*) *There exists a p*<sup>1</sup> × *p*<sup>1</sup> *symmetric nonnegative matrix B with spectrum* Λ<sup>1</sup> *and with diagonal entries*

**Example 4.** *Now, from Theorem 9, we can see that there exists a symmetric nonnegative matrix with spectrum* Λ = {5, 4, 0, −3, −3, −3}*, which can not be seen from Theorem 7. Moreover, we can*

> Λ<sup>1</sup> = {5, 4, 0, −3}, Λ<sup>2</sup> = {−3}, Λ<sup>3</sup> = {−3} *with* Γ<sup>2</sup> = {3, −3}, Γ<sup>3</sup> = {3, −3}, Γ<sup>4</sup> = Γ<sup>5</sup> = {0}.

> > 3 0 <sup>√</sup>6 0 <sup>030</sup> <sup>√</sup><sup>6</sup> <sup>√</sup>60 0 2 <sup>0</sup> <sup>√</sup>62 0

, *A*<sup>2</sup> = *A*<sup>3</sup> =

⎤ ⎥ ⎥ ⎦

The following result, although is not written in the fashion of a sufficient condition, is indeed

**Theorem 10.** *[35] Let A be an n* × *n irreducible symmetric nonnegative matrix with spectrum* Λ = {*λ*1, *λ*2,..., *λn*}, *Perron eigenvalue λ*<sup>1</sup> *and a diagonal element c*. *Let B be an m* × *m symmetric*

*i*) *If μ*<sup>1</sup> ≤ *c*, *then there exists a symmetric nonnegative matrix C*, *of order* (*n* + *m* − 1), *with spectrum*

*ii*) *If μ*<sup>1</sup> ≥ *c*, *then there exists a symmetric nonnegative matrix C*, *of order* (*n* + *m* − 1), *with spectrum*

+ *XCXT*,

�

, *C* = *B* − Ω.

⎤ ⎥ ⎥ ⎦

*B* =

*has spectrum* Λ<sup>1</sup> *and diagonal entries* 3, 3, 0, 0. *Let* Ω = *diag*{3, 3, 0, 0} *and*

*A* =

a very general and efficient sufficient condition for the *SNIEP*.

⎡ ⎢ ⎢ ⎣

*nonnegative matrix with spectrum* Γ = {*μ*1, *μ*2,..., *μm*} *and Perron eigenvalue μ*1.

*A*2 *A*3 0 0

⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ ⎡ ⎢ ⎢ ⎣

*Suppose that the following conditions hold:*

*compute a realizing matrix. In fact, let the partition*

*X* =

*which is symmetric nonnegative with spectrum* Λ.

*Then, from Theorem 6 we obtain*

{*λ*1,..., *λn*, *μ*2,..., *μm*}.

{*λ*<sup>1</sup> + *μ*<sup>1</sup> − *c*, *λ*2,..., *λn*, *μ*2,..., *μm*}.

⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ √2 <sup>2</sup> 0 00 √2 <sup>2</sup> 0 00

0 √2 <sup>2</sup> 0 0

*Then* Λ *is symmetrically realizable.*

*The symmetric nonnegative matrix*

{*ωk*, *λk*1, ..., *λkpk* },

*ω*2,..., *ωp*1+1.

$$A = \begin{bmatrix} 4 \ 0 \ 0 \ \dot{b} \ 0 \\ 0 \ 4 \ 0 \ \dot{d} \\ b \ 0 \ 0 \ \sqrt{6} \\ 0 \ d \ \sqrt{6} \ 0 \end{bmatrix} \text{ with } b^2 + d^2 = 23, \text{ } bd = 4\sqrt{6}.$$

*is symmetric nonnegative with spectrum* Λ. *Then there exists a symmetric nonnegative matrix C with spectrum* {7, 5, 0, −4, −4} *and a diagonal element* 4. *By applying again Theorem 10 to the lists* {7, 5, 0, −4, −4} *and* {4, −4}, *we obtain the desired symmetric nonnegative matrix.*

It is not hard to show that both results, Theorem 9 and Theorem 10, are equivalent (see [44]). Thus, the list in the Example 4 is also realizable from Theorem 10, while the list in the example 5 is also realizable from Theorem 9.

#### **5. List of complex numbers**

In this section we consider lists of complex nonreal numbers. We start with a complex generalization of a well known result of Suleimanova, usually considered as one of the important results in the *RNIEP (see [16]): The list λ*<sup>1</sup> > 0 > *λ*<sup>2</sup> ≥···≥ *λ<sup>n</sup>* is the spectrum of a nonnegative matrix if and only if *λ*<sup>1</sup> + *λ*<sup>2</sup> + ··· + *λ<sup>n</sup>* ≥ 0.

**Theorem 11.** *[10] Let* Λ = {*λ*0, *λ*1,..., *λn*} *be a list of complex numbers closed under complex conjugation, with*

$$\Lambda' = \{\lambda\_1, \dots, \lambda\_n\} \subset \{z \in \mathbb{C} : \operatorname{Re} z \le 0; \ |\operatorname{Re} z| \ge |\operatorname{Im} z|\}.$$

*Then* Λ *is realizable if and only if n* ∑ *i*=0 *λ<sup>i</sup>* ≥ 0.

**Proof.** Suppose that the elements of Λ� are ordered in such a way that *λ*2*p*+1,..., *λ<sup>n</sup>* are real and *λ*1,..., *λ*2*<sup>p</sup>* are complex nonreal, with

$$\lambda x\_k = \text{Re}\,\lambda\_{2k-1} = \text{Re}\,\lambda\_{2k} \text{ and } \, y\_k = \text{Im}\,\lambda\_{2k-1} = \text{Im}\,\lambda\_{2k}$$

for *k* = 1, . . . , *p*. Consider the matrix

$$B = \begin{bmatrix} 0 & 0 & 0 & \cdot \\ -\mathbf{x}\_1 + y\_1 & \mathbf{x}\_1 - y\_1 & \cdot \\ -\mathbf{x}\_1 - y\_1 & y\_1 & \mathbf{x}\_1 & \cdot \\ \vdots & \vdots & \vdots & \ddots \\ -\mathbf{x}\_p + y\_p & 0 & 0 & \cdot & \mathbf{x}\_p - y\_p \\ -\mathbf{x}\_p - y\_p & 0 & 0 & \cdot & y\_p & \mathbf{x}\_p \\ -\lambda\_{2p+1} & 0 & 0 & \cdot & & \lambda\_{2p+1} \\ \vdots & \vdots & \vdots & \ddots & & \ddots \\ -\lambda\_n & 0 & \cdot & & & \lambda\_n \end{bmatrix}.$$

It is clear that *B* ∈ CS<sup>0</sup> with spectrum {0, *λ*1,..., *λn*} and all the entries on its first column are nonnegative. Define **q** = (*q*0, *q*1,..., *qn*)*<sup>T</sup>* with *q*<sup>0</sup> = *λ*<sup>0</sup> + *n* ∑ *i*=1 *λ<sup>i</sup>* and

*qk* = − Re *λ<sup>k</sup>* for *k* = 1, . . . , 2*p* and *qk* = −*λ<sup>k</sup>* for *k* = 2*p* + 1, . . . , *n*.

Then, from the Brauer Theorem 1 *A* = *B* + **eq***<sup>T</sup>* is nonnegative with spectrum Λ.

In the case when all numbers in the given list, except one (the Perron eigenvalue), have real parts smaller than or equal to zero, remarkably simple necessary and sufficient conditions were obtained in [11].

**Theorem 12.** *[11] Let λ*2, *λ*3,..., *λ<sup>n</sup> be nonzero complex numbers with real parts less than or equal to zero and let λ*<sup>1</sup> *be a positive real number. Then the list* Λ = {*λ*1, *λ*2,..., *λn*} *is the nonzero spectrum of a nonnegative matrix if the following conditions are satisfied:*

$$\begin{aligned} \text{i)} \quad &\Lambda = \Lambda\\ \text{ii)} \ s\_1 = \sum\_{i=1}^n \lambda\_i \ge 0\\ \text{iii)} \ s\_2 = \sum\_{i=1}^n \lambda\_i^2 \ge 0 \end{aligned} \tag{11}$$

*The minimal number of zeros that need to be added to* Λ *to make it realizable is the smallest nonnegative integer N for which the following inequality is satisfied:*

$$s\_1^2 \le (n+N)s\_2.$$

*Furthermore, the list* {*λ*1, *λ*2,..., *λn*, 0, . . . , 0} *can be realized by C* + *αI*, *where C is a nonnegative companion matrix with trace zero, α is a nonnegative scalar and I is the n* × *n identity matrix.*

**Corollary 1.** *[11] Let λ*2, *λ*3,..., *λ<sup>n</sup> be complex numbers with real parts less than or equal to zero and let λ*<sup>1</sup> *be a positive real number. Then the list* Λ = {*λ*1, *λ*2,..., *λn*} *is the spectrum of a nonnegative matrix if and only if the following conditions are satisfied:*

$$\begin{aligned} \begin{array}{l} i) \quad \overline{\Lambda} = \Lambda\\ ii) \ s\_1 = \sum\_{\substack{i=1\\i=1\\i\neq 1}}^n \lambda\_i \ge 0\\ iii) \ s\_2 = \sum\_{i=1}^n \lambda\_i^2 \ge 0\\ iv) \quad s\_1^2 \le ns\_2 \end{array} \tag{12}$$

**Example 6.** *The list* Λ = {8, −1 + 3*i*, −1 − 3*i*, −2 + 5*i*, −2 − 5*i*} *satisfies conditions (12). Then* Λ *is the spectrum of the nonnegative companion matrix*

$$\mathbf{C} = \begin{bmatrix} 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 1 \\ 2320 & 494 & 278 & 1 & 2 \end{bmatrix}.$$

*Observe that Theorem 11 gives no information about the realizability of* Λ.

*The list* {19, −1 + 11*i*, −1 − 11*i*, −3 + 8*i*, −3 − 8*i*} *was given in [11]. It does not satisfy conditions (12): s*<sup>1</sup> = 11, *s*<sup>2</sup> = 11 *and s*<sup>2</sup> <sup>1</sup> *ns*2. *The inequality* <sup>11</sup><sup>2</sup> <sup>≤</sup> (<sup>5</sup> <sup>+</sup> *<sup>N</sup>*)<sup>11</sup> *is satisfied for N* <sup>≥</sup> 6. *Then we need to add* 6 *zeros to the list to make it realizable.*

Theorem 3 (in section 3), can also be extended to the complex case:

**Theorem 13.** *[13] Let* Λ = {*λ*2, *λ*3,..., *λn*} *be a list of complex numbers such that* Λ = Λ, *λ*<sup>1</sup> ≥ max*<sup>i</sup>* |*λi*| , *i* = 2, . . . , *n*, *and n* ∑ *i*=1 *λ<sup>i</sup>* ≥ 0. *Suppose that: i*) *there exists a partition* Λ = Λ<sup>1</sup> ∪···∪ Λ*<sup>t</sup> with*

$$\Lambda\_k = \{\lambda\_{k1}, \lambda\_{k2}, \dots, \lambda\_{kp}\}, \ \lambda\_{11} = \lambda\_{1\nu}$$

*k* = 1, . . . , *t*, *such that* Γ*<sup>k</sup>* = {*ωk*, *λk*2, ..., *λkpk* } *is realizable by a nonnegative matrix Ak* ∈ CS*ω<sup>k</sup>* , *and*

*ii*) *there exists a t* × *t nonnegative matrix B* ∈ CS*λ*<sup>1</sup> , *with eigenvalues λ*1, *λ*21,..., *λt*<sup>1</sup> *(the first elements of the lists* Λ*k*) *and with diagonal entries ω*1, *ω*2,..., *ω<sup>t</sup> (the Perron eigenvalues of the lists* Γ*k*).

*Then* Λ *is realizable.*

**Example 7.** *Let* Λ = {7, 1, −2, −2, −2 + 4*i*, −2 − 4*i*}. *Consider the partition*

$$\begin{aligned} \Lambda\_1 &= \{7, 1, -2, -2\}, \,\,\Lambda\_2 = \{-2 + 4i\}, \,\,\Lambda\_3 = \{-2 - 4i\} \text{ with} \\ \Gamma\_1 &= \{3, 1, -2, -2\}, \,\,\Gamma\_2 = \{0\}, \,\,\Gamma\_3 = \{0\}. \end{aligned}$$

*We look for a nonnegative matrix B* ∈ CS<sup>7</sup> *with eigenvalues* 7, −2 + 4*i*, −2 − 4*i and diagonal entries* 3, 0, 0, *and a nonnegative matrix A*<sup>1</sup> *realizing* Γ1. *They are*

$$B = \begin{bmatrix} 3 & 0 \ 4 \\ \frac{41}{7} & 0 \ \frac{8}{7} \\ 0 & 7 \ 0 \end{bmatrix} \quad \text{and} \quad A\_1 = \begin{bmatrix} 0 \ 2 \ 0 \ 1 \\ 2 \ 0 \ 0 \ 1 \\ 0 \ 1 \ 0 \ 2 \\ 0 \ 1 \ 2 \ 0 \end{bmatrix}.$$

*Then*

12 Will-be-set-by-IN-TECH

It is clear that *B* ∈ CS<sup>0</sup> with spectrum {0, *λ*1,..., *λn*} and all the entries on its first column are

*qk* = − Re *λ<sup>k</sup>* for *k* = 1, . . . , 2*p* and *qk* = −*λ<sup>k</sup>* for *k* = 2*p* + 1, . . . , *n*.

In the case when all numbers in the given list, except one (the Perron eigenvalue), have real parts smaller than or equal to zero, remarkably simple necessary and sufficient conditions

**Theorem 12.** *[11] Let λ*2, *λ*3,..., *λ<sup>n</sup> be nonzero complex numbers with real parts less than or equal to zero and let λ*<sup>1</sup> *be a positive real number. Then the list* Λ = {*λ*1, *λ*2,..., *λn*} *is the nonzero spectrum*

> *n* ∑ *i*=1

*n* ∑ *i*=1 *λ*2 *<sup>i</sup>* ≥ 0

*The minimal number of zeros that need to be added to* Λ *to make it realizable is the smallest nonnegative*

<sup>1</sup> ≤ (*n* + *N*)*s*2.

*Furthermore, the list* {*λ*1, *λ*2,..., *λn*, 0, . . . , 0} *can be realized by C* + *αI*, *where C is a nonnegative companion matrix with trace zero, α is a nonnegative scalar and I is the n* × *n identity matrix.*

**Corollary 1.** *[11] Let λ*2, *λ*3,..., *λ<sup>n</sup> be complex numbers with real parts less than or equal to zero and let λ*<sup>1</sup> *be a positive real number. Then the list* Λ = {*λ*1, *λ*2,..., *λn*} *is the spectrum of a nonnegative*

> *n* ∑ *i*=1

*n* ∑ *i*=1 *λ*2 *<sup>i</sup>* ≥ 0

<sup>1</sup> ≤ *ns*<sup>2</sup>

0 1 0 00 0 0 1 00 0 0 0 10 0 0 0 01 2320 494 278 1 2 ⎤ ⎥ ⎥ ⎥ ⎥ ⎦ .

**Example 6.** *The list* Λ = {8, −1 + 3*i*, −1 − 3*i*, −2 + 5*i*, −2 − 5*i*} *satisfies conditions (12). Then* Λ

*λ<sup>i</sup>* ≥ 0

*i*) Λ = Λ

*ii*) *s*<sup>1</sup> =

*iii*) *s*<sup>2</sup> =

*iv*) *s*<sup>2</sup>

*C* =

⎡ ⎢ ⎢ ⎢ ⎢ ⎣ *λ<sup>i</sup>* ≥ 0

*i*) Λ = Λ

*ii*) *s*<sup>1</sup> =

*iii*) *s*<sup>2</sup> =

*s* 2

Then, from the Brauer Theorem 1 *A* = *B* + **eq***<sup>T</sup>* is nonnegative with spectrum Λ.

*n* ∑ *i*=1

*λ<sup>i</sup>* and

(11)

(12)

nonnegative. Define **q** = (*q*0, *q*1,..., *qn*)*<sup>T</sup>* with *q*<sup>0</sup> = *λ*<sup>0</sup> +

*of a nonnegative matrix if the following conditions are satisfied:*

*integer N for which the following inequality is satisfied:*

*matrix if and only if the following conditions are satisfied:*

*is the spectrum of the nonnegative companion matrix*

were obtained in [11].

$$A = \begin{bmatrix} A\_1 \\ 0 \\ 0 \end{bmatrix} + \begin{bmatrix} 1 \ 0 \ 0 \\ 1 \ 0 \ 0 \\ 1 \ 0 \ 0 \\ 1 \ 0 \ 0 \\ 0 \ 1 \ 0 \\ 0 \ 0 \ 1 \end{bmatrix} \begin{bmatrix} 0 & 0 \ 0 & 0 \ 0 & 4 \\ \frac{41}{7} & 0 \ 0 & 0 \ 0 & 8 \\ 0 & 0 \ 0 & 0 & 7 \end{bmatrix} = \begin{bmatrix} 0 & 2 \ 0 \ 1 \ 0 \ 4 \\ 2 \ 0 \ 0 \ 1 \ 0 \ 4 \\ 0 \ 1 \ 0 \ 2 \ 0 \ 4 \\ 0 \ 1 \ 2 \ 0 \ 0 \ 4 \\ \frac{41}{7} & 0 \ 0 \ 0 \ 0 \ 8 \\ 0 \ 0 \ 0 \ 0 \ 7 & 0 \end{bmatrix}$$

*has the spectrum* Λ.

#### **6. Fiedler and Guo results**

One of the most important works about the *SNIEP* is due to Fiedler [29]. In [29] Fiedler showed, as it was said before, that Kellogg sufficient conditions for the *RNIEP* are also sufficient for the *SNIEP*. Three important and very useful results of Fiedler are:

**Lemma 2.** *[29] Let A be a symmetric m* × *m matrix with eigenvalues α*1,..., *αm*, *A***u** = *α*1**u**, �**u**� = 1. *Let B be a symmetric n* × *n matrix with eigenvalues β*1,..., *βn*, *B***v** = *β*1**v**, �**v**� = 1. *Then for any ρ*, *the matrix*

$$\mathbf{C} = \begin{bmatrix} A & \rho \mathbf{u} \mathbf{v}^T \\ \rho \mathbf{v} \mathbf{u}^T & B \end{bmatrix}$$

*has eigenvalues α*2,..., *αm*, *β*2,..., *βn*, *γ*1, *γ*2, *where γ*1, *γ*<sup>2</sup> *are eigenvalues of the matrix*

$$
\tilde{\mathcal{C}} = \begin{bmatrix} \alpha\_1 & \rho \\ \rho & \beta\_1 \end{bmatrix}.
$$

**Lemma 3.** *[29] If* {*α*1,..., *αm*} *and* {*β*1,..., *βn*} *are lists symmetrically realizable and α*<sup>1</sup> ≥ *β*1*, then for any t* ≥ 0, *the list*

$$\{\alpha\_1 + t\_\prime \beta\_1 - t\_\prime \alpha\_2, \dots, \alpha\_m, \beta\_{2'}, \dots, \beta\_n\}$$

*is also symmetrically realizable.*

**Lemma 4.** *[29] If* Λ = {*λ*1, *λ*2,..., *λn*} *is symmetrically realizable by a nonnegative matrix and if t* > 0, *then*

$$\Lambda\_{\mathbf{f}} = \{\lambda\_1 + t\_\prime \lambda\_{2\prime}, \dots, \lambda\_n\},$$

*is symmetrically realizable by a positive matrix.*

**Remark 1.** *It is not hard to see that Lemma 2 can be obtained from Theorem 6. In fact, it is enough to consider*

$$\begin{aligned} \mathbf{C} &= \begin{bmatrix} A \\ & B \end{bmatrix} + \begin{bmatrix} \mathbf{u} \ \mathbf{0} \\ \mathbf{0} \ \mathbf{v} \end{bmatrix} \begin{bmatrix} 0 \ \rho \\ \rho \ \mathbf{0} \end{bmatrix} \begin{bmatrix} \mathbf{u}^T \ \mathbf{0}^T \\ \mathbf{0}^T \ \mathbf{v}^T \end{bmatrix} \\ &= \begin{bmatrix} A & \rho \mathbf{u} \mathbf{v}^T \\ \rho \mathbf{v} \mathbf{u}^T & B \end{bmatrix} \end{aligned}$$

*which is symmetric with eigenvalues γ*1, *γ*2, *α*2,..., *αm*, *β*2,..., *βn*, *where γ*1, *γ*<sup>2</sup> *are eigenvalues of*

$$B = \begin{bmatrix} \alpha\_1 & \rho \\ \rho & \beta\_1 \end{bmatrix}.$$

Now we consider a relevant result due to Guo [45]:

**Theorem 14.** *[45] If the list of complex numbers* Λ = {*λ*1, *λ*2,..., *λn*} *is realizable, where λ*<sup>1</sup> *is the Perron eigenvalue and λ*<sup>2</sup> ∈ **R**, *then for any t* ≥ 0 *the list* Λ*<sup>t</sup>* = {*λ*<sup>1</sup> + *t*, *λ*<sup>2</sup> ± *t*, *λ*3,..., *λn*} *is also realizable.*

**Corollary 2.** *[45] If the list of real numbers* Λ = {*λ*1, *λ*2,..., *λn*} *is realizable and t*<sup>1</sup> = *n* ∑ *i*=2 |*ti*| *with ti* ∈ **R**, *i* = 2, . . . , *n*, *then the list* Λ*<sup>t</sup>* = {*λ*<sup>1</sup> + *t*1, *λ*<sup>2</sup> + *t*2,..., *λ<sup>n</sup>* + *tn*} *is also realizable.*

**Example 8.** *Let* Λ = {8, 6, 3, 3, −5, −5, −5, −5} *be a given list. Since the lists* Λ<sup>1</sup> = Λ<sup>2</sup> = {7, 3, −5, −5} *are both realizable (see [22] to apply a simple criterion, which shows the realizability of* Λ<sup>1</sup> = Λ2*), then*

$$
\Lambda\_1 \cup \Lambda\_2 = \{7, 7, 3, 3, -5, -5, -5, -5\}
$$

*is also realizable. Now, from Theorem 14, with t* = 1, Λ *is realizable.*

Guo also sets the following two questions:

Question 1: Do complex eigenvalues of nonnegative matrices have a property similar to Theorem 14?

Question 2: If the list Λ = {*λ*1, *λ*2,..., *λn*} is symmetrically realizable, and *t* > 0, is the list Λ*<sup>t</sup>* = {*λ*<sup>1</sup> + *t*, *λ*<sup>2</sup> ± *t*, *λ*3,..., *λn*} symmetrically realizable?.

It was shown in [12] and also in [46] that Question 1 has an affirmative answer.

**Theorem 15.** *[12] Let* Λ = {*λ*1, *a* + *bi*, *a* − *bi*, *λ*4,..., *λn*} *be a realizable list of complex numbers. Then for all t* ≥ 0, *the perturbed list*

$$\Lambda\_t = \{\lambda\_1 + 2t, a - t + bi, a - t - bi, \lambda\_{4\prime}, \dots, \lambda\_n\}$$

*is also realizable.*

14 Will-be-set-by-IN-TECH

**Lemma 2.** *[29] Let A be a symmetric m* × *m matrix with eigenvalues α*1,..., *αm*, *A***u** = *α*1**u**, �**u**� = 1. *Let B be a symmetric n* × *n matrix with eigenvalues β*1,..., *βn*, *B***v** = *β*1**v**, �**v**� = 1. *Then for any*

> *A ρ***uv***<sup>T</sup> ρ***vu***<sup>T</sup> B*

**Lemma 3.** *[29] If* {*α*1,..., *αm*} *and* {*β*1,..., *βn*} *are lists symmetrically realizable and α*<sup>1</sup> ≥ *β*1*,*

{*α*<sup>1</sup> + *t*, *β*<sup>1</sup> − *t*, *α*2,..., *αm*, *β*2,..., *βn*}

**Lemma 4.** *[29] If* Λ = {*λ*1, *λ*2,..., *λn*} *is symmetrically realizable by a nonnegative matrix and if*

Λ*<sup>t</sup>* = {*λ*<sup>1</sup> + *t*, *λ*2,..., *λn*}

**Remark 1.** *It is not hard to see that Lemma 2 can be obtained from Theorem 6. In fact, it is enough to*

 ,

*which is symmetric with eigenvalues γ*1, *γ*2, *α*2,..., *αm*, *β*2,..., *βn*, *where γ*1, *γ*<sup>2</sup> *are eigenvalues of*

**Theorem 14.** *[45] If the list of complex numbers* Λ = {*λ*1, *λ*2,..., *λn*} *is realizable, where λ*<sup>1</sup> *is the Perron eigenvalue and λ*<sup>2</sup> ∈ **R**, *then for any t* ≥ 0 *the list* Λ*<sup>t</sup>* = {*λ*<sup>1</sup> + *t*, *λ*<sup>2</sup> ± *t*, *λ*3,..., *λn*} *is also*

**Example 8.** *Let* Λ = {8, 6, 3, 3, −5, −5, −5, −5} *be a given list. Since the lists* Λ<sup>1</sup> = Λ<sup>2</sup> = {7, 3, −5, −5} *are both realizable (see [22] to apply a simple criterion, which shows the realizability*

Λ<sup>1</sup> ∪ Λ<sup>2</sup> = {7, 7, 3, 3, −5, −5, −5, −5}

*B* = *α*1 *ρ ρ β*<sup>1</sup>

**Corollary 2.** *[45] If the list of real numbers* Λ = {*λ*1, *λ*2,..., *λn*} *is realizable and t*<sup>1</sup> =

*ti* ∈ **R**, *i* = 2, . . . , *n*, *then the list* Λ*<sup>t</sup>* = {*λ*<sup>1</sup> + *t*1, *λ*<sup>2</sup> + *t*2,..., *λ<sup>n</sup>* + *tn*} *is also realizable.*

 0 *ρ ρ* 0

> .

 **u***<sup>T</sup>* **0***<sup>T</sup>* **0***<sup>T</sup>* **v***<sup>T</sup>* *n* ∑ *i*=2


 .

*C* =

*has eigenvalues α*2,..., *αm*, *β*2,..., *βn*, *γ*1, *γ*2, *where γ*1, *γ*<sup>2</sup> *are eigenvalues of the matrix*

*C* = *α*1 *ρ ρ β*<sup>1</sup>

*ρ*, *the matrix*

*t* > 0, *then*

*consider*

*realizable.*

*of* Λ<sup>1</sup> = Λ2*), then*

*then for any t* ≥ 0, *the list*

*is also symmetrically realizable.*

*is symmetrically realizable by a positive matrix.*

*C* = *A B* + **u 0 0 v**

=

*is also realizable. Now, from Theorem 14, with t* = 1, Λ *is realizable.*

Now we consider a relevant result due to Guo [45]:

 *A ρ***uv***<sup>T</sup> ρ***vu***<sup>T</sup> B*

Question 2, however, remains open. An affirmative answer to Question 2, in the case that the symmetric realizing matrix is a nonnegative circulant matrix or it is a nonnegative left circulant matrix, it was given in [47]. The use of circulant matrices has been shown to be very useful for the *NIEP* [9, 24]. In [24] it was given a necessary and sufficient condition for a list of 5 real numbers, which corresponds to a even-conjugate vector, to be the spectrum of 5 × 5 symmetric nonnegative circulant matrix:

**Lemma 5.** *[24] Let λ* = (*λ*1, *λ*2, *λ*3, *λ*3, *λ*2)*<sup>T</sup> be a vector of real numbers (even-conjugate) such that*

$$\begin{aligned} \lambda\_1 &\ge \left| \lambda\_j \right|, \ j = 2, 3\\ \lambda\_1 &\ge \lambda\_2 \ge \lambda\_3\\ \lambda\_1 + 2\lambda\_2 + 2\lambda\_3 &\ge 0 \end{aligned} \tag{13}$$

*A necessary and sufficient condition for* {*λ*1, *λ*2, *λ*3, *λ*3, *λ*2} *to be the spectrum of a symmetric nonnegative circulant matrix is*

$$
\lambda\_1 + (\lambda\_3 - \lambda\_2) \frac{\sqrt{5} - 1}{2} - \lambda\_2 \ge 0. \tag{14}
$$

**Example 9.** *From Lemma 5 we may know, for instance, that the list* {6, 1, 1, −4, −4} *is the spectrum of a symmetric nonnegative circulant matrix.*

#### **7. Some open questions**

We finish this chapter by setting two open questions:

Question 1: *If the list of real numbers* Λ = {*λ*1, *λ*2,..., *λn*} *is symmetrically realizable, and t* > 0, *is the list* Λ*<sup>t</sup>* = {*λ*<sup>1</sup> + *t*, *λ*<sup>2</sup> ± *t*, *λ*3,..., *λn*} also symmetrically realizable?

Some progress has been done about this question. In [47], it was given an affirmative answer to Question 1, in the case that the realizing matrix is symmetric nonnegative circulant matrix or it is nonnegative left circulant matrix. In [48] it was shown that if 1 > *λ*<sup>2</sup> ≥ ··· ≥ *λ<sup>n</sup>* ≥ 0, then Theorem 14 holds for positive stochastic, positive doubly stochastic and positive symmetric matrices.

Question 2: *How adding one or more zeros to a list can lead to its symmetric realizability by different symmetric patterned matrices?*

The famous Boyle-Handelman Theorem [49] gives a nonconstructive proof of the fact that if *sk* = *λ<sup>k</sup>* <sup>1</sup> <sup>+</sup> *<sup>λ</sup><sup>k</sup>* <sup>2</sup> <sup>+</sup> ··· <sup>+</sup> *<sup>λ</sup><sup>k</sup> <sup>n</sup>* > 0, for *k* = 1, 2, . . . , then there exists a nonnegative number *N* for which the list {*λ*1,..., *λn*, 0, . . . , 0}, with *N* zeros added, is realizable. In [11] Laffey and Šmigoc completely solve the *NIEP* for lists of complex numbers Λ = {*λ*1,..., *λn*}, closed under conjugation, with *λ*2,..., *λ<sup>n</sup>* having real parts smaller than or equal to zero. They show the existence of *N* ≥ 0 for which Λ with *N* zeros added is realizable and show how to compute the least such *N*. The situation for symmetrically realizable spectra is different and even less is known.

## **8. Conclusion**

The *nonnegative* inverse eigenvalue problem is an open and difficult problem. A full solution is unlikely in the near future. A number of partial results are known in the literature about the problem, most of them in terms of sufficient conditions. Some matrix results, like Brauer Theorem (Theorem 1), Rado Theorem (Theorem 2), and its symmetric version (Theorem 6) have been shown to be very useful to derive good sufficient conditions. This way, however, seems to be quite narrow and may be other techniques should be explored and applied.

## **Author details**

Ricardo L. Soto

*Department of Mathematics, Universidad Católica del Norte, Casilla 1280, Antofagasta, Chile.*

## **9. References**


[11] T. J. Laffey, H. Šmigoc (2006) Nonnegative realization of spectra having negative real parts. In: Linear Algebra Appl. 416 148-159.

16 Will-be-set-by-IN-TECH

The famous Boyle-Handelman Theorem [49] gives a nonconstructive proof of the fact that if

for which the list {*λ*1,..., *λn*, 0, . . . , 0}, with *N* zeros added, is realizable. In [11] Laffey and Šmigoc completely solve the *NIEP* for lists of complex numbers Λ = {*λ*1,..., *λn*}, closed under conjugation, with *λ*2,..., *λ<sup>n</sup>* having real parts smaller than or equal to zero. They show the existence of *N* ≥ 0 for which Λ with *N* zeros added is realizable and show how to compute the least such *N*. The situation for symmetrically realizable spectra is different and even less

The *nonnegative* inverse eigenvalue problem is an open and difficult problem. A full solution is unlikely in the near future. A number of partial results are known in the literature about the problem, most of them in terms of sufficient conditions. Some matrix results, like Brauer Theorem (Theorem 1), Rado Theorem (Theorem 2), and its symmetric version (Theorem 6) have been shown to be very useful to derive good sufficient conditions. This way, however, seems to be quite narrow and may be other techniques should be explored and applied.

*Department of Mathematics, Universidad Católica del Norte, Casilla 1280, Antofagasta, Chile.*

[1] A. Berman, R. J. Plemmons (1994) Nonnegative Matrices in the Mathematical Sciences. In: Classics in Applied Mathematics 9, Society for Industrial and Applied Mathematics

[2] M. T. Chu, G. H. Golub (2005) Inverse eigenvalue problems: theory, algorithms and

[4] R. Loewy, D. London (1978) A note on an inverse problem for nonnegative matrices. In:

[5] M.E. Meehan (1998) Some results on matrix spectra, Ph. D. Thesis, National University

[6] J. Torre-Mayo, M.R. Abril-Raymundo, E. Alarcia-Estévez, C. Marijuán, M. Pisonero (2007) The nonnegative inverse eigenvalue problem from the coefficients of the

[8] N. Radwan (1996) An inverse eigenvalue problem for symmetric and normal matrices.

[9] O. Rojo, R. L. Soto (2003) Existence and construction of nonnegative matrices with

[10] A. Borobia, J. Moro, R. L. Soto (2004) Negativity compensation in the nonnegative inverse

characteristic polynomial. EBL digraphs In: Linear Algebra Appl. 426 729-773. [7] T. J. Laffey, E. Meehan (1999) A characterization of trace zero nonnegative 5x5 matrices.

*<sup>n</sup>* > 0, for *k* = 1, 2, . . . , then there exists a nonnegative number *N*

*sk* = *λ<sup>k</sup>*

is known.

**8. Conclusion**

**Author details** Ricardo L. Soto

**9. References**

(SIAM), Philadelphia, PA.

of Ireland, Dublin.

applications, Oxford University Press, New York.

Linear and Multilinear Algebra 6 83-90.

In: Linear Algebra Appl. 302-303 295-302.

complex spectrum. In: Linear Algebra Appl. 368 53-69

eigenvalue problem. In: Linear Algebra Appl. 393 73-89.

In: Linear Algebra Appl. 248 101-109.

[3] H. Minc (1988) Nonnegative Matrices, John Wiley & Sons, New York.

<sup>1</sup> <sup>+</sup> *<sup>λ</sup><sup>k</sup>*

<sup>2</sup> <sup>+</sup> ··· <sup>+</sup> *<sup>λ</sup><sup>k</sup>*


## **Identification of Linear, Discrete-Time Filters via Realization**

Daniel N. Miller and Raymond A. de Callafon

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/48293

## **1. Introduction**

18 Will-be-set-by-IN-TECH

[34] R. L. Soto, O. Rojo, J. Moro, A. Borobia (2007) Symmetric nonnegative realization of

[35] T. J. Laffey, H. Šmigoc (2007) Construction of nonnegative symmetric matrices with given

[36] R.L. Soto, O. Rojo, C.B. Manzaneda (2011) On nonnegative realization of partitioned

[37] O. Spector (2011) A characterization of trace zero symmetric nonnegative 5 × 5 matrices.

[38] T. J. Laffey, E. Meehan (1998) A refinament of an inequality of Johnson, Loewy and London on nonnegative matrices and some applications. In: Electronic Journal of Linear

[39] O. Holtz (2004) M-matrices satisfy Newton's inequalities. In: Proceedings of the AMS

[40] C. R. Johnson (1981) Row stochastic matrices similar to doubly stochastic matrices. In:

[41] A. Brauer (1952) Limits for the characteristic roots of a matrix. IV: Aplications to

[42] R. Reams (1996) An inequality for nonnegative matrices and the inverse eigenvalue

[43] R. A. Horn, C. R. Johnson (1991) Matrix Analysis, Cambridge University Press,

[44] R.L. Soto, A.I. Julio (2011) A note on the symmetric nonnegative inverse eigenvalue

[45] W. Guo (1997) Eigenvalues of nonnegative matrices. In: Linear Algebra Appl. 266

[46] S. Guo, W. Guo (2007) Perturbing non-real eigenvalues of nonnegative real matrices. In:

[47] O. Rojo, R. L. Soto (2009) Guo perturbations for symmetric nonnegative circulant

[48] J. Ccapa, R.L. Soto (2009) On spectra perturbation and elementary divisors of positive

[49] M. Boyle and D. Handelman, The spectra of nonnegative matrices via symbolic

spectra. In: Electronic Journal of Linear Algebra 16 1-18.

spectra. In: Electronic Journal of Linear Algebra 22 557-572.

spectrum. In: Linear Algebra Appl. 421 97-109.

In: Linear Algebra Appl. 434 1000-1017.

Linear and Multilinear Algebra 10 113-130.

Linear Algebra Appl. 426 199-203.

stochastic matrices. In: Duke Math. J., 19 75-91.

matrices. In: Linear Algebra Appl. 431 594-607.

dynamics, Ann.of Math*.* 133: 249-316 (1991).

matrices. In: Electron. J. of Linear Algebra 18 462-481.

problem. In: Linear and Multilinear Algebra 41 367-375.

problem. In: International Mathematical Forum 6 N◦ 50, 2447-2460.

Algebra 3 119-128.

Cambridge.

261-270.

133 (3) (2004) 711-717.

The realization of a discrete-time, linear, time-invariant (LTI) filter from its impulse response provides insight into the role of linear algebra in the analysis of both dynamical systems and rational functions. For an LTI filter, a sequence of output data measured over some finite period of time may be expressed as the linear combination of the past input and the input measured over that same period. For a finite-dimensional LTI filter, the mapping from past input to future output is a finite-rank linear operator, and the effect of past input, that is, the memory of the system, may be represented as a finite-dimensional vector. This vector is the *state* of the system.

The central idea of realization theory is to first identify the mapping from past input to future output and to then factor it into two parts: a map from the input to the state and another from the state to the output. This factorization guarantees that the resulting system representation is both casual and finite-dimensional; thus it can be physically constructed, or *realized*.

*System identification* is the science of constructing dynamic models from experimentally measured data. Realization-based identification methods construct models by estimating the mapping from past input to future output based on this measured data. The non-deterministic nature of the estimation process causes this mapping to have an arbitrarily large rank, and so a rank-reduction step is required to factor the mapping into a suitable state-space model. Both these steps must be carefully considered to guarantee unbiased estimates of dynamic systems.

The foundations of realization theory are primarily due to Kalman and first appear in the landmark paper of [1], though the problem is not defined explicitly until [2], which also coins the term "realization" as being the a state-space model of a linear system constructed from an experimentally measured impulse response. It was [3] that introduced the structured-matrix approach now synonymous with the term "realization theory" by re-interpreting a theorem originally due to [4] in a state-space LTI system framework.

Although Kalman's original definition of "realization" implied an identification problem, it was not until [5] proposed rank-reduction by means of the singular-value decomposition

#### 2 Will-be-set-by-IN-TECH 118 Linear Algebra – Theorems and Applications

that Ho's method became feasible for use with non-deterministic data sets. The combination of Ho's method and the singular-value decomposition was finally generalized to use with experimentally measured data by Kung in [6].

With the arrival of Kung's method came the birth of what is now known as the field of *subspace identification* methods. These methods use structured matrices of arbitrary input and output data to estimate a state-sequence from the system. The system is then identified from the propagation of the state over time. While many subspace methods exist, the most popular are the Multivariable Output-Error State Space (MOESP) family, due to [7], and the Numerical Algorithms for Subspace State-Space System Identification (N4SID) family, due to [8]. Related to subspace methods is the Eigensystem Realization Algorithm [9], which applies Kung's algorithm to impulse-response estimates, which are typically estimated through an Observer/Kalman Filter Identification (OKID) algorithm [10].

This chapter presents the central theory behind realization-based system identification in a chronological context, beginning with Kronecker's theorem, proceeding through the work of Kalman and Kung, and presenting a generalization of the procedure to arbitrary sets of data. This journey provides an interesting perspective on the original role of linear algebra in the analysis of rational functions and highlights the similarities of the different representations of LTI filters. Realization theory is a diverse field that connects many tools of linear algebra, including structured matrices, the QR-decomposition, the singular-value decomposition, and linear least-squares problems.

## **2. Transfer-function representations**

We begin by reviewing some properties of discrete-time linear filters, focusing on the role of infinite series expansions in analyzing the properties of rational functions. The reconstruction of a transfer function from an infinite impulse response is equivalent to the reconstruction of a rational function from its Laurent series expansion. The reconstruction problem is introduced and solved by forming structured matrices of impulse-response coefficients.

### **2.1. Difference equations and transfer functions**

Discrete-time linear filters are most frequently encountered in the form of difference equations that relate an input signal *uk* to an output signal *yk*. A simple example is an output *yk* determined by a weighted sum of the inputs from *uk* to *uk*−*m*,

$$y\_k = b\_m u\_k + b\_{m-1} u\_{k-1} + \cdots + b\_0 u\_{k-m}.\tag{1}$$

More commonly, the output *yk* also contains a weighted sum of previous outputs, such as a weighted sum of samples from *yk*−<sup>1</sup> to *yk*−*n*,

$$y\_k = b\_m u\_k + b\_{m-1} u\_{k-1} + \dots + b\_0 u\_{k-m} - a\_{n-1} y\_{k-1} - a\_{n-2} y\_{k-2} - \dots + a\_0 y\_{k-n} \tag{2}$$

The impulse response of a filter is the output sequence *gk* = *yk* generated from an input

$$\mu\_k = \begin{cases} 1 & k = 0, \\ 0 & k \neq 0. \end{cases} \tag{3}$$

The parameters *gk* are the impulse-response coefficients, and they completely describe the behavior of an LTI filter through the convolution

2 Will-be-set-by-IN-TECH

that Ho's method became feasible for use with non-deterministic data sets. The combination of Ho's method and the singular-value decomposition was finally generalized to use with

With the arrival of Kung's method came the birth of what is now known as the field of *subspace identification* methods. These methods use structured matrices of arbitrary input and output data to estimate a state-sequence from the system. The system is then identified from the propagation of the state over time. While many subspace methods exist, the most popular are the Multivariable Output-Error State Space (MOESP) family, due to [7], and the Numerical Algorithms for Subspace State-Space System Identification (N4SID) family, due to [8]. Related to subspace methods is the Eigensystem Realization Algorithm [9], which applies Kung's algorithm to impulse-response estimates, which are typically estimated through an

This chapter presents the central theory behind realization-based system identification in a chronological context, beginning with Kronecker's theorem, proceeding through the work of Kalman and Kung, and presenting a generalization of the procedure to arbitrary sets of data. This journey provides an interesting perspective on the original role of linear algebra in the analysis of rational functions and highlights the similarities of the different representations of LTI filters. Realization theory is a diverse field that connects many tools of linear algebra, including structured matrices, the QR-decomposition, the singular-value decomposition, and

We begin by reviewing some properties of discrete-time linear filters, focusing on the role of infinite series expansions in analyzing the properties of rational functions. The reconstruction of a transfer function from an infinite impulse response is equivalent to the reconstruction of a rational function from its Laurent series expansion. The reconstruction problem is introduced

Discrete-time linear filters are most frequently encountered in the form of difference equations that relate an input signal *uk* to an output signal *yk*. A simple example is an output *yk*

More commonly, the output *yk* also contains a weighted sum of previous outputs, such as a

The impulse response of a filter is the output sequence *gk* = *yk* generated from an input

*uk* =

*yk* = *bmuk* + *bm*−<sup>1</sup>*uk*−<sup>1</sup> + ··· + *<sup>b</sup>*0*uk*−*<sup>m</sup>* − *an*−<sup>1</sup>*yk*−<sup>1</sup> − *an*−<sup>2</sup>*yk*−<sup>2</sup> −··· + *<sup>a</sup>*0*yk*−*n*. (2)

1 *k* = 0,

*yk* = *bmuk* + *bm*−<sup>1</sup>*uk*−<sup>1</sup> + ··· + *<sup>b</sup>*0*uk*−*m*. (1)

<sup>0</sup> *<sup>k</sup>* �<sup>=</sup> 0. (3)

and solved by forming structured matrices of impulse-response coefficients.

experimentally measured data by Kung in [6].

linear least-squares problems.

**2. Transfer-function representations**

weighted sum of samples from *yk*−<sup>1</sup> to *yk*−*n*,

**2.1. Difference equations and transfer functions**

determined by a weighted sum of the inputs from *uk* to *uk*−*m*,

Observer/Kalman Filter Identification (OKID) algorithm [10].

$$y\_k = \sum\_{j=0}^{\infty} g\_j u\_{k-j}.\tag{4}$$

Filters of type (1) are called finite-impulse response (FIR) filters because *gk* is a finite-length sequence that settles to 0 once *k* > *m*. Filters of type (2) are called infinite impulse response (IIR) filters since generally the impulse response will never completely settle to 0.

A system is stable if a bounded *uk* results in a bounded *yk*. Because the output of LTI filters is a linear combination of the input and previous output, any input-output sequence can be formed from a linear superposition of other input-output sequences. Hence proving that the system has a bounded output for a single input sequence is necessary and sufficient to prove the stability of an LTI filter. The simplest input to consider is an impulse, and so a suitable definition of system stability is that the absolute sum of the impulse response is bounded,

$$\sum\_{k=0}^{\infty} |g\_k| < \infty. \tag{5}$$

Though the impulse response completely describes the behavior of an LTI filter, it does so with an infinite number of parameters. For this reason, discrete-time LTI filters are often written as transfer functions of a complex variable *z*. This enables analysis of filter stability and computation of the filter's frequency response in a finite number of calculations, and it simplifies convolution operations into basic polynomial algebra.

The transfer function is found by grouping output and input terms together and taking the *Z*-transform of both signals. Let *Y*(*z*) = ∑<sup>∞</sup> *<sup>k</sup>*=−<sup>∞</sup> *ykz*−*<sup>k</sup>* be the *<sup>Z</sup>*-transform of *yk* and *<sup>U</sup>*(*z*) be the *Z*-transform of *uk*. From the property

$$\mathcal{Z}\left[y\_{k-1}\right] = \mathcal{Y}(z)z^{-1}$$

the relationship between *Y*(*z*) and *U*(*z*) may be expressed in polynomials of *z* as

$$a(z)\mathcal{Y}(z) = b(z)\mathcal{U}(z).$$

The ratio of these two polynomials is the filter's transfer function

$$G(z) = \frac{b(z)}{a(z)} = \frac{b\_m z^m + b\_{m-1} z^{m-1} + \dots + b\_1 z + b\_0}{z^n + a\_{n-1} z^{n-1} + \dots + a\_1 z + a\_0}.\tag{6}$$

When *n* ≥ *m*, *G*(*z*) is *proper*. If the transfer function is not proper, then the difference equations will have *yk* dependent on future input samples such as *uk*+1. Proper transfer functions are required for causality, and thus all physical systems have proper transfer function representations. When *n* > *m*, the system is *strictly proper*. Filters with strictly proper transfer functions have no feed-through terms; the output *yk* does not depend on *uk*, only the preceding input *uk*−1, *uk*−2, . . . . In this chapter, we assume all systems are causal and all transfer functions proper.

If *a*(*z*) and *b*(*z*) have no common roots, then the rational function *G*(*z*) is *coprime*, and the order *n* of *G*(*z*) cannot be reduced. Fractional representations are not limited to

#### 4 Will-be-set-by-IN-TECH 120 Linear Algebra – Theorems and Applications

single-input-single-output systems. For vector-valued input signals *uk* <sup>∈</sup> **<sup>R</sup>***nu* and output signals *yk* <sup>∈</sup> **<sup>R</sup>***ny* , an LTI filter may be represented as an *ny* <sup>×</sup> *nu* matrix of rational functions *Gij*(*z*), and the system will have matrix-valued impulse-response coefficients. For simplicity, we will assume that transfer function representations are single-input-single-output, though all results presented here generalize to the multi-input-multi-output case.

#### **2.2. Stability of transfer function representations**

Because the effect of *b*(*z*) is equivalent to a finite-impulse response filter, the only requirement for *b*(*z*) to produce a stable system is that its coefficients be bounded, which we may safely assume is always the case. Thus the stability of a transfer function *G*(*z*) is determined entirely by *a*(*z*), or more precisely, the roots of *a*(*z*). To see this, suppose *a*(*z*) is factored into its roots, which are the poles *pi* of *G*(*z*),

$$G(z) = \frac{b(z)}{\prod\_{i=1}^{n} (z - p\_i)}.\tag{7}$$

To guarantee a bounded *yk*, it is sufficient to study a single pole, which we will denote simply as *p*. Thus we wish to determine necessary and sufficient conditions for stability of the system

$$G'(z) = \frac{1}{z - p}.\tag{8}$$

Note that *p* may be complex. Assume that |*z*| > |*p*|. *G*� (*z*) then has the Laurent-series expansion

$$G'(z) = z^{-1} \left(\frac{1}{1 - pz^{-1}}\right) = z^{-1} \sum\_{k=0}^{\infty} p^k z^{-k} = \sum\_{k=1}^{\infty} p^{k-1} z^{-k}.\tag{9}$$

From the time-shift property of the *z*-transform, it is immediately clear that the sequence

$$\mathbf{g}'\_{k} = \begin{cases} 0 & k = 1, \\ p^{k-1} & k > 1, \end{cases} \tag{10}$$

is the impulse response of *G*� (*z*). If we require that (9) is absolutely summable and let |*z*| = 1, the result is the original stability requirement (5), which may be written in terms of *p* as

$$\sum\_{k=1}^{\infty} \left| p^{k-1} \right| < \infty.$$

This is true if and only if |*p*| < 1, and thus *G*� (*z*) is stable if and only if |*p*| < 1. Finally, from (7) we may deduce that a system is stable if and only if all the poles of *G*(*z*) satisfy the property |*pi*| < 1.

#### **2.3. Construction of transfer functions from impulse responses**

Transfer functions are a convenient way of representing complex system dynamics in a finite number of parameters, but the coefficients of *a*(*z*) and *b*(*z*) cannot be measured directly. The impulse response of a system can be found experimentally by either direct measurement or from other means such as taking the inverse Fourier transform of a measured frequency response [11]. It cannot, however, be represented in a finite number of parameters. Thus the conversion between transfer functions and impulse responses is an extremely useful tool.

For a single-pole system such as (8), the expansion (9) provides an obvious means of reconstructing a transfer function from a measured impulse response: given any 2 sequential impulse-response coefficients *gk* and *gk*+1, the pole of *G*� (*z*) may be found from

4 Will-be-set-by-IN-TECH

single-input-single-output systems. For vector-valued input signals *uk* <sup>∈</sup> **<sup>R</sup>***nu* and output signals *yk* <sup>∈</sup> **<sup>R</sup>***ny* , an LTI filter may be represented as an *ny* <sup>×</sup> *nu* matrix of rational functions *Gij*(*z*), and the system will have matrix-valued impulse-response coefficients. For simplicity, we will assume that transfer function representations are single-input-single-output, though

Because the effect of *b*(*z*) is equivalent to a finite-impulse response filter, the only requirement for *b*(*z*) to produce a stable system is that its coefficients be bounded, which we may safely assume is always the case. Thus the stability of a transfer function *G*(*z*) is determined entirely by *a*(*z*), or more precisely, the roots of *a*(*z*). To see this, suppose *a*(*z*) is factored into its roots,

> *<sup>G</sup>*(*z*) = *<sup>b</sup>*(*z*) ∏*<sup>n</sup>*

> > *G*�

From the time-shift property of the *z*-transform, it is immediately clear that the sequence

the result is the original stability requirement (5), which may be written in terms of *p* as

we may deduce that a system is stable if and only if all the poles of *G*(*z*) satisfy the property

Transfer functions are a convenient way of representing complex system dynamics in a finite number of parameters, but the coefficients of *a*(*z*) and *b*(*z*) cannot be measured directly. The impulse response of a system can be found experimentally by either direct measurement or from other means such as taking the inverse Fourier transform of a measured frequency response [11]. It cannot, however, be represented in a finite number of parameters. Thus the conversion between transfer functions and impulse responses is an extremely useful tool.

∞ ∑ *k*=1 *<sup>p</sup>k*−<sup>1</sup> <sup>&</sup>lt; <sup>∞</sup>.

To guarantee a bounded *yk*, it is sufficient to study a single pole, which we will denote simply as *p*. Thus we wish to determine necessary and sufficient conditions for stability of the system

(*z*) = <sup>1</sup>

*<sup>i</sup>*=1(*z* − *pi*)

*z* − *p*

*pkz*−*<sup>k</sup>* =

(*z*). If we require that (9) is absolutely summable and let |*z*| = 1,

∞ ∑ *k*=1

*<sup>p</sup>k*−<sup>1</sup> *<sup>k</sup>* <sup>&</sup>gt; 1, (10)

(*z*) is stable if and only if |*p*| < 1. Finally, from (7)

<sup>=</sup> *<sup>z</sup>*−<sup>1</sup> <sup>∞</sup> ∑ *k*=0

0 *k* = 1,

. (7)

. (8)

(*z*) then has the Laurent-series

*pk*−1*z*−*k*. (9)

all results presented here generalize to the multi-input-multi-output case.

**2.2. Stability of transfer function representations**

Note that *p* may be complex. Assume that |*z*| > |*p*|. *G*�

 1 <sup>1</sup> − *pz*−<sup>1</sup>

> *g*� *<sup>k</sup>* =

**2.3. Construction of transfer functions from impulse responses**

which are the poles *pi* of *G*(*z*),

*G*�

This is true if and only if |*p*| < 1, and thus *G*�

is the impulse response of *G*�

(*z*) = *z*−<sup>1</sup>

expansion


$$p = \mathcal{g}\_k^{-1} \mathcal{g}\_{k+1}.\tag{11}$$

Notice that this is true for any *k*, and the impulse response can be said to have a *shift-invariant* property in this respect.

Less clear is the case when an impulse response is generated by a system with higher-order *a*(*z*) and *b*(*z*). In fact, there is no guarantee that an arbitrary impulse response is the result of a linear system of difference equations at all. For an LTI filter, however, the coefficients of the impulse response exhibit a linear dependence which may be used to not only verify the linearity of the system, but to construct a transfer function representation as well. The exact nature of this linear dependence may be found by forming a structured matrix of impulse response coefficients and examining its behavior when the indices of the coefficients are shifted forward by a single increment, similar to the single-pole case in (11). The result is stated in the following theorem, originally due to Kronecker [4] and adopted from the English translation of [12].

**Theorem 1** (Kronecker's Theorem)**.** *Suppose G*(*z*) : **C** → **C** *is an infinite series of descending powers of z, starting with z*−1*,*

$$G(z) = g\_1 z^{-1} + g\_2 z^{-2} + g\_3 z^{-3} + \cdots = \sum\_{k=1}^{\infty} g\_k z^{-k}.\tag{12}$$

*Assume G(z) is analytic (the series converges) for all* |*z*| > 1*. Let H be an infinitely large matrix of the form*

$$H = \begin{bmatrix} \\$1 & \\$2 & \\$3 & \cdots \\ \\$2 & \\$3 & \\$4 & \cdots \\ \\$3 & \\$4 & \\$5 & \cdots \\ \vdots & \vdots & \vdots \end{bmatrix} \tag{13}$$

*Then H has finite rank n if and only if G*(*z*) *is a strictly proper, coprime, rational function of degree n with poles inside the unit circle. That is, G*(*z*) *has an alternative representation*

$$G(z) = \frac{b(z)}{a(z)} = \frac{b\_m z^m + b\_{m-1} z^{m-1} + \dots + b\_1 z + b\_0}{z^n + a\_{n-1} z^{n-1} + \dots + a\_1 z + a\_0},\tag{14}$$

*in which m* < *n, all roots of a*(*z*) *satisfy* |*z*| < 1*, a*(*z*) *and b*(*z*) *have no common roots, and we have assumed without loss of generality that a*(*z*) *is monic.*

To prove Theorem 1, we first prove that for *k* > *n*, *gk* must be linearly dependent on the previous *n* terms of the series for *H* to have finite rank.

**Theorem 2.** *The infinitely large matrix H is of finite rank n if and only if there exists a finite sequence α*1, *α*2, ··· , *α<sup>n</sup> such that for k* ≥ *n,*

$$\mathbf{g}\_{k+1} = \sum\_{j=1}^{n} a\_j \mathbf{g}\_{k-j+1\nu} \tag{15}$$

*and n is the smallest number with this property.*

*Proof.* Let *hk* be the row of *H* beginning with *gk*. If *H* has rank *n*, then the first *n* + 1 rows of *H* are linearly dependent. This implies that for some 1 ≤ *p* ≤ *n*, *hp*+<sup>1</sup> is a linear combination of *h*1,..., *hp*, and thus there exists some sequence *α<sup>k</sup>* such that

$$h\_{p+1} = \sum\_{j=1}^{p} \alpha\_j h\_{p-j+1}.\tag{16}$$

The structure and infinite size of *H* imply that such a relationship must hold for all following rows of *H*, so that for *q* ≥ 0

$$h\_{q+p+1} = \sum\_{j=1}^{p} \alpha\_j h\_{q+p-j+1}.$$

Hence any row *hk*, *k* > *p*, can be expressed as a linear combination of the previous *p* rows. Since *H* has at least *n* linearly independent rows, *p* = *n*, and since this applies element-wise, rank(*H*) = *n* implies (15).

Alternatively, (15) implies a relationship of the form (16) exists, and hence rank(*H*) = *p*. Since *n* is the smallest possible *p*, this implies rank(*H*) = *n*.

We now prove Theorem 1.

*Proof.* Suppose *G*(*z*) is a coprime rational function of the form (14) with series expansion (12), which we know exists, since *G*(*z*) is analytic for |*z*| < 1. Without loss of generality, let *m* = *n* − 1, since we may always let *bk* = 0 for some *k*. Hence

$$\frac{b\_{n-1}z^{n-1} + b\_{n-2}z^{n-2} + \cdots + b\_1z + b\_0}{z^n + a\_{n-1}z^{n-1} + \cdots + a\_1z + a\_0} = g\_1z^{-1} + g\_2z^{-2} + g\_3z^{-3} + \cdots + z^{-n}$$

Multiplying both sides by the denominator of the left,

$$\begin{aligned} (b\_{n-1}z^{n-1} + b\_{n-2}z^{n-2} + \cdots + b\_1z + b\_0 \\ &= g\_1 z^{n-1} + (g\_2 + g\_1 a\_{n-1})z^{n-2} + (g\_3 + g\_2 a\_{n-1} + g\_1 a\_{n-2})z^{n-3} + \cdots \end{aligned}$$

and equating powers of *z*, we find

$$\begin{aligned} b\_{n-1} &= g\_1 \\ b\_{n-2} &= g\_2 + g\_1 a\_{n-1} \\ b\_{n-3} &= g\_3 + g\_2 a\_{n-1} + g\_1 a\_{n-2} \\ &\vdots \\ b\_1 &= g\_{n-1} + g\_{n-2} a\_{n-1} + \dots + g\_1 a\_2 \\ b\_0 &= g\_n + g\_{n-1} a\_{n-1} + \dots + g\_1 a\_1 \\ 0 &= g\_{k+1} + g\_k a\_{n-1} + \dots + g\_{k-n+1} a\_0 \qquad k \ge n. \end{aligned} \tag{17}$$

From this, we have, for *k* ≥ *n*,

$$\mathcal{g}\_{k+1} = \sum\_{j=1}^{n} -a\_j \mathcal{g}\_{k-j+1}$$

which not only shows that (15) holds, but also shows that *α<sup>j</sup>* = −*aj*. Hence by Theorem 2, *H* must have finite rank.

Conversely, suppose *H* has finite rank. Then (15) holds, and we may construct *a*(*z*) from *α<sup>k</sup>* and *b*(*z*) from (17) to create a rational function. This function must be coprime since its order *n* is the smallest possible.

The construction in Theorem 1 is simple to extend to the case in which *G*(*z*) is only proper and not strictly proper; the additional coefficient *bn* is simply the feed-through term in the impulse response, that is, *g*0.

A result of Theorem 2 is that given finite-dimensional, full-rank matrices

$$H\_k = \begin{bmatrix} \mathcal{S}k & \mathcal{S}k+1 & \cdots & \mathcal{S}k+n-1\\ \mathcal{S}k+1 & \mathcal{S}k+2 & \cdots & \mathcal{S}k+n\\ \vdots & \vdots & & \vdots\\ \mathcal{S}k+n-1 & \mathcal{S}k+n & \cdots & \mathcal{S}k+2n-2 \end{bmatrix} \tag{18}$$

and

6 Will-be-set-by-IN-TECH

*Proof.* Let *hk* be the row of *H* beginning with *gk*. If *H* has rank *n*, then the first *n* + 1 rows of *H* are linearly dependent. This implies that for some 1 ≤ *p* ≤ *n*, *hp*+<sup>1</sup> is a linear combination

> *p* ∑ *j*=1

The structure and infinite size of *H* imply that such a relationship must hold for all following

*p* ∑ *j*=1

Hence any row *hk*, *k* > *p*, can be expressed as a linear combination of the previous *p* rows. Since *H* has at least *n* linearly independent rows, *p* = *n*, and since this applies element-wise,

Alternatively, (15) implies a relationship of the form (16) exists, and hence rank(*H*) = *p*. Since

*Proof.* Suppose *G*(*z*) is a coprime rational function of the form (14) with series expansion (12), which we know exists, since *G*(*z*) is analytic for |*z*| < 1. Without loss of generality, let

*<sup>α</sup>jhq*<sup>+</sup>*p*−*j*<sup>+</sup>1.

*<sup>α</sup>jhp*−*j*<sup>+</sup>1. (16)

<sup>=</sup> *<sup>g</sup>*1*z*−<sup>1</sup> <sup>+</sup> *<sup>g</sup>*2*z*−<sup>2</sup> <sup>+</sup> *<sup>g</sup>*3*z*−<sup>3</sup> <sup>+</sup> ···

(17)

<sup>=</sup> *<sup>g</sup>*1*zn*−<sup>1</sup> + (*g*<sup>2</sup> <sup>+</sup> *<sup>g</sup>*1*an*−1)*zn*−<sup>2</sup> + (*g*<sup>3</sup> <sup>+</sup> *<sup>g</sup>*2*an*−<sup>1</sup> <sup>+</sup> *<sup>g</sup>*1*an*−2)*zn*−<sup>3</sup> <sup>+</sup> ··· ,

*hp*+<sup>1</sup> =

*hq*+*p*+<sup>1</sup> =

*and n is the smallest number with this property.*

rows of *H*, so that for *q* ≥ 0

rank(*H*) = *n* implies (15).

We now prove Theorem 1.

of *h*1,..., *hp*, and thus there exists some sequence *α<sup>k</sup>* such that

*n* is the smallest possible *p*, this implies rank(*H*) = *n*.

*m* = *n* − 1, since we may always let *bk* = 0 for some *k*. Hence

*bn*−1*zn*−<sup>1</sup> <sup>+</sup> *bn*−2*zn*−<sup>2</sup> <sup>+</sup> ··· <sup>+</sup> *<sup>b</sup>*1*<sup>z</sup>* <sup>+</sup> *<sup>b</sup>*<sup>0</sup> *<sup>z</sup><sup>n</sup>* <sup>+</sup> *an*−1*zn*−<sup>1</sup> <sup>+</sup> ··· <sup>+</sup> *<sup>a</sup>*1*<sup>z</sup>* <sup>+</sup> *<sup>a</sup>*<sup>0</sup>

Multiplying both sides by the denominator of the left,

*bn*−<sup>1</sup> = *<sup>g</sup>*<sup>1</sup>

. . .

*bn*−<sup>2</sup> = *<sup>g</sup>*<sup>2</sup> + *<sup>g</sup>*1*an*−<sup>1</sup>

*bn*−<sup>3</sup> = *<sup>g</sup>*<sup>3</sup> + *<sup>g</sup>*2*an*−<sup>1</sup> + *<sup>g</sup>*1*an*−<sup>2</sup>

*<sup>b</sup>*<sup>1</sup> = *gn*−<sup>1</sup> + *gn*−<sup>2</sup>*an*−<sup>1</sup> + ··· + *<sup>g</sup>*1*a*<sup>2</sup> *<sup>b</sup>*<sup>0</sup> = *gn* + *gn*−<sup>1</sup>*an*−<sup>1</sup> + ··· + *<sup>g</sup>*1*a*<sup>1</sup>

<sup>0</sup> = *gk*<sup>+</sup><sup>1</sup> + *gk an*−<sup>1</sup> + ··· + *gk*−*n*+1*a*<sup>0</sup> *<sup>k</sup>* ≥ *<sup>n</sup>*.

*bn*−1*zn*−<sup>1</sup> <sup>+</sup> *bn*−2*zn*−<sup>2</sup> <sup>+</sup> ··· <sup>+</sup> *<sup>b</sup>*1*<sup>z</sup>* <sup>+</sup> *<sup>b</sup>*<sup>0</sup>

and equating powers of *z*, we find

$$H\_{k+1} = \begin{bmatrix} \mathcal{g}k+1 & \mathcal{g}k+2 & \cdots & \mathcal{g}k+n\\ \mathcal{g}k+2 & \mathcal{g}k+3 & \cdots & \mathcal{g}k+n+1\\ \vdots & \vdots & & \vdots\\ \mathcal{g}k+n & \mathcal{g}k+n+1 & \cdots & \mathcal{g}k+2n-1 \end{bmatrix}'\tag{19}$$

the coefficients of *a*(*z*) may be calculated as

$$\begin{bmatrix} 0 \ 0 \ \cdots \ 0 \ \ -a\_0 \\ 1 \ 0 \ \cdots \ 0 \ \ -a\_1 \\ 0 \ 1 \ \cdots \ 0 \ \ -a\_2 \\ \vdots \ \vdots \ \ddots \ \vdots \ \vdots \\ 0 \ 0 \ \cdots \ 1 \ -a\_{n-1} \end{bmatrix} = H\_k^{-1} H\_{k+1}.\tag{20}$$

Notice that (11) is in fact a special case of (20). Thus we need only know the first 2*n* + 1 impulse-response coefficients to reconstruct the transfer function *G*(*z*): 2*n* to form the matrices *Hk* and *Hk*<sup>+</sup><sup>1</sup> from (18) and (19), respectively, and possibly the initial coefficient *g*<sup>0</sup> in case of an *n*th-order *b*(*z*).

Matrices with the structure of *H* are useful enough to have a special name. A *Hankel matrix H* is a matrix constructed from a sequence {*hk*} so that each element *H*(*j*,*k*) = *hj*<sup>+</sup>*k*. For the Hankel matrix in (13), *hk* = *gk*−1. *Hk* also has an interesting property implied by (20): its row space is invariant under shifting of the index *k*. Because its symmetric, this is also true for its column space. Thus this matrix is also often referred to as being *shift-invariant*.

While (20) provides a potential method of identifying a system from a measured impulse response, this is not a reliable method to use with measured impulse response coefficients that are corrupted by noise. The exact linear dependence of the coefficients will not be identical for all *k*, and the structure of (20) will not be preserved. Inverting *Hk* will also invert any noise on *gk*, potentially amplifying high-frequency noise content. Finally, the system order *n* is required to be known beforehand, which is usually not the case if only an impulse response is available. Fortunately, these difficulties may all be overcome by reinterpreting the results Kronecker's theorem in a state-space framework. First, however, we more carefully examine the role of the Hankel matrix in the behavior of LTI filters.

#### **2.4. Hankel and Toeplitz operators**

The Hankel matrix of impulse response coefficients (13) is more than a tool for computing the transfer function representation of a system from its impulse response. It also defines the mapping of past input signals to future output signals. To define exactly what this means, we write the convolution of (4) around sample *k* = 0 in matrix form as


where the vectors and matrix have been partitioned into sections for *k* < 0 and *k* ≥ 0. The output for *k* ≥ 0 may then be split into two parts:

$$
\underbrace{\begin{bmatrix} y\_0 \\ y\_1 \\ y\_2 \\ \vdots \\ y\_f \end{bmatrix}}\_{\mathcal{Y}f} = \underbrace{\begin{bmatrix} \mathfrak{g}\_1 \ \mathfrak{g}\_2 \ \mathfrak{g}\_3 \cdots \cdots \\ \mathfrak{g}\_2 \ \mathfrak{g}\_3 \ \mathfrak{g}\_4 \ \cdots \\ \mathfrak{g}\_3 \ \mathfrak{g}\_4 \ \mathfrak{g}\_5 \cdots \\ \vdots \\ \mathfrak{i} \end{bmatrix}}\_{\mathcal{H}} \begin{bmatrix} u\_{-1} \\ u\_{-2} \\ u\_{-3} \\ \vdots \\ \vdots \\ u\_p \end{bmatrix} + \underbrace{\begin{bmatrix} \mathfrak{g}\_0 & \cdots & 0 \\ \mathfrak{g}\_1 \ \mathfrak{g}\_0 & \vdots \\ \mathfrak{g}\_1 \ \mathfrak{g}\_0 & \mathfrak{i} \\ \vdots & \vdots & \vdots \\ \mathfrak{i} & \mathfrak{i} & \mathfrak{i} \end{bmatrix}}\_{\mathcal{U}} \begin{bmatrix} u\_0 \\ u\_1 \\ u\_2 \\ \vdots \\ \vdots \end{bmatrix}' \tag{21}
$$

,

where the subscripts *p* and *f* denote "past" and "future," respectively. The system Hankel matrix *H* has returned to describe the effects of the past input *up* on the future output *yf* . Also present is the matrix *T*, which represents the convolution of future input *uf* with the impulse response. Matrices such as *T* with constant diagonals are called *Toeplitz* matrices.

From (21), it can be seen that *H* defines the effects of past input on future output. One interpretation of this is that *H* represents the "memory" of the system. Because *H* is a linear mapping from *up* to *yf* , the induced matrix 2-norm of *H*, ||*H*||2, can be considered a function norm, and in a sense, ||*H*||<sup>2</sup> is a measure of the "gain" of the system. ||*H*||<sup>2</sup> is often called the *Hankel-norm* of a system, and it plays an important role in model reduction and in the analysis of anti-causal systems. More information on this aspect of linear systems can be found in the literature of robust control, for instance, [13].

#### **3. State-space representations**

8 Will-be-set-by-IN-TECH

for all *k*, and the structure of (20) will not be preserved. Inverting *Hk* will also invert any noise on *gk*, potentially amplifying high-frequency noise content. Finally, the system order *n* is required to be known beforehand, which is usually not the case if only an impulse response is available. Fortunately, these difficulties may all be overcome by reinterpreting the results Kronecker's theorem in a state-space framework. First, however, we more carefully examine

The Hankel matrix of impulse response coefficients (13) is more than a tool for computing the transfer function representation of a system from its impulse response. It also defines the mapping of past input signals to future output signals. To define exactly what this means, we

... ··· <sup>0</sup>

where the vectors and matrix have been partitioned into sections for *k* < 0 and *k* ≥ 0. The

*u*−<sup>1</sup> *u*−<sup>2</sup> *u*−<sup>3</sup> . . .

� �� � *up*

where the subscripts *p* and *f* denote "past" and "future," respectively. The system Hankel matrix *H* has returned to describe the effects of the past input *up* on the future output *yf* . Also present is the matrix *T*, which represents the convolution of future input *uf* with the impulse

From (21), it can be seen that *H* defines the effects of past input on future output. One interpretation of this is that *H* represents the "memory" of the system. Because *H* is a linear mapping from *up* to *yf* , the induced matrix 2-norm of *H*, ||*H*||2, can be considered a function norm, and in a sense, ||*H*||<sup>2</sup> is a measure of the "gain" of the system. ||*H*||<sup>2</sup> is often called the *Hankel-norm* of a system, and it plays an important role in model reduction and in the analysis

⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦

+

⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣

*g*<sup>0</sup> ··· 0

. . . ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣

*u*0 *u*1 *u*2 . . .

� �� � *uf*

⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦

, (21)

*g*<sup>1</sup> *g*<sup>0</sup>

*g*<sup>2</sup> *g*<sup>1</sup> *g*<sup>0</sup> . . . . . . . . . ...

� �� � *T*

. . . ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣

. . . *u*−<sup>3</sup> *u*−<sup>2</sup> *u*−<sup>1</sup> *u*0 *u*1 *u*2 . . .

⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦

,

the role of the Hankel matrix in the behavior of LTI filters.

write the convolution of (4) around sample *k* = 0 in matrix form as

⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣

··· *g*<sup>0</sup>

··· *g*<sup>1</sup> *g*<sup>0</sup> ··· *g*<sup>2</sup> *g*<sup>1</sup> *g*<sup>0</sup> ··· *g*<sup>3</sup> *g*<sup>2</sup> *g*<sup>1</sup> *g*<sup>0</sup> ··· *g*<sup>4</sup> *g*<sup>3</sup> *g*<sup>2</sup> *g*<sup>1</sup> *g*<sup>0</sup> ··· *g*<sup>5</sup> *g*<sup>4</sup> *g*<sup>3</sup> *g*<sup>2</sup> *g*<sup>1</sup> *g*<sup>0</sup>

. . . . . . . . . . . . . . . . . . ...

⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦

response. Matrices such as *T* with constant diagonals are called *Toeplitz* matrices.

⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣

**2.4. Hankel and Toeplitz operators**

⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣

output for *k* ≥ 0 may then be split into two parts:

⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣

. . . . . . . . .

*g*<sup>1</sup> *g*<sup>2</sup> *g*<sup>3</sup> ··· *g*<sup>2</sup> *g*<sup>3</sup> *g*<sup>4</sup> ··· *g*<sup>3</sup> *g*<sup>4</sup> *g*<sup>5</sup> ···

� �� � *H*

=

⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣

*y*0 *y*1 *y*2 . . .

� �� � *yf*

⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦

. . . *y*−<sup>3</sup> *y*−<sup>2</sup> *y*−<sup>1</sup> *y*0 *y*1 *y*2 . . .

⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦

=

Although transfer functions define system behavior completely with a finite number of parameters and simplify frequency-response calculations, they are cumbersome to manipulate when the input or output is multi-dimensional or when initial conditions must be considered. The other common representation of LTI filters is the state-space form

$$\begin{aligned} \mathbf{x}\_{k+1} &= A\mathbf{x}\_k + Bu\_k\\ y\_k &= \mathbf{C}\mathbf{x}\_k + Du\_{k\prime} \end{aligned} \tag{22}$$

in which *xk* <sup>∈</sup> **<sup>R</sup>***<sup>n</sup>* is the system state. The matrices *<sup>A</sup>* <sup>∈</sup> **<sup>R</sup>***n*×*n*, *<sup>B</sup>* <sup>∈</sup> **<sup>R</sup>***n*×*nu* , *<sup>C</sup>* <sup>∈</sup> **<sup>R</sup>***ny*×*n*, and *<sup>D</sup>* <sup>∈</sup> **<sup>R</sup>***ny*×*nu* completely parameterize the system. Only *<sup>D</sup>* uniquely defines the input-output behavior; any nonsingular matrix *T*� may be used to change the state basis via the relationships

$$\mathbf{x}' = T'\mathbf{x} \qquad A' = T'AT'^{-1} \qquad B' = T'B \qquad C' = CT'^{-1}.$$

The *Z*-transform may also be applied to the state-space equations (22) to find

$$\begin{array}{llll} \mathcal{Z}[\mathbf{x}\_{k+1}] = A \mathcal{Z}[\mathbf{x}\_{k}] + B \mathcal{Z}[\mathbf{u}\_{k}] & \Rightarrow & X(z)z = AX(z) + BI(z) \\ \mathcal{Z}[y\_{k}] = \mathcal{C} \mathcal{Z}[\mathbf{x}\_{k}] + D \mathcal{Z}[\mathbf{u}\_{k}] & \Rightarrow & Y(z) = \mathcal{C}X(z) + DI(z) \end{array}$$

$$\frac{Y(z)}{U(z)} = G(z) \qquad \qquad G(z) = \mathcal{C} \left(zI - A\right)^{-1}B + D,\tag{23}$$

and thus, if (22) is the state-space representation of the single-variable system (6), then *a*(*z*) is the characteristic polynomial of *A*, det(*zI* − *A*).

Besides clarifying the effect of initial conditions on the output, state-space representations are inherently causal, and (23) will always result in a proper system (strictly proper if *D* = 0). For this reason, state-space representations are often called *realizable* descriptions; while the forward-time-shift of *z* is an inherently non-causal operation, state-space systems may always be constructed in reality.

#### **3.1. Stability, controllability, and observability of state-space representations**

The system impulse response is simple to formulate in terms of the state-space parameters by calculation of the output to a unit impulse with *x*<sup>0</sup> = 0:

$$\mathcal{g}\_k = \begin{cases} D & k = 0, \\ CA^{k-1}B & k > 0 \end{cases} \tag{24}$$

Notice the similarity of (10) and (24). In fact, from the eigenvalue decomposition of *A*,

$$A = V\Lambda V^{-1}\prime$$

we find <sup>∞</sup>

$$\sum\_{k=1}^{\infty} |\mathcal{g}\_k| = \sum\_{k=1}^{\infty} \left| \mathcal{C} A^{k-1} B \right| = \sum\_{k=1}^{\infty} |\mathcal{C} V| \left( |\Lambda^{k-1}| \right) \left| V^{-1} B \right|.$$

The term <sup>|</sup>Λ*k*−1<sup>|</sup> will only converge if the largest eigenvalue of *<sup>A</sup>* is within the unit circle, and thus the condition that all eigenvalues *λ<sup>i</sup>* of *A* satisfy |*λi*| < 1 is a necessary and sufficient condition for stability.

For state-space representations, there is the possibility that a combination of *A* and *B* will result in a system for which *xk* cannot be entirely controlled by the input *uk*. Expressing *xk* in a matrix-form similar to (21) as

$$\mathbf{x}\_{k} = \mathcal{C} \begin{bmatrix} u\_{k-1} \\ u\_{k-2} \\ u\_{k-3} \\ \vdots \end{bmatrix}, \qquad \mathcal{C} = \begin{bmatrix} B \ AB \ A^2 B \ \cdots \end{bmatrix} \tag{25}$$

demonstrates that *xk* is in subspace **<sup>R</sup>***<sup>n</sup>* if and only if <sup>C</sup> has rank *<sup>n</sup>*. <sup>C</sup> is the *controllability matrix* and the system is *controllable* if it has full row rank.

Similarly, the state *xk* may not uniquely determine the output for some combinations of *A* and *C*. Expressing the evolution of the output as a function of the state in matrix-form as

$$\begin{bmatrix} y\_k \\ y\_{k+1} \\ y\_{k+2} \\ \vdots \end{bmatrix} = \mathcal{O} \mathbf{x}\_{k\prime} \qquad \mathcal{O} = \begin{bmatrix} \mathbf{C} \\ \mathbf{C}A \\ \mathbf{C}A^2 \\ \vdots \end{bmatrix}$$

demonstrates that there is no nontrivial null space in the mapping from *xk* to *yk* if and only if O has rank *n*. O is the *observability matrix* and the system is *observable* if it has full column rank.

Systems that are both controllable and observable are called *minimal*, and for minimal systems, the dimension *n* of the state variable cannot be reduced. In the next section we show that minimal state-space system representations convert to coprime transfer functions that are found through (23).

#### **3.2. Construction of state-space representations from impulse responses**

The fact that the denominator of *G*(*z*) is the characteristic polynomial of *A* not only allows for the calculation of a transfer function from a state-space representation, but provides an alternative version of Kronecker's theorem for state-space systems, known as the Ho-Kalman Algorithm [3]. From the Caley-Hamilton theorem, if *a*(*z*) is the characteristic polynomial of *A*, then *a*(*A*) = 0, and

$$CA^k a(A)B = CA^k \left( A^n + a\_{n-1}A^{n-1} + \dots + a\_1A + a\_0 \right) B$$

$$= CA^{k+n}B + \sum\_{j=0}^{n-1} a\_j \mathbf{C} A^{k+j} B\_\prime$$

which implies

$$
\mathbb{C}A^{k+n}B = -\sum\_{j=0}^{n-1} a\_j \mathbb{C}A^{k+j}B.\tag{26}
$$

Indeed, substitution of (24) into (26) and rearrangement of the indices leads to (15). Additionally, substitution of (24) into the product of O and C shows that

$$\mathcal{OC} = \begin{bmatrix} \mathcal{C}B & \mathcal{C}AB \ \mathcal{C}A^2B & \cdots \\ \mathcal{C}AB \ \mathcal{C}A^2B \ \mathcal{C}A^3B & \cdots \\ \mathcal{C}A^2B \ \mathcal{C}A^3B \ \mathcal{C}A^4B & \cdots \\ \vdots & \vdots & \vdots \end{bmatrix} = \begin{bmatrix} \mathcal{g}\_1 \ \mathcal{g}\_2 \ \mathcal{g}\_3 \ \cdots \\ \mathcal{g}\_2 \ \mathcal{g}\_3 \ \mathcal{g}\_4 \ \cdots \\ \mathcal{g}\_3 \ \mathcal{g}\_4 \ \mathcal{g}\_5 \ \cdots \\ \vdots & \vdots \end{bmatrix} = H\_{\mathcal{A}}$$

which confirms our previous statement that *H* effectively represents the memory of the system. Because

$$\text{rank}(H) = \min\{\text{rank}(\mathcal{O}), \text{rank}(\mathcal{C})\}\_{\mathcal{M}}$$

we see that rank(*H*) = *n* implies the state-space system (22) is minimal.

If the entries of *H* are shifted forward by one index to form

$$
\overline{H} = \begin{bmatrix}
\\$2 \text{ }\\$3 \text{ }\\$4 & \cdots \\
\\$3 \text{ }\\$4 \text{ }\\$5 & \cdots \\
\\$4 \text{ }\\$5 \text{ }\\$6 & \cdots \\
\vdots & \vdots & \vdots
\end{bmatrix},
$$

then once again substituting (24) reveals

10 Will-be-set-by-IN-TECH

The term <sup>|</sup>Λ*k*−1<sup>|</sup> will only converge if the largest eigenvalue of *<sup>A</sup>* is within the unit circle, and thus the condition that all eigenvalues *λ<sup>i</sup>* of *A* satisfy |*λi*| < 1 is a necessary and sufficient

For state-space representations, there is the possibility that a combination of *A* and *B* will result in a system for which *xk* cannot be entirely controlled by the input *uk*. Expressing *xk* in

<sup>⎦</sup> , <sup>C</sup> <sup>=</sup> �

demonstrates that *xk* is in subspace **<sup>R</sup>***<sup>n</sup>* if and only if <sup>C</sup> has rank *<sup>n</sup>*. <sup>C</sup> is the *controllability matrix*

Similarly, the state *xk* may not uniquely determine the output for some combinations of *A* and

⎡ ⎢ ⎢ ⎢ ⎣ *C CA CA*<sup>2</sup> . . .

⎤ ⎥ ⎥ ⎥ ⎦

<sup>⎦</sup> <sup>=</sup> <sup>O</sup>*xk*, <sup>O</sup> <sup>=</sup>

demonstrates that there is no nontrivial null space in the mapping from *xk* to *yk* if and only if O has rank *n*. O is the *observability matrix* and the system is *observable* if it has full column

Systems that are both controllable and observable are called *minimal*, and for minimal systems, the dimension *n* of the state variable cannot be reduced. In the next section we show that minimal state-space system representations convert to coprime transfer functions that are

The fact that the denominator of *G*(*z*) is the characteristic polynomial of *A* not only allows for the calculation of a transfer function from a state-space representation, but provides an alternative version of Kronecker's theorem for state-space systems, known as the Ho-Kalman Algorithm [3]. From the Caley-Hamilton theorem, if *a*(*z*) is the characteristic polynomial of

> *n*−1 ∑ *j*=0

*<sup>A</sup><sup>n</sup>* <sup>+</sup> *an*−1*An*−<sup>1</sup> <sup>+</sup> ··· <sup>+</sup> *<sup>a</sup>*1*<sup>A</sup>* <sup>+</sup> *<sup>a</sup>*<sup>0</sup>

*ajCA<sup>k</sup>*+*<sup>j</sup>*

*B*,

*ajCA<sup>k</sup>*+*<sup>j</sup>*

*n*−1 ∑ *j*=0

**3.2. Construction of state-space representations from impulse responses**

�

= *CAk*<sup>+</sup>*nB* +

*CAk*<sup>+</sup>*nB* <sup>=</sup> <sup>−</sup>

*C*. Expressing the evolution of the output as a function of the state in matrix-form as

*B AB A*2*<sup>B</sup>* ···� (25)

� *B*

*B*. (26)

condition for stability.

rank.

found through (23).

*A*, then *a*(*A*) = 0, and

which implies

a matrix-form similar to (21) as

*xk* = C

⎡ ⎢ ⎢ ⎢ ⎣

*CA<sup>k</sup> a*(*A*)*B* = *CA<sup>k</sup>*

*yk yk*<sup>+</sup><sup>1</sup> *yk*<sup>+</sup><sup>2</sup> . . .

⎤ ⎥ ⎥ ⎥

and the system is *controllable* if it has full row rank.

⎡ ⎢ ⎢ ⎢ ⎣ *uk*−<sup>1</sup> *uk*−<sup>2</sup> *uk*−<sup>3</sup> . . .

⎤ ⎥ ⎥ ⎥

$$
\overline{H} = \mathcal{O}A\mathcal{C}.\tag{27}
$$

Thus the row space and column space of *H* are invariant under a forward-shift of the indices, implying the same shift-invariant structure seen in (20).

The appearance of *A* in (27) hints at a method for constructing a state-space realization from an impulse response. Suppose the impulse response is known exactly, and let *Hr* be a finite slice of *H* with *r* block rows and *L* columns,

$$H\_r = \begin{bmatrix} \mathcal{S}1 & \mathcal{S}2 & \mathcal{S}3 & \cdots & \mathcal{S}L \\ \mathcal{S}2 & \mathcal{S}3 & \mathcal{S}4 & \cdots & \mathcal{S}L+1 \\ \mathcal{S}3 & \mathcal{S}4 & \mathcal{S}5 & \cdots & \mathcal{S}L+2 \\ \vdots & \vdots & \vdots & & \vdots \\ \mathcal{S}r-1 & \mathcal{S}r & \mathcal{S}r+1 & \cdots & \mathcal{S}r+L-1 \end{bmatrix}.$$

Then any appropriately dimensioned factorization

$$\begin{aligned} \mathcal{H}\_{\mathcal{I}} = \mathcal{O}\_{\mathcal{I}} \mathcal{C}\_{L} = \begin{bmatrix} \mathsf{C} \\ \mathsf{C}A \\ \mathsf{C}A^{2} \\ \vdots \\ \mathsf{C}A^{r-1} \end{bmatrix} \begin{bmatrix} \mathcal{B} \ AB \ A^{2} \mathcal{B} \ \cdots \ A^{L-1} \mathcal{B} \end{bmatrix} \tag{28}$$

may be used to find *A* for some arbitrary state basis as

$$A = \left(\mathcal{O}\_r\right)^\dagger \overline{H}\_r \left(\mathcal{C}\_L\right)^\dagger \tag{29}$$

where *Hr* is *Hr* with the indices shifted forward once and (·)† is the Moore-Penrose pseudoinverse. *C* taken from the first block row of O*r*, *B* taken from the first block column of C*L*, and *D* taken from *g*<sup>0</sup> then provides a complete and minimal state-space realization from an impulse response. Because *Hr* has rank *n* and det(*zI* − *A*) has degree *n*, we know from Kronecker's theorem that *G*(*z*) taken from (23) will be coprime.

However, as mentioned before, the impulse response of the system is rarely known exactly. In this case only an estimate *H*ˆ*<sup>r</sup>* with a non-deterministic error term is available:

$$
\hat{H}\_r = H\_r + E.
$$

Because *E* is non-deterministic, *H*ˆ will always have full rank, regardless of the number of rows *r*. Thus *n* cannot be determined from examining the rank of *H*, and even if *n* is known beforehand, a factorization (28) for *r* > *n* will not exist. Thus we must find a way of reducing the rank of *H*ˆ*<sup>r</sup>* in order to find a state-space realization.

#### **3.3. Rank-reduction of the Hankel matrix estimate**

If *H*ˆ*<sup>r</sup>* has full rank, or if *n* is unknown, its rank must be reduced prior to factorization. The obvious tool for reducing the rank of matrices is the *singular-value decomposition* (SVD). Assume for now that *n* is known. The SVD of *H*ˆ*<sup>r</sup>* is

$$
\hat{H}\_r = l\Sigma V^T
$$

where *U* and *V<sup>T</sup>* are orthogonal matrices and Σ is a diagonal matrix containing the nonnegative *singular values σ<sup>i</sup>* ordered from largest to smallest. The SVD for a matrix is unique and guaranteed to exist, and the number of nonzero singular values of a matrix is equal to its rank [14].

Because *U* and *V<sup>T</sup>* are orthogonal, the SVD satisfies

$$\hat{H}\_{\mathbf{r}} = \left| \left| \mathbf{J} \Sigma \mathbf{V}^{T} \right| \right|\_{2} = ||\Sigma||\_{2} = \sigma\_{1} \tag{30}$$

where ||·||<sup>2</sup> is the induced matrix 2-norm, and

$$\hat{H}\_{l} = \left\| \left| U \Sigma V^{T} \right| \right\|\_{F} = ||\Sigma||\_{F} = \left( \sum\_{i}^{l} \sigma\_{i}^{2} \right)^{1/2} \tag{31}$$

where ||·||*<sup>F</sup>* is the Frobenius norm. Equation (30) also shows that the Hankel norm of a system is the maximum singular value of *Hr*. From (30) and (31), we can directly see that if the SVD of *Hr* is partitioned into

$$
\hat{H}\_r = \begin{bmatrix} \mathcal{U}\_n \ \mathcal{U}\_s \end{bmatrix} \begin{bmatrix} \Sigma\_n & 0 \\ 0 & \Sigma\_s \end{bmatrix} \begin{bmatrix} V\_n^T \\ V\_s^T \end{bmatrix} \mathcal{I}
$$

where *Un* is the first *<sup>n</sup>* columns of *<sup>U</sup>*, <sup>Σ</sup>*<sup>n</sup>* is the upper-left *<sup>n</sup>* <sup>×</sup> *<sup>n</sup>* block of <sup>Σ</sup>, and *<sup>V</sup><sup>T</sup> <sup>n</sup>* is the first *n* rows of *VT*, the solution to the rank-reduction problem is [14]

$$Q = \underset{\text{rank}(Q) = n}{\text{arg min}} \left||Q - \hat{H}\_r||\_2 = \underset{\text{rank}(Q) = n}{\text{arg min}} \left||Q - \hat{H}\_r||\_F = \mathcal{U}\_n \Sigma\_n V\_n^T.$$

Additionally, the error resulting from the rank reduction is

$$\mathcal{e} = \left\| \left| \mathcal{Q} - \hat{H}\_{\prime} \right\| \right\|\_{2} = \sigma\_{n+1\prime}$$

which suggests that if the rank of *Hr* is not known beforehand, it can be determined by examining the nonzero singular values in the deterministic case or by searching for a significant drop-off in singular values if only a noise-corrupted estimate is available.

#### **3.4. Identifying the state-space realization**

From a rank-reduced *H*ˆ*r*, any factorization

$$
\hat{H}\_r = \hat{\mathcal{O}}\_r \hat{\mathcal{C}}\_L
$$

can be used to estimate O*<sup>r</sup>* and C*L*. The error in the state-space realization, however, will depend on the chosen state basis. Generally we would like to have a state variable with a norm ||*xk*||<sup>2</sup> in between ||*uk*||<sup>2</sup> and ||*yk*||2. As first proposed in [5], choosing the factorization

$$\mathcal{O}\_r = \mathsf{U}\_n \Sigma\_n^{1/2} \qquad \text{and} \qquad \mathcal{C}\_L = \Sigma\_n^{1/2} V\_n^T \tag{32}$$

results in

12 Will-be-set-by-IN-TECH

where *Hr* is *Hr* with the indices shifted forward once and (·)† is the Moore-Penrose pseudoinverse. *C* taken from the first block row of O*r*, *B* taken from the first block column of C*L*, and *D* taken from *g*<sup>0</sup> then provides a complete and minimal state-space realization from an impulse response. Because *Hr* has rank *n* and det(*zI* − *A*) has degree *n*, we know from

However, as mentioned before, the impulse response of the system is rarely known exactly. In

*H*ˆ*<sup>r</sup>* = *Hr* + *E*. Because *E* is non-deterministic, *H*ˆ will always have full rank, regardless of the number of rows *r*. Thus *n* cannot be determined from examining the rank of *H*, and even if *n* is known beforehand, a factorization (28) for *r* > *n* will not exist. Thus we must find a way of reducing

If *H*ˆ*<sup>r</sup>* has full rank, or if *n* is unknown, its rank must be reduced prior to factorization. The obvious tool for reducing the rank of matrices is the *singular-value decomposition* (SVD).

*H*ˆ*<sup>r</sup>* = *U*Σ*V<sup>T</sup>* where *U* and *V<sup>T</sup>* are orthogonal matrices and Σ is a diagonal matrix containing the nonnegative *singular values σ<sup>i</sup>* ordered from largest to smallest. The SVD for a matrix is unique and guaranteed to exist, and the number of nonzero singular values of a matrix is equal to its

*<sup>F</sup>* <sup>=</sup> ||Σ||*<sup>F</sup>* <sup>=</sup>

where ||·||*<sup>F</sup>* is the Frobenius norm. Equation (30) also shows that the Hankel norm of a system is the maximum singular value of *Hr*. From (30) and (31), we can directly see that if the SVD

> <sup>2</sup> = arg min rank(*Q*)=*n*

 *l* ∑ *i σ*2 *i*

 *V<sup>T</sup> n V<sup>T</sup> s* ,

 *<sup>Q</sup>* <sup>−</sup> *<sup>H</sup>*ˆ*<sup>r</sup>* 

<sup>2</sup> <sup>=</sup> ||Σ||<sup>2</sup> <sup>=</sup> *<sup>σ</sup>*<sup>1</sup> (30)

*<sup>F</sup>* <sup>=</sup> *Un*Σ*nV<sup>T</sup>*

*n* .

(31)

*<sup>n</sup>* is the first

1/2

Kronecker's theorem that *G*(*z*) taken from (23) will be coprime.

the rank of *H*ˆ*<sup>r</sup>* in order to find a state-space realization.

**3.3. Rank-reduction of the Hankel matrix estimate**

Assume for now that *n* is known. The SVD of *H*ˆ*<sup>r</sup>* is

Because *U* and *V<sup>T</sup>* are orthogonal, the SVD satisfies

*H*ˆ*<sup>r</sup>* = *U*Σ*V<sup>T</sup>* 

where ||·||<sup>2</sup> is the induced matrix 2-norm, and

*Q* = arg min rank(*Q*)=*n* *H*ˆ*<sup>r</sup>* = *U*Σ*V<sup>T</sup>* 

*H*ˆ*<sup>r</sup>* =

*n* rows of *VT*, the solution to the rank-reduction problem is [14]

 *<sup>Q</sup>* <sup>−</sup> *<sup>H</sup>*ˆ*<sup>r</sup>* 

*Un Us* Σ*<sup>n</sup>* 0 0 Σ*<sup>s</sup>*

where *Un* is the first *<sup>n</sup>* columns of *<sup>U</sup>*, <sup>Σ</sup>*<sup>n</sup>* is the upper-left *<sup>n</sup>* <sup>×</sup> *<sup>n</sup>* block of <sup>Σ</sup>, and *<sup>V</sup><sup>T</sup>*

rank [14].

of *Hr* is partitioned into

this case only an estimate *H*ˆ*<sup>r</sup>* with a non-deterministic error term is available:

$$|||\mathcal{O}\_r|||\_2 = ||\mathcal{C}\_L||\_2 = \sqrt{||\hat{H}\_r||\_2} \tag{33}$$

and thus, from a functional perspective, the mappings from input to state and state to output will have equal magnitudes, and each entry of the state vector *xk* will have similar magnitudes. State-space realizations that satisfy (33) are sometimes called *internally balanced* realizations [11]. (Alternative definitions of a "balanced" realization exist, however, and it is generally wise to verify the definition in each context.)

Choosing the factorization (32) also simplifies computation of the estimate *A*ˆ, since

$$\begin{aligned} \hat{A} &= \left(\mathcal{O}\_r\right)^\dagger \overleftarrow{H}\_r \left(\mathcal{C}\_L\right)^\dagger \\ &= \Sigma\_n^{-1/2} \mathcal{U}\_n^T \overleftarrow{H}\_r V\_n \Sigma\_n^{-1/2} . \end{aligned}$$

By estimating *<sup>B</sup>*<sup>ˆ</sup> as the first block column of <sup>C</sup><sup>ˆ</sup> *<sup>L</sup>*, *<sup>C</sup>*<sup>ˆ</sup> as the first block row of <sup>O</sup><sup>ˆ</sup> *<sup>L</sup>*, and *<sup>D</sup>*<sup>ˆ</sup> as *<sup>g</sup>*0, a complete state-space realization (*A*ˆ, *B*ˆ, *C*ˆ, *D*ˆ ) is identified from this method.

#### **3.5. Pitfalls of direct realization from an impulse response**

Even though the rank-reduction process allows for realization from a noise-corrupted estimate of an impulse response, identification methods that generate a system estimate from a Hankel matrix constructed from an estimated impulse response have numerous difficulties when applied to noisy measurements. Measuring an impulse response directly is often infeasible; high-frequency damping may result in a measurement that has a very brief response before the signal-to-noise ratio becomes prohibitively small, and a unit pulse will often excite high-frequency nonlinearities that degrade the quality of the resulting estimate.

Taking the inverse Fourier transform of the frequency response guarantees that the estimates of the Markov parameters will converge as the dataset grows only so long as the input is broadband. Generally input signals decay in magnitude at higher frequencies, and calculation of the frequency response by inversion of the input will amplify high-frequency noise. We would prefer an identification method that is guaranteed to provide a system estimate that converges to the true system as the amount of data measured increases and that avoids inverting the input. Fortunately, the relationship between input and output data in (21) may be used to formulate just such an identification procedure.

#### **4. Realization from input-output data**

To avoid the difficulties in constructing a system realization from an estimated impulse response, we will form a realization-based identification procedure applicable to measured input-output data. To sufficiently account for non-deterministic effects in measured data, we add a noise term *vk* <sup>∈</sup> **<sup>R</sup>***ny* to the output to form the noise-perturbed state-space equations

$$\begin{aligned} \mathbf{x}\_{k+1} &= A\mathbf{x}\_k + B\mathbf{u}\_k\\ \mathbf{y}\_k &= \mathbf{C}\mathbf{x}\_k + D\mathbf{u}\_k + \mathbf{v}\_k. \end{aligned} \tag{34}$$

We assume that the noise signal *vk* is generated by a stationary stochastic process, which may be either white or colored. This includes the case in which the state is disturbed by process noise, so that the noise process may have the same poles as the deterministic system. (See [15] for a thorough discussion of representations of noise in the identification context.)

#### **4.1. Data-matrix equations**

The goal is to construct a state-space realization using the relationships in (21), but doing so requires a complete characterization of the row space of *Hr*. To this end, we expand a finite-slice of the future output vector to form a block-Hankel matrix of output data with *r* block rows,

$$Y = \begin{bmatrix} \begin{array}{cccc} \mathcal{Y}\_0 & \mathcal{Y}\_1 & \mathcal{Y}\_2 & \cdots & \mathcal{Y}\_L \\ \mathcal{Y}\_1 & \mathcal{Y}\_2 & \mathcal{Y}\_3 & \cdots & \mathcal{Y}\_{L+1} \\ \mathcal{Y}\_2 & \mathcal{Y}\_3 & \mathcal{Y}\_4 & \cdots & \mathcal{Y}\_{L+2} \\ \vdots & \vdots & \vdots & & \vdots \\ \vdots & \vdots & \vdots & & \vdots \\ \mathcal{Y}\_{r-1} & \mathcal{Y}\_r & \mathcal{Y}\_{r+1} & \cdots & \mathcal{Y}\_{r+L-1} \end{array} \end{bmatrix}.$$

This matrix is related to a block-Hankel matrix of future input data

$$\mathcal{U}\_f = \begin{bmatrix} u\_0 & u\_1 & u\_2 & \cdots & u\_L \\ u\_1 & u\_2 & u\_3 & \cdots & u\_{L+1} \\ u\_2 & u\_3 & u\_4 & \cdots & u\_{L+2} \\ \vdots & \vdots & \vdots & & \vdots \\ u\_{r-1} & u\_r & u\_{r+1} & \cdots & u\_{r+L-1} \end{bmatrix} \prime$$

a block-Toeplitz matrix of past input data

$$\mathcal{U}\_p = \begin{bmatrix} \mu\_{-1} & \mu\_0 & \mu\_1 & \cdots & \mu\_{L-1} \\ \mu\_{-2} & \mu\_{-1} & \mu\_0 & \cdots & \mu\_{L-2} \\ \mu\_{-3} & \mu\_{-2} & \mu\_{-1} & \cdots & \mu\_{L-3} \\ \vdots & \vdots & \vdots & & \vdots \end{bmatrix} \prime$$

a finite-dimensional block-Toeplitz matrix

14 Will-be-set-by-IN-TECH

of the frequency response by inversion of the input will amplify high-frequency noise. We would prefer an identification method that is guaranteed to provide a system estimate that converges to the true system as the amount of data measured increases and that avoids inverting the input. Fortunately, the relationship between input and output data in (21) may

To avoid the difficulties in constructing a system realization from an estimated impulse response, we will form a realization-based identification procedure applicable to measured input-output data. To sufficiently account for non-deterministic effects in measured data, we add a noise term *vk* <sup>∈</sup> **<sup>R</sup>***ny* to the output to form the noise-perturbed state-space equations

We assume that the noise signal *vk* is generated by a stationary stochastic process, which may be either white or colored. This includes the case in which the state is disturbed by process noise, so that the noise process may have the same poles as the deterministic system. (See [15]

The goal is to construct a state-space realization using the relationships in (21), but doing so requires a complete characterization of the row space of *Hr*. To this end, we expand a finite-slice of the future output vector to form a block-Hankel matrix of output data with *r*

> *y*<sup>0</sup> *y*<sup>1</sup> *y*<sup>2</sup> ··· *yL y*<sup>1</sup> *y*<sup>2</sup> *y*<sup>3</sup> ··· *yL*+<sup>1</sup> *y*<sup>2</sup> *y*<sup>3</sup> *y*<sup>4</sup> ··· *yL*+<sup>2</sup>

*yr*−<sup>1</sup> *yr yr*+<sup>1</sup> ··· *yr*+*L*−<sup>1</sup>

*u*<sup>0</sup> *u*<sup>1</sup> *u*<sup>2</sup> ··· *uL u*<sup>1</sup> *u*<sup>2</sup> *u*<sup>3</sup> ··· *uL*+<sup>1</sup> *u*<sup>2</sup> *u*<sup>3</sup> *u*<sup>4</sup> ··· *uL*+<sup>2</sup>

*ur*−<sup>1</sup> *ur ur*+<sup>1</sup> ··· *ur*+*L*−<sup>1</sup>

*u*−<sup>1</sup> *u*<sup>0</sup> *u*<sup>1</sup> ··· *uL*−<sup>1</sup> *u*−<sup>2</sup> *u*−<sup>1</sup> *u*<sup>0</sup> ··· *uL*−<sup>2</sup> *u*−<sup>3</sup> *u*−<sup>2</sup> *u*−<sup>1</sup> ··· *uL*−<sup>3</sup>

. .

. .

. .

. . ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ .

> ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ ,

⎤ ⎥ ⎥ ⎥ ⎦ ,

*yk* <sup>=</sup> *Cxk* <sup>+</sup> *Duk* <sup>+</sup> *vk*. (34)

*xk*<sup>+</sup><sup>1</sup> = *Axk* + *Buk*

for a thorough discussion of representations of noise in the identification context.)

*Y* =

This matrix is related to a block-Hankel matrix of future input data

⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣

*Uf* =

*Up* =

a block-Toeplitz matrix of past input data

⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣

. . . . . . . .

. . . . . . . .

⎡ ⎢ ⎢ ⎢ ⎣

. . . . . . . . . . . .

be used to formulate just such an identification procedure.

**4. Realization from input-output data**

**4.1. Data-matrix equations**

block rows,

$$T = \begin{bmatrix} \ \mathfrak{g}\_0 & \cdots & 0 \\ \mathfrak{g}\_1 & \mathfrak{g}\_0 & & \vdots \\ \mathfrak{g}\_2 & \mathfrak{g}\_1 & \mathfrak{g}\_0 \\ \vdots & \vdots & \vdots & \ddots \\ \mathfrak{g}\_{r-1} & \mathfrak{g}\_{r-2} & \mathfrak{g}\_{r-3} & \cdots & \mathfrak{g}\_0 \end{bmatrix} \cdot \mathbf{n}$$

the system Hankel matrix *H*, and a block-Hankel matrix *V* formed from noise data *vk* with the same indices as *Y* by the equation

$$Y = H\mathcal{U}\_p + T\mathcal{U}\_f + V.\tag{35}$$

If the entries of *Yf* are shifted forward by one index to form

$$
\overline{Y} = \begin{bmatrix}
\mathcal{Y}\_1 & \mathcal{Y}\_2 & \mathcal{Y}\_3 & \cdots & \mathcal{Y}\_{L+1} \\
\mathcal{Y}\_2 & \mathcal{Y}\_3 & \mathcal{Y}\_4 & \cdots & \mathcal{Y}\_{L+2} \\
\mathcal{Y}\_3 & \mathcal{Y}\_4 & \mathcal{Y}\_5 & \cdots & \mathcal{Y}\_{L+3} \\
\vdots & \vdots & \vdots & & \vdots \\
\vdots & \vdots & \vdots & & \vdots \\
\mathcal{Y}\_r \ \mathcal{Y}\_{r+1} \ \mathcal{Y}\_{r+2} & \cdots & \mathcal{Y}\_{r+L}
\end{bmatrix}
$$

then *Yf* is related to the shifted system Hankel matrix *H*, the past input data *Up*, *T* with a block column appended to the left, and *Uf* with a block row appended to the bottom,

$$
\overline{T} = \begin{bmatrix} \mathcal{S}\_1 \\ \mathcal{S}\_2 \\ \mathcal{S}\_3 \\ \vdots \\ \mathcal{S}\_r \end{bmatrix}, \qquad \qquad \overline{\mathcal{U}}\_f = \begin{bmatrix} \mathcal{U}\_f \\ \frac{\mathcal{U}\_r \ \mathcal{U}\_{r+1} \ \mathcal{U}\_{r+2} \ \cdots \ \mathcal{U}\_{r+L}} \end{bmatrix}.
$$

and a block-Hankel matrix *V* of noise data *vk* with the same indices as *Y* by the equation

$$
\overline{Y} = \overline{H}\underline{U}\_p + \overline{T}\,\overline{U}\_f + \overline{V}.\tag{36}
$$

From (25), the state is equal to the column vectors of *Up* multiplied by the entries of the controllability matrix C, which we may represent as the block-matrix

$$X = \begin{bmatrix} \mathfrak{x}\_0 \ \mathfrak{x}\_1 \ \mathfrak{x}\_2 \ \cdots \ \mathfrak{x}\_L \end{bmatrix} = \mathcal{C} \mathcal{U}\_{p\nu}$$

which is an alternative means of representing the memory of the system at sample 0, 1, .... The two data matrix equations (35) and (36) may then be written as

$$Y = \mathcal{O}\_r X + T\mathcal{U}\_f + V\_r \tag{37}$$

$$
\overline{Y} = \mathcal{O}\_{\overline{r}} A X + \overline{T} \, \overline{U}\_f + \overline{V}. \tag{38}
$$

Equation (37) is basis for the field of system identification methods known as subspace methods. Subspace identification methods typically fall into one of two categories. First, because a shifted observability matrix

$$
\overline{\mathcal{O}} = \begin{bmatrix} \mathcal{C}A \\ \mathcal{C}A^2 \\ \mathcal{C}A^3 \\ \vdots \end{bmatrix}'
$$

satisfies

$$\text{im}(\mathcal{O}) = \text{im}(\overline{\mathcal{O}})\_\prime$$

where im(·) of denotes the row space (often called the "image"), the row-space of O is shift-invariant, and *A* may be identified from estimates O*<sup>r</sup>* and O*<sup>r</sup>* as

$$
\vec{A} = \mathcal{O}\_r^\dagger \vec{\mathcal{O}}\_r.
$$

Alternatively, because a forward-propagated sequence of states

$$
\overline{X} = AX
$$

satisfies

$$\text{im}(X^T) = \text{im}(\overline{X}^T)\_\prime$$

the column-space of *<sup>X</sup>* is shift-invariant, and *<sup>A</sup>* may be identified from estimates *<sup>X</sup>*<sup>ˆ</sup> and <sup>ˆ</sup> *X* as

$$
\hat{A} = \mathring{\overline{X}} \hat{X}^{\dagger}.
$$

In both instances, the system dynamics are estimated by propagating the indices forward by one step and examining a propagation of linear dynamics, not unlike (20) from Kronecker's theorem. Details of these methods may be found in [16] and [17]. In the next section we present a system identification method that constructs system estimates from the shift-invariant structure of *Y* itself.

#### **4.2. Identification from shift-invariance of output measurements**

Equations (37) and (38) still contain the effects of the future input in *Uf* . To remove these effects from the output, we must first add some assumptions about *Uf* . First, we assume that *Uf* has full row rank. This is true for any *Uf* with a smooth frequency response or if *Uf* is generated from some pseudo-random sequence. Next, we assume that the initial conditions in *X* do not somehow cancel out the effects of future input. A sufficient condition for this is to require

$$\text{rank}\left(\left[\frac{X}{\overline{U}\_f}\right]\right) = n + rn\_{\mu}$$

to have full row rank. Although these assumptions might appear restrictive at first, since it is impossible to verify without knowledge of *X*, it is generally true with the exception of some pathological cases.

Next, we form the null-space projector matrix

$$
\Pi = I\_{L+1} - \overline{\mathcal{U}}\_f^T \left( \overline{\mathcal{U}}\_f \overline{\mathcal{U}}\_f^T \right)^{-1} \overline{\mathcal{U}}\_{f'} \tag{39}
$$

which has the property

16 Will-be-set-by-IN-TECH

⎡ ⎢ ⎢ ⎢ ⎣

im(O) = im(O), where im(·) of denotes the row space (often called the "image"), the row-space of O is

> *<sup>A</sup>*<sup>ˆ</sup> <sup>=</sup> <sup>O</sup><sup>ˆ</sup> † *r* ˆ O*r*.

> > *X* = *AX*

im(*XT*) = im(*X<sup>T</sup>*

the column-space of *<sup>X</sup>* is shift-invariant, and *<sup>A</sup>* may be identified from estimates *<sup>X</sup>*<sup>ˆ</sup> and <sup>ˆ</sup>

*<sup>A</sup>*<sup>ˆ</sup> = <sup>ˆ</sup> *XX*ˆ †.

**4.2. Identification from shift-invariance of output measurements**

rank

<sup>Π</sup> <sup>=</sup> *IL*+<sup>1</sup> <sup>−</sup> *<sup>U</sup><sup>T</sup>*

�� *X Uf* ��

In both instances, the system dynamics are estimated by propagating the indices forward by one step and examining a propagation of linear dynamics, not unlike (20) from Kronecker's theorem. Details of these methods may be found in [16] and [17]. In the next section we present a system identification method that constructs system estimates from the

Equations (37) and (38) still contain the effects of the future input in *Uf* . To remove these effects from the output, we must first add some assumptions about *Uf* . First, we assume that *Uf* has full row rank. This is true for any *Uf* with a smooth frequency response or if *Uf* is generated from some pseudo-random sequence. Next, we assume that the initial conditions in *X* do not somehow cancel out the effects of future input. A sufficient condition for this is to

to have full row rank. Although these assumptions might appear restrictive at first, since it is impossible to verify without knowledge of *X*, it is generally true with the exception of some

> *f* � *Uf <sup>U</sup><sup>T</sup> f* �−<sup>1</sup>

= *n* + *rnu*

*Uf* , (39)

),

*X* as

*CA CA*<sup>2</sup> *CA*<sup>3</sup> . . .

⎤ ⎥ ⎥ ⎥ ⎦ ,

O =

shift-invariant, and *A* may be identified from estimates O*<sup>r</sup>* and O*<sup>r</sup>* as

Alternatively, because a forward-propagated sequence of states

because a shifted observability matrix

shift-invariant structure of *Y* itself.

satisfies

satisfies

require

pathological cases.

Next, we form the null-space projector matrix

$$
\overline{\mathcal{U}}\_f \Pi = 0.
$$

We know the inverse of (*Uf <sup>U</sup><sup>T</sup> <sup>f</sup>* ) exists, since we assume *Uf* has full row rank. Projector matrices such as (39) have many interesting properties. Their eigenvalues are all 0 or 1, and if they are symmetric, they separate the subspace of real vectors — in this case, vectors in **R***L*+<sup>1</sup> — into a subspace and its orthogonal complement. In fact, it is simple to verify that the null space of *Uf* contains the null space of *Uf* as a subspace, since

$$
\overline{\mathcal{U}}\_f \Pi = \left[ \frac{\mathcal{U}\_f}{\cdots} \right] \Pi = 0.
$$

Thus multiplication of (37) and (38) on the right by Π results in

$$\text{YTI} = \mathcal{O}\_r \text{XTI} + \text{VII}\_r \tag{40}$$

$$
\overline{Y}\overline{\Pi} = \mathcal{O}\_I A X \Pi + \overline{V} \Pi. \tag{41}
$$

It is also unnecessary to compute the projected products *Y*Π and *Y*Π directly, since from the QR-decomposition

$$
\begin{bmatrix} \overline{\boldsymbol{\mathcal{U}}}^T \boldsymbol{Y}^T \end{bmatrix} = \begin{bmatrix} \boldsymbol{Q}\_1 \ \boldsymbol{Q}\_2 \end{bmatrix} \begin{bmatrix} \boldsymbol{R}\_{11} \ \boldsymbol{R}\_{12} \\ \boldsymbol{0} \ \boldsymbol{R}\_{22} \end{bmatrix} \prime$$

we have

$$Y = R\_{12}^T Q\_1^T + R\_{22}^T Q\_2^T \tag{42}$$

and *U* = *R<sup>T</sup>* 11*Q<sup>T</sup>* <sup>1</sup> . Substitution into (39) reveals

$$
\Pi = I - Q\_1 Q\_1^T. \tag{43}
$$

Because the columns of *Q*<sup>1</sup> and *Q*<sup>2</sup> are orthogonal, multiplication of (42) on the right by (43) results in

$$\mathcal{Y}\Pi = \mathcal{R}\_{22}^{\top}\mathcal{Q}\_{2}^{\top}.$$

A similar result holds for *Y*Π. Taking the QR-decomposition of the data can alternatively be thought of as using the principle of superposition to construct new sequences of input-output data through a Gram-Schmidt-type orthogonalization process. A detailed discussion of this interpretation can be found in [18].

Thus we have successfully removed the effects of future input on the output while retaining the effects of the past, which is the foundation of the realization process. We still must account for non-deterministic effects in *V* and *V*. To do so, we look for some matrix *Z* such that

$$\begin{array}{c} \mathsf{V}\Pi\!\!\!\!Z^{\mathsf{T}} \to 0, \\\\ \overline{\mathsf{V}}\Pi\!\!\!\!Z^{\mathsf{T}} \to 0. \end{array}$$

This requires the content of *Z* to be statistically independent of the process that generates *vk*. The input *uk* is just such a signal, so long as the filter output is not a function of the input that is, the data was measured in open-loop operation,. If we begin measuring input before *k* = 0 at some sample *k* = −*ζ* and construct *Z* as a block-Hankel matrix of past input data,

$$Z = \frac{1}{L} \begin{bmatrix} \begin{matrix} \begin{matrix} \begin{matrix} \begin{matrix} \begin{matrix} \begin{matrix} \begin{matrix} \end{matrix} \end{matrix} \end{matrix} & \begin{matrix} \begin{matrix} \begin{matrix} \begin{matrix} \begin{matrix} \begin{matrix} \end{matrix} \end{matrix} \end{matrix} \end{bmatrix} \begin{matrix} \begin{matrix} \begin{matrix} \begin{matrix} \begin{matrix} \end{matrix} \end{matrix} \end{matrix} \end{bmatrix} \begin{matrix} \begin{matrix} \begin{matrix} \begin{matrix} \begin{matrix} \begin{matrix} \end{matrix} \end{matrix} \end{matrix} \end{bmatrix} \begin{matrix} \begin{matrix} \begin{matrix} \begin{matrix} \begin{matrix} \begin{matrix} \end{matrix} \end{matrix} \end{matrix} \end{bmatrix} \begin{matrix} \begin{matrix} \begin{matrix} \begin{matrix} \begin{matrix} \begin{matrix} \end{matrix} \end{matrix} \end{matrix} \end{bmatrix} \begin{matrix} \begin{matrix} \begin{matrix} \begin{matrix} \begin{matrix} \begin{matrix} \begin{matrix} \end{matrix} \end{matrix} \end{matrix} \end{bmatrix} \begin{matrix} \begin{matrix} \begin{matrix} \begin{matrix} \begin{matrix} \begin{matrix} \begin{matrix} \end{matrix} \end{matrix} \end{matrix} \end{pmatrix} \begin{matrix} \begin{matrix} \begin{matrix} \begin{matrix} \begin{matrix} \begin{matrix} \end{matrix} \end{matrix} \end{matrix} \end{bmatrix} \begin{matrix} \begin{matrix} \begin{matrix} \begin{matrix} \begin{matrix} \begin{matrix} \begin{matrix} \begin{matrix} \begin{matrix} \begin{matrix} \begin{matrix} \end{matrix} \end{matrix} \end{matrix} \end{matrix} \end{bmatrix} \begin{matrix} \begin{matrix} \begin{matrix} \begin{matrix} \begin{matrix} \begin{matrix} \begin{matrix} \begin{matrix} \begin{matrix} \begin{matrix} \begin{matrix} \begin{matrix} \begin{matrix} \end{matrix} \end{matrix} \end{matrix} \end{$$

then multiplication of (40) and (41) on the right by *Z<sup>T</sup>* results in

$$\text{YITZ}^{T} \rightarrow \mathcal{O}\_{\text{I}} \text{XIIZ}^{T},\tag{44}$$

$$
\overline{\mathcal{Y}}\Pi\mathcal{Z}^{\mathsf{T}} \to \mathcal{O}\_{\mathsf{T}}A\Pi\mathcal{Z}^{\mathsf{T}},\tag{45}
$$

as *<sup>L</sup>* <sup>→</sup> <sup>∞</sup>. Note the term <sup>1</sup> *<sup>L</sup>* in *Z* is necessary to keep (44) and (45) bounded.

Finally we are able to perform our rank-reduction technique directly on measured data without needing to first estimate the impulse response. From the SVD

$$\mathbf{Y}\Pi\mathbf{Z}^{\mathsf{T}} = \mathcal{U}\Sigma V^{\mathsf{T}}{}\_{\mathsf{A}}$$

we may estimate the order *n* by looking for a sudden decrease in singular values. From the partitioning

$$\begin{aligned} \,^j\text{T}\Pi Z^T &= \begin{bmatrix} \mathcal{U}\_{\boldsymbol{n}} \ \mathcal{U}\_{\boldsymbol{s}} \end{bmatrix} \begin{bmatrix} \Sigma\_{\boldsymbol{n}} & \mathbf{0} \\ \mathbf{0} & \Sigma\_{\boldsymbol{s}} \end{bmatrix} \begin{bmatrix} V\_{\boldsymbol{n}}^T \\ V\_{\boldsymbol{s}}^T \end{bmatrix} \,^j\text{T} \end{aligned} $$

we may estimate <sup>O</sup>*<sup>r</sup>* and *<sup>X</sup>*Π*Z<sup>T</sup>* from the factorization

$$
\mathcal{O}\_r = \mathsf{U}\_n \Sigma\_n^{1/2} \qquad \text{and} \qquad \hat{\mathsf{X}} \Pi \mathbb{Z}^T = \Sigma\_n^{1/2} V\_n^T.
$$

*A* may then be estimated as

$$\begin{split} \boldsymbol{\hat{A}} &= \left(\boldsymbol{\mathcal{O}}\_{r}\right)^{\dagger} \overline{\boldsymbol{\Upsilon}} \boldsymbol{\Pi} \boldsymbol{Z}^{T} \left(\boldsymbol{\hat{X}} \boldsymbol{\Pi} \boldsymbol{Z}^{T}\right)^{\dagger} = \boldsymbol{\Sigma}\_{n}^{-1/2} \boldsymbol{\mathcal{U}}\_{n}^{T} \overline{\boldsymbol{\Upsilon}} \boldsymbol{\Pi} \boldsymbol{Z}^{T} \boldsymbol{V}\_{n} \boldsymbol{\Sigma}\_{n}^{-1/2} \\ &\approx \left(\boldsymbol{\mathcal{O}}\_{r}\right)^{\dagger} \left(\overline{\boldsymbol{\mathcal{H}}} \boldsymbol{\mathcal{U}}\_{p} \boldsymbol{\Pi}\right) \left(\boldsymbol{\mathcal{C}}\_{L} \boldsymbol{\mathcal{U}}\_{p} \boldsymbol{\Pi}\right)^{\dagger} \approx \left(\boldsymbol{\mathcal{O}}\_{r}\right)^{\dagger} \overline{\boldsymbol{\mathcal{H}}} \left(\boldsymbol{\mathcal{C}}\_{L}\right)^{\dagger} .\end{split}$$

And so we have returned to our original relationship (29).

While *<sup>C</sup>* may be estimated from the top block row of <sup>O</sup>ˆ*r*, our projection has lost the column space of *Hr* that we previously used to estimate *B*, and initial conditions in *X* prevent us from estimating *D* directly. Fortunately, if *A* and *C* are known, then the remaining terms *B*, *D*, and an initial condition *x*<sup>0</sup> are linear in the input output data, and may be estimated by solving a linear-least-squares problem.

#### **4.3. Estimation of** *B***,** *D***, and** *x*0

The input-to-state terms *B* and *D* may be estimated by examining the convolution with the state-space form of the impulse response. Expanding (24) with the input and including an initial condition *x*<sup>0</sup> results in

$$y\_k = \mathbb{C}A^k x\_0 + \sum\_{j=0}^{k-1} \mathbb{C}A^{k-j-1}Bu\_j + Du\_k + v\_k. \tag{46}$$

Factoring out *B* and *D* on the right provides

18 Will-be-set-by-IN-TECH

*u*−*<sup>ζ</sup> u*−*ζ*+<sup>1</sup> *u*−*ζ*+<sup>2</sup> ··· *u*−*ζ*+*<sup>L</sup> u*−*ζ*+<sup>1</sup> *u*−*ζ*+<sup>2</sup> *u*−*ζ*+<sup>3</sup> ··· *u*−*ζ*+*L*+<sup>1</sup> *u*−*ζ*+<sup>2</sup> *u*−*ζ*+<sup>3</sup> *u*−*ζ*+<sup>4</sup> ··· *u*−*ζ*+*L*+<sup>2</sup>

*u*−<sup>1</sup> *u*<sup>0</sup> *u*<sup>1</sup> ··· *ur*+*L*−<sup>2</sup>

*<sup>L</sup>* in *Z* is necessary to keep (44) and (45) bounded.

*<sup>n</sup>* and *<sup>X</sup>*<sup>ˆ</sup> <sup>Π</sup>*Z<sup>T</sup>* = <sup>Σ</sup>1/2

= Σ−1/2 *<sup>n</sup> <sup>U</sup><sup>T</sup>*

Finally we are able to perform our rank-reduction technique directly on measured data

*Y*Π*Z<sup>T</sup>* = *U*Σ*VT*, we may estimate the order *n* by looking for a sudden decrease in singular values. From the

> *Un Us* � � Σ*<sup>n</sup>* 0 0 Σ*<sup>s</sup>*

*X*ˆ Π*Z<sup>T</sup>* �†

*HUp*Π� �C*LUp*Π�† <sup>≈</sup> (O*r*)

While *<sup>C</sup>* may be estimated from the top block row of <sup>O</sup>ˆ*r*, our projection has lost the column space of *Hr* that we previously used to estimate *B*, and initial conditions in *X* prevent us from estimating *D* directly. Fortunately, if *A* and *C* are known, then the remaining terms *B*, *D*, and an initial condition *x*<sup>0</sup> are linear in the input output data, and may be estimated by solving a

The input-to-state terms *B* and *D* may be estimated by examining the convolution with the state-space form of the impulse response. Expanding (24) with the input and including an

> *k*−1 ∑ *j*=0

. .

� �*V<sup>T</sup> n V<sup>T</sup> s* � ,

. .

*<sup>Y</sup>*Π*Z<sup>T</sup>* → O*rX*Π*ZT*, (44) *<sup>Y</sup>*Π*Z<sup>T</sup>* → O*rA*Π*ZT*, (45)

> *<sup>n</sup> <sup>V</sup><sup>T</sup> n* .

*<sup>n</sup> <sup>Y</sup>*Π*ZTVn*Σ−1/2 *n*

† .

*CAk*−*j*−<sup>1</sup>*Buj* + *Duk* + *vk*. (46)

† *<sup>H</sup>* (C*L*)

⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ ,

*k* = 0 at some sample *k* = −*ζ* and construct *Z* as a block-Hankel matrix of past input data,

*<sup>Z</sup>* <sup>=</sup> <sup>1</sup> *L*

as *<sup>L</sup>* <sup>→</sup> <sup>∞</sup>. Note the term <sup>1</sup>

*A* may then be estimated as

linear-least-squares problem.

initial condition *x*<sup>0</sup> results in

**4.3. Estimation of** *B***,** *D***, and** *x*0

partitioning

⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣

then multiplication of (40) and (41) on the right by *Z<sup>T</sup>* results in

. . . . . . . .

without needing to first estimate the impulse response. From the SVD

*Y*Π*Z<sup>T</sup>* = �

we may estimate <sup>O</sup>*<sup>r</sup>* and *<sup>X</sup>*Π*Z<sup>T</sup>* from the factorization

*A*ˆ = � Oˆ*r*

≈ (O*r*)

<sup>O</sup>ˆ*<sup>r</sup>* <sup>=</sup> *Un*Σ1/2

�† *<sup>Y</sup>*Π*Z<sup>T</sup>* �

*yk* = *CAkx*<sup>0</sup> +

† �

And so we have returned to our original relationship (29).

$$y\_k = \mathbb{C}A^k x\_0 + \left(\sum\_{j=0}^{k-1} \mathfrak{u}\_k^T \otimes \mathbb{C}A^{k-j-1}\right) \text{vec}(B) + \left(\mathfrak{u}\_k^T \otimes I\_{n\_y}\right) \text{vec}(D) + v\_{k'}$$

in which vec(·) is the operation that stacks the columns of a matrix on one another, ⊗ is the (coincidentally named) Kronecker product, and we have made use of the identity

$$\text{vec}(AXB) = (B^T \otimes A)\text{vec}(X).$$

Grouping the unknown terms together results in

$$y\_k = \left[ \mathbb{C}A^k \, \sum\_{j=0}^{k-1} u\_k^T \otimes \mathbb{C}A^{k-j-1} \, u\_k^T \otimes I\_{\mathbb{N}\_y} \right] \begin{bmatrix} \varkappa\_0\\ \mathrm{vec}(B) \\ \mathrm{vec}(D) \end{bmatrix} + v\_k.$$

Thus by forming the regressor

$$\phi\_k = \left[ \triangle \hat{A}^k \; \Sigma\_{j=0}^{k-1} \; \boldsymbol{\mu}\_k^T \otimes \triangle \hat{A}^{k-j-1} \; \boldsymbol{\mu}\_k^T \otimes \boldsymbol{I}\_{\boldsymbol{\eta}\_k} \right],$$

from the estimates *A*ˆ and *C*ˆ, estimates of *B* and *D* may be found from the least-squares solution of the linear system of *N* equations

$$
\begin{bmatrix} y\_0 \\ y\_1 \\ y\_2 \\ \vdots \\ y\_N \end{bmatrix} = \begin{bmatrix} \phi\_0 \\ \phi\_1 \\ \phi\_2 \\ \vdots \\ \phi\_N \end{bmatrix} \begin{bmatrix} \hat{x}\_0 \\ \text{vec}(\hat{B}) \\ \text{vec}(\hat{D}) \end{bmatrix}.
$$

Note that *N* is arbitrary and does not need to be related in any way to the indices of the data matrix equations. This can be useful, since for large-dimensional systems, the regressor *φ<sup>k</sup>* may become very computationally expensive to compute.

#### **5. Conclusion**

Beginning with the construction of a transfer function from an impulse response, we have constructed a method for identification of state-space realizations of linear filters from measured input-output data, introducing the fundamental concepts of realization theory of linear systems along the way. Computing a state-space realization from measured input-output data requires many tools of linear algebra: projections and the QR-decomposition, rank reduction and the singular-value decomposition, and linear least squares. The principles of realization theory provide insight into the different representations of linear systems, as well as the role of rational functions and series expansions in linear algebra.

#### **Author details**

Daniel N. Miller and Raymond A. de Callafon *University of California, San Diego, USA*

### **6. References**


## **Partition-Matrix Theory Applied to the Computation of Generalized-Inverses for MIMO Systems in Rayleigh Fading Channels**

P. Cervantes, L. F. González, F. J. Ortiz and A. D. García

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/48198

## **1. Introduction**

20 Will-be-set-by-IN-TECH

[1] Rudolf E. Kalman. On the General Theory of Control Systems. In *Proceedings of the First*

[2] Rudolf E. Kalman. Mathematical Description of Linear Dynamical Systems. *Journal of the Society for Industrial and Applied Mathematics, Series A: Control*, 1(2):152, July 1963. [3] B. L. Ho and Rudolf E. Kalman. Effective construction of linear state-variable models

[4] Leopold Kronecker. Zur Theorie der Elimination einer Variabeln aus zwei Algebraischen Gleichungen. *Monatsberichte der Königlich Preussischen Akademie der Wissenschaften zu*

[5] Paul H. Zeiger and Julia A. McEwen. Approximate Linear Realizations of Given Dimension via Ho's Algorithm. *IEEE Transactions on Automatic Control*, 19(2):153 – 153,

[6] Sun-Yuan Kung. A new identification and model reduction algorithm via singular value decomposition. In *Proceedings of the 12th Asilomar Conference on Circuits, Systems, and*

[7] Michel Verhaegen. Identification of the deterministic part of MIMO state space models given in innovations form from input-output data. *Automatica*, 30(1):61–74, January 1993. [8] Peter Van Overschee and Bart De Moor. N4SID: Subspace algorithms for the identification of combined deterministic-stochastic systems. *Automatica*, 30(1):75–93,

[9] Jer-Nan Juang and R. S. Pappa. An Eigensystem Realization Algorithm (ERA) for Modal Parameter Identification and Model Reduction. *JPL Proc. of the Workshop on Identification*

[10] Jer-Nan Juang, Minh Q. Phan, Lucas G. Horta, and Richard W. Longman. Identification of Observer/Kalman Filter Markov Parameters: Theory and Experiments. *Journal of*

[11] Chi-Tsong Chen. *Linear System Theory and Design*. Oxford University Press, New York,

[12] Felix R. Gantmacher. *The Theory of Matrices - Volume Two*. Chelsea Publishing Company,

[13] Kemin Zhou, John C. Doyle, and Keith Glover. *Robust and Optimal Control*. Prentice Hall,

[14] G.H. Golub and C.F. Van Loan. *Matrix Computations*. The Johns Hopkins University

[15] Lennart Ljung. *System Identification: Theory for the User*. PTR Prentice Hall Information and System Sciences. Prentice Hall PTR, Upper Saddle River, NJ, 2nd edition, 1999. [16] Michel Verhaegen and Vincent Verdult. *Filtering and System Identification: A Least Squares*

[17] Peter Van Overschee and Bart De Moor. *Subspace Identification for Linear Systems: Theory,*

[18] Tohru Katayama. Role of LQ Decomposition in Subspace Identification Methods. In Alessandro Chiuso, Stefano Pinzoni, and Augusto Ferrante, editors, *Modeling, Estimation*

*Approach*. Cambridge University Press, New York, 1 edition, May 2007.

*Implementation, Applications*. Kluwer Academic Publishers, London, 1996.

*and Control*, pages 207–220. Springer Berlin / Heidelberg, 2007.

*International Congress of Automatic Control*, Moscow, 1960. IRE.

from input/output functions. *Regelungstechnik*, 14:545–548, 1966.

*and Control of Flexible Space Structures*, 3:299–318, April 1985.

*Guidance, Control, and Dynamics*, 16(2):320–329, 1993.

Press, Baltimore, Maryland, USA, third edition, 1996.

**6. References**

1971.

January 1994.

1st edition, 1984.

New York, 1960.

August 1995.

*Berlin*, pages 535–600, 1881.

*Computers*, pages 705–714. IEEE, 1978.

Partition-Matrix Theory and Generalized-Inverses are interesting topics explored in linear algebra and matrix computation. Partition-Matrix Theory is associated with the problem of properly partitioning a matrix into block matrices (i.e. an array of matrices), and is a matrix computation tool widely employed in several scientific-technological application areas. For instance, blockwise Toeplitz-based covariance matrices are used to model structural properties for space-time multivariate adaptive processing in radar applications [1], Jacobian response matrices are partitioned into several block-matrix instances in order to enhance medical images for Electrical-Impedance-Tomography [2], design of stateregulators and partial-observers for non-controllable/non-observable linear continuous systems contemplates matrix blocks for controllable/non-controllable and observable/nonobservable eigenvalues [3]. The Generalized-Inverse is a common and natural problem found in a vast of applications. In control robotics, non-collocated partial linearization is applied to underactuated mechanical systems through inertia-decoupling regulators which employ a pseudoinverse as part of a modified input control law [4]. At sliding-mode control structures, a Right-Pseudoinverse is incorporated into a state-feedback control law in order to stabilize electromechanical non-linear systems [5]. Under the topic of system identification, definition of a Left-Pseudoinverse is present in auto-regressive movingaverage models (ARMA) for matching dynamical properties of unknown systems [6]. An interesting approach arises whenever Partition-Matrix Theory and Generalized-Inverse are combined together yielding attractive solutions for solving the problem of block matrix inversion [7-10]. Nevertheless, several assumptions and restrictions regarding numerical stability and structural properties are considered for these alternatives. For example, an attractive pivot-free block matrix inversion algorithm is proposed in [7], which

unfortunately exhibits an overhead in matrix multiplications that are required in order to guarantee full-rank properties for particular blocks within it. For circumventing the expense in rank deficiency, [8] offers block-matrix completion strategies in order to find the Generalized-Inverse of any non-singular block matrix (irrespective of the singularity of their constituting sub-blocks). However, the existence of intermediate matrix inverses and pseudoinverses throughout this algorithm still rely on full-rank assumptions, as well as introducing more hardness to the problem. The proposals exposed in [9-10] avoid completion strategies and contemplate all possible scenarios for avoiding any rank deficiency among each matrix sub-block, yet demanding full-rank assumptions for each scenario. In this chapter, an iterative-recursive algorithm for computing a Left-Pseudoinverse (LPI) of a MIMO channel matrix is developed by combining Partition-Matrix Theory and Generalized-Inverse concepts. For this approach, no matrix-operations' overhead nor any particular block matrix full-rank assumptions are needed because of structural attributes of the MIMO channel matrix, which models dynamical properties of a Rayleigh fading channel (RFC) within wireless MIMO communication systems.

The content of this work is outlined as follows. Section 2 provides a description of the MIMO communication link, pointing out its principal physical effects and the mathematical model considered for RFC-based environments. Section 3 defines formally the problem of computing the Left-Pseudoinverse as the Generalized-Inverse for the MIMO channel matrix applying Partition-Matrix Theory concepts. Section 4 presents linear algebra and matrix computation concepts and tools needed for tracking a solution for the aforementioned problem. Section 5 analyzes important properties of the MIMO channel matrix derived from a Rayleigh fading channel scenario. Section 6 explains the proposed novel algorithm. Section 7 presents a brief analysis of VLSI (Very Large Scale of Integration) aspects towards implementation of arithmetic operations presented in this algorithm. Section 8 concludes the chapter. Due to the vast literature about MIMO systems, and to the best of the authors' knowledge, this chapter provides a nice and strategic list of references in order to easily correlate essential concepts between matrix theory and MIMO systems. For instance, [11-16] describe and analyze information and system aspects about MIMO communication systems, as well as studying MIMO channel matrix behavior under RFC-based environments; [17-18] contain all useful linear algebra and matrix computation theoretical concepts around the mathematical background immersed in MIMO systems; [19-21] provide practical guidelines and examples for MIMO channel matrix realizations comprising RFC scenarios; [22] treats the formulation and development of the algorithm presented in this chapter; [23-27] detail a splendid survey on architectural aspects for implementing several arithmetic operations.

### **2. MIMO systems**

In the context of wireless communication systems, MIMO (Multiple-Input Multiple-Output) is an extension of the classical SISO (Single-Input Single-Output) communication paradigm, where instead of having a communication link composed of a single transmitter-end and a receiver-end element (or antenna), wireless MIMO communication systems (or just MIMO systems) consist of an array of multiple elements at both the transmission and reception parts [11-16,19-21]. Generally speaking, the MIMO communication link contains *<sup>T</sup> <sup>n</sup>* transmitter-end and *<sup>R</sup> <sup>n</sup>* receiver-end antennas sendingand-receiving information through a wireless channel. Extensive studies on MIMO systems and commercial devices already employing them reveal that these communication systems offer promising results in terms of: a) spectral efficiency and channel capacity enhancements (many user-end applications supporting high-data rates at limited available bandwidth); b) improvements on Bit-Error-Rate (BER) performance; and c) practical feasability already seen in several wireless communication standards. The conceptualization of this paradigm is illustrated in figure 1, where Tx is the transmitterend, Rx the receiver-end, and Chx the channel.

138 Linear Algebra – Theorems and Applications

**2. MIMO systems** 

unfortunately exhibits an overhead in matrix multiplications that are required in order to guarantee full-rank properties for particular blocks within it. For circumventing the expense in rank deficiency, [8] offers block-matrix completion strategies in order to find the Generalized-Inverse of any non-singular block matrix (irrespective of the singularity of their constituting sub-blocks). However, the existence of intermediate matrix inverses and pseudoinverses throughout this algorithm still rely on full-rank assumptions, as well as introducing more hardness to the problem. The proposals exposed in [9-10] avoid completion strategies and contemplate all possible scenarios for avoiding any rank deficiency among each matrix sub-block, yet demanding full-rank assumptions for each scenario. In this chapter, an iterative-recursive algorithm for computing a Left-Pseudoinverse (LPI) of a MIMO channel matrix is developed by combining Partition-Matrix Theory and Generalized-Inverse concepts. For this approach, no matrix-operations' overhead nor any particular block matrix full-rank assumptions are needed because of structural attributes of the MIMO channel matrix, which models dynamical properties of a

Rayleigh fading channel (RFC) within wireless MIMO communication systems.

The content of this work is outlined as follows. Section 2 provides a description of the MIMO communication link, pointing out its principal physical effects and the mathematical model considered for RFC-based environments. Section 3 defines formally the problem of computing the Left-Pseudoinverse as the Generalized-Inverse for the MIMO channel matrix applying Partition-Matrix Theory concepts. Section 4 presents linear algebra and matrix computation concepts and tools needed for tracking a solution for the aforementioned problem. Section 5 analyzes important properties of the MIMO channel matrix derived from a Rayleigh fading channel scenario. Section 6 explains the proposed novel algorithm. Section 7 presents a brief analysis of VLSI (Very Large Scale of Integration) aspects towards implementation of arithmetic operations presented in this algorithm. Section 8 concludes the chapter. Due to the vast literature about MIMO systems, and to the best of the authors' knowledge, this chapter provides a nice and strategic list of references in order to easily correlate essential concepts between matrix theory and MIMO systems. For instance, [11-16] describe and analyze information and system aspects about MIMO communication systems, as well as studying MIMO channel matrix behavior under RFC-based environments; [17-18] contain all useful linear algebra and matrix computation theoretical concepts around the mathematical background immersed in MIMO systems; [19-21] provide practical guidelines and examples for MIMO channel matrix realizations comprising RFC scenarios; [22] treats the formulation and development of the algorithm presented in this chapter; [23-27] detail a splendid survey on architectural aspects for implementing several arithmetic operations.

In the context of wireless communication systems, MIMO (Multiple-Input Multiple-Output) is an extension of the classical SISO (Single-Input Single-Output) communication paradigm, where instead of having a communication link composed of a single transmitter-end and a receiver-end element (or antenna), wireless MIMO communication systems (or just MIMO systems) consist of an array of multiple elements at both the

**Figure 1.** The MIMO system: conceptualization for the MIMO communication paradigm.

Notice that information sent from the trasnmission part (Tx label on figure 1) will suffer from several degradative and distorional effects inherent in the channel (Chx label on figure 1), forcing the reception part (Rx label on figure 1) to decode information properly. Information at Rx will suffer from degradations caused by time, frequency, and spatial characteristics of the MIMO communication link [11-12,14]. These issues are directly related to: i) the presence of physical obstacles obstructing the Line-of-Sight (LOS) between Tx and Rx (existance of non-LOS); ii) time delays between received and transmitted information signals due to Tx and Rx dynamical properties (time-selectivity of Chx); iii) frequency distortion and interference among signal carriers through Chx (frequencyselectivity of Chx); iv) correlation of information between receiver-end elements. Fading (or fading mutlipath) and noise are the most common destructive phenomena that significantly affect information at Rx [11-16]. Fading is a combination of time-frequency replicas of the trasnmitted information as a consequence of the MIMO system phenomena i)-iv) exposed before, whereas noise affects information at every receiver-end element under an additve or multiplicative way. As a consequence, degradation of signal information rests mainly upon magnitude attenuation and time-frequency shiftings. The simplest treatable MIMO communication link has a slow-flat quasi-static fading channel (proper of a non-LOS indoor environment). For this type of scenario, a well-known dynamical-stochastic model considers a Rayleigh fading channel (RFC) [13,15-16,19-21], which gives a quantitative clue of how information has been degradated by means of Chx. Moreover, this type of channels allows to: a) distiguish among each information block tranmitted from the *<sup>T</sup> <sup>n</sup>* elements at every Chx realization (i.e. the time during which the channel's properties remain unvariant); and b) implement easily symbol decoding tasks related to channel equalization (CE) techniques. Likewise, noise is commonly assumed to have additive effects over Rx. Once again, all of these assumptions provide a treatable information-decoding problem (refered as MIMO demodulation [12]), and the mathematical model that suits the aforementioned MIMO communication link characteristics will be represented by

$$y = Hx + \eta \tag{1}$$

where: 1 1 [ ] *T T n n <sup>j</sup> <sup>x</sup>* is a complex-valued *<sup>T</sup> <sup>n</sup>* dimensional transmitted vector with entries drawn from a Gaussian-integer finite-lattice constellation (digital modulators, such as: *q*-QAM, QPSK); <sup>1</sup> *Rn <sup>y</sup>* is a complex-valued *<sup>R</sup> <sup>n</sup>* dimensional received vector; 1 *Rn* is a *<sup>R</sup> <sup>n</sup>* dimensional independent-identically-distributed (idd) complexcircularly-symmetric (ccs) Additive White Gaussian Noise (AWGN) vector; and *R T n n <sup>H</sup>* is the *R T n n* dimensional MIMO channel matrix whose entries model: a) the RFC-based environment behavior according to a Gaussian probabilistic density function with zero-mean and 0.5-variance statistics; and b) the time-invariant transfer function (which measures the degradation of the signal information) between the i-th receiver-end and the j-th trasnmitter-end antennas [11-16,19-21]. Figure 2 gives a representation of (1). As shown therein, the MIMO communication link model stated in (1) can be also expressed as

$$
\begin{bmatrix} y\_1 \\ \vdots \\ y\_{n\_R} \end{bmatrix} = \begin{bmatrix} h\_{11} & \cdots & h\_{1n\_T} \\ \vdots & & \vdots \\ h\_{n\_R 1} & \cdots & h\_{n\_R n\_T} \end{bmatrix} \begin{bmatrix} \mathbf{x}\_1 \\ \vdots \\ \mathbf{x}\_{n\_T} \end{bmatrix} + \begin{bmatrix} \eta\_1 \\ \vdots \\ \eta\_{n\_R} \end{bmatrix} \tag{2}
$$

Notice from (1-2) that an important requisite for CE purposes within RFC scenarios is that *H* is provided somehow to the Rx. This MIMO system requirement is classically known as Channel State Information (CSI) [11-16]. In the sequel of this work, symbol-decoding efforts will consider the problem of finding *x* from *y* regarding CSI at the Rx part within a slowflat quasi-static RFC-based environment as modeled in (1-2). In simpler words, Rx must find *x* from degradated information *y* through calculating an inversion over *H* . Moreover, *R T n n* is commonly assumed for MIMO demodulation tasks [13-14] because it guarantees linear independency between row-entries of matrix *H* in (2), yielding a nonhomogeneous overdetermined system of linear equations.

**Figure 2.** Representation for the MIMO communication link model according to *y Hx* . Here, each dotted arrow represents an entry ij *h* in *H* which determines channel degradation between the j-th transmitter and the i-th receiver elements. AWGN appears additively in each receiver-end antenna.

#### **3. Problem definition**

140 Linear Algebra – Theorems and Applications

characteristics will be represented by

*T T n n*

overdetermined system of linear equations.

where: 1 1 [ ]

1 *Rn*

expressed as

information rests mainly upon magnitude attenuation and time-frequency shiftings. The simplest treatable MIMO communication link has a slow-flat quasi-static fading channel (proper of a non-LOS indoor environment). For this type of scenario, a well-known dynamical-stochastic model considers a Rayleigh fading channel (RFC) [13,15-16,19-21], which gives a quantitative clue of how information has been degradated by means of Chx. Moreover, this type of channels allows to: a) distiguish among each information block tranmitted from the *<sup>T</sup> <sup>n</sup>* elements at every Chx realization (i.e. the time during which the channel's properties remain unvariant); and b) implement easily symbol decoding tasks related to channel equalization (CE) techniques. Likewise, noise is commonly assumed to have additive effects over Rx. Once again, all of these assumptions provide a treatable information-decoding problem (refered as MIMO demodulation [12]), and the mathematical model that suits the aforementioned MIMO communication link

> *y Hx*

entries drawn from a Gaussian-integer finite-lattice constellation (digital modulators, such as: *q*-QAM, QPSK); <sup>1</sup> *Rn <sup>y</sup>* is a complex-valued *<sup>R</sup> <sup>n</sup>* dimensional received vector;

 is a *<sup>R</sup> <sup>n</sup>* dimensional independent-identically-distributed (idd) complexcircularly-symmetric (ccs) Additive White Gaussian Noise (AWGN) vector; and *R T n n <sup>H</sup>* is the *R T n n* dimensional MIMO channel matrix whose entries model: a) the RFC-based environment behavior according to a Gaussian probabilistic density function with zero-mean and 0.5-variance statistics; and b) the time-invariant transfer function (which measures the degradation of the signal information) between the i-th receiver-end and the j-th trasnmitter-end antennas [11-16,19-21]. Figure 2 gives a representation of (1). As shown therein, the MIMO communication link model stated in (1) can be also

11 1 1 1 1

*R R RT T R*

*n n nn n n*

Notice from (1-2) that an important requisite for CE purposes within RFC scenarios is that *H* is provided somehow to the Rx. This MIMO system requirement is classically known as Channel State Information (CSI) [11-16]. In the sequel of this work, symbol-decoding efforts will consider the problem of finding *x* from *y* regarding CSI at the Rx part within a slowflat quasi-static RFC-based environment as modeled in (1-2). In simpler words, Rx must find *x* from degradated information *y* through calculating an inversion over *H* . Moreover, *R T n n* is commonly assumed for MIMO demodulation tasks [13-14] because it guarantees linear independency between row-entries of matrix *H* in (2), yielding a nonhomogeneous

 

*T*

*n*

1

*y x h h*

*y h hx*

*<sup>j</sup> <sup>x</sup>* is a complex-valued *<sup>T</sup> <sup>n</sup>* dimensional transmitted vector with

(1)

(2)

Recall for the moment the mathematical model provided in (1). Consider <sup>r</sup> and <sup>i</sup> to be the real and imaginary parts of a complex-valued matrix (vector) , that is, r i *<sup>j</sup>* . Then, Equation (1) can be expanded as follows:

$$\mathbf{y}^{\mathbf{r}} + j\mathbf{y}^{\mathbf{i}} = \left( H^{\mathbf{r}}\mathbf{x}^{\mathbf{r}} - H^{\mathbf{i}}\mathbf{x}^{\mathbf{i}} + \boldsymbol{\eta}^{\mathbf{r}} \right) + j\left( H^{\mathbf{i}}\mathbf{x}^{\mathbf{r}} + H^{\mathbf{i}}\mathbf{x}^{\mathbf{i}} + \boldsymbol{\eta}^{\mathbf{i}} \right) \tag{3}$$

It can be noticed from Equation (3) that: r i <sup>1</sup> , *Tn x x* ; r i <sup>1</sup> , *Rn y y* ; r i <sup>1</sup> , *Rn* ; and r i , *R T n n H H* . An alternative representation for the MIMO communication link model in (2) can be expressed as

$$
\begin{bmatrix} y^{\text{r}} \\ y^{\text{i}} \end{bmatrix} = \begin{bmatrix} H^{\text{r}} & -H^{\text{i}} \\ H^{\text{i}} & H^{\text{r}} \end{bmatrix} \begin{bmatrix} \mathbf{x}^{\text{r}} \\ \mathbf{x}^{\text{i}} \end{bmatrix} + \begin{bmatrix} \boldsymbol{\eta}^{\text{r}} \\ \boldsymbol{\eta}^{\text{i}} \end{bmatrix} \tag{4}
$$

where r i 2 1 <sup>Y</sup> *Rn <sup>y</sup> y* , r i 2 2 i r <sup>h</sup> *R T H H n n H H* , <sup>r</sup> 2 1 i <sup>X</sup> *<sup>T</sup> <sup>x</sup> <sup>n</sup> x* , and <sup>r</sup> 2 1 i N . *<sup>R</sup> n* 

CSI is still needed for MIMO demodulation purposes involving (4). Moreover, if 2 *N n r R* and 2 *N n t T* , then *N N r t* . Obviously, while seeking for a solution of signal vector X from (4), the reception part Rx will provide also the solution for signal vector *x* , and thus MIMO demodulation tasks will be fulfilled. This problem can be defined formally into the following manner:

*Definition 1.* Given parameters 2 *N n r R* and <sup>2</sup> *N n t T* , and a block-matrix <sup>h</sup> *N N r t* , there exists an operator 1 1 : *N NN N r rt t* which solves the matrixblock equation Y=hX+N so that Y,h X . ■

From Definition 1, the following affirmations hold: i) CSI over h is a necessary condition as an input argument for the operator ; and ii) can be naïvely defined as a Generalized-Inverse of the block-matrix h . In simpler terms, † X=h Y 1 is associated with Y,h and † <sup>h</sup> *N N t r* stands for the Generalized-Inverse of the block-matrix h , where †T T <sup>1</sup> h hh h [17-18]. Clearly, <sup>1</sup> and <sup>T</sup> represent the inverse and transpose matrix operations over real-valued matrices. As a concluding remark, computing the Generalized-Inverse † h can be separated into two operations: 1) a block-matrix inversion <sup>T</sup> <sup>1</sup> h h 2; 2) a typical matrix multiplication T T <sup>1</sup> hh h . For these tasks, Partition-Matrix Theory will be employed in order to find a novel algorithm for computing a Generalized-Inverse related to (4).

### **4. Mathematical background**

#### **4.1. Partition-matrix theory**

Partition-Matrix Theory embraces structures related to block matrices (or partition matrices: an array of matrices) [17-18]. Furthermore, a block-matrix *L* with *nq mp* dimension can be constructed (or partitioned) consistently according to matrix sub-blocks *A* , *B* , *C* , and *D* of *n m* , *n p* , *q m* , and *q p* dimensions, respectively, yielding

$$L = \begin{bmatrix} A & B \\ C & D \end{bmatrix} \tag{5}$$

An interesting operation to be performed for these structures given in (5) is the inversion, i.e. a blockwise inversion <sup>1</sup> *L* . For instance, let *nm nm <sup>L</sup>* be a full-rank real-valued block matrix (the subsequent treatment is also valid for complex-valued entities, i.e.

2 Notice that <sup>T</sup> <sup>h</sup> *N N t r* and <sup>T</sup> <sup>1</sup> h h *N N t t* .

<sup>1</sup> In the context of MIMO systems, this matrix operation is commonly found in Babai estimators for symbol-decoding purposes at the Rx part [12,13]. For the reader's interest, refer to [11-16] for other MIMO demodulation techniques.

*nm nm <sup>L</sup>* ). An alternative partition can be performed with *n n <sup>A</sup>* , *n m <sup>B</sup>* , *m n <sup>C</sup>* , and *m m <sup>D</sup>* . Assume also *A* and *D* to be full-rank matrices. Then,

$$L^{-1} = \begin{bmatrix} \left(A - BD^{-1}\mathbb{C}\right)^{-1} & -\left(A - BD^{-1}\mathbb{C}\right)^{-1} BD^{-1} \\\\ -\left(D - CA^{-1}\mathcal{B}\right)^{-1}CA^{-1} & \left(D - CA^{-1}\mathcal{B}\right)^{-1} \end{bmatrix} \tag{6}$$

This strategy (to be proved in the next part) requires additonally and mandatorily full-rank over matrices <sup>1</sup> *A BD C* and <sup>1</sup> *D CA B* . The simple case is defined for *L a b c d* (indistinctly for 2 2 or 2 2 ). Once again, assuming det *<sup>L</sup>* 0 , *<sup>a</sup>* <sup>0</sup> , and *<sup>d</sup>* <sup>0</sup> (related to full-rank restictions within block-matrix *L* ):

$$L^{-1} = \begin{bmatrix} \left(a - bd^{-1}c\right)^{-1} & -\left(a - bd^{-1}c\right)^{-1}bd^{-1} \\ -\left(d - ca^{-1}b\right)^{-1}ca^{-1} & \left(d - ca^{-1}b\right)^{-1} \end{bmatrix} = \frac{1}{ad - bc} \begin{bmatrix} d & -b \\ -c & a \end{bmatrix} \end{bmatrix}$$

where evidently *ad bc* <sup>0</sup> , ( )( ) *nm nm* <sup>1</sup> *a bd <sup>c</sup>* <sup>0</sup> , and <sup>1</sup> *d ca <sup>b</sup>* <sup>0</sup> .

#### **4.2. Matrix Inversion Lemma**

142 Linear Algebra – Theorems and Applications

*Definition 1.* Given parameters 2 *N n r R*

block equation Y=hX+N so that Y,h X . ■

and <sup>T</sup>

hh h

**4. Mathematical background** 

**4.1. Partition-matrix theory** 

i.e. a blockwise inversion <sup>1</sup> *L*

Notice that <sup>T</sup> <sup>h</sup> *N N t r* and <sup>T</sup> <sup>1</sup>

 1

2

<sup>h</sup> *N N r t* , there exists an operator 1 1

be separated into two operations: 1) a block-matrix inversion <sup>T</sup> <sup>1</sup>

and *D* of *n m* , *n p* , *q m* , and *q p* dimensions, respectively, yielding

h h *N N t t* .

order to find a novel algorithm for computing a Generalized-Inverse related to (4).

:

following manner:

[17-18]. Clearly, <sup>1</sup>

multiplication T T <sup>1</sup>

(4), the reception part Rx will provide also the solution for signal vector *x* , and thus MIMO demodulation tasks will be fulfilled. This problem can be defined formally into the

From Definition 1, the following affirmations hold: i) CSI over h is a necessary condition as an input argument for the operator ; and ii) can be naïvely defined as a Generalized-Inverse of the block-matrix h . In simpler terms, † X=h Y 1 is associated with Y,h and

† <sup>h</sup> *N N t r* stands for the Generalized-Inverse of the block-matrix h , where †T T <sup>1</sup>

real-valued matrices. As a concluding remark, computing the Generalized-Inverse † h can

Partition-Matrix Theory embraces structures related to block matrices (or partition matrices: an array of matrices) [17-18]. Furthermore, a block-matrix *L* with *nq mp* dimension can be constructed (or partitioned) consistently according to matrix sub-blocks *A* , *B* , *C* ,

> *A B <sup>L</sup> C D*

An interesting operation to be performed for these structures given in (5) is the inversion,

block matrix (the subsequent treatment is also valid for complex-valued entities, i.e.

 In the context of MIMO systems, this matrix operation is commonly found in Babai estimators for symbol-decoding purposes at the Rx part [12,13]. For the reader's interest, refer to [11-16] for other MIMO demodulation techniques.

and <sup>2</sup> *N n t T*

, and a block-matrix

h hh h 

h h 2; 2) a typical matrix

(5)

*N NN N r rt t* which solves the matrix-

represent the inverse and transpose matrix operations over

. For these tasks, Partition-Matrix Theory will be employed in

. For instance, let *nm nm <sup>L</sup>* be a full-rank real-valued

The Matrix Inversion Lemma is an indirect consequence of inverting non-singular block matrices [17-18], either real-valued or complex-valued, e.g., under certain restrictions 3. Lemma 1 states this result.

*Lemma 1*. Let *r r* , *r s* , *s s* , and *s r* be real-valued or complexvalued matrices. Assume these matrices to be non-singular: , , , and 1 1 . Then,

$$\left(\Psi + \Sigma\mathbf{1}\mathbf{\Xi}\right)^{-1} = \Psi^{-1} - \Psi^{-1}\Sigma\left(\mathbf{I}^{-1} + \Xi\Psi^{-1}\Sigma\right)^{-1}\Xi\Psi^{-1} \tag{7}$$

**Proof**. The validation of (7) must satisfy

$$\begin{aligned} \text{i.i.} \qquad & \left(\Psi + \Sigma\boldsymbol{\Sigma}\boldsymbol{\Xi}\right) \cdot \left(\Psi^{-1} - \Psi^{-1}\boldsymbol{\Sigma}\left(\boldsymbol{\chi}^{-1} + \boldsymbol{\Xi}\Psi^{-1}\boldsymbol{\Sigma}\right)^{-1}\boldsymbol{\Xi}\Psi^{-1}\right) = \boldsymbol{I}\_{r^{-}} \text{ and} \\\\ & \left(\Psi^{-1} - \Psi^{-1}\boldsymbol{\Sigma}\left(\boldsymbol{\chi}^{-1} + \boldsymbol{\Xi}\Psi^{-1}\boldsymbol{\Sigma}\right)^{-1}\boldsymbol{\Xi}\Psi^{-1}\right) \cdot \left(\Psi + \Sigma\boldsymbol{\Upsilon}\boldsymbol{\Xi}\right) = \boldsymbol{I}\_{r^{-}} \text{, where } \boldsymbol{I}\_{r} \text{ represents the } r \times r \text{ identity} \\\\ & \text{matrix. Notice the existence of matrices } \;\Psi^{-1}, \;\ \operatorname{\boldsymbol{\chi}}^{-1}, \; \begin{pmatrix} \Psi + \Sigma\boldsymbol{\chi}\boldsymbol{\Xi} \end{pmatrix}^{-1} \text{ and } \begin{pmatrix} \operatorname{\boldsymbol{\chi}}^{-1} + \boldsymbol{\Xi}\Psi^{-1}\boldsymbol{\Sigma} \end{pmatrix}^{-1}. \end{aligned}$$

Manipulating i) shows: 

<sup>3</sup> Refer to [3,7-10,17,18] to review lemmata exposed for these issues and related results.

$$
\left(\Psi + \Sigma\Upsilon\Xi\right) \cdot \left(\Psi^{-1} - \Psi^{-1}\Sigma\left(\Upsilon^{-1} + \Xi\Psi^{-1}\Sigma\right)^{-1}\Xi\Psi^{-1}\right)
$$

$$
= I\_r - \Sigma\left(\Upsilon^{-1} + \Xi\Psi^{-1}\Sigma\right)^{-1}\Xi\Psi^{-1} + \Sigma\Upsilon\Xi\Psi^{-1} - \Sigma\Upsilon\Xi\Psi^{-1}\Sigma\left(\Upsilon^{-1} + \Xi\Psi^{-1}\Sigma\right)^{-1}\Xi\Psi^{-1}
$$

$$
= I\_r + \Sigma\Upsilon\Xi\Psi^{-1} - \Sigma\Upsilon\left(\Upsilon^{-1} + \Xi\Psi^{-1}\Sigma\right)\left(\Upsilon^{-1} + \Xi\Psi^{-1}\Sigma\right)^{-1}\Xi\Psi^{-1}
$$

$$
= I\_r + \Sigma\Upsilon\Xi\Psi^{-1} - \Sigma\Upsilon\Xi\Psi^{-1} = I\_r.
$$

Likewise for ii):

$$\begin{aligned} \left(\Psi^{-1} - \Psi^{-1}\Sigma\left(\mathbf{I}^{-1} + \Xi\Psi^{-1}\Sigma\right)^{-1}\Xi\Psi^{-1}\right)\cdot\left(\Psi + \Sigma\Upsilon\Xi\right) \\\\ = \left(\Psi^{-1} + \Psi^{-1}\Sigma\Upsilon\Xi - \Psi^{-1}\Sigma\left(\mathbf{I}^{-1} + \Xi\Psi^{-1}\Sigma\right)^{-1}\Xi - \Psi^{-1}\Sigma\left(\mathbf{I}^{-1} + \Xi\Psi^{-1}\Sigma\right)^{-1}\Xi\Psi^{-1}\Sigma\Upsilon\Xi\right) \\\\ = I\_r + \Psi^{-1}\Sigma\Upsilon\Xi - \Psi^{-1}\Sigma\left(\mathbf{I}^{-1} + \Xi\Psi^{-1}\Sigma\right)^{-1}\left(\mathbf{I}^{-1} + \Xi\Psi^{-1}\Sigma\right)\mathbf{I}\Xi \\\\ = I\_r + \Psi^{-1}\Sigma\Upsilon\Xi - \Psi^{-1}\Sigma\Upsilon\Xi = I\_r. \ \blacksquare$$

Now it is pertinent to demonstrate (6) with the aid of Lemma 1. It must be verified that both <sup>1</sup> *LL* and 1 *L L* must be equal to the *nm nm* identity block matrix ( ) 0 0 *n nm mn m n m I <sup>I</sup> <sup>I</sup>* , with consistent-dimensional identity and zero sub-blocks: *nI* , *mI* ; 0*n m* , 0*m n* , respectively. We start by calulating

$$LL^{-1} = \begin{bmatrix} A & B \\ C & D \end{bmatrix} \begin{bmatrix} \left(A - BD^{-1} \mathbb{C}\right)^{-1} & -\left(A - BD^{-1} \mathbb{C}\right)^{-1} BD^{-1} \\\\ -\left(D - CA^{-1}\mathbb{B}\right)^{-1}CA^{-1} & \left(D - CA^{-1}\mathbb{B}\right)^{-1} \end{bmatrix} \tag{8}$$

and

$$L^{-1}L = \begin{bmatrix} \left(A - BD^{-1}\mathbb{C}\right)^{-1} & -\left(A - BD^{-1}\mathbb{C}\right)^{-1}BD^{-1} \\\\ -\left(D - CA^{-1}B\right)^{-1}CA^{-1} & \left(D - CA^{-1}B\right)^{-1} \end{bmatrix} \begin{bmatrix} A & B \\ \mathbb{C} & D \end{bmatrix} \tag{9}$$

by applying (7) in Lemma 1 to both matrices <sup>1</sup> <sup>1</sup> *n n A BD C* and <sup>1</sup> <sup>1</sup> *m m D CA B* , which are present in (8) and (9), and recalling full-rank conditions not only over those matrices but also for *A* and *D* , yields the relations

Partition-Matrix Theory Applied to the Computation of Generalized-Inverses for MIMO Systems in Rayleigh Fading Channels 145

$$\left(A - BD^{-1}\mathbb{C}\right)^{-1} = A^{-1} + A^{-1}B\left(D - CA^{-1}B\right)^{-1}CA^{-1} \tag{10}$$

$$\left(D - \mathbb{C}A^{-1}B\right)^{-1} = D^{-1} + D^{-1}\mathbb{C}\left(A - \mathbb{B}D^{-1}\mathbb{C}\right)^{-1}BD^{-1} \tag{11}$$

Using (10-11) in (8-9), the following results arise:

144 Linear Algebra – Theorems and Applications

Likewise for ii):

<sup>1</sup> *LL* and 1 *L L*

0

*I <sup>I</sup> <sup>I</sup>* 

0

0*n m* , 0*m n* , respectively. We start by calulating

*A B LL C D*

1

1

*n nm mn m*

 

( )

and

*n m*

<sup>1</sup> 1 11 1 1

*rI*

1 1 . *r r I I*

<sup>1</sup> 1 11 1 1

 1 1 1 11 1 11 1 1 *<sup>r</sup> <sup>I</sup>*

> <sup>1</sup> 1 11 1 1 1 *<sup>r</sup> <sup>I</sup>*

> > 1 1 . *r r I I* ■

Now it is pertinent to demonstrate (6) with the aid of Lemma 1. It must be verified that both

must be equal to the *nm nm* identity block matrix

1 1 <sup>1</sup> 1 1

1 1 11 1

*A BD C A BD C BD*

by applying (7) in Lemma 1 to both matrices <sup>1</sup> <sup>1</sup> *n n A BD C* and

<sup>1</sup> <sup>1</sup> *m m D CA B* , which are present in (8) and (9), and recalling full-rank conditions

*D CA B CA D CA B*

 

not only over those matrices but also for *A* and *D* , yields the relations

*D CA B CA D CA B A B L L*

, with consistent-dimensional identity and zero sub-blocks: *nI* , *mI* ;

1 1 1 1 <sup>1</sup>

*C D*

(8)

(9)

1 1 11 1

*A BD C A BD C BD*

<sup>1</sup> 1 1 11 1 1

 1 1 1 1 1 1 11 1 1 *rI*

a. for operations involved in sub-blocks of <sup>1</sup> *LL* :

$$A\left(A-BD^{-1}C\right)^{-1} - B\left(D-CA^{-1}B\right)^{-1}CA^{-1}$$

$$= A\left[A^{-1} + A^{-1}B\left(D-CA^{-1}B\right)^{-1}CA^{-1}\right] - B\left(D-CA^{-1}B\right)^{-1}CA^{-1}$$

$$= I\_n + B\left(D-CA^{-1}B\right)^{-1}CA^{-1} - B\left(D-CA^{-1}B\right)^{-1}CA^{-1} = I\_n + I\_n$$

$$\quad - A\left(A-BD^{-1}C\right)^{-1}BD^{-1} + B\left(D-CA^{-1}B\right)^{-1}CA^{-1} = I\_n + I\_n$$

$$= -A\left[A^{-1} + A^{-1}B\left(D-CA^{-1}B\right)^{-1}CA^{-1}\right]BD^{-1} + B\left(D-CA^{-1}B\right)^{-1}$$

$$= -BD^{-1} - B\left(D-CA^{-1}B\right)^{-1}CA^{-1}BD^{-1} + B\left(D-CA^{-1}B\right)^{-1}$$

$$= -BD^{-1} - B\left(D-CA^{-1}B\right)^{-1}CA^{-1}BD^{-1} + B\left(D-CA^{-1}B\right)^{-1}$$

$$= -BD^{-1} - B\left(D-CA^{-1}B\right)^{-1}\left(-CA^{-1}B\right)D^{-1} = 0\_{n \times n};$$

$$C\left(A-BD^{-1}C\right)^{-1} = D\left(D-CA^{-1}B\right)^{-1}CA^{-1}$$

$$= C\left(A-BD^{-1}C\right)^{-1} - D\left[D^{-1} + D^{-1}C\left(A-BD^{-1}C\right)^{-1}BD^{-1}\right]AC^{-1}$$

$$= C\left(A-BD^{-1}C\right)^{-1} - CA^{-1} - C\left(A-BD^{-1}C\right)^{-1}BD^{-1}CA^{-1}$$

$$\mathcal{L} = -\mathbb{C}\left(A - BD^{-1}\mathbb{C}\right)^{-1}BD^{-1} + I\_m + \mathbb{C}\left(A - BD^{-1}\mathbb{C}\right)^{-1}BD^{-1} = I\_m;$$

thus, <sup>1</sup> ( ) *n m LL I* .

b. for operations involved in sub-blocks of <sup>1</sup> *L L* :

$$\begin{aligned} \left(A - BD^{-1}\mathbb{C}\right)^{-1}A - \left(A - BD^{-1}\mathbb{C}\right)^{-1}BD^{-1}\mathbb{C} \\\\ = \left(A - BD^{-1}\mathbb{C}\right)^{-1}\left[A - BD^{-1}\mathbb{C}\right] = I\_n; \\\\ \left(A - BD^{-1}\mathbb{C}\right)^{-1}B - \left(A - BD^{-1}\mathbb{C}\right)^{-1}BD^{-1}D = 0\_{n \times n}; \\\\ -\left(D - CA^{-1}B\right)^{-1}CA^{-1}A + \left(D - CA^{-1}B\right)^{-1}\mathbb{C} = 0\_{n \times n}; \\\\ -\left(D - CA^{-1}B\right)^{-1}CA^{-1}B + \left(D - CA^{-1}B\right)^{-1}D \\\\ = -\left(D - CA^{-1}B\right)^{-1}\left[-CA^{-1}B + D\right] = I\_m; \end{aligned}$$

thus, <sup>1</sup> ( ) *n m LL I* .

#### **4.3. Generalized-Inverse**

The concept of Generalized-Inverse is an extension of a matrix inversion operations applied to non-singular rectangular matrices [17-18]. For notation purposes and without loss of generalization, *G* and *G*<sup>T</sup> denote the rank of a rectangular matrix M*m n G* , and <sup>H</sup> *G G*<sup>T</sup> is the transpose-conjugate of *G* (when M= *m n <sup>G</sup>* ) or <sup>T</sup> *G G*<sup>T</sup> is the transpose of *<sup>G</sup>* (when M= *m n <sup>G</sup>* ), respectively.

*Definition 2*. Let M*m n G* and 0 min , *G mn* . Then, there exists a matrix † M*n m G* (identified as the Generalized-Inverse), such that it satisfies several conditions for the following cases:

**case i**: if *m n* and 0 min , *G mn <sup>G</sup> n* , then there exists a unique matrix † <sup>M</sup>*n m G G* (identified as Left-Pseudoinverse: LPI) such that *GG In* , satisfying: a) *GG G G* , and b) *G GG G* . Therefore, the LPI matrix is proposed as <sup>1</sup> *G GG G* T T .

**case ii**: if *m n* and det 0 *G G <sup>n</sup>* , then there exists a unique matrix † 1 M*n n G G* (identified as Inverse) such that 1 1 *G G GG In* .

**case iii**: if *m n* and 0 min , *G mn <sup>G</sup> m* , then there exists a unique matrix † <sup>M</sup>*n m G G* (identified as Right-Pseudoinverse: RPI) such that *GG Im* , satisfying: a) *GG G G* , and b) *G GG G* . Therefore, the RPI matrix is proposed as <sup>1</sup> *G G GG* T T . ■

Given the mathematical structure for † *G* provided in Definition 2, it can be easily validated that: 1) For a LPI matrix stipulated in case i, † *GG G G* and †† † *G GG G* with <sup>1</sup> † *G GG G* T T ; 2) For a RPI matrix stipulated in case iii, † *GG G G* and †† † *G GG G* with <sup>1</sup> † *G G GG* T T ; iii) For the Inverse in case ii, 1 1 *G G G G G GG G* T TT T . For a uniqueness test for all cases, assume the existance of matrices † <sup>1</sup> <sup>M</sup>*n m <sup>G</sup>* and † <sup>2</sup> <sup>M</sup>*n m <sup>G</sup>* such that † *GG I* <sup>1</sup> *<sup>n</sup>* and † *GG I* <sup>2</sup> *<sup>n</sup>* (for case i), and † *GG I* <sup>1</sup> *<sup>m</sup>* and † *GG I* <sup>2</sup> *<sup>m</sup>* (for case iii). Notice immediately, † † 1 2 0 *G GG <sup>n</sup>* (for case i) and † † 1 2 <sup>0</sup> *GG G <sup>m</sup>* (for case iii), which obligates † † *G G* 1 2 for both cases, because of full-rank properties over *G* . Clearly, case ii is a particular consequence of cases i and iii.

### **5. The MIMO channel matrix**

146 Linear Algebra – Theorems and Applications

b. for operations involved in sub-blocks of <sup>1</sup> *L L* :

thus, <sup>1</sup>

thus, <sup>1</sup>

generalization,

following cases:

( ) *n m LL I* 

.

**4.3. Generalized-Inverse** 

(when M= *m n <sup>G</sup>* ), respectively.

**case i**: if *m n* and 0 min ,

**case ii**: if *m n* and det 0 *G G*

(identified as Inverse) such that 1 1 *G G GG In*

**case iii**: if *m n* and 0 min ,

*Definition 2*. Let M*m n G* and 0 min ,

( ) *n m LL I* . 1 1 1 1 1 1 ; *m m C A BD C BD C A BD C BD I I*

1 1 1 1 <sup>1</sup> *A BD C A A BD C BD <sup>C</sup>*

<sup>1</sup> 1 1 ; *<sup>n</sup> A BD C A BD C I*

 1 1 <sup>1</sup> 1 1 <sup>0</sup> ; *A BD C B A BD C n m BD D* 

 1 1 11 1 <sup>0</sup> ; *D CA B CA A D CA B <sup>C</sup> m n* 

1 1 11 1 *D CA B CA B D CA B <sup>D</sup>*

<sup>1</sup> 1 1 ; *D CA B CA B D mI*

The concept of Generalized-Inverse is an extension of a matrix inversion operations applied to non-singular rectangular matrices [17-18]. For notation purposes and without loss of

is the transpose-conjugate of *G* (when M= *m n <sup>G</sup>* ) or <sup>T</sup> *G G*<sup>T</sup> is the transpose of *<sup>G</sup>*

(identified as the Generalized-Inverse), such that it satisfies several conditions for the

*G mn*

† <sup>M</sup>*n m G G* (identified as Left-Pseudoinverse: LPI) such that *GG In*

*GG G G* , and b) *G GG G* . Therefore, the LPI matrix is proposed as <sup>1</sup>

.

*G mn*

matrix † <sup>M</sup>*n m G G* (identified as Right-Pseudoinverse: RPI) such that *GG Im*

*G* and *G*<sup>T</sup> denote the rank of a rectangular matrix M*m n G* , and <sup>H</sup> *G G*<sup>T</sup>

*G mn* . Then, there exists a matrix † M*n m G*

*<sup>n</sup>* , then there exists a unique matrix † 1 M*n n G G*

*<sup>G</sup> n* , then there exists a unique matrix

*<sup>G</sup> m* , then there exists a unique

, satisfying: a)

,

*G GG G* T T .

The MIMO channel matrix is the mathematical representation for modeling the degradation phenomena presented in the RFC scenario presented in (2). The elements ij *<sup>h</sup>* in *R T n n <sup>H</sup>* represent a time-invariant transfer function (possesing spectral information about magnitude and phase profiles) between a j-th transmitter and an i-th receiver antenna. Once again, dynamical properties of physical phenomena 4 such as path-loss, shadowing, multipath, Doppler spreading, coherence time, absorption, reflection, scattering, diffraction, basestation-user motion, antenna's physical properties-dimensions, information correlation, associated with a slow-flat quasi-static RFC scenario (proper of a non-LOS indoor wireless environments) are highlighted into a statistical model represented by matrix *H* . For † *<sup>H</sup>* purposes, CSI is a necessary feature required at the reception part in (2), as well as the *R T n n* condition. Table 1 provides several *R T n n* MIMO channel matrix realizations for RFC-based environments [19-21]. On table 1: a) MIMO , *R T n n* : refers to the MIMO communication link configuration, i.e. amount of receiver-end and transmitter-end elements; b) *H*<sup>m</sup> : refers to a MIMO channel matrix realization; c) *<sup>H</sup>*<sup>m</sup> : refers to the corresponding LPI, computed as <sup>1</sup> H H mm m † *H HH H* <sup>m</sup> ; d) h : blockwise matrix version for *<sup>H</sup>*<sup>m</sup> ; e) <sup>+</sup> <sup>h</sup> : refers to the corresponding LPI, computed as <sup>1</sup> †T T h hh h . As an additional point of analysis, full-rank properties over *H* and h (and thus the existance of matrices *H* , <sup>1</sup> *H* , <sup>+</sup> h , and <sup>1</sup> h ) are validated and corroborated through a MATLAB simulation-driven model regarding frequency-selective and time-invariant properties for several RFC-based scenarios at different MIMO configurations. Experimental data were generated upon <sup>6</sup> 10 MIMO channel matrix realizations. As illustrated in figure 3, a common pattern is found regarding the statistical evolution for full-rank properties of *H* and h with *R T n n* at several typical MIMO configurations, for instance, MIMO 2, 2 , MIMO 4, 2 , and MIMO 4, 4 . It is plotted therein REAL(H,h) against IMAG(H,h), where each axis label denote respectively the real and imaginary parts of: a) det( ) *<sup>H</sup>* and det(h) when *R T n n* , and b) <sup>H</sup> det *H H* and <sup>T</sup> det h h when . Blue crosses indicate the behavior of ( ) *H* related to det( ) *H* and <sup>H</sup> det *H H*

<sup>4</sup> We suggest the reader consulting references [11-16] for a detail and clear explanation on these narrowband and wideband physical phenomena presented in wireless MIMO communication systems.

(det(H) legend on top-left margin), while red crosses indicate the behavior of (h) related to det(h) and <sup>T</sup> det h h (det(h) legend on top-left margin). The black-circled zone intersected with black-dotted lines locates the 0 0 *j* value. As depicted on figures (4)-(5), a closer glance at this statistical behavior reveals a prevalence on full-rank properties of *H* and h , meaning that non of the determinants det( ) *H* , det(h) , <sup>H</sup> det *H H* and <sup>T</sup> det h h is equal to zero (behavior enclosed by the light-blue region and delimited by blue/red-dotted lines).

**Figure 3.** MIMO channel matrix rank-determinant behavior for several realizations for *H* and h . This statistical evolution is a common pattern found for several MIMO configurations involving slow-flat quasi-static RFC-based environments with *R T n n* .


(det(H) legend on top-left margin), while red crosses indicate the behavior of

enclosed by the light-blue region and delimited by blue/red-dotted lines).

det(h) and <sup>T</sup> det h h (det(h) legend on top-left margin). The black-circled zone intersected with black-dotted lines locates the 0 0 *j* value. As depicted on figures (4)-(5), a closer glance at this statistical behavior reveals a prevalence on full-rank properties of *H* and h , meaning that non of the determinants det( ) *H* , det(h) , <sup>H</sup> det *H H* and <sup>T</sup> det h h is equal to zero (behavior

**Figure 3.** MIMO channel matrix rank-determinant behavior for several realizations for *H* and h . This statistical evolution is a common pattern found for several MIMO configurations involving slow-flat

quasi-static RFC-based environments with *R T n n* .

(h) related to

**Table 1.** MIMO channel matrix realizations for several MIMO communication link configurations at slow-flat quasi-static RFC scenarios.

**Figure 4.** MIMO channel matrix rank-determinant behavior for several realizations for *H* . Full-rank properties for *H* and <sup>H</sup> *H H* preveal for RFC-based environments (light-blue region delimited by bluedotted lines).

**Figure 5.** MIMO channel matrix rank-determinant behavior for several realizations for h . Full-rank properties for h and Th h preveal for RFC-based environments (light-blue region delimited by reddotted line).

#### **6. Proposed algorithm**

150 Linear Algebra – Theorems and Applications

properties for *H* and <sup>H</sup>

dotted lines).

dotted line).

**Figure 4.** MIMO channel matrix rank-determinant behavior for several realizations for *H* . Full-rank

**Figure 5.** MIMO channel matrix rank-determinant behavior for several realizations for h . Full-rank properties for h and Th h preveal for RFC-based environments (light-blue region delimited by red-

*H H* preveal for RFC-based environments (light-blue region delimited by blue-

The proposal for a novel algorithm for computing a LPI matrix <sup>+</sup> 2 2 <sup>h</sup> *T R n n* (with *R T n n* ) is based on the block-matrix structure of h as exhibited in (4). This idea is an extension of the approach presented in [22]. The existence for this Generalized-Inverse matrix is supported on the statistical properties of the slow-flat quasi-static RFC scenario which impact directly on the singularity of *H* at every MIMO channel matrix realization. Keeping in mind that other approaches attempting to solve the block-matrix inversion problem [7-10] requires several constraints and conditions, the subsequent proposal does not require any restriction at all mainly due to the aforementioned properties of *H* . From (4), it is suggested

$$\text{that}\\
\begin{bmatrix} \mathbf{x}^{\mathsf{t}}\\\mathbf{x}^{\mathsf{t}} \end{bmatrix} \text{ is some}\\
\text{show related to}\\
\begin{bmatrix} \Re\left\{\mathbf{H}^{+}\right\} & -\Im\left\{\mathbf{H}^{+}\right\} \\\\ \Im\left\{\mathbf{H}^{+}\right\} & \Re\left\{\mathbf{H}^{+}\right\} \end{bmatrix} \cdot \mathbf{y} \\
\text{ :} \text{hence, calculating } \mathbf{h}^{+} \text{ will lead to this}\\
\begin{bmatrix} \mathbf{h}^{+} \\\\ \end{bmatrix} \text{ is a vector of } \mathbf{h}$$

solution. Let r *A H* and i *B H* . It is kwon a priori that *<sup>A</sup> <sup>T</sup> jB n* . Then h *A B B A* 

with h 2 *<sup>T</sup> <sup>t</sup> n N* . Define the matrix as <sup>T</sup> h h *N N t t* , where *<sup>M</sup> <sup>L</sup> L M* with

T T *T T n n M AA BB* , <sup>T</sup> T T *T T n n L AB AB* , and *Nt* as a direct consequence from 2 2 *R T r t n n N N* . It can be seen that

$$\mathbf{h}^{+} = \tilde{\mathbf{\Omega}}^{-1} \mathbf{h}^{\mathbf{T}} \in \mathbb{R}^{N\_{t} \times N\_{r}} \tag{12}$$

For simplicity, matrix operations involved in (12) require classic multiply-and-accumulate operations between row-entries of 1 *N N t t* and column-entries of T <sup>h</sup> *N N t r* . Notice immediately that the critical and essential task of computing <sup>+</sup> h relies on finding the block matrix inverse <sup>1</sup> 5. The strategy to be followed in order to solve <sup>1</sup> in (12) will consist of the following steps: 1) the proposition of partitioning without any restriction on rankdefficiency over inner matrix sub-blocks; 2) the definition of iterative multiply-andaccumulate operations within sub-blocks comprised in ; 3) the recursive definition for compacting the overall blockwise matrix inversion. Keep in mind that matrix can be also

viewed as 1,1 1, ,1 , *t t t t N N N N* . The symmetry presented in *<sup>M</sup> <sup>L</sup> L M* will motivate

the development for the pertinent LPI-based algorithm. From (12) and by the use of Lemma 1 it can be concluded that <sup>1</sup> *Q P P Q* , where <sup>1</sup> <sup>1</sup> *T T n n Q M LM L* , *T T n n P QX* ,

<sup>5</sup> Notice that + 1 T T *A jB M jL A jB* . Moreover, <sup>1</sup> *M jL T T n n j* , where 1 1 <sup>1</sup> 1 1 *M LM L L ML M ML* and 1 1 <sup>1</sup> 1 1 *M L M LM L L ML M* .

and <sup>1</sup> *T T n n X LM* . Interesting enough, full-rank is identified at each matrix sub-block in the main diagonal of (besides *Q nT* ). This structural behavior serves as the leitmotiv for the construction of an algorithm for computing the blockwise inverse <sup>1</sup> . Basically speaking and concerning step 1) of this strategy, the matrix partition procedure obeys the assignments (13-16) defined as:

$$\mathcal{W}\_{k} = \begin{bmatrix} \tilde{o}\_{N\_{l}-\{2k+1\}, N\_{l}-\{2k+1\}} & \tilde{o}\_{N\_{l}-\{2k+1\}, N\_{l}-2k} \\ \tilde{o}\_{N\_{l}-2k, N\_{l}-\{2k+1\}} & \tilde{o}\_{N\_{l}-2k, N\_{l}-2k} \end{bmatrix} \in \mathbb{R}^{2 \times 2} \tag{13}$$

$$X\_k = \begin{bmatrix} \tilde{o}\tilde{o}\_{N\_t - \{2k+1\}, N\_t - \{2k-1\}} & \cdots & \tilde{o}\_{N\_t - \{2k+1\}, N\_t} \\ \tilde{o}\tilde{o}\_{N\_t - 2k, N\_t - \{2k-1\}} & \cdots & \tilde{o}\_{N\_t - 2k, N\_t} \end{bmatrix} \in \mathbb{R}^{2 \times 2k} \tag{14}$$

$$Y\_k = \begin{bmatrix} \tilde{o}\_{N\_l - \{2k - 1\}, N\_l - \{2k + 1\}} & \tilde{o}\_{N\_l - \{2k - 1\}, N\_l - 2k} \\ \vdots & \vdots \\ \tilde{o}\_{N\_l, N\_l - \{2k + 1\}} & \tilde{o}\_{N\_l, N\_l - 2k} \end{bmatrix} \in \mathbb{R}^{2k \times 2} \tag{15}$$

$$Z\_0 = \begin{bmatrix} \tilde{o}\_{N\_t - 1, N\_t - 1} & \tilde{o}\_{N\_T - 1, N\_t} \\ \tilde{o}\_{N\_t, N\_t - 1} & \tilde{o}\_{N\_t, N\_t} \end{bmatrix} \in \mathbb{R}^{2 \times 2} \tag{16}$$

The matrix partition over obeys the index *<sup>k</sup>* 1:1:*Nt* 2 1 . Because of the evenrectangular dimensions of , matirx owns exactly an amount of <sup>2</sup> *N n t T* sub-block matrices of 2 2 dimension along its main diagonal. Interesting enough, due to RFC-based environment characteristics studied in (1) and (4), it is found that:

$$
\rho\left(\mathcal{W}\_k\right) = \rho\left(Z\_0\right) = 2 \tag{17}
$$

After performing these structural characteristics for , and with the use of (13-16), step 2) of the strategy consists of the following iterative operations also indexed by *<sup>k</sup>* 1:1:*Nt* 2 1 , in the sense of performing:

$$
\phi\_k = \mathcal{W}\_k - X\_k Z\_{k-1}^{-1} Y\_k \tag{18}
$$

$$
\alpha\_k = \phi\_k^{-1} X\_k Z\_{k-1}^{-1} \tag{19}
$$

$$
\theta\_k = Z\_{k-1}^{-1} + Z\_{k-1}^{-1} Y\_k a\_k \tag{20}
$$

Here: 1 22 1 *k k Zk* , 2 2 *k* , 2 2*<sup>k</sup> k* , and 2 2 *k k k* . Steps stated in (18-20) help to construct intermediate sub-blocks as

Partition-Matrix Theory Applied to the Computation of Generalized-Inverses for MIMO Systems in Rayleigh Fading Channels 153

$$\begin{array}{c} \widetilde{\boldsymbol{\Omega}}\_{k} = \begin{bmatrix} \underline{\boldsymbol{W}\_{k}} & \underline{\boldsymbol{X}\_{k}} \\ \underline{\boldsymbol{z}}\_{2\times 2} & \underline{\boldsymbol{z}}\_{2\times 2k} \\ \underline{\boldsymbol{Y}\_{k}} & \underline{\boldsymbol{Z}\_{k-1}} \end{bmatrix} \rightarrow \begin{array}{c} \underline{\boldsymbol{\Theta}\_{k}^{-1}} & \underline{\boldsymbol{\Theta}\_{k}^{-1}} \\ \underline{\boldsymbol{Y}\_{k}} & \underline{\boldsymbol{Z}\_{k-1}} \end{array} \rightarrow \begin{array}{c} \underline{\boldsymbol{\Theta}\_{k}^{-1}} & \underline{\boldsymbol{\Theta}\_{k}} \\ \underline{\boldsymbol{\Theta}\_{k}} & \underline{\boldsymbol{Z}\_{k}} \end{array} \tag{21}$$

152 Linear Algebra – Theorems and Applications

assignments (13-16) defined as:

the main diagonal of (besides *Q nT*

*k*

0

environment characteristics studied in (1) and (4), it is found that:

*Z*

*W*

*k*

*k*

*<sup>k</sup>* 1:1:*Nt* 2 1 , in the sense of performing:

 , 2 2 *k* 

construct intermediate sub-blocks as

Here: 1 22 1 *k k Zk* 

*Y*

*X*

and <sup>1</sup> *T T n n X LM* . Interesting enough, full-rank is identified at each matrix sub-block in

for the construction of an algorithm for computing the blockwise inverse <sup>1</sup> . Basically speaking and concerning step 1) of this strategy, the matrix partition procedure obeys the

 

*N k N k N k Nk*

*N kN k N kN k*

 

2, 2 1 2 , *t t t t t t t t*

2 1, 2 1 2 1, 2

 

, 21 , 2

,1 , *tt Tt t t t t NN N N NN NN*

The matrix partition over obeys the index *<sup>k</sup>* 1:1:*Nt* 2 1 . Because of the evenrectangular dimensions of , matirx owns exactly an amount of <sup>2</sup> *N n t T* sub-block matrices of 2 2 dimension along its main diagonal. Interesting enough, due to RFC-based

<sup>0</sup> <sup>2</sup> *W Z <sup>k</sup>*

After performing these structural characteristics for , and with the use of (13-16), step 2) of the strategy consists of the following iterative operations also indexed by

*k k kk k W XZ Y*<sup>1</sup>

1 1 *k k k kk Z ZY* 1 1

 , and 2 2 *k k k* 

1 1

1

 

> 

*NN k NN k*

 

*t t t t*

*N kN k N kN*

*tt tt*

 

*N k N k N k Nk*

2, 2 1 2, 2 *tt tt t t t t*

 

> 

2 1, 2 1 2 1, 2 2 2

2 1, 2 1 2 1, 2 2

 

 

*N kN k N kN k*

 

1, 1 1, 2 2

 , 2 2*<sup>k</sup> k*

 *k k kk X Z* <sup>1</sup> 

). This structural behavior serves as the leitmotiv

(13)

(14)

2 2

(17)

(18)

(19)

(20)

. Steps stated in (18-20) help to

(15)

*k*

(16)

The dimensions of each real-valued sub-block in (21) are indicated consistently 6. For step 3) of the strategy, a recursion step 1 1 <sup>1</sup> ( ) *Z Z k k* is provided in terms of the assignment 1 1 2( 1) 2( 1) *k k Zk k* . Clearly, only inversions of *Wk* , *Z*<sup>0</sup> , and *<sup>k</sup>* (which are 2 2 matrices, yielding correspondingly <sup>1</sup> *Wk* , <sup>1</sup> *<sup>Z</sup>*<sup>0</sup> , and <sup>1</sup> *k* ) are required to be performed throughout this iterative-recursive process, unlike the operation linked to <sup>1</sup> *Zk* <sup>1</sup> , which comes from a previous updating step associated with the recursion belonging to <sup>1</sup> *Zk* . Although *Nt* assures the existance of <sup>1</sup> , full-rank requirements outlined in (17) and non-zero determinants for (18) are strongly needed for this iterative-recursive algorithm to work accordingly. Also, full-rank is expected for every recursive outcome related to 1 1 <sup>1</sup> ( ) *Z Z k k* . Again, thank to the characteristics of the slow-flat quasi-static RFC-based environment in which these operations are involved among every MIMO channel matrix realization, conditions in (17) and full-rank of (18) are always satisfied. These issues are corroborated with the aid of the same MATLAB-based simulation framework used to validate full-rank properties over *H* and h . The statistical evolution for the determinants for *Wk* , *Z*<sup>0</sup> , and *<sup>k</sup>* , and the behavior of singularity within the 1 1 <sup>1</sup> ( ) *Z Z k k* recursion are respectively illustrated in figures (6)-(8). MIMO 2, 2 , MIMO 4, 2 , and MIMO 4, 4 were the MIMO communication link configurations considered for these tests. These simulation-driven outcomes provide supportive evidence for the proper functionality of the proposed iterative-recursive algorithm for computing <sup>1</sup> involving matrix sub-block inversions. On each figure, the statistical evolution for the determinants associated with *Z*<sup>0</sup> , *Wk* , *<sup>k</sup>* , and 1 1 <sup>1</sup> ( ) *Z Z k k* are respectively indicated by labels det(Zo), det(Wk), det(Fik), and det(iZk,iZkm1), while the light-blue zone at bottom delimited by a red-dotted line exhibits the gap which marks the avoidance in rankdeficincy over the involved matrices. The zero-determinant value is marked with a black circle.

The next point of analysis for the behavior of the + h LPI-based iterative-recursive algorithm is complexity, which in essence will consist of a demand in matrix partitions (amount of matrix sub-blocks: PART) and arithmetic operations (amount of additions-subtractions: ADD-SUB; multiplications: MULT; and divisions: DIV). Let PART-mtx and ARITH-ops be the nomenclature for complexity cost related to matrix partitions and arithmetic operations, respectively. Without loss of generalization, define *C* as the complexity in terms of the

6 Matrix structure given in (21) is directly derived from applying Equation (6), and by the use of Lemma 1 as 1 1 1 11 1 1 *k k k k k k k k kk k kk* <sup>1</sup> 1 1 1 1 *Z YW X Z Z Y W X Z Y X Z* . See that this expansion is preferable instead of 1 1 1 11 1 1 *W X Z Y W W X Z YW X YW k kk k k k k k k k k k k* 1 1 , which is undesirable due to an unnecessary matrix operation overhead related to computing *<sup>k</sup>* <sup>1</sup> *<sup>Z</sup>* , e.g. inverting <sup>1</sup> *<sup>k</sup>* <sup>1</sup> *<sup>Z</sup>* , which comes preferably from the 1 1 <sup>1</sup> ( ) *k k Z Z* recursion.

costs PART-mtx and ARITH-ops belonging to operations involved in . Henceforth, 1 1T *CC C* h h denotes the cost of computing + h as the sum of the costs of inverting and multiplying <sup>1</sup> by T <sup>h</sup> . It is evident that: a) 1 T *<sup>C</sup>* <sup>h</sup> implies PART=0 and ARITH-ops itemized into MULT= <sup>2</sup> <sup>8</sup> *R T n n* , ADD-SUB= 4 21 *RT T nn n* , and DIV=0; b) 1T T -1 *CC C* h h (h h) . Clearly, T *<sup>C</sup>* h h demands no partitions at all, but with a ARITH-ops cost of MULT= <sup>2</sup> <sup>8</sup> *R T n n* , and ADD-SUB= <sup>2</sup> 42 1 *R T n n* . However, the principal complexity relies critically on T -1 *<sup>C</sup>* (h h) , which is the backbone for + <sup>h</sup> , as presented in [22]. Table 2 summerizes these complexity results. For this treatment, T -1 *<sup>C</sup>* (h h) consists of 3 2 *<sup>T</sup> <sup>n</sup>* partitions, MULT = 1 1 6 *T n I k k C* , ADD-SUB = 1 1 1 *T n II k k C* , and DIV = 1 1 1 *T n III k k C* . The ARITH-ops cost depends on *<sup>I</sup> <sup>k</sup> <sup>C</sup>* , *II <sup>k</sup> <sup>C</sup>* , and *III <sup>k</sup> <sup>C</sup>* ; the constant factors for each one of these items are proper of the complexity presented in <sup>1</sup> <sup>0</sup> *C Z* . The remain of the complexities, i.e. *I <sup>k</sup> <sup>C</sup>* , *II <sup>k</sup> <sup>C</sup>* , and *III <sup>k</sup> <sup>C</sup>* , are calculated according to the iterative stpes defined in (18-20) and (21), particularly expressed in terms of

$$\mathbb{C}\left[\boldsymbol{\phi}\_{k}^{-1}\right] + \mathbb{C}\left[-\boldsymbol{\alpha}\_{k}\right] + \mathbb{C}\left[-\theta\_{k}\boldsymbol{Y}\_{k}\boldsymbol{W}\_{k}^{-1}\right] + \mathbb{C}\left[\boldsymbol{\theta}\_{k}\right] \tag{22}$$

It can be checked out that: a) no PART-mtx cost is required; b) the ARITH-ops cost employs (22) for each item, yielding: 2 40 24 12 *<sup>I</sup> <sup>k</sup> Ckk* (for MULT), 2 40 2 *II <sup>k</sup> C k* (for ADD\_SUB), and 2 *III <sup>k</sup> <sup>C</sup>* (for DIV).

An illustrative application example is given next. It considers a MIMO channel matrix realization obeying statistical behavior according to (1) and a MIMO 4, 4 configuration:

0.3059 0.7543 0.8107 0.2082 0.2314 0.4892 0.416 1.0189 1.1777 0.0419 0.8421 0.9448 0.1235 0.6067 1.5437 0.4039 0.0886 0.0676 0.8409 0.5051 0.132 0.8867 0.0964 0.2828 0.2034 0.5886 0.0266 1.1 *j jj j jjjj <sup>H</sup> j jj j j j* 4 4 48 0.5132 1.1269 0.0806 0.4879 *j j* with *H* 4 . As a consequence, 2.4516 1.2671 0.1362 2.7028 0 1.9448 0.6022 0.2002 1.2671 4.5832 1.7292 1.3776 1.9448 0 1.229 2.4168 0.1362 1.7292 3.0132 0.0913 0.6022 1.229 0 0.862 2.7028 1.3776 0.0913 4.0913 0.2002 2.4168 0.862 0 0 1.9448 0.6022 0.2 002 2.4516 1.2671 0.1362 2.7028 1.9448 0 1.229 2.4168 1.2671 4.5832 1.7292 1.3776 0.6022 1.229 0 0.862 0.1362 1.7292 3.0132 0.0913 0.2002 2.4168 0.862 0 2.7028 1.3776 0.0913 4.0913 8 8 with 8 .

1T T -1 *CC C* h h (h h) 

3 2 *<sup>T</sup> <sup>n</sup>* partitions, MULT =

*<sup>k</sup> <sup>C</sup>* , and *III*

*<sup>k</sup> <sup>C</sup>* (for DIV).

*I <sup>k</sup> <sup>C</sup>* , *II*

and 2 *III*

consequence,

ARITH-ops cost depends on *<sup>I</sup>*

particularly expressed in terms of

0.2034 0.5886 0.0266 1.1

0 1.9448 0.6022 0.2

 

 

*j j*

costs PART-mtx and ARITH-ops belonging to operations involved in . Henceforth, 1 1T *CC C* h h denotes the cost of computing + h as the sum of the costs of

inverting and multiplying <sup>1</sup> by T <sup>h</sup> . It is evident that: a) 1 T *<sup>C</sup>* <sup>h</sup> implies PART=0

and ARITH-ops itemized into MULT= <sup>2</sup> <sup>8</sup> *R T n n* , ADD-SUB= 4 21 *RT T nn n* , and DIV=0; b)

ARITH-ops cost of MULT= <sup>2</sup> <sup>8</sup> *R T n n* , and ADD-SUB= <sup>2</sup> 42 1 *R T n n* . However, the principal complexity relies critically on T -1 *<sup>C</sup>* (h h) , which is the backbone for + <sup>h</sup> , as presented in [22].

Table 2 summerizes these complexity results. For this treatment, T -1 *<sup>C</sup>* (h h) consists of

1

*T n II k k C* 

*<sup>k</sup> <sup>C</sup>* , are calculated according to the iterative stpes defined in (18-20) and (21),

*<sup>k</sup> Ckk* (for MULT), 2 40 2 *II*

1

 (22)

, and DIV =

*<sup>k</sup> <sup>C</sup>* ; the constant factors for each one of these

<sup>0</sup> *C Z* . The remain of the complexities, i.e.

4 4

with

1

*III k k C* 

*T n*

1

. The

1

*<sup>k</sup> C k* (for ADD\_SUB),

8 8 with

8 .

*H* 4 . As a

1

1

*T n I k k C* 

6

, ADD-SUB =

*<sup>k</sup> <sup>C</sup>* , and *III*

1 1 *<sup>k</sup> <sup>k</sup> kk k k C C C YW C*

It can be checked out that: a) no PART-mtx cost is required; b) the ARITH-ops cost employs

An illustrative application example is given next. It considers a MIMO channel matrix realization obeying statistical behavior according to (1) and a MIMO 4, 4 configuration:

48 0.5132 1.1269 0.0806 0.4879 *j j*

002 2.4516 1.2671 0.1362 2.7028

1

0.3059 0.7543 0.8107 0.2082 0.2314 0.4892 0.416 1.0189 1.1777 0.0419 0.8421 0.9448 0.1235 0.6067 1.5437 0.4039 0.0886 0.0676 0.8409 0.5051 0.132 0.8867 0.0964 0.2828

 

*jjjj <sup>H</sup>*

 

*j jj j*

*j jj j*

2.4516 1.2671 0.1362 2.7028 0 1.9448 0.6022 0.2002 1.2671 4.5832 1.7292 1.3776 1.9448 0 1.229 2.4168 0.1362 1.7292 3.0132 0.0913 0.6022 1.229 0 0.862 2.7028 1.3776 0.0913 4.0913 0.2002 2.4168 0.862 0

1.9448 0 1.229 2.4168 1.2671 4.5832 1.7292 1.3776 0.6022 1.229 0 0.862 0.1362 1.7292 3.0132 0.0913 0.2002 2.4168 0.862 0 2.7028 1.3776 0.0913 4.0913

*<sup>k</sup> <sup>C</sup>* , *II*

items are proper of the complexity presented in <sup>1</sup>

(22) for each item, yielding: 2 40 24 12 *<sup>I</sup>*

. Clearly, T *<sup>C</sup>* h h demands no partitions at all, but with a

**Figure 6.** Statistical evolution of the rank-determinant behaviour concerning <sup>0</sup> *Z* , *Wk* , *<sup>k</sup>* , and 1 1 <sup>1</sup> ( ) *k k Z Z* for a MIMO 2, 2 configuration.

**Figure 7.** Statistical evolution of the rank-determinant behaviour concerning <sup>0</sup> *Z* , *Wk* , *<sup>k</sup>* , and 1 1 <sup>1</sup> ( ) *k k Z Z* for a MIMO 4, 2 configuration.

**Figure 7.** Statistical evolution of the rank-determinant behaviour concerning <sup>0</sup> *Z* , *Wk* , *<sup>k</sup>*

for a MIMO 4, 2 configuration.

 , and 1 1 <sup>1</sup> ( ) *k k Z Z* 

**Figure 8.** Statistical evolution of the rank-determinant behaviour concerning <sup>0</sup> *Z* , *Wk* , *<sup>k</sup>* , and 1 1 <sup>1</sup> ( ) *k k Z Z* for a MIMO 4, 4 configuration.


**Table 2.** Complexity cost results of the LPI-based iterative-recursive algorithm for <sup>+</sup> <sup>h</sup> .

Applying partition criteria (13-16) and given *k* 1:1:3 , the following matrix sub-blocks are generated:

$$W\_1 = \begin{bmatrix} 2.4516 & -1.2671 \\ -1.2671 & 4.5832 \end{bmatrix}^t$$
 
$$X\_1 = \begin{bmatrix} 0.1362 & -2.7028 \\ -1.7292 & 1.3776 \end{bmatrix}^t \quad Y\_1 = \begin{bmatrix} 0.1362 & -1.7292 \\ -2.7028 & 1.3776 \end{bmatrix} \quad Z\_{\
u} = \begin{bmatrix} 3.0132 & 0.0913 \\ 0.0913 & 4.0913 \end{bmatrix}^t,$$
 
$$W\_2 = \begin{bmatrix} 3.0132 & 0.0913 \\ 0.0913 & 4.0913 \end{bmatrix}^t,$$
 
$$X\_2 = \begin{bmatrix} -0.6022 & 1.2280 & 0 & 0.862 \\ 0.2002 & 2.4168 & -0.862 & 0 \end{bmatrix}^t \quad Y\_2 = \begin{bmatrix} -0.6022 & 0.2002 \\ 1.229 & 2.4168 \\ 0 & -0.862 \\ 0.862 & 0 \end{bmatrix}^t$$

$$\begin{aligned} W\_{3} &= \begin{bmatrix} 2.4516 & -1.2671 \\ -1.2671 & 4.5832 \end{bmatrix}, Y\_{3} = \begin{bmatrix} 0.1382 & -2.7028 & 0 & -1.9448 & 0.6022 & -0.202 \\ -1.7292 & 1.3776 & 1.9448 & 0 & -1.229 & -2.418 \end{bmatrix}, \\\\ \mathbf{and} \quad Y\_{j} &= \begin{bmatrix} 0.1362 & -1.7292 \\ -2.7028 & 1.3776 \\ 0 & 1.9448 \\ -1.9448 & 0 \\ 0.6022 & -1.229 \\ -0.2002 & -2.418 \end{bmatrix}. \end{aligned}$$

generated:

1

2

0.1362 2.7028 1.7292 1.3776 *<sup>X</sup>* ,

<sup>1</sup>

**Table 2.** Complexity cost results of the LPI-based iterative-recursive algorithm for <sup>+</sup> <sup>h</sup> .

1

2

0.6022 1.2290 0 0.862 0.2002 2.4168 0.862 0 *<sup>X</sup>* , <sup>2</sup>

Applying partition criteria (13-16) and given *k* 1:1:3 , the following matrix sub-blocks are

2.4516 1.2671 1.2671 4.5832 *<sup>W</sup>* ,

0.1362 1.7292 2.7028 1.3776 *<sup>Y</sup>* ,

3.0132 0.0913 0.0913 4.0913 *<sup>W</sup>*

<sup>0</sup>

,

3.0132 0.0913 0.0913 4.0913 *<sup>Z</sup>* ,

0.6022 0.2002 1.229 2.4168 0 0.862 0.862 0

 

*Y* ,

$$
\phi\_1 = \mathcal{W}\_1 - X\_1 Z\_0^{-1} Y\_1, \quad \alpha\_1 = \phi\_1^{-1} X\_1 Z\_0^{-1}, \quad \theta\_1 = Z\_0^{-1} + Z\_0^{-1} Y\_1 \alpha\_1 \tag{23}
$$

$$
\boldsymbol{\phi}\_{2} = \boldsymbol{W}\_{2} - \boldsymbol{X}\_{2}\boldsymbol{Z}\_{1}^{-1}\boldsymbol{Y}\_{2}, \quad \boldsymbol{\alpha}\_{2} = \boldsymbol{\phi}\_{2}^{-1}\boldsymbol{X}\_{2}\boldsymbol{Z}\_{1}^{-1}, \quad \boldsymbol{\theta}\_{2} = \boldsymbol{Z}\_{1}^{-1} + \boldsymbol{Z}\_{1}^{-1}\boldsymbol{Y}\_{2}\boldsymbol{\alpha}\_{2} \tag{24}
$$

$$
\boldsymbol{\phi\_3} = \boldsymbol{W\_3} - \boldsymbol{X\_3} \boldsymbol{Z\_2}^{-1} \boldsymbol{Y\_{3'}} \quad \boldsymbol{\alpha\_3} = \boldsymbol{\phi\_3^{-1}} \boldsymbol{X\_3} \boldsymbol{Z\_2}^{-1} \; \prime \quad \boldsymbol{\theta\_3} = \boldsymbol{Z\_2}^{-1} + \boldsymbol{Z\_2}^{-1} \boldsymbol{Y\_3} \boldsymbol{\alpha\_3} \tag{25}
$$

From (21), the matrix assignments related to recursion 1 1 <sup>1</sup> ( ) *Z Z k k* produces the following intermediate blockwise matrix results:

$$\begin{aligned} Z\_1^{-1} \left( Z\_0^{-1} \right) = \tilde{\Omega}\_1^{-1} = \begin{bmatrix} \theta\_1^{-1} & -\alpha\_1 \\ -\theta\_1 Y\_1 W\_1^{-1} & \theta\_1 \end{bmatrix} = \begin{bmatrix} 1.5765 & 0.1235 & -0.0307 & 1.0005 \\ 0.1235 & 0.3332 & 0.1867 & -0.0348 \\ -0.0307 & 0.1867 & 0.4432 & -0.093 \\ 1.0005 & -0.0348 & -0.093 & 0.9191 \end{bmatrix} \prime \\\ Z\_2^{-1} \left( Z\_1^{-1} \right) = \tilde{\Omega}\_2^{-1} = \begin{bmatrix} \theta\_2^{-1} & -\alpha\_2 \\ \theta\_2^{-1} \theta\_2 W\_2^{-1} & \theta\_2 \end{bmatrix} \end{aligned}$$


from 1 1 <sup>1</sup> ( ) *Z Z k k* corresponds to <sup>1</sup> , and is further used for calculating + 8 1 T <sup>8</sup> h h . Moreover, notice that full-rank properties are always presented in matrices 0 *Z* , *W*<sup>1</sup> , *W*<sup>2</sup> , *W*<sup>3</sup> , 1 , <sup>2</sup> , <sup>3</sup> , <sup>1</sup> <sup>1</sup> *<sup>Z</sup>* , <sup>1</sup> <sup>2</sup> *<sup>Z</sup>* , and <sup>1</sup> <sup>3</sup> *<sup>Z</sup>* .

### **7. VLSI implementation aspects**

The arithmetic operations presented in the algorithm for computing <sup>+</sup> h can be implemented under a modular-iterative fashion towards a VLSI (Very Large Scale of Integration) design. The partition strategy comprised in (13-16) provides modularity, while (18-20) is naturally associated with iterativeness; recursion is just used for constructing matrix-blocks in (21). Several well-studied aspects aid to implement a further VLSI architecture [23-27] given the nature of the mathematical structure of the algorithm. For instance, systolic arrays [25-27] are a suitable choice for efficient, parallel-processing architectures concerning matrix multiplications-additions. Bidimensional processing arrays are typical architectural outcomes, whose design consist basically in interconnecting processing elements (PE) among different array layers. The configuration of each PE comes from projection or linear mapping techniques [25-27] derived from multiplications and additions presented in (18-20). Also, systolic arrays tend to concurrently perform arithmetic operations dealing with the matrix concatenated multiplications <sup>1</sup> *XZ Y kk k* <sup>1</sup> , 1 1 *k kk X Z* <sup>1</sup> , <sup>1</sup> *Z Y k kk* <sup>1</sup> , and <sup>1</sup> *kk k Y W* presented in (18-20). Consecutive additions inside every PE can be favourably implemented via Carry-Save-Adder (CSA) architectures [23-24], while multiplications may recur to Booth multipliers [23-24] in order to reduce latencies caused by adding acummulated partial products. Divisions presented in <sup>1</sup> *Wk* , <sup>1</sup> *<sup>Z</sup>*<sup>0</sup> , and <sup>1</sup> *k* can be built through regular shift-and-subtract modules or classic serial-parallel subtractors [23-24]; in fact, CORDIC (Coordinate Rotate Digital Computer) processors [23] are also employed and configured in order to solve numerical divisions. The aforementioned architectural aspects offer an attractive and alternative framework for consolidating an ultimate VLSI design for implementing the <sup>+</sup> h algorithm without compromising the overall system data throughput (intrinsicly related to operation frequencies) for it.

### **8. Conclusions**

This chapter presented the development of a novel iterative-recursive algorithm for computing a Left-Pseudoinverse (LPI) as a Generalized-Inverse for a MIMO channel matrix within a Rayleigh fading channel (RFC). The formulation of this algorithm consisted in the following step: i) first, structural properties for the MIMO channel matrix acquired permanent full-rank due to statistical properties of the RFC scenario; ii) second, Partition-Matrix Theory was applied allowing the generation of a block-matrix version of the MIMO channel matrix; iii) third, iterative addition-multiplication operations were applied at these matrix sub-blocks in order to construct blockwise sub-matrix inverses, and recursively reusing them for obtaining the LPI. For accomplishing this purpose, required mathematical background and MIMO systems concepts were provided for consolidating a solid scientific framework to understand the context of the problem this algorithm was attempting to solve. Proper functionality for this approach was validated through simulation-driven experiments, as well as providing an example of this operation. As an additional remark, some VLSI aspects and architectures were outlined for basically implementing arithmetic operations within the proposed LPI-based algorithm.

## **Author details**

160 Linear Algebra – Theorems and Applications

<sup>2</sup> *<sup>Z</sup>* , and <sup>1</sup>

**7. VLSI implementation aspects** 

<sup>3</sup> *<sup>Z</sup>* .

1 , <sup>2</sup> , <sup>3</sup> , <sup>1</sup> <sup>1</sup> *<sup>Z</sup>* , <sup>1</sup>

<sup>1</sup> *Z Y k kk* <sup>1</sup> 

 , and <sup>1</sup> 

**8. Conclusions** 

Moreover, notice that full-rank properties are always presented in matrices 0 *Z* , *W*<sup>1</sup> , *W*<sup>2</sup> , *W*<sup>3</sup> ,

The arithmetic operations presented in the algorithm for computing <sup>+</sup> h can be implemented under a modular-iterative fashion towards a VLSI (Very Large Scale of Integration) design. The partition strategy comprised in (13-16) provides modularity, while (18-20) is naturally associated with iterativeness; recursion is just used for constructing matrix-blocks in (21). Several well-studied aspects aid to implement a further VLSI architecture [23-27] given the nature of the mathematical structure of the algorithm. For instance, systolic arrays [25-27] are a suitable choice for efficient, parallel-processing architectures concerning matrix multiplications-additions. Bidimensional processing arrays are typical architectural outcomes, whose design consist basically in interconnecting processing elements (PE) among different array layers. The configuration of each PE comes from projection or linear mapping techniques [25-27] derived from multiplications and additions presented in (18-20). Also, systolic arrays tend to concurrently perform arithmetic

operations dealing with the matrix concatenated multiplications <sup>1</sup> *XZ Y kk k* <sup>1</sup>

adding acummulated partial products. Divisions presented in <sup>1</sup> *Wk*

throughput (intrinsicly related to operation frequencies) for it.

favourably implemented via Carry-Save-Adder (CSA) architectures [23-24], while multiplications may recur to Booth multipliers [23-24] in order to reduce latencies caused by

built through regular shift-and-subtract modules or classic serial-parallel subtractors [23-24]; in fact, CORDIC (Coordinate Rotate Digital Computer) processors [23] are also employed and configured in order to solve numerical divisions. The aforementioned architectural aspects offer an attractive and alternative framework for consolidating an ultimate VLSI design for implementing the <sup>+</sup> h algorithm without compromising the overall system data

This chapter presented the development of a novel iterative-recursive algorithm for computing a Left-Pseudoinverse (LPI) as a Generalized-Inverse for a MIMO channel matrix within a Rayleigh fading channel (RFC). The formulation of this algorithm consisted in the following step: i) first, structural properties for the MIMO channel matrix acquired permanent full-rank due to statistical properties of the RFC scenario; ii) second, Partition-Matrix Theory was applied allowing the generation of a block-matrix version of the MIMO channel matrix; iii) third, iterative addition-multiplication operations were applied at these matrix sub-blocks in order to construct blockwise sub-matrix inverses, and recursively reusing them for obtaining the LPI. For accomplishing this purpose, required mathematical background and MIMO systems concepts were provided for consolidating a solid scientific framework to understand the context of the problem this algorithm was attempting to solve.

*kk k Y W* presented in (18-20). Consecutive additions inside every PE can be

 , and <sup>1</sup> *k* 

, <sup>1</sup> *<sup>Z</sup>*<sup>0</sup>

 , 1 1 *k kk X Z* <sup>1</sup> ,

can be

P. Cervantes and L. F. González *Instituto Tecnológico y de Estudios Superiores de Monterrey, Campus Guadalajara, ITESM University, Mexico* 

F. J. Ortiz and A. D. García *Instituto Tecnológico y de Estudios Superiores de Monterrey, Campus Estado de México, ITESM University, Mexico* 

## **Acknowledgement**

This work was supported by CONACYT (National Council of Science and Technology) under the supervision, revision, and sponsorship of ITESM University (Instituto Tecnológico y de Estudios Superiores de Monterrey).

### **9. References**

	- [9] Choi Y (2009) New Form of Block Matrix Inversion. International Conference on Advanced Intelligent Mechatronics. July 2009: 1952-1957.
	- [10] Choi Y, and Cheong J. (2009) New Expressions of 2X2 Block Matrix Inversion and Their Application. IEEE Transactions on Automatic Control, vol. 54, no. 11. November 2009: 2648-2653.
	- [11] Fontán FP, and Espiñera PM (2008) Modeling the Wireless Propagation Channel. Wiley. 268 p.
	- [12] El-Hajjar M, and Hanzo L (2010) Multifunctional MIMO Systems: A Combined Diversity and Multiplexing Design Perspective. IEEE Wireless Communications. April 2010: 73-79.
	- [13] Biglieri E, *et al* (2007) MIMO Wireless Communications. Cambridge University Press: United Kingdom. 344 p.
	- [14] Jankiraman M (2004) Space-Time Codes and MIMO Systems. Artech House: United States. 327 p.
	- [15] Biglieri E, Proakis J, and Shamai S (1998) Fading Channels: Information-Theoretic and Communications Aspects. IEEE Transactions on Information Theory, vol. 44, no. 6. October 1998: 2619-2692.
	- [16] Almers P, Bonek E, Burr A, *et al* (2007) Survey of Channel and Radio Propagation Models for Wireless MIMO Systems. EURASIP Journal on Wireless Communications and Networking, vol. 2011, issue 1. January 2007: 19 p.
	- [17] Golub GH, and Van Loan CF (1996) Matrix Computations. The Johns Hopkins University Press. 694 p.
	- [18] Serre D (2001) Matrices: Theory and Applications. Springer Verlag. 202 p.
	- [19] R&S®. Rohde & Schwarz GmbH & Co. KG. WLAN 802.11n: From SISO to MIMO. Application Note: 1MA179\_9E. Available: www.rohde-schwarz.com: 59 p.
	- [20] © Agilent Technologies, Inc. (2008) Agilent MIMO Wireless LAN PHY Layer [RF] : Operation & Measurement: Application Note: 1509. Available: www.agilent.com: 48 p.
	- [21] Paul T, and Ogunfunmi T (2008) Wireless LAN Comes of Age : Understanding the IEEE 802.11n Amendment. IEEE Circuits and Systems Magazine. First Quarter 2008: 28-54.
	- [22] Cervantes P, González VM, and Mejía PA (2009) Left-Pseudoinverse MIMO Channel Matrix Computation. 19th International Conference on Electronics, Communications, and Computers (CONIELECOMP 2009). July 2009: 134-138.
	- [23] Milos E, and Tomas L (2004) Digital Arithmetic. Morgan Kauffmann Publishers. 709 p.
	- [24] Parhi KK (1999) VLSI Digital Signal Processing Systems: Design and Implementation. John Wiley & Sons. 784 p.
	- [25] Song SW (1994) Systolic Algorithms: Concepts, Synthesis, and Evolution. Institute of Mathematics, University of Sao Paulo, Brazil. Available: http://www.ime.usp.br/~song/ papers/cimpa.pdf. DOI number: 10.1.1.160.4057: 40 p.
	- [26] Kung SY (1985) VLSI Array Processors. IEEE ASSP Magazine. July 1985: 4-22.
	- [27] Jagadish HV, Rao SK, and Kailath T (1987) Array Architectures for Iterative Algorithms. Proceedings of the IEEE, vol. 75, no. 9. September 1987: 1304-1321.

## **Operator Means and Applications**

### Pattrawut Chansangiam

162 Linear Algebra – Theorems and Applications

2648-2653.

2010: 73-79.

States. 327 p.

United Kingdom. 344 p.

October 1998: 2619-2692.

University Press. 694 p.

John Wiley & Sons. 784 p.

268 p.

[9] Choi Y (2009) New Form of Block Matrix Inversion. International Conference on

[10] Choi Y, and Cheong J. (2009) New Expressions of 2X2 Block Matrix Inversion and Their Application. IEEE Transactions on Automatic Control, vol. 54, no. 11. November 2009:

[11] Fontán FP, and Espiñera PM (2008) Modeling the Wireless Propagation Channel. Wiley.

[12] El-Hajjar M, and Hanzo L (2010) Multifunctional MIMO Systems: A Combined Diversity and Multiplexing Design Perspective. IEEE Wireless Communications. April

[13] Biglieri E, *et al* (2007) MIMO Wireless Communications. Cambridge University Press:

[14] Jankiraman M (2004) Space-Time Codes and MIMO Systems. Artech House: United

[15] Biglieri E, Proakis J, and Shamai S (1998) Fading Channels: Information-Theoretic and Communications Aspects. IEEE Transactions on Information Theory, vol. 44, no. 6.

[16] Almers P, Bonek E, Burr A, *et al* (2007) Survey of Channel and Radio Propagation Models for Wireless MIMO Systems. EURASIP Journal on Wireless Communications

[17] Golub GH, and Van Loan CF (1996) Matrix Computations. The Johns Hopkins

[19] R&S®. Rohde & Schwarz GmbH & Co. KG. WLAN 802.11n: From SISO to MIMO.

[20] © Agilent Technologies, Inc. (2008) Agilent MIMO Wireless LAN PHY Layer [RF] : Operation & Measurement: Application Note: 1509. Available: www.agilent.com: 48 p. [21] Paul T, and Ogunfunmi T (2008) Wireless LAN Comes of Age : Understanding the IEEE 802.11n Amendment. IEEE Circuits and Systems Magazine. First Quarter 2008: 28-54. [22] Cervantes P, González VM, and Mejía PA (2009) Left-Pseudoinverse MIMO Channel Matrix Computation. 19th International Conference on Electronics, Communications,

[23] Milos E, and Tomas L (2004) Digital Arithmetic. Morgan Kauffmann Publishers. 709 p. [24] Parhi KK (1999) VLSI Digital Signal Processing Systems: Design and Implementation.

[25] Song SW (1994) Systolic Algorithms: Concepts, Synthesis, and Evolution. Institute of Mathematics, University of Sao Paulo, Brazil. Available: http://www.ime.usp.br/~song/

[27] Jagadish HV, Rao SK, and Kailath T (1987) Array Architectures for Iterative Algorithms.

[26] Kung SY (1985) VLSI Array Processors. IEEE ASSP Magazine. July 1985: 4-22.

Proceedings of the IEEE, vol. 75, no. 9. September 1987: 1304-1321.

[18] Serre D (2001) Matrices: Theory and Applications. Springer Verlag. 202 p.

Application Note: 1MA179\_9E. Available: www.rohde-schwarz.com: 59 p.

Advanced Intelligent Mechatronics. July 2009: 1952-1957.

and Networking, vol. 2011, issue 1. January 2007: 19 p.

and Computers (CONIELECOMP 2009). July 2009: 134-138.

papers/cimpa.pdf. DOI number: 10.1.1.160.4057: 40 p.

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/46479

### **1. Introduction**

The theory of scalar means was developed since the ancient Greek by the Pythagoreans until the last century by many famous mathematicians. See the development of this subject in a survey article [24]. In Pythagorean school, various means are defined via the method of proportions (in fact, they are solutions of certain algebraic equations). The theory of matrix and operator means started from the presence of the notion of parallel sum as a tool for analyzing multi-port electrical networks in engineering; see [1]. Three classical means, namely, arithmetic mean, harmonic mean and geometric mean for matrices and operators are then considered, e.g., in [3, 4, 11, 12, 23]. These means play crucial roles in matrix and operator theory as tools for studying monotonicity and concavity of many interesting maps between algebras of operators; see the original idea in [3]. Another important mean in mathematics, namely the power mean, is considered in [6]. The parallel sum is characterized by certain properties in [22]. The parallel sum and these means share some common properties. This leads naturally to the definitions of the so-called connection and mean in a seminal paper [17]. This class of means cover many in-practice operator means. A major result of Kubo-Ando states that there are one-to-one correspondences between connections, operator monotone functions on the non-negative reals and finite Borel measures on the extended half-line. The mean theoretic approach has many applications in operator inequalities (see more information in Section 8), matrix and operator equations (see e.g. [2, 19]) and operator entropy. The concept of operator entropy plays an important role in mathematical physics. The *relative operator entropy* is defined in [13] for invertible positive operators *A*, *B* by

$$S(A|B) = A^{1/2} \log(A^{-1/2}BA^{-1/2})A^{1/2}.\tag{1}$$

In fact, this formula comes from the Kubo-Ando theory–*S*(·|·) is the connection corresponds to the operator monotone function *t* �→ log *t*. See more information in [7, Chapter IV] and its references.

In this chapter, we treat the theory of operator means by weakening the original definition of connection in such a way that the same theory is obtained. Moreover, there is a one-to-one correspondence between connections and finite Borel measures on the unit interval. Each connection can be regarded as a weighed series of weighed harmonic means. Hence, every mean in Kubo-Ando's sense corresponds to a probability Borel measure on the unit interval.

©2012 Chansangiam, licensee InTech. This is an open access chapter distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. ©2012 Chansangiam, licensee InTech. This is a paper distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Various characterizations of means are obtained; one of them is a usual property of scalar mean, namely, the betweenness property. We provide some new properties of abstract operator connections, involving operator monotonicity and concavity, which include specific operator means as special cases.

For benefits of readers, we provide the development of the theory of operator means. In Section 2, we setup basic notations and state some background about operator monotone functions which play important roles in the theory of operator means. In Section 3, we consider the parallel sum together with its physical interpretation in electrical circuits. The arithmetic mean, the geometric mean and the harmonic mean of positive operators are investigated and characterized in Section 4. The original definition of connection is improved in Section 5 in such a way that the same theory is obtained. In Section 6, several characterizations and examples of Kubo-Ando means are given. We provide some new properties of general operator connections, related to operator monotonicity and concavity, in Section 7. Many operator versions of classical inequalities are obtained via the mean-theoretic approach in Section 8.

## **2. Preliminaries**

Throughout, let *B*(H) be the von Neumann algebra of bounded linear operators acting on a Hilbert space <sup>H</sup>. Let *<sup>B</sup>*(H)*sa* be the real vector space of self-adjoint operators on <sup>H</sup>. Equip *<sup>B</sup>*(H) with a natural partial order as follows. For *<sup>A</sup>*, *<sup>B</sup>* <sup>∈</sup> *<sup>B</sup>*(H)*sa*, we write *<sup>A</sup> <sup>B</sup>* if *<sup>B</sup>* <sup>−</sup> *<sup>A</sup>* is a positive operator. The notation *<sup>T</sup>* <sup>∈</sup> *<sup>B</sup>*(H)<sup>+</sup> or *<sup>T</sup>* 0 means that *<sup>T</sup>* is a positive operator. The case that *<sup>T</sup>* 0 and *<sup>T</sup>* is invertible is denoted by *<sup>T</sup>* <sup>&</sup>gt; 0 or *<sup>T</sup>* <sup>∈</sup> *<sup>B</sup>*(H)++. Unless otherwise stated, every limit in *B*(H) is taken in the strong-operator topology. Write *An* → *A* to indicate that *An* converges strongly to *<sup>A</sup>*. If *An* is a sequence in *<sup>B</sup>*(H)*sa*, the expression *An* <sup>↓</sup> *<sup>A</sup>* means that *An* is a decreasing sequence and *An* → *A*. Similarly, *An* ↑ *A* tells us that *An* is increasing and *An* → *A*. We always reserve *A*, *B*, *C*, *D* for positive operators. The set of non-negative real numbers is denoted by **R**+.

**Remark 0.1.** It is important to note that if *An* is a decreasing sequence in *<sup>B</sup>*(H)*sa* such that *An A*, then *An* → *A* if and only if �*Anx*, *x*�→�*Ax*, *x*� for all *x* ∈ H. Note first that this sequence is convergent by the order completeness of *B*(H). For the sufficiency, if *x* ∈ H, then

$$\|(A\_{\mathfrak{n}} - A)^{1/2}\mathbf{x}\|^2 = \langle (A\_{\mathfrak{n}} - A)^{1/2}\mathbf{x}, (A\_{\mathfrak{n}} - A)^{1/2}\mathbf{x} \rangle = \langle (A\_{\mathfrak{n}} - A)\mathbf{x}, \mathbf{x} \rangle \to 0$$

and hence �(*An* − *A*)*x*� → 0.

The spectrum of *T* ∈ *B*(H) is defined by

Sp(*T*) = {*λ* ∈ **C** : *T* − *λI* is not invertible}.

Then Sp(*T*) is a nonempty compact Hausdorff space. Denote by *C*(Sp(*T*)) the C∗-algebra of continuous functions from Sp(*T*) to **C**. Let *T* ∈ *B*(H) be a normal operator and *z* : Sp(*T*) → **C** the inclusion. Then there exists a unique unital ∗-homomorphism *φ* : *C*(Sp(*T*)) → *B*(H) such that *φ*(*z*) = *T*, i.e.,


• *φ*( ¯ *f*)=(*φ*(*f*))<sup>∗</sup> for all *f* ∈ *C*(Sp(*T*))

$$\bullet \quad \phi(1) = I.$$

2 Will-be-set-by-IN-TECH

Various characterizations of means are obtained; one of them is a usual property of scalar mean, namely, the betweenness property. We provide some new properties of abstract operator connections, involving operator monotonicity and concavity, which include specific

For benefits of readers, we provide the development of the theory of operator means. In Section 2, we setup basic notations and state some background about operator monotone functions which play important roles in the theory of operator means. In Section 3, we consider the parallel sum together with its physical interpretation in electrical circuits. The arithmetic mean, the geometric mean and the harmonic mean of positive operators are investigated and characterized in Section 4. The original definition of connection is improved in Section 5 in such a way that the same theory is obtained. In Section 6, several characterizations and examples of Kubo-Ando means are given. We provide some new properties of general operator connections, related to operator monotonicity and concavity, in Section 7. Many operator versions of classical inequalities are obtained via the mean-theoretic

Throughout, let *B*(H) be the von Neumann algebra of bounded linear operators acting on a Hilbert space <sup>H</sup>. Let *<sup>B</sup>*(H)*sa* be the real vector space of self-adjoint operators on <sup>H</sup>. Equip *<sup>B</sup>*(H) with a natural partial order as follows. For *<sup>A</sup>*, *<sup>B</sup>* <sup>∈</sup> *<sup>B</sup>*(H)*sa*, we write *<sup>A</sup> <sup>B</sup>* if *<sup>B</sup>* <sup>−</sup> *<sup>A</sup>* is a positive operator. The notation *<sup>T</sup>* <sup>∈</sup> *<sup>B</sup>*(H)<sup>+</sup> or *<sup>T</sup>* 0 means that *<sup>T</sup>* is a positive operator. The case that *<sup>T</sup>* 0 and *<sup>T</sup>* is invertible is denoted by *<sup>T</sup>* <sup>&</sup>gt; 0 or *<sup>T</sup>* <sup>∈</sup> *<sup>B</sup>*(H)++. Unless otherwise stated, every limit in *B*(H) is taken in the strong-operator topology. Write *An* → *A* to indicate that *An* converges strongly to *<sup>A</sup>*. If *An* is a sequence in *<sup>B</sup>*(H)*sa*, the expression *An* <sup>↓</sup> *<sup>A</sup>* means that *An* is a decreasing sequence and *An* → *A*. Similarly, *An* ↑ *A* tells us that *An* is increasing and *An* → *A*. We always reserve *A*, *B*, *C*, *D* for positive operators. The set of non-negative

**Remark 0.1.** It is important to note that if *An* is a decreasing sequence in *<sup>B</sup>*(H)*sa* such that *An A*, then *An* → *A* if and only if �*Anx*, *x*�→�*Ax*, *x*� for all *x* ∈ H. Note first that this sequence is convergent by the order completeness of *B*(H). For the sufficiency, if *x* ∈ H, then

�(*An* <sup>−</sup> *<sup>A</sup>*)1/2*x*�<sup>2</sup> <sup>=</sup> �(*An* <sup>−</sup> *<sup>A</sup>*)1/2*x*,(*An* <sup>−</sup> *<sup>A</sup>*)1/2*x*� <sup>=</sup> �(*An* <sup>−</sup> *<sup>A</sup>*)*x*, *<sup>x</sup>*� → <sup>0</sup>

Sp(*T*) = {*λ* ∈ **C** : *T* − *λI* is not invertible}. Then Sp(*T*) is a nonempty compact Hausdorff space. Denote by *C*(Sp(*T*)) the C∗-algebra of continuous functions from Sp(*T*) to **C**. Let *T* ∈ *B*(H) be a normal operator and *z* : Sp(*T*) → **C** the inclusion. Then there exists a unique unital ∗-homomorphism *φ* : *C*(Sp(*T*)) → *B*(H) such

operator means as special cases.

approach in Section 8.

**2. Preliminaries**

real numbers is denoted by **R**+.

and hence �(*An* − *A*)*x*� → 0.

that *φ*(*z*) = *T*, i.e.,

• *φ* is linear

The spectrum of *T* ∈ *B*(H) is defined by

• *φ*(*f g*) = *φ*(*f*)*φ*(*g*) for all *f* , *g* ∈ *C*(Sp(*T*))

Moreover, *φ* is isometric. We call the unique isometric ∗-homomorphism which sends *f* ∈ *C*(Sp(*T*)) to *φ*(*f*) ∈ *B*(H) the *continuous functional calculus* of *T*. We write *f*(*T*) for *φ*(*f*).


A continuous real-valued function *f* on an interval *I* is called an *operator monotone function* if one of the following equivalent conditions holds:


This concept is introduced in [20]; see also [7, 10, 15, 16]. Every operator monotone function is always continuously differentiable and monotone increasing. Here are examples of operator monotone functions:


The next result is called the Löwner-Heinz's inequality [20].

**Theorem 0.3.** *For A*, *<sup>B</sup>* <sup>∈</sup> *<sup>B</sup>*(H)<sup>+</sup> *and r* <sup>∈</sup> [0, 1]*, if A B, then A<sup>r</sup> <sup>B</sup>r. That is the map t* �→ *<sup>t</sup> <sup>r</sup> is an operator monotone function on* **<sup>R</sup>**<sup>+</sup> *for any r* <sup>∈</sup> [0, 1]*.*

A key result about operator monotone functions is that there is a one-to-one correspondence between nonnegative operator monotone functions on **R**<sup>+</sup> and finite Borel measures on [0, ∞] via integral representations. We give a variation of this result in the next proposition.

**Proposition 0.4.** *A continuous function f* : **<sup>R</sup>**<sup>+</sup> <sup>→</sup> **<sup>R</sup>**<sup>+</sup> *is operator monotone if and only if there exists a finite Borel measure μ on* [0, 1] *such that*

$$f(\mathbf{x}) = \int\_{[0,1]} \mathbf{1} \mathbf{1}\_t \mathbf{x} \, d\mu(t), \quad \mathbf{x} \in \mathbb{R}^+. \tag{2}$$

*Here, the weighed harmonic mean* !*<sup>t</sup> is defined for a*, *b* > 0 *by*

$$a \, !\_{t} b = [(1 - t)a^{-1} + tb^{-1}]^{-1} \tag{3}$$

*and extended to a*, *b* 0 *by continuity. Moreover, the measure μ is unique. Hence, there is a one-to-one correspondence between operator monotone functions on the non-negative reals and finite Borel measures on the unit interval.*

*Proof.* Recall that a continuous function *<sup>f</sup>* : **<sup>R</sup>**<sup>+</sup> <sup>→</sup> **<sup>R</sup>**<sup>+</sup> is operator monotone if and only if there exists a unique finite Borel measure *ν* on [0, ∞] such that

$$f(\mathbf{x}) = \int\_{[0,\infty]} \phi\_{\mathbf{x}}(\lambda) \, d\nu(\lambda), \quad \mathbf{x} \in \mathbb{R}^+$$

where

$$\phi\_{\mathfrak{X}}(\lambda) = \frac{\mathfrak{x}(\lambda + 1)}{\mathfrak{x} + \lambda} \text{ for } \lambda > 0, \quad \phi\_{\mathfrak{X}}(0) = 1, \quad \phi\_{\mathfrak{X}}(\infty) = \infty.$$

Consider the Borel measurable function *<sup>ψ</sup>* : [0, 1] <sup>→</sup> [0, <sup>∞</sup>], *<sup>t</sup>* �→ *<sup>t</sup>* <sup>1</sup>−*<sup>t</sup>* . Then, for each *<sup>x</sup>* <sup>∈</sup> **<sup>R</sup>**+,

$$\begin{aligned} \int\_{[0,\infty]} \phi\_{\mathfrak{x}}(\lambda) \, d\nu(\lambda) &= \int\_{[0,1]} \phi\_{\mathfrak{x}} \circ \psi(t) \, d\nu \psi(t) \\ &= \int\_{[0,1]} \frac{\mathfrak{x}}{\mathfrak{x} - \mathfrak{x}t + t} \, d\nu \psi(t) \\ &= \int\_{[0,1]} 1 \, !\_{\mathfrak{f}} \, d\nu \psi(t) .\end{aligned}$$

Now, set *μ* = *νψ*. Since *ψ* is bijective, there is a one-to-one corresponsence between the finite Borel measures on [0, ∞] of the form *ν* and the finite Borel measures on [0, 1] of the form *νψ*. The map *f* �→ *μ* is clearly well-defined and bijective.

#### **3. Parallel sum: A notion from electrical networks**

In connections with electrical engineering, Anderson and Duffin [1] defined the *parallel sum* of two positive definite matrices *A* and *B* by

$$A:B=(A^{-1}+B^{-1})^{-1}.\tag{4}$$

The impedance of an electrical network can be represented by a positive (semi)definite matrix. If *A* and *B* are impedance matrices of multi-port networks, then the parallel sum *A* : *B* indicates the total impedance of two electrical networks connected in parallel. This notion plays a crucial role for analyzing multi-port electrical networks because many physical interpretations of electrical circuits can be viewed in a form involving parallel sums. This is a starting point of the study of matrix and operator means. This notion can be extended to invertible positive operators by the same formula.

**Lemma 0.5.** *Let A*, *<sup>B</sup>*, *<sup>C</sup>*, *<sup>D</sup>*, *An*, *Bn* <sup>∈</sup> *<sup>B</sup>*(H)++ *for all n* <sup>∈</sup> **<sup>N</sup>***.*

*(1) If An* <sup>↓</sup> *A, then A*−<sup>1</sup> *<sup>n</sup>* <sup>↑</sup> *<sup>A</sup>*−1*. If An* <sup>↑</sup> *A, then A*−<sup>1</sup> *<sup>n</sup>* <sup>↓</sup> *<sup>A</sup>*−1*.*


*Proof.* (1) Assume *An* <sup>↓</sup> *<sup>A</sup>*. Then *<sup>A</sup>*−<sup>1</sup> *<sup>n</sup>* is increasing and, for each *<sup>x</sup>* ∈ H,

$$\langle (A\_{\mathfrak{n}}^{-1} - A^{-1})\mathbf{x}, \mathbf{x} \rangle = \langle (A - A\_{\mathfrak{n}})A^{-1}\mathbf{x}, A\_{\mathfrak{n}}^{-1}\mathbf{x} \rangle \lesssim \|(A - A\_{\mathfrak{n}})A^{-1}\mathbf{x}\| \|A\_{\mathfrak{n}}^{-1}\| \|\mathbf{x}\| \to 0.$$

(2) Follow from (1).

4 Will-be-set-by-IN-TECH

*<sup>a</sup>* !*<sup>t</sup> <sup>b</sup>* = [(<sup>1</sup> <sup>−</sup> *<sup>t</sup>*)*a*−<sup>1</sup> <sup>+</sup> *tb*−1]

*and extended to a*, *b* 0 *by continuity. Moreover, the measure μ is unique. Hence, there is a one-to-one correspondence between operator monotone functions on the non-negative reals and finite*

*Proof.* Recall that a continuous function *<sup>f</sup>* : **<sup>R</sup>**<sup>+</sup> <sup>→</sup> **<sup>R</sup>**<sup>+</sup> is operator monotone if and only if

 [0,1]

= [0,1]

= [0,1]

Now, set *μ* = *νψ*. Since *ψ* is bijective, there is a one-to-one corresponsence between the finite Borel measures on [0, ∞] of the form *ν* and the finite Borel measures on [0, 1] of the form *νψ*.

In connections with electrical engineering, Anderson and Duffin [1] defined the *parallel sum* of

The impedance of an electrical network can be represented by a positive (semi)definite matrix. If *A* and *B* are impedance matrices of multi-port networks, then the parallel sum *A* : *B* indicates the total impedance of two electrical networks connected in parallel. This notion plays a crucial role for analyzing multi-port electrical networks because many physical interpretations of electrical circuits can be viewed in a form involving parallel sums. This is a starting point of the study of matrix and operator means. This notion can be extended to

*<sup>φ</sup>x*(*λ*) *<sup>d</sup>ν*(*λ*), *<sup>x</sup>* <sup>∈</sup> **<sup>R</sup>**<sup>+</sup>

*<sup>x</sup>* <sup>+</sup> *<sup>λ</sup>* for *<sup>λ</sup>* <sup>&</sup>gt; 0, *<sup>φ</sup>x*(0) = 1, *<sup>φ</sup>x*(∞) = *<sup>x</sup>*.

*φ<sup>x</sup>* ◦ *ψ*(*t*) *dνψ*(*t*)

*dνψ*(*t*)

*A* : *B* = (*A*−<sup>1</sup> + *B*−1)<sup>−</sup>1. (4)

*x x* − *xt* + *t*

1 !*<sup>t</sup> x dνψ*(*t*).

<sup>−</sup><sup>1</sup> (3)

<sup>1</sup>−*<sup>t</sup>* . Then, for each *<sup>x</sup>* <sup>∈</sup> **<sup>R</sup>**+,

*Here, the weighed harmonic mean* !*<sup>t</sup> is defined for a*, *b* > 0 *by*

there exists a unique finite Borel measure *ν* on [0, ∞] such that

*<sup>φ</sup>x*(*λ*) = *<sup>x</sup>*(*<sup>λ</sup>* <sup>+</sup> <sup>1</sup>)

 [0,∞]

The map *f* �→ *μ* is clearly well-defined and bijective.

invertible positive operators by the same formula.

**Lemma 0.5.** *Let A*, *<sup>B</sup>*, *<sup>C</sup>*, *<sup>D</sup>*, *An*, *Bn* <sup>∈</sup> *<sup>B</sup>*(H)++ *for all n* <sup>∈</sup> **<sup>N</sup>***.*

*(1) If An* <sup>↓</sup> *A, then A*−<sup>1</sup> *<sup>n</sup>* <sup>↑</sup> *<sup>A</sup>*−1*. If An* <sup>↑</sup> *A, then A*−<sup>1</sup> *<sup>n</sup>* <sup>↓</sup> *<sup>A</sup>*−1*.*

two positive definite matrices *A* and *B* by

**3. Parallel sum: A notion from electrical networks**

*f*(*x*) =

Consider the Borel measurable function *<sup>ψ</sup>* : [0, 1] <sup>→</sup> [0, <sup>∞</sup>], *<sup>t</sup>* �→ *<sup>t</sup>*

 [0,∞]

*φx*(*λ*) *dν*(*λ*) =

*Borel measures on the unit interval.*

where

(3) Let *An*, *Bn* <sup>∈</sup> *<sup>B</sup>*(H)++ be such that *An* <sup>↓</sup> *<sup>A</sup>* and *Bn* <sup>↓</sup> *<sup>A</sup>* where *<sup>A</sup>*, *<sup>B</sup>* <sup>&</sup>gt; 0. Then *<sup>A</sup>*−<sup>1</sup> *<sup>n</sup>* <sup>↑</sup> *<sup>A</sup>*−<sup>1</sup> and *<sup>B</sup>*−<sup>1</sup> *<sup>n</sup>* <sup>↑</sup> *<sup>B</sup>*−1. So, *<sup>A</sup>*−<sup>1</sup> *<sup>n</sup>* <sup>+</sup> *<sup>B</sup>*−<sup>1</sup> *<sup>n</sup>* is an increasing sequence in *<sup>B</sup>*(H)<sup>+</sup> such that

$$A\_n^{-1} + B\_n^{-1} \to A^{-1} + B^{-1}\prime$$

i.e. *<sup>A</sup>*−<sup>1</sup> *<sup>n</sup>* <sup>+</sup> *<sup>B</sup>*−<sup>1</sup> *<sup>n</sup>* <sup>↑</sup> *<sup>A</sup>*−<sup>1</sup> <sup>+</sup> *<sup>B</sup>*−1. By (1), we thus have (*A*−<sup>1</sup> *<sup>n</sup>* <sup>+</sup> *<sup>B</sup>*−<sup>1</sup> *<sup>n</sup>* )−<sup>1</sup> <sup>↓</sup> (*A*−<sup>1</sup> <sup>+</sup> *<sup>B</sup>*−1)−1.

(4) Let *An*, *Bn* <sup>∈</sup> *<sup>B</sup>*(H)++ be such that *An* <sup>↓</sup> *<sup>A</sup>* and *Bn* <sup>↓</sup> *<sup>B</sup>*. Then, by (2), *An* : *Bn* is a decreasing sequence of positive operators. The order completeness of *B*(H) guaruntees the existence of the strong limit of *An* : *Bn*. Let *A*� *<sup>n</sup>* and *B*� *<sup>n</sup>* be another sequences such that *A*� *<sup>n</sup>* ↓ *A* and *B*� *<sup>n</sup>* ↓ *B*. Note that for each *n*, *m* ∈ **N**, we have *An An* + *A*� *<sup>m</sup>* − *A* and *Bn Bn* + *B*� *<sup>m</sup>* − *B*. Then

$$A\_{\mathfrak{n}} : B\_{\mathfrak{n}} \leqslant \left( A\_{\mathfrak{n}} + A\_{\mathfrak{m}}' - A \right) : (B\_{\mathfrak{n}} + B\_{\mathfrak{m}}' - B).$$

Note that as *n* → ∞, *An* + *A*� *<sup>m</sup>* − *A* → *A*� *<sup>m</sup>* and *Bn* + *B*� *<sup>m</sup>* − *B* → *B*� *<sup>m</sup>*. We have that as *n* → ∞,

> (*An* + *A*� *<sup>m</sup>* − *A*) : (*Bn* + *B*� *<sup>m</sup>* − *B*) → *A*� *<sup>m</sup>* : *B*� *m*.

Hence, lim*n*→<sup>∞</sup> *An* : *Bn A*� *<sup>m</sup>* : *B*� *<sup>m</sup>* and lim*n*→<sup>∞</sup> *An* : *Bn* lim*m*→<sup>∞</sup> *A*� *<sup>m</sup>* : *B*� *<sup>m</sup>*. By symmetry, lim*n*→<sup>∞</sup> *An* : *Bn* lim*m*→<sup>∞</sup> *A*� *<sup>m</sup>* : *B*� *m*.

We define the *parallel sum* of *A*, *B* 0 to be

$$A:B = \lim\_{\epsilon \downarrow 0} (A + \epsilon I):(B + \epsilon I) \tag{5}$$

where the limit is taken in the strong-operator topology.

**Lemma 0.6.** *For each x* ∈ H*,*

$$<\langle (A:B)\mathbf{x}, \mathbf{x} \rangle = \inf \{ \langle Ay, y \rangle + \langle Bz, z \rangle : y, z \in \mathcal{H}, y + z = \mathbf{x} \}. \tag{6}$$

*Proof.* First, assume that *A*, *B* are invertible. Then for all *x*, *y* ∈ H,

$$\begin{aligned} &\langle Ay, y \rangle + \langle B(x - y), x - y \rangle - \langle (A : B)x, x \rangle \\ &= \langle Ay, y \rangle + \langle Bx, x \rangle - 2\text{Re}\langle Bx, y \rangle + \langle By, y \rangle - \langle (B - B(A + B)^{-1}B)x, x \rangle \\ &= \langle (A + B)y, y \rangle - 2\text{Re}\langle Bx, y \rangle + \langle (A + B)^{-1}Bx, Bx \rangle \\ &= \|(A + B)^{1/2}y\|^2 - 2\text{Re}\langle Bx, y \rangle + \|(A + B)^{-1/2}Bx\|^2 \\ &\ge 0. \end{aligned}$$

With *y* = (*A* + *B*)−1*Bx*, we have

�*Ay*, *y*� + �*B*(*x* − *y*), *x* − *y*�−�(*A* : *B*)*x*, *x*� = 0.

Hence, we have the claim for *A*, *B* > 0. For *A*, *B* 0, consider *A* + *I* and *B* + *I* where ↓ 0.

**Remark 0.7.** This lemma has a physical interpretation, called the *Maxwell's minimum power principle*. Recall that a positive operator represents the impedance of a electrical network while the power dissipation of network with impedance *A* and current *x* is the inner product �*Ax*, *x*�. Consider two electrical networks connected in parallel. For a given current input *x*, the current will divide *x* = *y* + *z*, where *y* and *z* are currents of each network, in such a way that the power dissipation is minimum.

**Theorem 0.8.** *The parallel sum satisfies*


*Proof.* (1) The monotonicity follows from the formula (5) and Lemma 0.5(2).

(2) For each *x*, *y*, *z* ∈ H such that *x* = *y* + *z*, by the previous lemma,

$$
\begin{aligned}
\langle \mathcal{S}^\*(A:B)Sx, x \rangle &= \langle (A:B)Sx, Sx \rangle \\
&\le \langle ASy, Sy \rangle + \langle S^\*BSz, z \rangle \\
&= \langle \mathcal{S}^\*ASy, y \rangle + \langle \mathcal{S}^\*BSz, z \rangle.
\end{aligned}
$$

Again, the previous lemma assures *S*∗(*A* : *B*)*S* (*S*∗*AS*) : (*S*∗*BS*).

(3) Let *An* and *Bn* be decreasing sequences in *<sup>B</sup>*(H)<sup>+</sup> such that *An* <sup>↓</sup> *<sup>A</sup>* and *Bn* <sup>↓</sup> *<sup>B</sup>*. Then *An* : *Bn* is decreasing and *A* : *B An* : *Bn* for all *n* ∈ **N**. We have that, by the joint monotonicity of parallel sum, for all > 0

$$A\_n: B\_n \leqslant (A\_n + \epsilon I): (B\_n + \epsilon I).$$

Since *An* + *I* ↓ *A* + *I* and *Bn* + *I* ↓ *B* + *I*, by Lemma 3.1.4(3) we have *An* : *Bn* ↓ *A* : *B*.

**Remark 0.9.** The positive operator *S*∗*AS* represents the impedance of a network connected to a transformer. The transformer inequality means that the impedance of parallel connection with transformer first is greater than that with transformer last.

**Proposition 0.10.** *The set of positive operators on* H *is a partially ordered commutative semigroup with respect to the parallel sum.*

*Proof.* For *A*, *B*, *C* > 0, we have (*A* : *B*) : *C* = *A* : (*B* : *C*) and *A* : *B* = *B* : *A*. The continuity from above in Theorem 0.8 implies that (*A* : *B*) : *C* = *A* : (*B* : *C*) and *A* : *B* = *B* : *A* for all *A*, *B*, *C* 0. The monotonicity of the parallel sum means that the positive operators form a partially ordered semigroup.

**Theorem 0.11.** *For A*, *B*, *C*, *D* 0*, we have the series-parallel inequality*

$$(A+B):(\mathbb{C}+D)\geqslant A:\mathbb{C}+B:D.\tag{7}$$

*In other words, the parallel sum is concave.*

6 Will-be-set-by-IN-TECH

�*Ay*, *y*� + �*B*(*x* − *y*), *x* − *y*�−�(*A* : *B*)*x*, *x*� = 0. Hence, we have the claim for *A*, *B* > 0. For *A*, *B* 0, consider *A* + *I* and *B* + *I* where

**Remark 0.7.** This lemma has a physical interpretation, called the *Maxwell's minimum power principle*. Recall that a positive operator represents the impedance of a electrical network while the power dissipation of network with impedance *A* and current *x* is the inner product �*Ax*, *x*�. Consider two electrical networks connected in parallel. For a given current input *x*, the current will divide *x* = *y* + *z*, where *y* and *z* are currents of each network, in such a way that the power

With *y* = (*A* + *B*)−1*Bx*, we have

dissipation is minimum.

**Theorem 0.8.** *The parallel sum satisfies*

monotonicity of parallel sum, for all > 0

*with respect to the parallel sum.*

partially ordered semigroup.

*(1) monotonicity: A*<sup>1</sup> *A*2, *B*<sup>1</sup> *B*<sup>2</sup> ⇒ *A*<sup>1</sup> : *B*<sup>1</sup> *A*<sup>2</sup> : *B*2*.*

*(2) transformer inequality: S*∗(*A* : *B*)*S* (*S*∗*AS*) : (*S*∗*BS*) *for every S* ∈ *B*(H)*.*

*Proof.* (1) The monotonicity follows from the formula (5) and Lemma 0.5(2).

�*S*∗(*A* : *B*)*Sx*, *x*� = �(*A* : *B*)*Sx*, *Sx*�

(3) Let *An* and *Bn* be decreasing sequences in *<sup>B</sup>*(H)<sup>+</sup> such that *An* <sup>↓</sup> *<sup>A</sup>* and *Bn* <sup>↓</sup> *<sup>B</sup>*. Then *An* : *Bn* is decreasing and *A* : *B An* : *Bn* for all *n* ∈ **N**. We have that, by the joint

*An* : *Bn* (*An* + *I*) : (*Bn* + *I*).

**Remark 0.9.** The positive operator *S*∗*AS* represents the impedance of a network connected to a transformer. The transformer inequality means that the impedance of parallel connection

**Proposition 0.10.** *The set of positive operators on* H *is a partially ordered commutative semigroup*

*Proof.* For *A*, *B*, *C* > 0, we have (*A* : *B*) : *C* = *A* : (*B* : *C*) and *A* : *B* = *B* : *A*. The continuity from above in Theorem 0.8 implies that (*A* : *B*) : *C* = *A* : (*B* : *C*) and *A* : *B* = *B* : *A* for all *A*, *B*, *C* 0. The monotonicity of the parallel sum means that the positive operators form a

Since *An* + *I* ↓ *A* + *I* and *Bn* + *I* ↓ *B* + *I*, by Lemma 3.1.4(3) we have *An* : *Bn* ↓ *A* : *B*.

 �*ASy*, *Sy*� + �*S*∗*BSz*, *z*� = �*S*∗*ASy*, *y*� + �*S*∗*BSz*, *z*�.

*(3) continuity from above: if An* ↓ *A and Bn* ↓ *B, then An* : *Bn* ↓ *A* : *B.*

(2) For each *x*, *y*, *z* ∈ H such that *x* = *y* + *z*, by the previous lemma,

Again, the previous lemma assures *S*∗(*A* : *B*)*S* (*S*∗*AS*) : (*S*∗*BS*).

with transformer first is greater than that with transformer last.

↓ 0.

*Proof.* For each *x*, *y*, *z* ∈ H such that *x* = *y* + *z*, we have by the previous lemma that

$$
\begin{aligned}
\langle (A:\mathbb{C} + B:D)\mathbf{x}, \mathbf{x} \rangle &= \langle (A:\mathbb{C})\mathbf{x}, \mathbf{x} \rangle + \langle (B:D)\mathbf{x}, \mathbf{x} \rangle \\
&\leqslant \langle Ay, y \rangle + \langle \mathbb{C}z, z \rangle + \langle By, y \rangle + \langle Dz, z \rangle \\
&= \langle (A + B)y, y \rangle + \langle (\mathbb{C} + D)z, z \rangle.
\end{aligned}
$$

Applying the previous lemma yields (*A* + *B*) : (*C* + *D*) *A* : *C* + *B* : *D*.

**Remark 0.12.** The ordinary sum of operators represents the total impedance of two networks with series connection while the parallel sum indicates the total impedance of two networks with parallel connection. So, the series-parallel inequality means that the impedance of a series-parallel connection is greater than that of a parallel-series connection.

#### **4. Classical means: arithmetic, harmonic and geometric means**

Some desired properties of any object that is called a "mean" *<sup>M</sup>* on *<sup>B</sup>*(H)<sup>+</sup> should have are given here.


(A4). *transformer inequality*: *X*∗*M*(*A*, *B*)*X M*(*X*∗*AX*, *X*∗*BX*) for *X* ∈ *B*(H);


In order to study matrix or operator means in general, the first step is to consider three classical means in mathematics, namely, arithmetic, geometric and harmonic means.

The *arithmetic mean* of *<sup>A</sup>*, *<sup>B</sup>* <sup>∈</sup> *<sup>B</sup>*(H)<sup>+</sup> is defined by

$$A \sqcap B = \frac{1}{2}(A+B). \tag{8}$$

Then the arithmetic mean satisfies the properties (A1)–(A9). In fact, the properties (A5) and (A6) can be replaced by a stronger condition:

*X*∗*M*(*A*, *B*)*X* = *M*(*X*∗*AX*, *X*∗*BX*) for all *X* ∈ *B*(H).

Moreover, the arithmetic mean satisfies

*affinity*: *<sup>M</sup>*(*kA* <sup>+</sup> *<sup>C</sup>*, *kB* <sup>+</sup> *<sup>C</sup>*) = *kM*(*A*, *<sup>B</sup>*) + *<sup>C</sup>* for *<sup>k</sup>* <sup>∈</sup> **<sup>R</sup>**+.

Define the *harmonic mean* of positive operators *<sup>A</sup>*, *<sup>B</sup>* <sup>∈</sup> *<sup>B</sup>*(H)<sup>+</sup> by

$$A \, !B = 2(A : B) = \lim\_{\epsilon \downarrow 0} 2(A\_{\epsilon}^{-1} + B\_{\epsilon}^{-1})^{-1} \tag{9}$$

where *A* ≡ *A* + *I* and *B* ≡ *B* + *I*. Then the harmonic mean satisfies the properties (A1)–(A9).

The geometric mean of matrices is defined in [23] and studied in details in [3]. A usage of congruence transformations for treating geometric means is given in [18]. For a given invertible operator *C* ∈ *B*(H), define

$$
\Gamma\_{\mathbb{C}} : B(\mathcal{H})^{sa} \to B(\mathcal{H})^{sa}, A \mapsto \mathbb{C}^\* A \mathbb{C}.
$$

Then each Γ*<sup>C</sup>* is a linear isomorphism with inverse Γ*C*−<sup>1</sup> and is called a *congruence transformation*. The set of congruence transformations is a group under multiplication. Each congruence transformation preserves positivity, invertibility and, hence, strictly positivity. In fact, <sup>Γ</sup>*<sup>C</sup>* maps *<sup>B</sup>*(H)<sup>+</sup> and *<sup>B</sup>*(H)++ onto themselves. Note also that <sup>Γ</sup>*<sup>C</sup>* is order-preserving.

Define the *geometric mean* of *A*, *B* > 0 by

$$A \# B = A^{1/2} (A^{-1/2} B A^{-1/2})^{1/2} A^{1/2} = \Gamma\_{A^{1/2}} \circ \Gamma\_{A^{-1/2}}^{1/2} (B). \tag{10}$$

Then *A* # *B* > 0 for *A*, *B* > 0. This formula comes from two natural requirements: This definition should coincide with the usual geometric mean in **R**+: *A* # *B* = (*AB*)1/2 provided that *AB* = *BA*. The second condition is that, for any invertible *T* ∈ *B*(H),

$$T^\*(A \# B)T = (T^\*AT) \# (T^\*BT). \tag{11}$$

The next theorem characterizes the geometric mean of *A* and *B* in term of the solution of a certain operator equation.

**Theorem 0.13.** *For each A*, *B* > 0*, the Riccati equation* Γ*X*(*A*−1) := *XA*−1*X* = *B has a unique positive solution, namely, X* = *A* # *B.*

*Proof.* The direct computation shows that (*A* # *B*)*A*−1(*A* # *B*) = *B*. Suppose there is another positive solution *Y* 0. Then

$$(A^{-1/2}XA^{-1/2})^2 = A^{-1/2}XA^{-1}XA^{-1/2} = A^{-1/2}YA^{-1}YA^{-1/2} = (A^{-1/2}YA^{-1/2})^2.$$

The uniqueness of positive square roots implies that *A*−1/2*XA*−1/2 = *A*−1/2*YA*−1/2, i.e., *X* = *Y*.

**Theorem 0.14** (Maximum property of geometric mean)**.** *For A*, *B* > 0*,*

$$A \# B = \max \{ X \geqslant 0 \, : \, XA^{-1}X \leqslant B \} \tag{12}$$

*where the maximum is taken with respect to the positive semidefinite ordering.*

*Proof.* If *XA*−1*X B*, then

8 Will-be-set-by-IN-TECH

↓0

where *A* ≡ *A* + *I* and *B* ≡ *B* + *I*. Then the harmonic mean satisfies the properties

The geometric mean of matrices is defined in [23] and studied in details in [3]. A usage of congruence transformations for treating geometric means is given in [18]. For a given

<sup>Γ</sup>*<sup>C</sup>* : *<sup>B</sup>*(H)*sa* <sup>→</sup> *<sup>B</sup>*(H)*sa*, *<sup>A</sup>* �→ *<sup>C</sup>*∗*AC*.

Then each Γ*<sup>C</sup>* is a linear isomorphism with inverse Γ*C*−<sup>1</sup> and is called a *congruence transformation*. The set of congruence transformations is a group under multiplication. Each congruence transformation preserves positivity, invertibility and, hence, strictly positivity. In fact, <sup>Γ</sup>*<sup>C</sup>* maps *<sup>B</sup>*(H)<sup>+</sup> and *<sup>B</sup>*(H)++ onto themselves. Note also that <sup>Γ</sup>*<sup>C</sup>* is order-preserving.

*<sup>A</sup>* # *<sup>B</sup>* <sup>=</sup> *<sup>A</sup>*1/2(*A*−1/2*BA*−1/2)1/2*A*1/2 <sup>=</sup> <sup>Γ</sup>*A*1/2 ◦ <sup>Γ</sup>1/2

that *AB* = *BA*. The second condition is that, for any invertible *T* ∈ *B*(H),

**Theorem 0.14** (Maximum property of geometric mean)**.** *For A*, *B* > 0*,*

*where the maximum is taken with respect to the positive semidefinite ordering.*

Then *A* # *B* > 0 for *A*, *B* > 0. This formula comes from two natural requirements: This definition should coincide with the usual geometric mean in **R**+: *A* # *B* = (*AB*)1/2 provided

The next theorem characterizes the geometric mean of *A* and *B* in term of the solution of a

**Theorem 0.13.** *For each A*, *B* > 0*, the Riccati equation* Γ*X*(*A*−1) := *XA*−1*X* = *B has a unique*

*Proof.* The direct computation shows that (*A* # *B*)*A*−1(*A* # *B*) = *B*. Suppose there is another

(*A*−1/2*XA*−1/2)<sup>2</sup> = *A*−1/2*XA*−1*XA*−1/2 = *A*−1/2*YA*−1*YA*−1/2 = (*A*−1/2*YA*−1/2)2.

The uniqueness of positive square roots implies that *A*−1/2*XA*−1/2 = *A*−1/2*YA*−1/2, i.e.,

*T*∗(*A* # *B*)*T* = (*T*∗*AT*) # (*T*∗*BT*). (11)

*<sup>A</sup>* # *<sup>B</sup>* <sup>=</sup> max{*<sup>X</sup>* 0 : *XA*−1*<sup>X</sup> <sup>B</sup>*} (12)

2(*A*−<sup>1</sup>

+ *<sup>B</sup>*−<sup>1</sup> )−<sup>1</sup> (9)

*<sup>A</sup>*−1/2 (*B*). (10)

Moreover, the arithmetic mean satisfies

invertible operator *C* ∈ *B*(H), define

Define the *geometric mean* of *A*, *B* > 0 by

certain operator equation.

*positive solution, namely, X* = *A* # *B.*

positive solution *Y* 0. Then

*X* = *Y*.

(A1)–(A9).

*affinity*: *<sup>M</sup>*(*kA* <sup>+</sup> *<sup>C</sup>*, *kB* <sup>+</sup> *<sup>C</sup>*) = *kM*(*A*, *<sup>B</sup>*) + *<sup>C</sup>* for *<sup>k</sup>* <sup>∈</sup> **<sup>R</sup>**+.

Define the *harmonic mean* of positive operators *<sup>A</sup>*, *<sup>B</sup>* <sup>∈</sup> *<sup>B</sup>*(H)<sup>+</sup> by

*A* ! *B* = 2(*A* : *B*) = lim

$$(A^{-1/2}XA^{-1/2})^2 = A^{-1/2}XA^{-1}XA^{-1/2} \lesssim A^{-1/2}BA^{-1/2}$$

and *A*−1/2*XA*−1/2 (*A*−1/2*BA*−1/2)1/2 i.e. *X A* # *B* by Theorem 0.3.

Recall the fact that if *f* : [*a*, *b*] → **C** is continuous and *An* → *A* with Sp(*An*) ⊆ [*a*, *b*] for all *n* ∈ **N**, then Sp(*A*) ⊆ [*a*, *b*] and *f*(*An*) → *f*(*A*).

**Lemma 0.15.** *Let A*, *<sup>B</sup>*, *<sup>C</sup>*, *<sup>D</sup>*, *An*, *Bn* <sup>∈</sup> *<sup>B</sup>*(H)++ *for all n* <sup>∈</sup> **<sup>N</sup>***.*


*Proof.* (1) The extremal characterization allows us to prove only that (*A* # *B*)*C*−1(*A* # *B*) *D*. Indeed,

$$\begin{aligned} (A \# B) \mathbb{C}^{-1} (A \# B) &= A^{1/2} (A^{-1/2} B A^{-1/2})^{1/2} A^{1/2} \mathbb{C}^{-1} A^{1/2} (A^{-1/2} B A^{-1/2})^{1/2} A^{1/2} \\ &\leqslant A^{1/2} (A^{-1/2} B A^{-1/2})^{1/2} A^{1/2} A^{-1} A^{1/2} (A^{-1/2} B A^{-1/2})^{1/2} A^{1/2} \\ &= \mathcal{B} \\ &\leqslant D. \end{aligned}$$

(2) Assume *An* ↓ *A* and *Bn* ↓ *B*. Then *An* # *Bn* is a decreasing sequence of strictly positive operators which is bounded below by 0. The order completeness of *B*(H) implies that this sequence converges strongly to a positive operator. Since *A*−<sup>1</sup> *<sup>n</sup> A*−1, the Löwner-Heinz's inequality assures that *<sup>A</sup>*−1/2 *<sup>n</sup> <sup>A</sup>*−1/2 and hence �*A*−1/2 *<sup>n</sup>* � �*A*−1/2� for all *<sup>n</sup>* <sup>∈</sup> **<sup>N</sup>**. Note also that �*Bn*� �*B*1� for all *n* ∈ **N**. Recall that the multiplication is jointly continuous in the strong-operator topology if the first variable is bounded in norm. So, *<sup>A</sup>*−1/2 *<sup>n</sup> BnA*−1/2 *<sup>n</sup>* converges strongly to *A*−1/2*BA*−1/2. It follows that

$$(A\_n^{-1/2}B\_nA\_n^{-1/2})^{1/2} \to (A^{-1/2}BA^{-1/2})^{1/2}.$$

Since *A*1/2 *<sup>n</sup>* is norm-bounded by �*A*1/2� by Löwner-Heinz's inequality, we conclude that

$$A\_n^{1/2} (A\_n^{-1/2} \mathcal{B}\_n A\_n^{-1/2})^{1/2} A\_n^{1/2} \to A^{1/2} (A^{-1/2} \mathcal{B} A^{-1/2})^{1/2} A^{1/2}.$$

The proof of (3) is just the same as the case of harmonic mean.

We define the *geometric mean* of *A*, *B* 0 by

$$A \# B = \lim\_{\epsilon \downarrow 0} (A + \epsilon I) \# (B + \epsilon I). \tag{13}$$

Then *A* # *B* 0 for any *A*, *B* 0.

**Theorem 0.16.** *The geometric mean enjoys the following properties*


*Proof.* (1) Use the formula (13) and Lemma 0.15 (1).

(2) Follows from Lemma 0.15 and the definition of the geometric mean.

(3) The unique positive solution to the equation *XA*−1*X* = *A* is *X* = *A*.

(4) The unique positive solution to the equation *X*−1*A*−1*X*−<sup>1</sup> = *B* is *X*−<sup>1</sup> = *A* # *B*. But this equstion is equivalent to *XAX* = *B*−1. So, *A*−<sup>1</sup> # *B*−<sup>1</sup> = *X* = (*A* # *B*)−1.

(5) The equation *XA*−1*X* = *B* has the same solution to the equation *XB*−1*X* = *A* by taking inverse in both sides.

(6) We have

$$\begin{aligned} \Gamma\_{\mathbb{C}}(A \# B)(\Gamma\_{\mathbb{C}}(A))^{-1} \Gamma\_{\mathbb{C}}(A \# B) &= \Gamma\_{\mathbb{C}}(A \# B) \Gamma\_{\mathbb{C}^{-1}}(A^{-1}) \Gamma\_{\mathbb{C}}(A \# B) \\ &= \Gamma\_{\mathbb{C}}((A \# B)A^{-1}(A \# B)) \\ &= \Gamma\_{\mathbb{C}}(B). \end{aligned}$$

Then apply Theorem 0.13.

The congruence invariance asserts that <sup>Γ</sup>*<sup>C</sup>* is an isomorphism on *<sup>B</sup>*(H)++ with respect to the operation of taking the geometric mean.

**Lemma 0.17.** *For A* > 0 *and B* 0*, the operator*

$$
\begin{pmatrix} A & \mathbf{C} \\ \mathbf{C}^\* \ \mathbf{B} \end{pmatrix}
$$

*is positive if and only if B* <sup>−</sup> *<sup>C</sup>*∗*A*−1*C is positive, i.e., B <sup>C</sup>*∗*A*−1*C.*

*Proof.* By setting

$$X = \begin{pmatrix} I & -A^{-1}C \\ 0 & I \end{pmatrix} \cdot$$

we compute

$$\begin{aligned} \Gamma\_X \begin{pmatrix} A & \mathbb{C} \\ \mathbb{C}^\* \end{pmatrix} &= \begin{pmatrix} I & 0 \\ -\mathbb{C}^\* A^{-1} & I \end{pmatrix} \begin{pmatrix} A & \mathbb{C} \\ \mathbb{C}^\* B \end{pmatrix} \begin{pmatrix} I - A^{-1} \mathbb{C} \\ 0 & I \end{pmatrix} \\ &= \begin{pmatrix} A & 0 \\ 0 \ B - \mathbb{C}^\* A^{-1} \mathbb{C} \end{pmatrix} .\end{aligned}$$

Since Γ*<sup>G</sup>* preserves positivity, we obtain the desired result.

**Theorem 0.18.** *The geometric mean A* # *B of A*, *<sup>B</sup>* <sup>∈</sup> *<sup>B</sup>*(H)<sup>+</sup> *is the largest operator X* <sup>∈</sup> *<sup>B</sup>*(H)*sa for which the operator*

$$
\begin{pmatrix} A & X \\ X^\* & B \end{pmatrix} \tag{14}
$$

*is positive.*

10 Will-be-set-by-IN-TECH

(4) The unique positive solution to the equation *X*−1*A*−1*X*−<sup>1</sup> = *B* is *X*−<sup>1</sup> = *A* # *B*. But this

(5) The equation *XA*−1*X* = *B* has the same solution to the equation *XB*−1*X* = *A* by taking

<sup>Γ</sup>*C*(*<sup>A</sup>* # *<sup>B</sup>*)(Γ*C*(*A*))−1Γ*C*(*<sup>A</sup>* # *<sup>B</sup>*) = <sup>Γ</sup>*C*(*<sup>A</sup>* # *<sup>B</sup>*)Γ*C*−<sup>1</sup> (*A*−1)Γ*C*(*<sup>A</sup>* # *<sup>B</sup>*)

The congruence invariance asserts that <sup>Γ</sup>*<sup>C</sup>* is an isomorphism on *<sup>B</sup>*(H)++ with respect to the

 *A C C*∗ *B*

 *<sup>I</sup>* <sup>−</sup>*A*−1*<sup>C</sup>* 0 *I*

 ,

 *A C C*∗ *B*

> .

 *<sup>I</sup>* <sup>−</sup>*A*−1*<sup>C</sup>* 0 *I*

= Γ*C*(*B*).

= Γ*C*((*A* # *B*)*A*−1(*A* # *B*))

**Theorem 0.16.** *The geometric mean enjoys the following properties*

*(6) congruence invariance:* Γ*C*(*A*) # Γ*C*(*B*) = Γ*C*(*A* # *B*) *for all invertible C.*

(2) Follows from Lemma 0.15 and the definition of the geometric mean. (3) The unique positive solution to the equation *XA*−1*X* = *A* is *X* = *A*.

equstion is equivalent to *XAX* = *B*−1. So, *A*−<sup>1</sup> # *B*−<sup>1</sup> = *X* = (*A* # *B*)−1.

*(1) monotonicity: A*<sup>1</sup> *A*2, *B*<sup>1</sup> *B*<sup>2</sup> ⇒ *A*<sup>1</sup> # *B*<sup>1</sup> *A*<sup>2</sup> # *B*2*. (2) continuity from above: An* ↓ *A*, *Bn* ↓ *B* ⇒ *An* # *Bn* ↓ *A* # *B.*

*Proof.* (1) Use the formula (13) and Lemma 0.15 (1).

*(3) fixed point property: A* # *A* = *A. (4) self-duality:* (*A* # *B*)−<sup>1</sup> = *A*−<sup>1</sup> # *B*−1*.*

*(5) symmetry: A* # *B* = *B* # *A.*

inverse in both sides.

Then apply Theorem 0.13.

*Proof.* By setting

we compute

operation of taking the geometric mean.

Γ*X*

 *A C C*∗ *B*

**Lemma 0.17.** *For A* > 0 *and B* 0*, the operator*

*is positive if and only if B* <sup>−</sup> *<sup>C</sup>*∗*A*−1*C is positive, i.e., B <sup>C</sup>*∗*A*−1*C.*

 =

Since Γ*<sup>G</sup>* preserves positivity, we obtain the desired result.

=

*X* =

 *I* 0 <sup>−</sup>*C*∗*A*−<sup>1</sup> *<sup>I</sup>*

*A* 0

<sup>0</sup> *<sup>B</sup>* <sup>−</sup> *<sup>C</sup>*∗*A*−1*<sup>C</sup>*

(6) We have

*Proof.* By continuity argumeny, we may assume that *A*, *B* > 0. If *X* = *A* # *B*, then the operator (14) is positive by Lemma 0.17. Let *<sup>X</sup>* <sup>∈</sup> *<sup>B</sup>*(H)*sa* be such that the operator (14) is positive. Then Lemma 0.17 again implies that *XA*−1*X B* and

$$(A^{-1/2}XA^{-1/2})^2 = A^{-1/2}XA^{-1}XA^{-1/2} \lesssim A^{-1/2}BA^{-1/2}A$$

The Löwner-Heinz's inequality forces *A*−1/2*XA*−1/2 (*A*−1/2*BA*−1/2)1/2. Now, applying Γ*A*1/2 yields *X A* # *B*.

**Remark 0.19.** The arithmetric mean and the harmonic mean can be easily defined for multivariable positive operators. The case of geometric mean is not easy, even for the case of matrices. Many authors tried to defined geometric means for multivariable positive semidefinite matrices but there is no satisfactory definition until 2004 in [5].

#### **5. Operator connections**

We see that the arithmetic, harmonic and geometric means share the properties (A1)–(A9) in common. A mean in general should have algebraic, order and topological properties. Kubo and Ando [17] proposed the following definition:

**Definition 0.20.** <sup>A</sup> *connection* on *<sup>B</sup>*(H)<sup>+</sup> is a binary operation *<sup>σ</sup>* on *<sup>B</sup>*(H)<sup>+</sup> satisfying the following axioms for all *A*, *A*� , *B*, *B*� , *<sup>C</sup>* <sup>∈</sup> *<sup>B</sup>*(H)+:


The term "connection" comes from the study of electrical network connections.

**Example 0.21.** The following are examples of connections:


From now on, assume dim H = ∞. Consider the following property:

(M3') *separate continuity from above*: if *An*, *Bn* <sup>∈</sup> *<sup>B</sup>*(H)<sup>+</sup> satisfy *An* <sup>↓</sup> *<sup>A</sup>* and *Bn* <sup>↓</sup> *<sup>B</sup>*, then *An σ B* ↓ *A σ B* and *A σ Bn* ↓ *A σ B*.

The condition (M3') is clearly weaker than (M3). The next theorem asserts that we can improve the definition of Kubo-Ando by replacing (M3) with (M3') and still get the same theory. This theorem also provides an easier way for checking a binary opertion to be a connection.

**Theorem 0.22.** *If a binary operation <sup>σ</sup> on B*(H)<sup>+</sup> *satisfies (M1), (M2) and (M3'), then <sup>σ</sup> satisfies (M3), that is, σ is a connection.*

Denote by *OM*(**R**+) the set of operator monotone functions from **R**<sup>+</sup> to **R**+. If a binary operation *σ* has a property (A), we write *σ* ∈ *BO*(*A*). The following properties for a binary operation *<sup>σ</sup>* and a function *<sup>f</sup>* : **<sup>R</sup>**<sup>+</sup> <sup>→</sup> **<sup>R</sup>**<sup>+</sup> play important roles:

(P) : If a projection *<sup>P</sup>* <sup>∈</sup> *<sup>B</sup>*(H)<sup>+</sup> commutes with *<sup>A</sup>*, *<sup>B</sup>* <sup>∈</sup> *<sup>B</sup>*(H)+, then

*P*(*A σ B*)=(*PA*) *σ* (*PB*)=(*A σ B*)*P*;

(F) : *<sup>f</sup>*(*t*)*<sup>I</sup>* <sup>=</sup> *<sup>I</sup> <sup>σ</sup>* (*tI*) for any *<sup>t</sup>* <sup>∈</sup> **<sup>R</sup>**+.

**Proposition 0.23.** *The transformer inequality (M2) implies*


*Proof.* For *A*, *B* 0 and *C* > 0, we have

$$\mathbb{C}^{-1}[(\mathbb{C}\mathcal{A}\mathbb{C})\,\sigma\,(\mathbb{C}\mathcal{B}\mathbb{C})]\mathbb{C}^{-1}\leqslant (\mathbb{C}^{-1}\mathbb{C}\mathcal{A}\mathbb{C}\mathbb{C}^{-1})\,\sigma\,(\mathbb{C}^{-1}\mathbb{C}\mathcal{B}\mathbb{C}\mathbb{C}^{-1})=A\,\sigma\,B$$

and hence (*CAC*) *σ* (*CBC*) *C*(*A σ B*)*C*. The positive homogeneity comes from the congruence invariance by setting *<sup>C</sup>* <sup>=</sup> <sup>√</sup>*αI*.

**Lemma 0.24.** *Let f* : **<sup>R</sup>**<sup>+</sup> <sup>→</sup> **<sup>R</sup>**<sup>+</sup> *be an increasing function. If <sup>σ</sup> satisfies the positive homogeneity, (M3') and (F), then f is continuous.*

*Proof.* To show that *<sup>f</sup>* is right continuous at each *<sup>t</sup>* <sup>∈</sup> **<sup>R</sup>**+, consider a sequence *tn* in **<sup>R</sup>**<sup>+</sup> such that *tn* ↓ *t*. Then by (M3')

$$f(t\_n)I = I\sigma \, t\_n I \downarrow I\sigma \, tI = f(t)I\prime$$

i.e. *f*(*tn*) ↓ *f*(*t*). To show that *f* is left continuous at each *t* > 0, consider a sequence *tn* > 0 such that *tn* ↑ *t*. Then *t* <sup>−</sup><sup>1</sup> *<sup>n</sup>* <sup>↓</sup> *<sup>t</sup>* <sup>−</sup><sup>1</sup> and

$$\begin{aligned} \lim t\_n^{-1} f(t\_n) I &= \lim t\_n^{-1} (I \, \sigma \, t\_n I) = \lim (t\_n^{-1} I) \, \sigma I = (t^{-1} I) \, \sigma I \\ &= t^{-1} (I \, \sigma \, tI) = t^{-1} f(t) I \end{aligned}$$

Since *f* is increasing, *t* <sup>−</sup><sup>1</sup> *<sup>n</sup> <sup>f</sup>*(*tn*) is decreasing. So, *<sup>t</sup>* �→ *<sup>t</sup>* <sup>−</sup><sup>1</sup> *f*(*t*) and *f* are left continuous. **Lemma 0.25.** *Let <sup>σ</sup> be a binary operation on B*(H)<sup>+</sup> *satisfying (M3') and (P). If f* : **<sup>R</sup>**<sup>+</sup> <sup>→</sup> **<sup>R</sup>**<sup>+</sup> *is an increasing continuous function such that <sup>σ</sup> and f satisfy (F), then f*(*A*) = *<sup>I</sup> <sup>σ</sup> A for any A* <sup>∈</sup> *<sup>B</sup>*(H)+*.*

12 Will-be-set-by-IN-TECH

(M3') *separate continuity from above*: if *An*, *Bn* <sup>∈</sup> *<sup>B</sup>*(H)<sup>+</sup> satisfy *An* <sup>↓</sup> *<sup>A</sup>* and *Bn* <sup>↓</sup> *<sup>B</sup>*, then

The condition (M3') is clearly weaker than (M3). The next theorem asserts that we can improve the definition of Kubo-Ando by replacing (M3) with (M3') and still get the same theory. This theorem also provides an easier way for checking a binary opertion to be a connection.

**Theorem 0.22.** *If a binary operation <sup>σ</sup> on B*(H)<sup>+</sup> *satisfies (M1), (M2) and (M3'), then <sup>σ</sup> satisfies*

Denote by *OM*(**R**+) the set of operator monotone functions from **R**<sup>+</sup> to **R**+. If a binary operation *σ* has a property (A), we write *σ* ∈ *BO*(*A*). The following properties for a binary

*P*(*A σ B*)=(*PA*) *σ* (*PB*)=(*A σ B*)*P*;

*C*−1[(*CAC*) *σ* (*CBC*)]*C*−<sup>1</sup> (*C*−1*CACC*−1) *σ* (*C*−1*CBCC*−1) = *A σ B*

and hence (*CAC*) *σ* (*CBC*) *C*(*A σ B*)*C*. The positive homogeneity comes from the

**Lemma 0.24.** *Let f* : **<sup>R</sup>**<sup>+</sup> <sup>→</sup> **<sup>R</sup>**<sup>+</sup> *be an increasing function. If <sup>σ</sup> satisfies the positive homogeneity,*

*Proof.* To show that *<sup>f</sup>* is right continuous at each *<sup>t</sup>* <sup>∈</sup> **<sup>R</sup>**+, consider a sequence *tn* in **<sup>R</sup>**<sup>+</sup> such

*f*(*tn*)*I* = *I σ tn I* ↓ *I σ tI* = *f*(*t*)*I*,

i.e. *f*(*tn*) ↓ *f*(*t*). To show that *f* is left continuous at each *t* > 0, consider a sequence *tn* > 0

*<sup>n</sup>* (*I σ tn I*) = lim(*t*

<sup>−</sup><sup>1</sup> *f*(*t*)*I*

−1

*<sup>n</sup> I*) *σ I* = (*t*

<sup>−</sup><sup>1</sup> *I*) *σ I*

<sup>−</sup><sup>1</sup> *f*(*t*) and *f* are left continuous.

From now on, assume dim H = ∞. Consider the following property:

operation *<sup>σ</sup>* and a function *<sup>f</sup>* : **<sup>R</sup>**<sup>+</sup> <sup>→</sup> **<sup>R</sup>**<sup>+</sup> play important roles:

**Proposition 0.23.** *The transformer inequality (M2) implies*

(P) : If a projection *<sup>P</sup>* <sup>∈</sup> *<sup>B</sup>*(H)<sup>+</sup> commutes with *<sup>A</sup>*, *<sup>B</sup>* <sup>∈</sup> *<sup>B</sup>*(H)+, then

*•* Congruence invariance*: For A*, *B* 0 *and C* > 0*, C*(*AσB*)*C* = (*CAC*) *σ* (*CBC*)*; •* Positive homogeneity*: For A*, *B* 0 *and α* ∈ (0, ∞)*, α*(*A σ B*)=(*αA*) *σ* (*αB*)*.*

*An σ B* ↓ *A σ B* and *A σ Bn* ↓ *A σ B*.

*(M3), that is, σ is a connection.*

(F) : *<sup>f</sup>*(*t*)*<sup>I</sup>* <sup>=</sup> *<sup>I</sup> <sup>σ</sup>* (*tI*) for any *<sup>t</sup>* <sup>∈</sup> **<sup>R</sup>**+.

*Proof.* For *A*, *B* 0 and *C* > 0, we have

congruence invariance by setting *<sup>C</sup>* <sup>=</sup> <sup>√</sup>*αI*.

<sup>−</sup><sup>1</sup> *<sup>n</sup>* <sup>↓</sup> *<sup>t</sup>*

lim *t* −1 <sup>−</sup><sup>1</sup> and

= *t*

−1

<sup>−</sup><sup>1</sup> *<sup>n</sup> <sup>f</sup>*(*tn*) is decreasing. So, *<sup>t</sup>* �→ *<sup>t</sup>*

<sup>−</sup>1(*I σ tI*) = *t*

*<sup>n</sup> f*(*tn*)*I* = lim *t*

*(M3') and (F), then f is continuous.*

that *tn* ↓ *t*. Then by (M3')

such that *tn* ↑ *t*. Then *t*

Since *f* is increasing, *t*

*Proof.* First consider *<sup>A</sup>* <sup>∈</sup> *<sup>B</sup>*(H)<sup>+</sup> in the form <sup>∑</sup>*<sup>m</sup> <sup>i</sup>*=<sup>1</sup> *<sup>λ</sup>iPi* where {*Pi*}*<sup>m</sup> <sup>i</sup>*=<sup>1</sup> is an orthogonal family of projections with sum *I* and *λ<sup>i</sup>* > 0 for all *i* = 1, . . . , *m*. Since each *Pi* commutes with *A*, we have by the property (P) that

$$\begin{aligned} I\,\sigma\,A &= \sum P\_{\bar{l}}(I\,\sigma\,A) = \sum P\_{\bar{l}}\,\sigma\,P\_{\bar{l}}A = \sum P\_{\bar{l}}\,\sigma\,\lambda\_{\bar{l}}P\_{\bar{l}} \\ &= \sum P\_{\bar{l}}(I\,\sigma\,\lambda\_{\bar{l}}I) = \sum f(\lambda\_{\bar{l}})P\_{\bar{l}} = f(A). \end{aligned}$$

Now, consider *<sup>A</sup>* <sup>∈</sup> *<sup>B</sup>*(H)+. Then there is a sequence *An* of strictly positive operators in the above form such that *An* ↓ *A*. Then *I σ An* ↓ *I σ A* and *f*(*An*) converges strongly to *f*(*A*). Hence, *I σ A* = lim *I σ An* = lim *f*(*An*) = *f*(*A*).

*Proof of Theorem 0.22:* Let *σ* ∈ *BO*(*M*1, *M*2, *M*3� ). As in [17], the conditions (M1) and (M2) imply that *<sup>σ</sup>* satisfies (P) and there is a function *<sup>f</sup>* : **<sup>R</sup>**<sup>+</sup> <sup>→</sup> **<sup>R</sup>**<sup>+</sup> subject to (F). If 0 *<sup>t</sup>*<sup>1</sup> *<sup>t</sup>*2, then by (M1)

$$f(t\_1)I = I\sigma\left(t\_1I\right) \leqslant I\sigma\left(t\_2I\right) = f(t\_2)I\nu$$

i.e. *f*(*t*1) *f*(*t*2). The assumption (M3') is enough to guarantee that *f* is continuous by Lemma 0.24. Then Lemma 0.25 results in *f*(*A*) = *IσA* for all *A* 0. Now, (M1) and the fact that dim <sup>H</sup> <sup>=</sup> <sup>∞</sup> yield that *<sup>f</sup>* is operator monotone. If there is another *<sup>g</sup>* <sup>∈</sup> *OM*(**R**+) satisfying (F), then *f*(*t*)*I* = *I σ tI* = *g*(*t*)*I* for each *t* 0, i.e. *f* = *g*. Thus, we establish a well-defined map *σ* ∈ *BO*(*M*1, *M*2, *M*3� ) �→ *<sup>f</sup>* <sup>∈</sup> *OM*(**R**+) such that *<sup>σ</sup>* and *<sup>f</sup>* satisfy (F).

Now, given *<sup>f</sup>* <sup>∈</sup> *OM*(**R**+), we construct *<sup>σ</sup>* from the integral representation (2) in Proposition 0.4. Define a binary operation *<sup>σ</sup>* : *<sup>B</sup>*(H)<sup>+</sup> <sup>×</sup> *<sup>B</sup>*(H)<sup>+</sup> <sup>→</sup> *<sup>B</sup>*(H)<sup>+</sup> by

$$A \,\sigma \, B = \int\_{\left[0,1\right]} A \, !\_{l} \, B \, d\mu(t) \tag{15}$$

where the integral is taken in the sense of Bochner. Consider *<sup>A</sup>*, *<sup>B</sup>* <sup>∈</sup> *<sup>B</sup>*(H)<sup>+</sup> and set *Ft* <sup>=</sup> *<sup>A</sup>* !*<sup>t</sup> <sup>B</sup>* for each *t* ∈ [0, 1]. Since *A* �*A*�*I* and *B* �*B*�*I*, we get

$$A \, !\_t B \leqslant \|A\| \|I\| \, !\_t \, \|B\| \|I\| \quad = \quad \frac{\|A\| \|\|B\|\|}{t \|\|A\|\| + (1 - t) \|\|B\|\|} I.$$

By Banach-Steinhaus' theorem, there is an *M* > 0 such that �*Ft*� *M* for all *t* ∈ [0, 1]. Hence,

$$\int\_{[0,1]} \|F\_t\| \, d\mu(t) \lesssim \int\_{[0,1]} M \, d\mu(t) < \infty.$$

So, *Ft* is Bochner integrable. Since *Ft* 0 for all *t* ∈ [0, 1], [0,1] *Ft dμ*(*t*) 0. Thus, *A σ B* is a well-defined element in *<sup>B</sup>*(H)+. The monotonicity (M1) and the transformer inequality (M2) come from passing the monotonicity and the transformer inequality of the weighed harmonic mean through the Bochner integral. To show (M3'), let *An* ↓ *A* and *Bn* ↓ *B*. Then *An* !*<sup>t</sup> B* ↓ *A* !*<sup>t</sup> B* for *t* ∈ [0, 1] by the monotonicity and the separate continuity from above of the weighed harmonic mean. Let *ξ* ∈ *H*. Define a bounded linear map Φ : *B*(H) → **C** by Φ(*T*) = �*Tξ*, *ξ*�.

#### 14 Will-be-set-by-IN-TECH 176 Linear Algebra – Theorems and Applications

For each *n* ∈ **N**, set *Tn*(*t*) = *An* !*<sup>t</sup> B* and put *T*∞(*t*) = *A* !*<sup>t</sup> B*. Then for each *n* ∈ **N** ∪ {∞}, Φ ◦ *Tn* is Bochner integrable and

$$\langle \int T\_{\mathfrak{n}}(t) \, d\mu(t) \xi\_t \, \xi \rangle = \Phi(\int T\_{\mathfrak{n}}(t) \, d\mu(t)) = \int \Phi \circ T\_{\mathfrak{n}}(t) \, d\mu(t).$$

Since *Tn*(*t*) ↓ *T*∞(*t*), we have that �*Tn*(*t*)*ξ*, *ξ*�→�*T*∞(*t*)*ξ*, *ξ*� as *n* → ∞ for each *t* ∈ [0, 1]. We obtain from the dominated convergence theorem that

$$\lim\_{n \to \infty} \langle (A\_n \,\sigma \, B)\tilde{\xi}, \tilde{\xi} \rangle = \lim\_{n \to \infty} \langle \int T\_n(t) \, d\mu(t) \tilde{\xi}, \tilde{\xi} \rangle$$

$$= \lim\_{n \to \infty} \int \langle T\_n(t) \tilde{\xi}, \tilde{\xi} \rangle \, d\mu(t)$$

$$= \int \langle T\_{\infty}(t) \tilde{\xi}, \tilde{\xi} \rangle \, d\mu(t)$$

$$= \langle \int T\_{\infty}(t) d\mu(t) \tilde{\xi}, \tilde{\xi} \rangle$$

$$= \langle (A \,\sigma \, B)\tilde{\xi}, \tilde{\xi} \rangle.$$

So, *An σ B* ↓ *A σ B*. Similarly, *A σ Bn* ↓ *A σ B*. Thus, *σ* satisfies (M3'). It is easy to see that *f*(*t*)*I* = *I σ* (*tI*) for *t* 0. This shows that the map *σ* �→ *f* is surjective.

To show the injectivity of this map, let *σ*1, *σ*<sup>2</sup> ∈ *BO*(*M*1, *M*2, *M*3� ) be such that *σ<sup>i</sup>* �→ *f* where, for each *t* 0, *I σ<sup>i</sup>* (*tI*) = *f*(*t*)*I*, *i* = 1, 2. Since *σ<sup>i</sup>* satisfies the property (P), we have *I σ<sup>i</sup> A* = *f*(*A*) for *A* 0 by Lemma 0.25. Since *σ<sup>i</sup>* satisfies the congruence invariance, we have that for *A* > 0 and *B* 0,

$$A\,\sigma\_{\mathrm{i}}\,\mathrm{B} = A^{1/2}(\mathrm{I}\,\sigma\_{\mathrm{i}}\,\mathrm{A}^{-1/2}\mathrm{BA}^{-1/2})A^{1/2} = A^{1/2}f(\mathrm{A}^{-1/2}\mathrm{BA}^{-1/2})A^{1/2}, \quad \mathrm{i} = 1,2.$$

For each *A*, *B* 0, we obtain by (M3') that

$$\begin{aligned} A \,\sigma\_1 \,\, &B = \lim\_{\epsilon \downarrow 0} A\_{\epsilon} \,\, \sigma\_1 \,\, \, \\ &= \lim\_{\epsilon \downarrow 0} A\_{\epsilon}^{1/2} (I \,\sigma\_1 \, A\_{\epsilon}^{-1/2} B A\_{\epsilon}^{-1/2}) A\_{\epsilon}^{1/2} \\ &= \lim\_{\epsilon \downarrow 0} A\_{\epsilon}^{1/2} f(A\_{\epsilon}^{-1/2} B A\_{\epsilon}^{-1/2}) A\_{\epsilon}^{1/2} \\ &= \lim\_{\epsilon \downarrow 0} A\_{\epsilon}^{1/2} (I \,\sigma\_2 \, A\_{\epsilon}^{-1/2} B A\_{\epsilon}^{-1/2}) A\_{\epsilon}^{1/2} \\ &= \lim\_{\epsilon \downarrow 0} A\_{\epsilon} \,\, \sigma\_2 \, \, \, \, \\ &= A \,\, \sigma\_2 \, B\_{\epsilon} \end{aligned}$$

where *<sup>A</sup>�* <sup>≡</sup> *<sup>A</sup>* <sup>+</sup> *�I*. That is *<sup>σ</sup>*<sup>1</sup> <sup>=</sup> *<sup>σ</sup>*2. Therefore, there is a bijection between *OM*(**R**+) and *BO*(*M*1, *M*2, *M*3� ). Every element in *BO*(*M*1, *M*2, *M*3� ) admits an integral representation (15). Since the weighed harmonic mean possesses the joint continuity (M3), so is any element in *BO*(*M*1, *M*2, *M*3� ).

The next theorem is a fundamental result of [17].

**Theorem 0.26.** *There is a one-to-one correspondence between connections σ and operator monotone functions f on the non-negative reals satisfying*

$$f(t)I = I\sigma\left(tI\right), \quad t \in \mathbb{R}^+.\tag{16}$$

*There is a one-to-one correspondence between connections σ and finite Borel measures ν on* [0, ∞] *satisfying*

$$A\,\sigma\,B = \int\_{\left[0,\infty\right]} \frac{t+1}{t} (tA\,:\,B) \,d\nu(t), \quad A, B \geqslant 0. \tag{17}$$

*Moreover, the map σ* �→ *f is an affine order-isomorphism between connections and non-negative operator monotone functions on* **<sup>R</sup>**+*. Here, the order-isomorphism means that when <sup>σ</sup> <sup>i</sup>* �→ *fi for <sup>i</sup>* <sup>=</sup> 1, 2*, A <sup>σ</sup>* <sup>1</sup>*<sup>B</sup> <sup>A</sup> <sup>σ</sup>* <sup>2</sup>*B for all A*, *<sup>B</sup>* <sup>∈</sup> *<sup>B</sup>*(H)<sup>+</sup> *if and only if f*<sup>1</sup> *<sup>f</sup>*2*.*

Each connection *<sup>σ</sup>* on *<sup>B</sup>*(H)<sup>+</sup> produces a unique scalar function on **<sup>R</sup>**+, denoted by the same notation, satisfying

$$(s\,\sigma\,t)I = (sI)\,\sigma\,(tI), \quad s, t \in \mathbb{R}^+.\tag{18}$$

Let *<sup>s</sup>*, *<sup>t</sup>* <sup>∈</sup> **<sup>R</sup>**+. If *<sup>s</sup>* <sup>&</sup>gt; 0, then *<sup>s</sup> <sup>σ</sup> <sup>t</sup>* <sup>=</sup> *s f*(*t*/*s*). If *<sup>t</sup>* <sup>&</sup>gt; 0, then *<sup>s</sup> <sup>σ</sup> <sup>t</sup>* <sup>=</sup> *t f*(*s*/*t*).

14 Will-be-set-by-IN-TECH

For each *n* ∈ **N**, set *Tn*(*t*) = *An* !*<sup>t</sup> B* and put *T*∞(*t*) = *A* !*<sup>t</sup> B*. Then for each *n* ∈ **N** ∪ {∞},

Since *Tn*(*t*) ↓ *T*∞(*t*), we have that �*Tn*(*t*)*ξ*, *ξ*�→�*T*∞(*t*)*ξ*, *ξ*� as *n* → ∞ for each *t* ∈ [0, 1]. We

<sup>=</sup> lim*n*→<sup>∞</sup>

= 

= � 

So, *An σ B* ↓ *A σ B*. Similarly, *A σ Bn* ↓ *A σ B*. Thus, *σ* satisfies (M3'). It is easy to see that

for each *t* 0, *I σ<sup>i</sup>* (*tI*) = *f*(*t*)*I*, *i* = 1, 2. Since *σ<sup>i</sup>* satisfies the property (P), we have *I σ<sup>i</sup> A* = *f*(*A*) for *A* 0 by Lemma 0.25. Since *σ<sup>i</sup>* satisfies the congruence invariance, we have that for

*A σ<sup>i</sup> B* = *A*1/2(*I σ<sup>i</sup> A*−1/2*BA*−1/2)*A*1/2 = *A*1/2 *f*(*A*−1/2*BA*−1/2)*A*1/2, *i* = 1, 2.

*�* (*<sup>I</sup> <sup>σ</sup>*<sup>1</sup> *<sup>A</sup>*−1/2

*�* (*<sup>I</sup> <sup>σ</sup>*<sup>2</sup> *<sup>A</sup>*−1/2

where *<sup>A</sup>�* <sup>≡</sup> *<sup>A</sup>* <sup>+</sup> *�I*. That is *<sup>σ</sup>*<sup>1</sup> <sup>=</sup> *<sup>σ</sup>*2. Therefore, there is a bijection between *OM*(**R**+) and

Since the weighed harmonic mean possesses the joint continuity (M3), so is any element in

*� <sup>f</sup>*(*A*−1/2

*� BA*−1/2

*� BA*−1/2

).

*� BA*−1/2

*�* )*A*1/2 *�*

*�* )*A*1/2 *�*

) admits an integral representation (15).

*�* )*A*1/2 *�*

*Tn*(*t*) *dμ*(*t*)) =

= �(*A σ B*)*ξ*, *ξ*�.

*Tn*(*t*) *dμ*(*t*)*ξ*, *ξ*�

�*Tn*(*t*)*ξ*, *ξ*� *dμ*(*t*)

�*T*∞(*t*)*ξ*, *ξ*� *dμ*(*t*)

*T*∞(*t*)*dμ*(*t*)*ξ*, *ξ*�

Φ ◦ *Tn*(*t*) *dμ*(*t*).

) be such that *σ<sup>i</sup>* �→ *f* where,

lim*n*→∞�(*An <sup>σ</sup> <sup>B</sup>*)*ξ*, *<sup>ξ</sup>*� <sup>=</sup> lim*n*→∞�

*f*(*t*)*I* = *I σ* (*tI*) for *t* 0. This shows that the map *σ* �→ *f* is surjective.

To show the injectivity of this map, let *σ*1, *σ*<sup>2</sup> ∈ *BO*(*M*1, *M*2, *M*3�

*A σ*<sup>1</sup> *B* = lim

*�*↓0

= lim *�*↓0

= lim *�*↓0

= lim *�*↓0

= lim *�*↓0

= *A σ*<sup>2</sup> *B*,

). Every element in *BO*(*M*1, *M*2, *M*3�

The next theorem is a fundamental result of [17].

*A� σ*<sup>1</sup> *B*

*A*1/2

*A*1/2

*A*1/2

*A� σ*<sup>2</sup> *B*

Φ ◦ *Tn* is Bochner integrable and

� 

*A* > 0 and *B* 0,

*BO*(*M*1, *M*2, *M*3�

*BO*(*M*1, *M*2, *M*3�

For each *A*, *B* 0, we obtain by (M3') that

*Tn*(*t*) *dμ*(*t*)*ξ*, *ξ*� = Φ(

obtain from the dominated convergence theorem that

**Theorem 0.27.** *There is a one-to-one correspondence between connections and finite Borel measures on the unit interval. In fact, every connection takes the form*

$$A\,\sigma\,B = \int\_{[0,1]} A\,!\_{\text{f}}\, B\,d\mu(t), \quad A\,\,B \gtrless 0 \tag{19}$$

*for some finite Borel measure μ on* [0, 1]*. Moreover, the map μ* �→ *σ is affine and order-preserving. Here, the order-presering means that when μ<sup>i</sup>* �→ *σ<sup>i</sup> (i=1,2), if μ*1(*E*) *μ*2(*E*) *for all Borel sets E in* [0, 1]*, then A <sup>σ</sup>*<sup>1</sup> *<sup>B</sup> <sup>A</sup> <sup>σ</sup>*<sup>2</sup> *B for all A*, *<sup>B</sup>* <sup>∈</sup> *<sup>B</sup>*(H)+*.*

*Proof.* The proof of the first part is contained in the proof of Theorem 0.22. This map is affine because of the linearity of the map *<sup>μ</sup>* �→ *f d<sup>μ</sup>* on the set of finite positive measures and the bijective correspondence between connections and Borel measures. It is straight forward to show that this map is order-preserving.

**Remark 0.28.** Let us consider operator connections from electrical circuit viewpoint. A general connection represents a formulation of making a new impedance from two given impedances. The integral representation (19) shows that such a formulation can be described as a weighed series connection of (infinite) weighed harmonic means. From this point of view, the theory of operator connections can be regarded as a mathematical theory of electrical circuits.

**Definition 0.29.** Let *σ* be a connection. The operator monotone function *f* in (16) is called the *representing function* of *σ*. If *μ* is the measure corresponds to *σ* in Theorem 0.27, the measure *μψ*−<sup>1</sup> that takes a Borel set *E* in [0, ∞] to *μ*(*ψ*−1(*E*)) is called the *representing measure* of *σ* in the Kubo-Ando's theory. Here, *ψ* : [0, 1] → [0, ∞] is a homeomorphism *t* �→ *t*/(1 − *t*).

Since every connection *σ* has an integral representation (19), properties of weighed harmonic means reflect properties of a general connection. Hence, every connection *σ* satisfies the following properties for all *A*, *B* 0, *T* ∈ *B*(H) and invertible *X* ∈ *B*(H):

16 Will-be-set-by-IN-TECH 178 Linear Algebra – Theorems and Applications


Moreover, if *A*, *B* > 0,

$$A\,\sigma\,\mathbf{B} = A^{1/2}f(A^{-1/2}\mathbf{B}A^{-1/2})A^{1/2} \tag{20}$$

and, in general, for each *A*, *B* 0,

$$A \,\sigma \, B = \lim\_{\epsilon \downarrow 0} A\_{\epsilon} \,\sigma \, B\_{\epsilon} \tag{21}$$

where *A�* ≡ *A* + *�I* and *B�* ≡ *B* + *�I*. These properties are useful tools for deriving operator inequalities involving connections. The formulas (20) and (21) give a way for computing the formula of connection from its representing function.


**Remark 0.31.** The map *σ* �→ *μ*, where *μ* is the representing measure of *σ*, is not order-preserving in general. Indeed, the representing measure of is given by *μ* = (*δ*<sup>0</sup> + *δ*∞)/2 while the representing measure of ! is given by *δ*1. We have ! but *δ*<sup>1</sup> *μ*.

#### **6. Operator means**

According to [24], a (scalar) mean is a binary operation *M* on (0, ∞) such that *M*(*s*, *t*) lies between *s* and *t* for any *s*, *t* > 0. For a connection, this property is equivalent to various properties in the next theorem.

**Theorem 0.32.** *The following are equivalent for a connection <sup>σ</sup> on B*(H)+*:*


*Proof.* Clearly, (i) ⇒ (iii) ⇒ (iv). The implication (iii) ⇒ (ii) follows from the congruence invariance and the continuity from above of *σ*. The monotonicity of *σ* is used to prove (ii) ⇒ (i). Since

$$I\sigma I = \int\_{[0,1]} I \mathbf{1}\_t \, I \, d\mu(t) = \mu([0,1])I\nu$$

we obtain that (iv) ⇒ (v) ⇒ (iii).

16 Will-be-set-by-IN-TECH

*A σ B* = lim *�*↓0

where *A�* ≡ *A* + *�I* and *B�* ≡ *B* + *�I*. These properties are useful tools for deriving operator inequalities involving connections. The formulas (20) and (21) give a way for computing the

**Example 0.30.** 1. The left- and the right-trivial means have representing functions given by *t* �→ 1 and *t* �→ *t*, respectively. The representing measures of the left- and the right-trivial means are given respectively by *δ*<sup>0</sup> and *δ*<sup>∞</sup> where *δ<sup>x</sup>* is the Dirac measure at *x*. So, the *α*-weighed arithmetic mean has the representing function *t* �→ (1 − *α*) + *αt* and it has

3. The harmonic mean has the representing function *t* �→ 2*t*/(1 + *t*) while *t* �→ *t*/(1 + *t*)

**Remark 0.31.** The map *σ* �→ *μ*, where *μ* is the representing measure of *σ*, is not order-preserving in general. Indeed, the representing measure of is given by *μ* = (*δ*<sup>0</sup> +

According to [24], a (scalar) mean is a binary operation *M* on (0, ∞) such that *M*(*s*, *t*) lies between *s* and *t* for any *s*, *t* > 0. For a connection, this property is equivalent to various

*Proof.* Clearly, (i) ⇒ (iii) ⇒ (iv). The implication (iii) ⇒ (ii) follows from the congruence invariance and the continuity from above of *σ*. The monotonicity of *σ* is used to prove (ii) ⇒

*I* !*<sup>t</sup> I dμ*(*t*) = *μ*([0, 1])*I*,

*δ*∞)/2 while the representing measure of ! is given by *δ*1. We have ! but *δ*<sup>1</sup> *μ*.

**Theorem 0.32.** *The following are equivalent for a connection <sup>σ</sup> on B*(H)+*:*

*(i) σ satisfies the* betweenness property*, i.e. A B* ⇒ *A A σ B B. (ii) <sup>σ</sup> satisfies the* fixed point property*, i.e. A <sup>σ</sup> <sup>A</sup>* <sup>=</sup> *A for all A* <sup>∈</sup> *<sup>B</sup>*(H)+*.*

*(v) the representing measure μ of σ is normalized, i.e. μ is a probability measure.*

 [0,1]

*(iv) the representing function f of σ is normalized, i.e. f*(1) = 1*.*

*I σ I* =

) *t*(*A σ A*�

)+(1 − *t*)(*B σ B*�

*A� σ B�* (21)

*A σ B* = *A*1/2 *f*(*A*−1/2*BA*−1/2)*A*1/2 (20)

1/2.

) for *t* ∈ [0, 1].

• *transformer inequality*: *T*∗(*A σ B*)*T* (*T*∗*AT*) *σ* (*T*∗*BT*); • *congruence invariance*: *X*∗(*A σ B*)*X* = (*X*∗*AX*) *σ* (*X*∗*BX*);

• *concavity*: (*tA* + (1 − *t*)*B*) *σ* (*tA*� + (1 − *t*)*B*�

formula of connection from its representing function.

(1 − *α*)*δ*<sup>0</sup> + *αδ*<sup>∞</sup> as the representing measure.

corrsponds to the parallel sum.

**6. Operator means**

properties in the next theorem.

*(iii) σ is normalized, i.e. I σ I* = *I.*

(i). Since

2. The geometric mean has the representing function *t* �→ *t*

Moreover, if *A*, *B* > 0,

and, in general, for each *A*, *B* 0,

**Definition 0.33.** A *mean* is a connection satisfying one, and thus all, of the properties in the previous theorem.

Hence, every mean in Kubo-Ando's sense satisfies the desired properties (A1)–(A9) in Section 3. As a consequence of Theorem 0.32, a convex combination of means is a mean.

**Theorem 0.34.** *Given a Hilbert space* H*, there exist affine bijections between any pair of the following objects:*


*Moreover, these correspondences between (i) and (ii) are order isomorphic. Hence, there exists an affine order isomorphism between the means on the positive operators acting on different Hilbert spaces.*

*Proof.* Follow from Theorems 0.27 and 0.32.

**Example 0.35.** The left- and right-trivial means, weighed arithmetic means, the geometric mean and the harmonic mean are means. The parallel sum is not a mean since its representing function is not normalized.

**Example 0.36.** The function *t* �→ *t <sup>α</sup>* is an operator monotone function on **<sup>R</sup>**<sup>+</sup> for each *<sup>α</sup>* <sup>∈</sup> [0, 1] by the Löwner-Heinz's inequality. So it produces a mean, denoted by #*α*, on *<sup>B</sup>*(H)+. By the direct computation,

$$s \#\_{\mathfrak{a}} t = s^{1-\mathfrak{a}} t^{\mathfrak{a}} \, , \tag{22}$$

i.e. #*<sup>α</sup>* is the *α*-weighed geometric mean on **R**+. So the *α*-weighed geometric mean on **R**<sup>+</sup> is really a Kubo-Ando mean. The *<sup>α</sup>-weighed geometric mean* on *<sup>B</sup>*(H)<sup>+</sup> is defined to be the mean corresponding to that mean on **R**+. Since *t <sup>α</sup>* has an integral expression

$$t^{\mathfrak{a}} = \frac{\sin \mathfrak{a} \pi}{\pi} \int\_0^\infty \frac{t \lambda^{\mathfrak{a}-1}}{t+\lambda} \, dm(\lambda) \tag{23}$$

(see [7]) where *m* denotes the Lebesgue measure, the representing measure of #*<sup>α</sup>* is given by

$$d\mu(\lambda) = \frac{\sin \alpha \pi}{\pi} \frac{\lambda^{\alpha - 1}}{\lambda + 1} \, dm(\lambda). \tag{24}$$

**Example 0.37.** Consider the operator monotone function

$$t \mapsto \frac{t}{(1-\alpha)t + \alpha'} \quad \text{ } t \gg 0 \text{, } \alpha \in [0, 1].$$

The direct computation shows that

$$s \restriction\_{\mathfrak{A}} t = \begin{cases} ((1-a)s^{-1} + at^{-1})^{-1} \text{, s} \, t > 0; \\ 0, & \text{otherwise,} \end{cases} \tag{25}$$

which is the *<sup>α</sup>*-weighed harmonic mean. We define the *<sup>α</sup>*-*weighed harmonic mean* on *<sup>B</sup>*(H)<sup>+</sup> to be the mean corresponding to this operator monotone function.

**Example 0.38.** Consider the operator monotone function *f*(*t*)=(*t* − 1)/ log *t* for *t* > 0, *t* �= 1, *<sup>f</sup>*(0) <sup>≡</sup> 0 and *<sup>f</sup>*(1) <sup>≡</sup> 1. Then it gives rise to a mean, denoted by *<sup>λ</sup>*, on *<sup>B</sup>*(H)+. By the direct computation,

$$s \wedge t = \begin{cases} \frac{s - t}{\log s - \log t}, s > 0, t > 0, s \neq t;\\ s\_\prime & s = t \\ 0, & \text{otherwise} \end{cases} \tag{26}$$

i.e. *λ* is the logarithmic mean on **R**+. So the logarithmic mean on **R**<sup>+</sup> is really a mean in Kubo-Ando's sense. The *logarithmic mean* on *<sup>B</sup>*(H)<sup>+</sup> is defined to be the mean corresponding to this operator monotone function.

**Example 0.39.** The map *t* �→ (*t <sup>r</sup>* + *t* <sup>1</sup>−*r*)/2 is operator monotone for any *<sup>r</sup>* <sup>∈</sup> [0, 1]. This function produces a mean on *<sup>B</sup>*(H)+. The computation shows that

$$(\mathbf{s}, t) \mapsto \frac{s^r t^{1-r} + s^{1-r} t^r}{2}.$$

However, the corresponding mean on *<sup>B</sup>*(H)<sup>+</sup> is not given by the formula

$$(A, B) \mapsto \frac{A^r B^{1-r} + A^{1-r} B^r}{2} \tag{27}$$

since it is not a binary operation on *<sup>B</sup>*(H)+. In fact, the formula (27) is considered in [8], called the *Heinz mean* of *A* and *B*.

**Example 0.40.** For each *p* ∈ [−1, 1] and *α* ∈ [0, 1], the map

$$t \mapsto [(1 - \mathfrak{a}) + \mathfrak{a}t^p]^{1/p}$$

is an operator monotone function on **R**+. Here, the case *p* = 0 is understood that we take limit as *p* → 0. Then

$$s \#\_{p, \mathfrak{a}} t = [(1 - \mathfrak{a})s^p + at^p]^{1/p}. \tag{28}$$

The corresponding mean on *<sup>B</sup>*(H)<sup>+</sup> is called the *quasi-arithmetic power mean* with parameter (*p*, *α*), defined for *A* > 0 and *B* 0 by

$$A \#\_{p\mu} B = A^{1/2} [(1 - \mathfrak{a})I + \mathfrak{a} (A^{-1/2} \mathfrak{B} A^{-1/2})^p]^{1/p} A^{1/2}. \tag{29}$$

The class of quasi-arithmetic power means contain many kinds of means: The mean #1,*<sup>α</sup>* is the *α*-weighed arithmetic mean. The case #0,*<sup>α</sup>* is the *α*-weighed geometric mean. The case #−1,*<sup>α</sup>* is the *α*-weighed harmonic mean. The mean #*p*,1/2 is the *power mean* or *binomial mean* of order *p*. These means satisfy the property that

$$A \#\_{p,a} B = B \#\_{p,1-a} A. \tag{30}$$

Moreover, they are interpolated in the sense that for all *p*, *q*, *α* ∈ [0, 1],

$$(A\#\_{r,p}B)\#\_{r,\mathfrak{a}}(A\#\_{r,\mathfrak{g}}B) \;=\; A\#\_{r,(1-\mathfrak{a})p+\mathfrak{a}\mathfrak{g}}B.\tag{31}$$

**Example 0.41.** If *σ*1, *σ*<sup>2</sup> are means such that *σ*<sup>1</sup> *σ*2, then there is a family of means that interpolates between *σ*<sup>1</sup> and *σ*2, namely, (1 − *α*)*σ*<sup>1</sup> + *ασ*<sup>2</sup> for all *α* ∈ [0, 1]. Note that the map *α* �→ (1 − *α*)*σ*<sup>1</sup> + *ασ*<sup>2</sup> is increasing. For instance, the *Heron mean* with weight *α* ∈ [0, 1] is defined to be *h<sup>α</sup>* = (1 − *α*) # + *α* . This family is the linear interpolations between the geometric mean and the arithmetic mean. The representing function of *h<sup>α</sup>* is given by

$$t \mapsto (1 - \alpha)t^{1/2} + \frac{\alpha}{2}(1 + t).$$

The case *α* = 2/3 is called the *Heronian mean* in the literature.

18 Will-be-set-by-IN-TECH

which is the *<sup>α</sup>*-weighed harmonic mean. We define the *<sup>α</sup>*-*weighed harmonic mean* on *<sup>B</sup>*(H)<sup>+</sup> to

**Example 0.38.** Consider the operator monotone function *f*(*t*)=(*t* − 1)/ log *t* for *t* > 0, *t* �= 1, *<sup>f</sup>*(0) <sup>≡</sup> 0 and *<sup>f</sup>*(1) <sup>≡</sup> 1. Then it gives rise to a mean, denoted by *<sup>λ</sup>*, on *<sup>B</sup>*(H)+. By the direct

log *<sup>s</sup>*−log *<sup>t</sup>* , *<sup>s</sup>* <sup>&</sup>gt; 0, *<sup>t</sup>* <sup>&</sup>gt; 0,*<sup>s</sup>* �<sup>=</sup> *<sup>t</sup>*;

<sup>1</sup>−*<sup>r</sup>* + *s*1−*rt*

*ArB*<sup>1</sup>−*<sup>r</sup>* + *A*1−*rBr*

*p*] 1/*p*

*p*]

*<sup>A</sup>* #*p*,*<sup>α</sup> <sup>B</sup>* = *<sup>B</sup>* #*p*,1−*<sup>α</sup> <sup>A</sup>*. (30)

(*<sup>A</sup>* #*r*,*<sup>p</sup> <sup>B</sup>*) #*r*,*<sup>α</sup>* (*<sup>A</sup>* #*r*,*<sup>q</sup> <sup>B</sup>*) = *<sup>A</sup>* #*r*,(1−*α*)*p*+*α<sup>q</sup> <sup>B</sup>*. (31)

*r* <sup>2</sup> .

<sup>1</sup>−*r*)/2 is operator monotone for any *<sup>r</sup>* <sup>∈</sup> [0, 1]. This

<sup>2</sup> (27)

1/*p*. (28)

1/*pA*1/2. (29)

(26)

be the mean corresponding to this operator monotone function.

*s λ t* =

⎧ ⎨ ⎩

*<sup>r</sup>* + *t*

(*s*, *t*) �→

However, the corresponding mean on *<sup>B</sup>*(H)<sup>+</sup> is not given by the formula

(*A*, *B*) �→

**Example 0.40.** For each *p* ∈ [−1, 1] and *α* ∈ [0, 1], the map

function produces a mean on *<sup>B</sup>*(H)+. The computation shows that

*s*−*t*

*s*, *s* = *t* 0, otherwise,

*srt*

since it is not a binary operation on *<sup>B</sup>*(H)+. In fact, the formula (27) is considered in [8], called

*t* �→ [(1 − *α*) + *αt*

*<sup>s</sup>* #*p*,*<sup>α</sup> <sup>t</sup>* = [(<sup>1</sup> <sup>−</sup> *<sup>α</sup>*)*s<sup>p</sup>* <sup>+</sup> *<sup>α</sup><sup>t</sup>*

*<sup>A</sup>* #*p*,*<sup>α</sup> <sup>B</sup>* <sup>=</sup> *<sup>A</sup>*1/2[(<sup>1</sup> <sup>−</sup> *<sup>α</sup>*)*<sup>I</sup>* <sup>+</sup> *<sup>α</sup>*(*A*−1/2*BA*−1/2)*p*]

Moreover, they are interpolated in the sense that for all *p*, *q*, *α* ∈ [0, 1],

is an operator monotone function on **R**+. Here, the case *p* = 0 is understood that we take

The corresponding mean on *<sup>B</sup>*(H)<sup>+</sup> is called the *quasi-arithmetic power mean* with parameter

The class of quasi-arithmetic power means contain many kinds of means: The mean #1,*<sup>α</sup>* is the *α*-weighed arithmetic mean. The case #0,*<sup>α</sup>* is the *α*-weighed geometric mean. The case #−1,*<sup>α</sup>* is the *α*-weighed harmonic mean. The mean #*p*,1/2 is the *power mean* or *binomial mean* of order *p*.

i.e. *λ* is the logarithmic mean on **R**+. So the logarithmic mean on **R**<sup>+</sup> is really a mean in Kubo-Ando's sense. The *logarithmic mean* on *<sup>B</sup>*(H)<sup>+</sup> is defined to be the mean corresponding

computation,

to this operator monotone function.

**Example 0.39.** The map *t* �→ (*t*

the *Heinz mean* of *A* and *B*.

limit as *p* → 0. Then

(*p*, *α*), defined for *A* > 0 and *B* 0 by

These means satisfy the property that

#### **7. Applications to operator monotonicity and concavity**

In this section, we generalize the matrix and operator monotonicity and concavity in the literature (see e.g. [3, 9]) in such a way that the geometric mean, the harmonic mean or specific operator means are replaced by general connections. Recall the following terminology. A continuous function *f* : *I* → **R** is called an *operator concave function* if

$$f(tA + (1 - t)B) \geqslant tf(A) + (1 - t)f(B)$$

for any *t* ∈ [0, 1] and Hermitian operators *A*, *B* ∈ *B*(H) whose spectrums are contained in the interval *I* and for all Hilbert spaces H. A well-known result is that a continuous function *<sup>f</sup>* : **<sup>R</sup>**<sup>+</sup> <sup>→</sup> **<sup>R</sup>**<sup>+</sup> is operator monotone if and only if it is operator concave. Hence, the maps *t* �→ *t <sup>r</sup>* and *<sup>t</sup>* �→ log *<sup>t</sup>* are operator concave for *<sup>r</sup>* <sup>∈</sup> [0, 1]. Let <sup>H</sup> and <sup>K</sup> be Hilbert spaces. A map Φ : *B*(H) → *B*(K) is said to be *positive* if Φ(*A*) 0 whenever *A* 0. It is called *unital* if Φ(*I*) = *I*. We say that a positive map Φ is *strictly positive* if Φ(*A*) > 0 when *A* > 0. A map Ψ from a convex subset <sup>C</sup> of *<sup>B</sup>*(H)*sa* to *<sup>B</sup>*(K)*sa* is called *concave* if for each *<sup>A</sup>*, *<sup>B</sup>* ∈ C and *<sup>t</sup>* <sup>∈</sup> [0, 1],

$$
\Psi(tA + (1-t)B) \gtrsim t\Psi(A) + (1-t)\Psi(B).
$$

A map <sup>Ψ</sup> : *<sup>B</sup>*(H)*sa* <sup>→</sup> *<sup>B</sup>*(K)*sa* is called *monotone* if *<sup>A</sup> <sup>B</sup>* assures <sup>Ψ</sup>(*A*) <sup>Ψ</sup>(*B*). So, in particular, the map *<sup>A</sup>* �→ *<sup>A</sup><sup>r</sup>* is monotone and concave on *<sup>B</sup>*(H)<sup>+</sup> for each *<sup>r</sup>* <sup>∈</sup> [0, 1]. The map *<sup>A</sup>* �→ log *<sup>A</sup>* is monotone and concave on *<sup>B</sup>*(H)++.

Note first that, from the previous section, the quasi-arithmetic power mean (*A*, *B*) �→ *A* #*p*,*<sup>α</sup> B* is monotone and concave for any *p* ∈ [−1, 1] and *α* ∈ [0, 1]. In particular, the following are monotone and concave:


Recall the following lemma from [9].

**Lemma 0.42** (Choi's inequality)**.** *If* Φ : *B*(H) → *B*(K) *is linear, strictly positive and unital, then for every A* > 0*,* Φ(*A*)−<sup>1</sup> Φ(*A*−1)*.*

**Proposition 0.43.** *If* Φ : *B*(H) → *B*(K) *is linear and strictly positive, then for any A*, *B* > 0

$$
\Phi(A)\Phi(B)^{-1}\Phi(A) \lesssim \Phi(AB^{-1}A).\tag{32}
$$

*Proof.* For each *<sup>X</sup>* <sup>∈</sup> *<sup>B</sup>*(H), set <sup>Ψ</sup>(*X*) = <sup>Φ</sup>(*A*)−1/2Φ(*A*1/2*XA*1/2)Φ(*A*)−1/2. Then <sup>Ψ</sup> is a unital strictly positive linear map. So, by Choi's inequality, Ψ(*A*)−<sup>1</sup> Ψ(*A*−1) for all *A* > 0. For each *A*, *B* > 0, we have by Lemma 0.42 that

$$\Phi(A)^{1/2}\Phi(B)^{-1}\Phi(A)^{1/2} = \Psi(A^{-1/2}BA^{-1/2})^{-1}$$

$$\lesssim \Psi\left((A^{-1/2}BA^{-1/2})^{-1}\right)$$

$$= \Phi(A)^{-1/2}\Phi(AB^{-1}A)\Phi(A)^{-1/2}.$$

So, we have the claim.

**Theorem 0.44.** *If* Φ : *B*(H) → *B*(K) *is a positive linear map which is norm-continuous, then for any connection <sup>σ</sup> on B*(K)<sup>+</sup> *and for each A*, *<sup>B</sup>* <sup>&</sup>gt; <sup>0</sup>*,*

$$
\Phi(A \,\sigma \, B) \lessdot \Phi(A) \,\sigma \, \Phi(B). \tag{33}
$$

*If, addition,* Φ *is strongly continuous, then* (33) *holds for any A*, *B* 0*.*

*Proof.* First, consider *A*, *B* > 0. Assume that Φ is strictly positive. For each *X* ∈ *B*(H), set

$$\Psi(X) = \Phi(B)^{-1/2} \Phi(B^{1/2} X B^{1/2}) \Phi(B)^{-1/2}.$$

Then Ψ is a unital strictly positive linear map. So, by Choi's inequality, Ψ(*C*)−<sup>1</sup> Ψ(*C*−1) for all *<sup>C</sup>* <sup>&</sup>gt; 0. For each *<sup>t</sup>* <sup>∈</sup> [0, 1], put *Xt* <sup>=</sup> *<sup>B</sup>*−1/2(*<sup>A</sup>* !*<sup>t</sup> <sup>B</sup>*)*B*−1/2 <sup>&</sup>gt; 0. We obtain from the previous proposition that

$$\begin{aligned} \Phi(A\,^\circ\_t B) &= \Phi(B)^{1/2} \Psi(X\_t) \Phi(B)^{1/2} \\\\ &\leqslant \Phi(B)^{1/2} [\Psi(X\_t^{-1})]^{-1} \Phi(B)^{1/2} \\\\ &= \Phi(B) [\Phi(B((1-t)A^{-1} + tB^{-1})B)]^{-1} \Phi(B) \\\\ &= \Phi(B) [(1-t)\Phi(BA^{-1}B) + t\Phi(B)]^{-1} \Phi(B) \\\\ &\leqslant \Phi(B) [(1-t)\Phi(B)\Phi(A)^{-1}\Phi(B) + t\Phi(B)]^{-1} \Phi(B) \\\\ &= \Phi(A) \, ^\circ\_t \Phi(B) .\end{aligned}$$

For general case of Φ, consider the family Φ*�*(*A*) = Φ(*A*) + *�I* where *�* > 0. Since the map (*A*, *<sup>B</sup>*) �→ *<sup>A</sup>* !*<sup>t</sup> <sup>B</sup>* = [(<sup>1</sup> <sup>−</sup> *<sup>t</sup>*)*A*−<sup>1</sup> <sup>+</sup> *tB*−1] <sup>−</sup><sup>1</sup> is norm-continuous, we arrive at

$$
\Phi(A \restriction\_t B) \lesssim \Phi(A) \restriction\_t \Phi(B).
$$

For each connection *σ*, since Φ is a bounded linear operator, we have

$$\begin{aligned} \Phi(A \,\sigma \, B) &= \Phi(\int\_{[0,1]} A \, !\_t \, B \, d\mu(t)) = \int\_{[0,1]} \Phi(A \, !\_t \, B) \, d\mu(t) \\ &\leqslant \int\_{[0,1]} \Phi(A) \, !\_t \, \Phi(B) \, d\mu(t) = \Phi(A) \, \sigma \, \Phi(B) .\end{aligned}$$

Suppose further that Φ is strongly continuous. Then, for each *A*, *B* 0,

$$\begin{aligned} \Phi(A \,\, \sigma \,\, B) &= \Phi(\lim\_{\epsilon \downarrow 0} (A + \epsilon I) \,\, \sigma \,(B + \epsilon I)) = \lim\_{\epsilon \downarrow 0} \Phi((A + \epsilon I) \,\, \sigma \,(B + \epsilon I)) \\ &\leqslant \lim\_{\epsilon \downarrow 0} \Phi(A + \epsilon I) \,\, \sigma \, \Phi(B + \epsilon I) = \Phi(A) \,\, \sigma \, \Phi(B). \end{aligned}$$

The proof is complete.

20 Will-be-set-by-IN-TECH

*Proof.* For each *<sup>X</sup>* <sup>∈</sup> *<sup>B</sup>*(H), set <sup>Ψ</sup>(*X*) = <sup>Φ</sup>(*A*)−1/2Φ(*A*1/2*XA*1/2)Φ(*A*)−1/2. Then <sup>Ψ</sup> is a unital strictly positive linear map. So, by Choi's inequality, Ψ(*A*)−<sup>1</sup> Ψ(*A*−1) for all *A* > 0. For

Φ(*A*)Φ(*B*)−1Φ(*A*) Φ(*AB*−1*A*). (32)

(*A*−1/2*BA*−1/2)−<sup>1</sup>

= Φ(*A*)−1/2Φ(*AB*−1*A*)Φ(*A*)<sup>−</sup>1/2.

Φ(*A σ B*) Φ(*A*) *σ* Φ(*B*). (33)

**Proposition 0.43.** *If* Φ : *B*(H) → *B*(K) *is linear and strictly positive, then for any A*, *B* > 0

Φ(*A*)1/2Φ(*B*)−1Φ(*A*)1/2 = Ψ(*A*−1/2*BA*−1/2)−<sup>1</sup>

 Ψ 

**Theorem 0.44.** *If* Φ : *B*(H) → *B*(K) *is a positive linear map which is norm-continuous, then for*

*Proof.* First, consider *A*, *B* > 0. Assume that Φ is strictly positive. For each *X* ∈ *B*(H), set

Ψ(*X*) = Φ(*B*)−1/2Φ(*B*1/2*XB*1/2)Φ(*B*)<sup>−</sup>1/2.

Then Ψ is a unital strictly positive linear map. So, by Choi's inequality, Ψ(*C*)−<sup>1</sup> Ψ(*C*−1) for all *<sup>C</sup>* <sup>&</sup>gt; 0. For each *<sup>t</sup>* <sup>∈</sup> [0, 1], put *Xt* <sup>=</sup> *<sup>B</sup>*−1/2(*<sup>A</sup>* !*<sup>t</sup> <sup>B</sup>*)*B*−1/2 <sup>&</sup>gt; 0. We obtain from the previous

*<sup>t</sup>* )]−1Φ(*B*)1/2

<sup>=</sup> <sup>Φ</sup>(*B*)[Φ(*B*((<sup>1</sup> <sup>−</sup> *<sup>t</sup>*)*A*−<sup>1</sup> <sup>+</sup> *tB*−1)*B*)]−1Φ(*B*)

<sup>=</sup> <sup>Φ</sup>(*B*)[(<sup>1</sup> <sup>−</sup> *<sup>t</sup>*)Φ(*BA*−1*B*) + *<sup>t</sup>*Φ(*B*)]−1Φ(*B*)

For general case of Φ, consider the family Φ*�*(*A*) = Φ(*A*) + *�I* where *�* > 0. Since the map

Φ(*A* !*<sup>t</sup> B*) Φ(*A*)!*<sup>t</sup>* Φ(*B*).

<sup>Φ</sup>(*B*)[(<sup>1</sup> <sup>−</sup> *<sup>t</sup>*)Φ(*B*)Φ(*A*)−1Φ(*B*) + *<sup>t</sup>*Φ(*B*)]−1Φ(*B*)

<sup>−</sup><sup>1</sup> is norm-continuous, we arrive at

each *A*, *B* > 0, we have by Lemma 0.42 that

*any connection <sup>σ</sup> on B*(K)<sup>+</sup> *and for each A*, *<sup>B</sup>* <sup>&</sup>gt; <sup>0</sup>*,*

*If, addition,* Φ *is strongly continuous, then* (33) *holds for any A*, *B* 0*.*

Φ(*A* !*<sup>t</sup> B*) = Φ(*B*)1/2Ψ(*Xt*)Φ(*B*)1/2

Φ(*B*)1/2[Ψ(*X*−<sup>1</sup>

= Φ(*A*)!*<sup>t</sup>* Φ(*B*).

(*A*, *<sup>B</sup>*) �→ *<sup>A</sup>* !*<sup>t</sup> <sup>B</sup>* = [(<sup>1</sup> <sup>−</sup> *<sup>t</sup>*)*A*−<sup>1</sup> <sup>+</sup> *tB*−1]

So, we have the claim.

proposition that

As a special case, if Φ : *Mn*(**C**) → *Mn*(**C**) is a positive linear map, then for any connection *σ* and for any positive semidefinite matrices *A*, *B* ∈ *Mn*(**C**), we have

$$
\Phi(A\sigma B) \lessapprox \Phi(A)\,\sigma\,\Phi(B).
$$

In particular, Φ(*A*) #*p*,*<sup>α</sup>* Φ(*B*) Φ(*A*) #*p*,*<sup>α</sup>* Φ(*B*) for any *p* ∈ [−1, 1] and *α* ∈ [0, 1].

**Theorem 0.45.** *If* <sup>Φ</sup>1, <sup>Φ</sup><sup>2</sup> : *<sup>B</sup>*(H)<sup>+</sup> <sup>→</sup> *<sup>B</sup>*(K)<sup>+</sup> *are concave, then the map*

$$(A\_1, A\_2) \mapsto \Phi\_1(A\_1) \,\sigma \,\Phi\_2(A\_2) \tag{34}$$

*is concave for any connection <sup>σ</sup> on B*(K)+*.*

*Proof.* Let *A*1, *A*� <sup>1</sup>, *A*2, *A*� <sup>2</sup> 0 and *t* ∈ [0, 1]. The concavity of Φ<sup>1</sup> and Φ<sup>2</sup> means that for *i* = 1, 2

$$
\Phi\_i(tA\_i + (1-t)A\_i') \geqslant t\Phi\_i(A\_i) + (1-t)\Phi\_i(A\_i').
$$

It follows from the monotonicity and concavity of *σ* that

$$\begin{aligned} \Phi\_1(tA\_1 + (1-t)A\_1') \sigma \Phi\_2(tA\_2 + (1-t)A\_2')\\ &\geqslant \left[t\Phi\_1(A\_1) + (1-t)\Phi\_1(A\_1')\right] \sigma \left[t\Phi\_2(A\_2) + (1-t)\Phi\_2(A\_2')\right] \\ &\geqslant t\left[\Phi\_1(A\_1)\sigma\Phi\_2(A\_2)\right] + (1-t)\left[\Phi\_1(A\_1)\sigma\Phi\_2(A\_2)\right]. \end{aligned}$$

This shows the concavity of the map (*A*1, *A*2) �→ Φ1(*A*1) *σ* Φ2(*A*2) .

In particular, if Φ<sup>1</sup> and Φ<sup>2</sup> are concave, then so is (*A*, *B*) �→ Φ1(*A*) #*p*,*α*Φ2(*B*) for *p* ∈ [−1, 1] and *α* ∈ [0, 1].

**Corollary 0.46.** *Let <sup>σ</sup> be a connection. Then, for any operator monotone functions f* , *<sup>g</sup>* : **<sup>R</sup>**<sup>+</sup> <sup>→</sup> **<sup>R</sup>**+*, the map* (*A*, *B*) �→ *f*(*A*) *σ g*(*B*) *is concave. In particular,*


**Theorem 0.47.** *If* <sup>Φ</sup>1, <sup>Φ</sup><sup>2</sup> : *<sup>B</sup>*(H)<sup>+</sup> <sup>→</sup> *<sup>B</sup>*(K)<sup>+</sup> *are monotone, then the map*

$$(A\_1, A\_2) \mapsto \Phi\_1(A\_1) \,\sigma \,\Phi\_2(A\_2) \tag{35}$$

*is monotone for any connection <sup>σ</sup> on B*(K)+*.*

*Proof.* Let *A*<sup>1</sup> *A*� <sup>1</sup> and *A*<sup>2</sup> *A*� <sup>2</sup>. Then Φ1(*A*1) Φ1(*A*� <sup>1</sup>) and Φ2(*A*2) Φ2(*A*� <sup>2</sup>) by the monotonicity of Φ<sup>1</sup> and Φ2. Now, the monotonicity of *σ* forces Φ1(*A*1) *σ* Φ2(*A*2) Φ1(*A*� <sup>1</sup>) *σ* Φ2(*A*� 2).

In particular, if Φ<sup>1</sup> and Φ<sup>2</sup> are monotone, then so is (*A*, *B*) �→ Φ1(*A*) #*p*,*α*Φ2(*B*) for *p* ∈ [−1, 1] and *α* ∈ [0, 1].

**Corollary 0.48.** *Let <sup>σ</sup> be a connection. Then, for any operator monotone functions f* , *<sup>g</sup>* : **<sup>R</sup>**<sup>+</sup> <sup>→</sup> **<sup>R</sup>**+*, the map* (*A*, *B*) �→ *f*(*A*) *σ g*(*B*) *is monotone. In particular,*

*(1) the map* (*A*, *<sup>B</sup>*) �→ *<sup>A</sup><sup>r</sup> <sup>σ</sup> <sup>B</sup><sup>s</sup> is monotone on B*(H)<sup>+</sup> *for any r*,*<sup>s</sup>* <sup>∈</sup> [0, 1]*,*

*(2) the map* (*A*, *<sup>B</sup>*) �→ (log *<sup>A</sup>*) *<sup>σ</sup>* (log *<sup>B</sup>*) *is monotone on B*(H)++*.*

**Corollary 0.49.** *Let <sup>σ</sup> be a connection on B*(H)+*. If* <sup>Φ</sup>1, <sup>Φ</sup><sup>2</sup> : *<sup>B</sup>*(H)<sup>+</sup> <sup>→</sup> *<sup>B</sup>*(H)<sup>+</sup> *is monotone and strongly continuous, then the map*

$$(A, B) \mapsto \Phi\_1(A) \,\sigma \,\Phi\_2(B) \tag{36}$$

*is a connection on B*(H)+*. Hence, the map*

$$f(A, B) \mapsto f(A) \,\sigma \,\mathrm{g}(B) \tag{37}$$

*is a connection for any operator monotone functions f* , *<sup>g</sup>* : **<sup>R</sup>**<sup>+</sup> <sup>→</sup> **<sup>R</sup>**+*.*

*Proof.* The monotonicity of this map follows from the previous result. It is easy to see that this map satisfies the transformer inequality. Since Φ<sup>1</sup> and Φ<sup>2</sup> strongly continuous, this binary operation satisfies the (separate or joint) continuity from above. The last statement follows from the fact that if *An* ↓ *A*, then Sp(*An*) ⊆ [0, �*A*1�] for all *n* and hence *f*(*An*) → *f*(*A*).

#### **8. Applications to operator inequalities**

In this section, we apply Kubo-Ando's theory in order to get simple proofs of many classical inequalities in the context of operators.

**Theorem 0.50** (AM-LM-GM-HM inequalities)**.** *For A*, *B* 0*, we have*

$$A \,\!\!\/ B \leqslant A \,\#\!\/ B \leqslant A \,\lambda \,\!\/ B \leqslant A \,\!\/ \to \!\!\/ B. \tag{38}$$

*Proof.* It is easy to see that, for each *t* > 0, *t* �= 1,

$$\frac{2t}{1+t} \le t^{1/2} \le \frac{t-1}{\log t} \le \frac{1+t}{2}.$$

Now, we apply the order isomorphism which converts inequalities of operator monotone functions to inequalities of the associated operator connections.

**Theorem 0.51** (Weighed AM-GM-HM inequalities)**.** *For A*, *B* 0 *and α* ∈ [0, 1]*, we have*

$$A \upharpoonright\_{\mathfrak{A}} B \lesssim A \,\#\_{\mathfrak{A}} B \lesssim A \,\top\_{\mathfrak{A}} B. \tag{39}$$

*Proof.* Apply the order isomorphism to the following inequalities:

$$\frac{t}{(1-\alpha)t+\alpha} \leqslant t^{\alpha} \leqslant 1-\alpha+\alpha t, \quad t \geqslant 0.$$

The next two theorems are given in [21].

**Theorem 0.52.** *For each i* <sup>=</sup> 1, ··· , *n, let Ai*, *Bi* <sup>∈</sup> *<sup>B</sup>*(H)+*. Then for each connection <sup>σ</sup>*

$$\sum\_{i=1}^{n} (A\_i \,\sigma \, B\_i) \lessapprox \sum\_{i=1}^{n} A\_i \,\sigma \sum\_{i=1}^{n} B\_i. \tag{40}$$

*Proof.* Use the concavity of *σ* together with the induction.

By replacing *σ* with appropriate connections, we get some interesting inequalities. (1) Cauchy-Schwarz's inequality: For *Ai*, *Bi* <sup>∈</sup> *<sup>B</sup>*(H)*sa*,

$$\sum\_{i=1}^{n} A\_i^2 \# B\_i^2 \leqslant \sum\_{i=1}^{n} A\_i^2 \# \sum\_{i=1}^{n} B\_i^2. \tag{41}$$

(2) Hölder's inequality: For *Ai*, *Bi* <sup>∈</sup> *<sup>B</sup>*(H)<sup>+</sup> and *<sup>p</sup>*, *<sup>q</sup>* <sup>&</sup>gt; 0 such that 1/*<sup>p</sup>* <sup>+</sup> 1/*<sup>q</sup>* <sup>=</sup> 1,

$$\sum\_{i=1}^{n} A\_i^p \#\_{1/p} B\_i^q \lesssim \sum\_{i=1}^{n} A\_i^p \#\_{1/p} \sum\_{i=1}^{n} B\_i^q. \tag{42}$$

(3) Minkowski's inequality: For *Ai*, *Bi* <sup>∈</sup> *<sup>B</sup>*(H)++,

$$
\left(\sum\_{i=1}^{n} (A\_i + B\_i)^{-1}\right)^{-1} \geqslant \left(\sum\_{i=1}^{n} A\_i^{-1}\right)^{-1} + \left(\sum\_{i=1}^{n} B\_i^{-1}\right)^{-1}.\tag{43}
$$

**Theorem 0.53.** *Let Ai*, *Bi* <sup>∈</sup> *<sup>B</sup>*(H)+*, i* <sup>=</sup> 1, ··· , *n, be such that*

$$A\_1 - A\_2 - \dots - A\_n \geqslant 0 \quad \text{and} \quad B\_1 - B\_2 - \dots - B\_n \geqslant 0.$$

*Then*

22 Will-be-set-by-IN-TECH

<sup>2</sup>. Then Φ1(*A*1) Φ1(*A*�

the monotonicity of Φ<sup>1</sup> and Φ2. Now, the monotonicity of *σ* forces Φ1(*A*1) *σ* Φ2(*A*2)

In particular, if Φ<sup>1</sup> and Φ<sup>2</sup> are monotone, then so is (*A*, *B*) �→ Φ1(*A*) #*p*,*α*Φ2(*B*) for *p* ∈ [−1, 1]

**Corollary 0.48.** *Let <sup>σ</sup> be a connection. Then, for any operator monotone functions f* , *<sup>g</sup>* : **<sup>R</sup>**<sup>+</sup> <sup>→</sup> **<sup>R</sup>**+*,*

**Corollary 0.49.** *Let <sup>σ</sup> be a connection on B*(H)+*. If* <sup>Φ</sup>1, <sup>Φ</sup><sup>2</sup> : *<sup>B</sup>*(H)<sup>+</sup> <sup>→</sup> *<sup>B</sup>*(H)<sup>+</sup> *is monotone and*

*Proof.* The monotonicity of this map follows from the previous result. It is easy to see that this map satisfies the transformer inequality. Since Φ<sup>1</sup> and Φ<sup>2</sup> strongly continuous, this binary operation satisfies the (separate or joint) continuity from above. The last statement follows from the fact that if *An* ↓ *A*, then Sp(*An*) ⊆ [0, �*A*1�] for all *n* and hence *f*(*An*) → *f*(*A*).

In this section, we apply Kubo-Ando's theory in order to get simple proofs of many classical

1/2

Now, we apply the order isomorphism which converts inequalities of operator monotone

*t* − 1 log *<sup>t</sup>*

(*A*1, *A*2) �→ Φ1(*A*1) *σ* Φ2(*A*2) (35)

(*A*, *B*) �→ Φ1(*A*) *σ* Φ2(*B*) (36)

(*A*, *B*) �→ *f*(*A*) *σ g*(*B*) (37)

*A* ! *B A* # *B A λ B A B*. (38)

1 + *t* 2 . <sup>1</sup>) and Φ2(*A*2) Φ2(*A*�

<sup>2</sup>) by

**Theorem 0.47.** *If* <sup>Φ</sup>1, <sup>Φ</sup><sup>2</sup> : *<sup>B</sup>*(H)<sup>+</sup> <sup>→</sup> *<sup>B</sup>*(K)<sup>+</sup> *are monotone, then the map*

*is monotone for any connection <sup>σ</sup> on B*(K)+*.*

<sup>1</sup> and *A*<sup>2</sup> *A*�

*the map* (*A*, *B*) �→ *f*(*A*) *σ g*(*B*) *is monotone. In particular,*

*(1) the map* (*A*, *<sup>B</sup>*) �→ *<sup>A</sup><sup>r</sup> <sup>σ</sup> <sup>B</sup><sup>s</sup> is monotone on B*(H)<sup>+</sup> *for any r*,*<sup>s</sup>* <sup>∈</sup> [0, 1]*,*

*(2) the map* (*A*, *<sup>B</sup>*) �→ (log *<sup>A</sup>*) *<sup>σ</sup>* (log *<sup>B</sup>*) *is monotone on B*(H)++*.*

*is a connection for any operator monotone functions f* , *<sup>g</sup>* : **<sup>R</sup>**<sup>+</sup> <sup>→</sup> **<sup>R</sup>**+*.*

**Theorem 0.50** (AM-LM-GM-HM inequalities)**.** *For A*, *B* 0*, we have*

2*t* <sup>1</sup> <sup>+</sup> *<sup>t</sup> <sup>t</sup>*

functions to inequalities of the associated operator connections.

*Proof.* Let *A*<sup>1</sup> *A*�

<sup>1</sup>) *σ* Φ2(*A*�

2).

*strongly continuous, then the map*

*is a connection on B*(H)+*. Hence, the map*

**8. Applications to operator inequalities**

*Proof.* It is easy to see that, for each *t* > 0, *t* �= 1,

inequalities in the context of operators.

Φ1(*A*�

and *α* ∈ [0, 1].

$$A\_1 \sigma \, B\_1 - \sum\_{i=2}^n A\_i \, \sigma \, B\_i \gtrsim \left( A\_1 - \sum\_{i=2}^n A\_i \right) \sigma \left( B\_1 - \sum\_{i=2}^n B\_i \right). \tag{44}$$

*Proof.* Substitute *A*<sup>1</sup> to *A*<sup>1</sup> − *A*<sup>2</sup> −···− *An* and *B*<sup>1</sup> to *B*<sup>1</sup> − *B*<sup>2</sup> −···− *Bn* in (40).

Here are consequences.

(1) Aczél's inequality: For *Ai*, *Bi* <sup>∈</sup> *<sup>B</sup>*(H)*sa*, if

$$A\_1^2 - A\_2^2 - \dots - A\_n^2 \gg 0 \quad \text{and} \quad B\_1^2 - B\_2^2 - \dots - B\_n^2 \gg 0,$$

then

$$A\_1^2 \# B\_1^2 - \sum\_{i=2}^n A\_i^2 \# B\_i^2 \geqslant \left(A\_1^2 - \sum\_{i=2}^n A\_i^2\right) \# \left(B\_1^2 - \sum\_{i=2}^n B\_i^2\right). \tag{45}$$

(2) Popoviciu's inequality: For *Ai*, *Bi* <sup>∈</sup> *<sup>B</sup>*(H)<sup>+</sup> and *<sup>p</sup>*, *<sup>q</sup>* <sup>&</sup>gt; 0 such that 1/*<sup>p</sup>* <sup>+</sup> 1/*<sup>q</sup>* <sup>=</sup> 1, if *p*, *q* > 0 are such that 1/*p* + 1/*q* = 1 and

$$A\_1^p - A\_2^p - \dots - A\_n^p \gtrsim 0 \quad \text{and} \quad B\_1^q - B\_2^q - \dots - B\_n^q \gtrsim 0,$$

then

$$A\_1^p \#\_{1/p} B\_1^q - \sum\_{i=2}^n A\_i^p \#\_{1/p} B\_i^q \geqslant \left( A\_1^p - \sum\_{i=2}^n A\_i^p \right) \#\_{1/p} \left( B\_1^q - \sum\_{i=2}^n B\_i^q \right). \tag{46}$$

(3) Bellman's inequality: For *Ai*, *Bi* <sup>∈</sup> *<sup>B</sup>*(H)++, if

$$A\_1^{-1} - A\_2^{-1} - \dots - A\_n^{-1} > 0 \quad \text{and} \quad B\_1^{-1} - B\_2^{-1} - \dots - B\_n^{-1} > 0\_n$$

then

$$\left[ (A\_1^{-1} + B\_1^{-1}) - \sum\_{i=2}^n (A\_i + B\_i)^{-1} \right]^{-1} \leqslant \left( A\_1^{-1} - \sum\_{i=2}^n A\_i^{-1} \right)^{-1} + \left( B\_1^{-1} - \sum\_{i=2}^n B\_i^{-1} \right)^{-1}.\tag{47}$$

The mean-theoretic approach can be used to prove the famous Furuta's inequality as follows. We cite [14] for the proof.

**Theorem 0.54** (Furuta's inequality)**.** *For A B* 0*, we have*

$$(B^r A^p B^r)^{1/q} \gtrsim B^{(p+2r)/q} \tag{48}$$

$$A^{(p+2r)/q} \gtrless (A^r B^p A^r)^{1/q} \tag{49}$$

*where r* 0, *p* 0, *q* 1 *and* (1 + 2*r*)*q p* + 2*r.*

*Proof.* By the continuity argument, assume that *A*, *B* > 0. Note that (48) and (49) are equivalent. Indeed, if (48) holds, then (49) comes from applying (48) to *A*−<sup>1</sup> *B*−<sup>1</sup> and taking inverse on both sides. To prove (48), first consider the case 0 *p* 1. We have *Bp*+2*<sup>r</sup>* = *BrBpBr BrApBr* and the Löwner-Heinz's inequality (LH) implies the desired result. Now, consider the case *p* 1 and *q* = (*p* + 2*r*)/(1 + 2*r*), since (48) for *q* > (*p* + 2*r*)/(1 + 2*r*) can be obtained by (LH). Let *f*(*t*) = *t* 1/*<sup>q</sup>* and let *σ* be the associated connection (in fact, *σ* = #1/*q*). Must show that, for any *r* 0,

24 Will-be-set-by-IN-TECH

*<sup>n</sup>* 0 and *<sup>B</sup>*<sup>2</sup>

(2) Popoviciu's inequality: For *Ai*, *Bi* <sup>∈</sup> *<sup>B</sup>*(H)<sup>+</sup> and *<sup>p</sup>*, *<sup>q</sup>* <sup>&</sup>gt; 0 such that 1/*<sup>p</sup>* <sup>+</sup> 1/*<sup>q</sup>* <sup>=</sup> 1, if

*<sup>n</sup>* 0 and *<sup>B</sup><sup>q</sup>*

*<sup>n</sup>* > 0 and *<sup>B</sup>*−<sup>1</sup>

The mean-theoretic approach can be used to prove the famous Furuta's inequality as follows.

*A*(*p*+2*r*)/*<sup>q</sup>* (*A<sup>r</sup>*

*Proof.* By the continuity argument, assume that *A*, *B* > 0. Note that (48) and (49) are equivalent. Indeed, if (48) holds, then (49) comes from applying (48) to *A*−<sup>1</sup> *B*−<sup>1</sup> and taking inverse on both sides. To prove (48), first consider the case 0 *p* 1. We have *Bp*+2*<sup>r</sup>* = *BrBpBr BrApBr* and the Löwner-Heinz's inequality (LH) implies the desired result. Now, consider the case *p* 1 and *q* = (*p* + 2*r*)/(1 + 2*r*), since (48) for *q* > (*p* + 2*r*)/(1 + 2*r*)

*<sup>i</sup> Ap* 1 − *n* ∑ *i*=2 *Ap i* #1/*<sup>p</sup>*

<sup>1</sup> <sup>−</sup> *<sup>B</sup>*<sup>2</sup>

<sup>1</sup> <sup>−</sup> *<sup>B</sup><sup>q</sup>*

<sup>1</sup> <sup>−</sup> *<sup>B</sup>*−<sup>1</sup>

−<sup>1</sup> + *B*−<sup>1</sup> 1 −

*A*−<sup>1</sup> *i*

*BpAr*

*n* ∑ *i*=2 <sup>2</sup> −···− *<sup>B</sup>*<sup>2</sup>

<sup>2</sup> −···− *<sup>B</sup><sup>q</sup>*

 *Bq* 1 − *n* ∑ *i*=2 *Bq i* 

<sup>2</sup> −···− *<sup>B</sup>*−<sup>1</sup> *<sup>n</sup>* <sup>&</sup>gt; 0,

)1/*<sup>q</sup> B*(*p*+2*r*)/*<sup>q</sup>* (48)

1/*<sup>q</sup>* and let *σ* be the associated connection (in fact,

*n* ∑ *i*=2

)1/*<sup>q</sup>* (49)

*B*−<sup>1</sup> *i*

−<sup>1</sup>

. (47)

*<sup>n</sup>* 0,

*<sup>n</sup>* 0,

. (45)

. (46)

Here are consequences.

then

then

then (*A*−<sup>1</sup> <sup>1</sup> <sup>+</sup> *<sup>B</sup>*−<sup>1</sup>

(1) Aczél's inequality: For *Ai*, *Bi* <sup>∈</sup> *<sup>B</sup>*(H)*sa*, if

<sup>2</sup> −···− *<sup>A</sup>*<sup>2</sup>

<sup>2</sup> −···− *<sup>A</sup><sup>p</sup>*

<sup>2</sup> −···− *<sup>A</sup>*−<sup>1</sup>

(*Ai* + *Bi*)−<sup>1</sup>

**Theorem 0.54** (Furuta's inequality)**.** *For A B* 0*, we have*

−<sup>1</sup> *A*−<sup>1</sup> 1 −

(*BrApBr*

*A*2 <sup>1</sup> <sup>−</sup> *<sup>A</sup>*<sup>2</sup>

*A*2 <sup>1</sup> # *<sup>B</sup>*<sup>2</sup> 1 − *n* ∑ *i*=2 *A*2 *<sup>i</sup>* # *<sup>B</sup>*<sup>2</sup> *<sup>i</sup> A*2 1 − *n* ∑ *i*=2 *A*2 *i* # *B*2 1 − *n* ∑ *i*=2 *B*2 *i* 

*p*, *q* > 0 are such that 1/*p* + 1/*q* = 1 and

1 − *n* ∑ *i*=2 *Ap <sup>i</sup>* #1/*<sup>p</sup> <sup>B</sup><sup>q</sup>*

(3) Bellman's inequality: For *Ai*, *Bi* <sup>∈</sup> *<sup>B</sup>*(H)++, if

*Ap* <sup>1</sup> <sup>−</sup> *<sup>A</sup><sup>p</sup>*

*Ap* <sup>1</sup> #1/*<sup>p</sup> <sup>B</sup><sup>q</sup>*

*A*−<sup>1</sup> <sup>1</sup> <sup>−</sup> *<sup>A</sup>*−<sup>1</sup>

<sup>1</sup> ) −

We cite [14] for the proof.

*n* ∑ *i*=2

*where r* 0, *p* 0, *q* 1 *and* (1 + 2*r*)*q p* + 2*r.*

can be obtained by (LH). Let *f*(*t*) = *t*

$$B^{-2r} \sigma A^p \gtrsim B. \tag{50}$$

For 0 *r* <sup>1</sup> <sup>2</sup> , we have by (LH) that *<sup>A</sup>*2*<sup>r</sup> <sup>B</sup>*2*<sup>r</sup>* and

$$B^{-2r}\sigma A^p \gtrsim A^{-2r}\sigma A^p = A^{-2r(1-1/q)}A^{p/q} = A \gtrsim B = B^{-2r}\sigma B^p.$$

Now, set *s* = 2*r* + <sup>1</sup> <sup>2</sup> and *q*<sup>1</sup> = (*p* + 2*s*)/(1 + 2*s*) 1. Let *f*1(*t*) = *t* 1/*q*<sup>1</sup> and consider the associated connection *σ*1. The previous step, the monotonicity and the congruence invariance of connections imply that

$$\begin{aligned} \mathcal{B}^{-2s} \sigma\_1 A^p &= \mathcal{B}^{-r} [\mathcal{B}^{-(2r+1)} \sigma\_1 \left( \mathcal{B}^r A^p B^r \right)] \mathcal{B}^{-r} \\ &\geqslant \mathcal{B}^{-r} [(\mathcal{B}^r A^p B^r)^{-1/q\_1} \sigma\_1 \left( \mathcal{B}^r A^p B^r \right)] \mathcal{B}^{-r} \\ &= \mathcal{B}^{-r} (\mathcal{B}^r A^p B^r)^{1/q} \mathcal{B}^{-r} \\ &\geqslant \mathcal{B}^{-r} \mathcal{B}^{1+2r} \mathcal{B}^{-r} \\ &= \mathcal{B}. \end{aligned}$$

Note that the above result holds for *A*, *B* 0 via the continuity of a connection. The desired equation (50) holds for all *r* 0 by repeating this process.

#### **Acknowledgement**

The author thanks referees for article processing.

#### **Author details**

Pattrawut Chansangiam *King Mongkut's Institute of Technology Ladkrabang, Thailand*

#### **9. References**

	- [9] Bhatia, R. (2007). *Positive Definite Matrices*, Princeton University Press, New Jersey.
	- [10] Donoghue, W. (1974). *Monotone matrix functions and analytic continuation*, Springer-Verlag New York Inc., New York.
	- [11] Fujii, J. (1978). Arithmetico-geometric mean of operators, *Mathematica Japonica*, Vol. 23, 667–669.
	- [12] Fujii, J. (1979). On geometric and harmonic means of positive operators, *Mathematica Japonica*, Vol. 24, No. 2, 203–207.
	- [13] Fujii, J. (1992). Operator means and the relative operator entropy, *Operator Theory: Advances and Applications*, Vol. 59, 161–172.
	- [14] Furuta, T. (1989). A proof via operator means of an order preserving inequality, *Linear Algebra and its Applications*, Vol. 113, 129–130.
	- [15] Hiai, F. (2010). Matrix analysis: matrix monotone functions, matrix means, and majorizations, *Interdisciplinary Information Sciences*, Vol. 16, No. 2, 139–248,
	- [16] Hiai, F. & Yanagi, K. (1995). *Hilbert spaces and linear operators*, Makino Pub. Ltd.
	- [17] Kubo, F. & Ando, T. (1980). Means of positive linear operators, *Mathematische Annalen*, Vol. 246, 205–224.
	- [18] Lawson, J. & Lim, Y. (2001). The geometric mean, matrices, metrices and more, *The American Mathematical Monthly*, Vol. 108, 797–812.
	- [19] Lim, Y. (2008). On Ando–Li–Mathias geometric mean equations, *Linear Algebra and its Applications*, Vol. 428, 1767–1777.
	- [20] Löwner, C. (1934). Über monotone matrix funktionen. *Mathematische Zeitschrift*, Vol. 38, 177–216.
	- [21] Mond, B. & Pe*c*ˇari*c*´, J. & *S*ˇunde, J. & Varo*s*ˇanec, S. (1997). Operator versions of some classical inequalities, *Linear Algebra and its Applications*, Vol. 264, 117–126.
	- [22] Nishio, K. & Ando, T. (1976). Characterizations of operations derived from network connections. *Journal of Mathematical Analysis and Applications*, Vol. 53, 539–549.
	- [23] Pusz, W. & Woronowicz, S. (1975). Functional calculus for sesquilinear forms and the purification map, *Reports on Mathematical Physics*, Vol. 8, 159–170.
	- [24] Toader, G. & Toader, S. (2005). Greek means and the arithmetic-geometric mean, RGMIA Monographs, Victoria University, (ONLINE: http://rgmia.vu.edu.au/monographs).

## **Recent Research on Jensen's Inequality for OpÜrators**

Jadranka Mićić and Josip Pečarić

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/48468

26 Will-be-set-by-IN-TECH

[11] Fujii, J. (1978). Arithmetico-geometric mean of operators, *Mathematica Japonica*, Vol. 23,

[12] Fujii, J. (1979). On geometric and harmonic means of positive operators, *Mathematica*

[13] Fujii, J. (1992). Operator means and the relative operator entropy, *Operator Theory:*

[14] Furuta, T. (1989). A proof via operator means of an order preserving inequality, *Linear*

[15] Hiai, F. (2010). Matrix analysis: matrix monotone functions, matrix means, and

[17] Kubo, F. & Ando, T. (1980). Means of positive linear operators, *Mathematische Annalen*,

[18] Lawson, J. & Lim, Y. (2001). The geometric mean, matrices, metrices and more, *The*

[19] Lim, Y. (2008). On Ando–Li–Mathias geometric mean equations, *Linear Algebra and its*

[20] Löwner, C. (1934). Über monotone matrix funktionen. *Mathematische Zeitschrift*, Vol. 38,

[21] Mond, B. & Pe*c*ˇari*c*´, J. & *S*ˇunde, J. & Varo*s*ˇanec, S. (1997). Operator versions of some

[22] Nishio, K. & Ando, T. (1976). Characterizations of operations derived from network connections. *Journal of Mathematical Analysis and Applications*, Vol. 53, 539–549. [23] Pusz, W. & Woronowicz, S. (1975). Functional calculus for sesquilinear forms and the

[24] Toader, G. & Toader, S. (2005). Greek means and the arithmetic-geometric mean, RGMIA Monographs, Victoria University, (ONLINE: http://rgmia.vu.edu.au/monographs).

classical inequalities, *Linear Algebra and its Applications*, Vol. 264, 117–126.

purification map, *Reports on Mathematical Physics*, Vol. 8, 159–170.

majorizations, *Interdisciplinary Information Sciences*, Vol. 16, No. 2, 139–248, [16] Hiai, F. & Yanagi, K. (1995). *Hilbert spaces and linear operators*, Makino Pub. Ltd.

[9] Bhatia, R. (2007). *Positive Definite Matrices*, Princeton University Press, New Jersey. [10] Donoghue, W. (1974). *Monotone matrix functions and analytic continuation*, Springer-Verlag

New York Inc., New York.

188 Linear Algebra – Theorems and Applications

*Japonica*, Vol. 24, No. 2, 203–207.

*Applications*, Vol. 428, 1767–1777.

*Advances and Applications*, Vol. 59, 161–172.

*Algebra and its Applications*, Vol. 113, 129–130.

*American Mathematical Monthly*, Vol. 108, 797–812.

667–669.

Vol. 246, 205–224.

177–216.

## **1. Introduction**

The self-adjoint operators on Hilbert spaces with their numerous applications play an important part in the operator theory. The bounds research for self-adjoint operators is a very useful area of this theory. There is no better inequality in bounds examination than Jensen's inequality. It is an extensively used inequality in various fields of mathematics.

Let *I* be a real interval of any type. A continuous function *f* : *I* → **R** is said to be operator convex if

$$f\left(\lambda\mathbf{x} + (1-\lambda)y\right) \le \lambda f(\mathbf{x}) + (1-\lambda)f(y) \tag{1}$$

holds for each *λ* ∈ [0, 1] and every pair of self-adjoint operators *x* and *y* (acting) on an infinite dimensional Hilbert space *H* with spectra in *I* (the ordering is defined by setting *x* ≤ *y* if *y* − *x* is positive semi-definite).

Let *f* be an operator convex function defined on an interval *I*. Ch. Davis [1] proved1 a Schwarz inequality

$$f\left(\phi(\mathbf{x})\right) \le \phi\left(f(\mathbf{x})\right) \tag{2}$$

where *φ*: A → *B*(*K*) is a unital completely positive linear mapping from a *C*∗-algebra A to linear operators on a Hilbert space *K*, and *x* is a self-adjoint element in A with spectrum in *I*. Subsequently M. D. Choi [2] noted that it is enough to assume that *φ* is unital and positive. In fact, the restriction of *φ* to the commutative *C*∗-algebra generated by *x* is automatically completely positive by a theorem of Stinespring.

F. Hansen and G. K. Pedersen [3] proved a Jensen type inequality

$$f\left(\sum\_{i=1}^{n}a\_i^\*\mathbf{x}\_ia\_i\right) \le \sum\_{i=1}^{n}a\_i^\*f(\mathbf{x}\_i)a\_i \tag{3}$$

<sup>1</sup> There is small typo in the proof. Davis states that *φ* by Stinespring's theorem can be written on the form *φ*(*x*) = *Pρ*(*x*)*P* where *ρ* is a ∗-homomorphism to *B*(*H*) and *P* is a projection on *H*. In fact, *H* may be embedded in a Hilbert space *K* on which *ρ* and *P* acts. The theorem then follows by the calculation *f*(*φ*(*x*)) = *f*(*Pρ*(*x*)*P*) ≤ *P f*(*ρ*(*x*))*P* = *Pρ*(*f*(*x*)*P* = *φ*(*f*(*x*)), where the pinching inequality, proved by Davis in the same paper, is applied.

©2012 Mi´ci´c and Peˇcari´c, licensee InTech. This is an open access chapter distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. ©2012 Mićić and Pečarić, licensee InTech. This is a paper distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

for operator convex functions *f* defined on an interval *I* = [0, *α*) (with *α* ≤ ∞ and *f*(0) ≤ 0) and self-adjoint operators *x*1,..., *xn* with spectra in *I* assuming that ∑*<sup>n</sup> <sup>i</sup>*=<sup>1</sup> *a*<sup>∗</sup> *<sup>i</sup> ai* = **1**. The restriction on the interval and the requirement *f*(0) ≤ 0 was subsequently removed by B. Mond and J. Peˇcari´c in [4], cf. also [5].

The inequality (3) is in fact just a reformulation of (2) although this was not noticed at the time. It is nevertheless important to note that the proof given in [3] and thus the statement of the theorem, when restricted to *n* × *n* matrices, holds for the much richer class of 2*n* × 2*n* matrix convex functions. Hansen and Pedersen used (3) to obtain elementary operations on functions, which leave invariant the class of operator monotone functions. These results then served as the basis for a new proof of Löwner's theorem applying convexity theory and Krein-Milman's theorem.

B. Mond and J. Peˇcari´c [6] proved the inequality

$$f\left(\sum\_{i=1}^{n} w\_i \phi\_i(\mathbf{x}\_i)\right) \le \sum\_{i=1}^{n} w\_i \phi\_i(f(\mathbf{x}\_i)) \tag{4}$$

for operator convex functions *f* defined on an interval *I*, where *φ<sup>i</sup>* : *B*(*H*) → *B*(*K*) are unital positive linear mappings, *x*1,..., *xn* are self-adjoint operators with spectra in *I* and *w*1,..., *wn* are are non-negative real numbers with sum one.

Also, B. Mond, J. Peˇcari´c, T. Furuta et al. [6–11] observed conversed of some special case of Jensen's inequality. So in [10] presented the following generalized converse of a Schwarz inequality (2)

$$\left[F\left[\phi\left(f(A)\right),\operatorname{g}\left(\phi(A)\right)\right]\right] \le \max\_{m \le t \le M} F\left[f(m) + \frac{f(M) - f(m)}{M - m} (t - m), \operatorname{g}(t)\right] \mathbf{1}\_{\tilde{n}} \tag{5}$$

for convex functions *f* defined on an interval [*m*, *M*], *m* < *M*, where *g* is a real valued continuous function on [*m*, *M*], *F*(*u*, *v*) is a real valued function defined on *U* × *V*, matrix non-decreasing in *u*, *U* ⊃ *f* [*m*, *M*], *V* ⊃ *g*[*m*, *M*], *φ* : *Hn* → *Hn*˜ is a unital positive linear mapping and *A* is a Hermitian matrix with spectrum contained in [*m*, *M*].

There are a lot of new research on the classical Jensen inequality (4) and its reverse inequalities. For example, J.I. Fujii et all. in [12, 13] expressed these inequalities by externally dividing points.

#### **2. Classic results**

In this section we present a form of Jensen's inequality which contains (2), (3) and (4) as special cases. Since the inequality in (4) was the motivating step for obtaining converses of Jensen's inequality using the so-called Mond-Peˇcari´c method, we also give some results pertaining to converse inequalities in the new formulation.

We recall some definitions. Let *T* be a locally compact Hausdorff space and let A be a *C*∗-algebra of operators on some Hilbert space *H*. We say that a field (*xt*)*t*∈*<sup>T</sup>* of operators in A is continuous if the function *t* �→ *xt* is norm continuous on *T*. If in addition *μ* is a Radon measure on *T* and the function *t* �→ �*xt*� is integrable, then we can form *the Bochner integral <sup>T</sup> xt dμ*(*t*), which is the unique element in A such that

$$\int \varphi\left(\int\_T \mathbf{x}\_t \, d\mu(t)\right) = \int\_T \varphi(\mathbf{x}\_t) \, d\mu(t).$$

for every linear functional *ϕ* in the norm dual A∗.

2 Will-be-set-by-IN-TECH

for operator convex functions *f* defined on an interval *I* = [0, *α*) (with *α* ≤ ∞ and *f*(0) ≤

restriction on the interval and the requirement *f*(0) ≤ 0 was subsequently removed by B.

The inequality (3) is in fact just a reformulation of (2) although this was not noticed at the time. It is nevertheless important to note that the proof given in [3] and thus the statement of the theorem, when restricted to *n* × *n* matrices, holds for the much richer class of 2*n* × 2*n* matrix convex functions. Hansen and Pedersen used (3) to obtain elementary operations on functions, which leave invariant the class of operator monotone functions. These results then served as the basis for a new proof of Löwner's theorem applying convexity theory and

*<sup>i</sup>*=<sup>1</sup> *a*<sup>∗</sup>

*wiφi*(*f*(*xi*)) (4)

*<sup>M</sup>* <sup>−</sup> *<sup>m</sup>* (*<sup>t</sup>* <sup>−</sup> *<sup>m</sup>*), *<sup>g</sup>*(*t*)

1*n*˜ (5)

*<sup>i</sup> ai* = **1**. The

0) and self-adjoint operators *x*1,..., *xn* with spectra in *I* assuming that ∑*<sup>n</sup>*

*wiφi*(*xi*)

*m*≤*t*≤*M*

mapping and *A* is a Hermitian matrix with spectrum contained in [*m*, *M*].

*F* 

 ≤ *n* ∑ *i*=1

for operator convex functions *f* defined on an interval *I*, where *φ<sup>i</sup>* : *B*(*H*) → *B*(*K*) are unital positive linear mappings, *x*1,..., *xn* are self-adjoint operators with spectra in *I* and *w*1,..., *wn*

Also, B. Mond, J. Peˇcari´c, T. Furuta et al. [6–11] observed conversed of some special case of Jensen's inequality. So in [10] presented the following generalized converse of a Schwarz

for convex functions *f* defined on an interval [*m*, *M*], *m* < *M*, where *g* is a real valued continuous function on [*m*, *M*], *F*(*u*, *v*) is a real valued function defined on *U* × *V*, matrix non-decreasing in *u*, *U* ⊃ *f* [*m*, *M*], *V* ⊃ *g*[*m*, *M*], *φ* : *Hn* → *Hn*˜ is a unital positive linear

There are a lot of new research on the classical Jensen inequality (4) and its reverse inequalities. For example, J.I. Fujii et all. in [12, 13] expressed these inequalities by externally dividing

In this section we present a form of Jensen's inequality which contains (2), (3) and (4) as special cases. Since the inequality in (4) was the motivating step for obtaining converses of Jensen's inequality using the so-called Mond-Peˇcari´c method, we also give some results pertaining to

We recall some definitions. Let *T* be a locally compact Hausdorff space and let A be a *C*∗-algebra of operators on some Hilbert space *H*. We say that a field (*xt*)*t*∈*<sup>T</sup>* of operators in A is continuous if the function *t* �→ *xt* is norm continuous on *T*. If in addition *μ* is a Radon

*<sup>f</sup>*(*m*) + *<sup>f</sup>*(*M*) <sup>−</sup> *<sup>f</sup>*(*m*)

Mond and J. Peˇcari´c in [4], cf. also [5].

B. Mond and J. Peˇcari´c [6] proved the inequality

are are non-negative real numbers with sum one.

*F* [*φ* (*f*(*A*)), *g* (*φ*(*A*))] ≤ max

converse inequalities in the new formulation.

*f n* ∑ *i*=1

Krein-Milman's theorem.

inequality (2)

points.

**2. Classic results**

Assume furthermore that there is a field (*φt*)*t*∈*<sup>T</sup>* of positive linear mappings *φ<sup>t</sup>* : A→B from A to another C∗-algebra B of operators on a Hilbert space *K*. We recall that a linear mapping *φ<sup>t</sup>* : A→B is said to be a positive mapping if *φt*(*xt*) ≥ 0 for all *xt* ≥ 0. We say that such a field is continuous if the function *t* �→ *φt*(*x*) is continuous for every *x* ∈ A. Let the C∗-algebras include the identity operators and the function *t* �→ *φt*(1*H*) be integrable with *<sup>T</sup> <sup>φ</sup>t*(1*H*) *<sup>d</sup>μ*(*t*) = *<sup>k</sup>*1*<sup>K</sup>* for some positive scalar *<sup>k</sup>*. Specially, if *<sup>T</sup> φt*(1*H*) *dμ*(*t*) = 1*K*, we say that a *field* (*φt*)*t*∈*<sup>T</sup>* is *unital*.

Let *B*(*H*) be the *C*∗-algebra of all bounded linear operators on a Hilbert space *H*. We define bounds of an operator *x* ∈ *B*(*H*) by

$$m\_{\mathbf{x}} = \inf\_{\|\boldsymbol{\xi}\|=1} \langle \mathbf{x} \boldsymbol{\xi}, \boldsymbol{\xi} \rangle \quad \text{and} \quad M\_{\mathbf{x}} = \sup\_{\|\boldsymbol{\xi}\|=1} \langle \mathbf{x} \boldsymbol{\xi}, \boldsymbol{\xi} \rangle \tag{6}$$

for *ξ* ∈ *H*. If Sp(*x*) denotes the spectrum of *x*, then Sp(*x*) ⊆ [*mx*, *Mx*].

For an operator *<sup>x</sup>* <sup>∈</sup> *<sup>B</sup>*(*H*) we define operators <sup>|</sup>*x*|, *<sup>x</sup>*+, *<sup>x</sup>*<sup>−</sup> by

$$|\mathbf{x}| = (\mathbf{x}^\* \mathbf{x})^{1/2}, \qquad \mathbf{x}^+ = (|\mathbf{x}| + \mathbf{x})/2, \qquad \mathbf{x}^- = (|\mathbf{x}| - \mathbf{x})/2$$

Obviously, if *<sup>x</sup>* is self-adjoint, then <sup>|</sup>*x*<sup>|</sup> = (*x*2)1/2 and *<sup>x</sup>*+, *<sup>x</sup>*<sup>−</sup> <sup>≥</sup> 0 (called positive and negative parts of *<sup>x</sup>* <sup>=</sup> *<sup>x</sup>*<sup>+</sup> <sup>−</sup> *<sup>x</sup>*−).

#### **2.1. Jensen's inequality with operator convexity**

Firstly, we give a general formulation of Jensen's operator inequality for a unital field of positive linear mappings (see [14]).

**Theorem 1.** *Let f* : *I* → **R** *be an operator convex function defined on an interval I and let* A *and* B *be unital C*∗*-algebras acting on a Hilbert space H and K respectively. If* (*φt*)*t*∈*<sup>T</sup> is a unital field of positive linear mappings φ<sup>t</sup>* : A→B *defined on a locally compact Hausdorff space T with a bounded Radon measure μ*, *then the inequality*

$$f\left(\int\_{T} \phi\_{t}(\mathbf{x}\_{t}) \, d\mu(t)\right) \le \int\_{T} \phi\_{t}(f(\mathbf{x}\_{t})) \, d\mu(t) \tag{7}$$

*holds for every bounded continuous field* (*xt*)*t*∈*<sup>T</sup> of self-adjoint elements in* A *with spectra contained in I*.

*Proof.* We first note that the function *t* �→ *φt*(*xt*) ∈ B is continuous and bounded, hence integrable with respect to the bounded Radon measure *μ*. Furthermore, the integral is an element in the multiplier algebra *M*(B) acting on *K*. We may organize the set *CB*(*T*, A) of bounded continuous functions on *T* with values in A as a normed involutive algebra by applying the point-wise operations and setting

$$\left\|(y\_t)\_{t\in T}\right\| = \sup\_{t\in T} \left\|y\_t\right\| \qquad (y\_t)\_{t\in T} \in \mathcal{CB}(T, \mathcal{A}),$$

and it is not difficult to verify that the norm is already complete and satisfy the *C*∗-identity. In fact, this is a standard construction in *C*∗-algebra theory. It follows that *f*((*xt*)*t*∈*T*) = (*f*(*xt*))*t*∈*T*. We then consider the mapping

$$
\pi \colon \mathsf{CB}(T, \mathcal{A}) \to M(\mathcal{B}) \subseteq B(K).
$$

defined by setting

$$\pi\left(\left(\mathfrak{x}\_{t}\right)\_{t\in T}\right) = \int\_{T} \phi\_{t}\left(\mathfrak{x}\_{t}\right)d\mu(t)$$

and note that it is a unital positive linear map. Setting *x* = (*xt*)*t*∈*<sup>T</sup>* ∈ *CB*(*T*, A), we use inequality (2) to obtain

$$f\left(\pi\left((\mathbf{x}\_{l})\_{l\in T}\right)\right) = f\left(\pi(\mathbf{x})\right) \le \pi\left(f(\mathbf{x})\right) = \pi\left(f\left((\mathbf{x}\_{l})\_{l\in T}\right)\right) = \pi\left(\left(f(\mathbf{x}\_{l})\right)\_{l\in T}\right)$$

but this is just the statement of the theorem.

#### **2.2. Converses of Jensen's inequality**

In the present context we may obtain results of the Li-Mathias type cf. [15, Chapter 3] and [16, 17].

**Theorem 2.** *Let T be a locally compact Hausdorff space equipped with a bounded Radon measure μ. Let* (*xt*)*t*∈*<sup>T</sup> be a bounded continuous field of self-adjoint elements in a unital C*∗*-algebra* A *with spectra in* [*m*, *M*]*, m* < *M. Furthermore, let* (*φt*)*t*∈*<sup>T</sup> be a field of positive linear mappings φ<sup>t</sup>* : A → B *from* A *to another unital C*∗−*algebra* B*, such that the function t* �→ *φt*(1*H*) *is integrable with <sup>T</sup> φt*(1*H*) *dμ*(*t*) = *k*1*<sup>K</sup> for some positive scalar k. Let mx and Mx, mx* ≤ *Mx, be the bounds of the self-adjoint operator x* = *<sup>T</sup> φt*(*xt*) *dμ*(*t*) *and f* : [*m*, *M*] → **R***, g* : [*mx*, *Mx*] → **R***, F* : *U* × *V* → **R** *be functions such that* (*k f*)([*m*, *M*]) ⊂ *U*, *g* ([*mx*, *Mx*]) ⊂ *V and F is bounded. If F is operator monotone in the first variable, then*

$$\begin{split} \inf\_{\boldsymbol{\mu}\_{\boldsymbol{\mu}\_{\boldsymbol{\nu}}} \leq \boldsymbol{z} \leq M\_{\boldsymbol{\mu}}} & F \left[ k \cdot h\_{1} \left( \frac{1}{\tilde{k}} z \right), \boldsymbol{g}(\boldsymbol{z}) \right] \mathbf{1}\_{K} \leq F \left[ \int\_{T} \boldsymbol{\Phi}\_{l} \left( f(\mathbf{x}\_{l}) \right) d\boldsymbol{\mu}(t), \boldsymbol{g} \left( \int\_{T} \boldsymbol{\Phi}\_{l} (\mathbf{x}\_{l}) d\boldsymbol{\mu}(t) \right) \right] \\ & \leq \sup\_{\boldsymbol{m}\_{\boldsymbol{\nu}} \leq \boldsymbol{z} \leq M\_{\boldsymbol{\nu}}} F \left[ k \cdot h\_{2} \left( \frac{1}{\tilde{k}} z \right), \boldsymbol{g}(\boldsymbol{z}) \right] \mathbf{1}\_{K} \end{split} \tag{8}$$

*holds for every operator convex function h*<sup>1</sup> *on* [*m*, *M*] *such that h*<sup>1</sup> ≤ *f and for every operator concave function h*<sup>2</sup> *on* [*m*, *M*] *such that h*<sup>2</sup> ≥ *f .*

*Proof.* We prove only RHS of (8). Let *h*<sup>2</sup> be operator concave function on [*m*, *M*] such that *f*(*z*) ≤ *h*2(*z*) for every *z* ∈ [*m*, *M*]. By using the functional calculus, it follows that *f*(*xt*) ≤ *h*2(*xt*) for every *t* ∈ *T*. Applying the positive linear mappings *φ<sup>t</sup>* and integrating, we obtain

$$\int\_{T} \phi\_{t} \left( f(\mathbf{x}\_{t}) \right) d\mu(t) \le \int\_{T} \phi\_{t} \left( h\_{2}(\mathbf{x}\_{t}) \right) d\mu(t)$$

Furthermore, replacing *φ<sup>t</sup>* by <sup>1</sup> *<sup>k</sup> <sup>φ</sup><sup>t</sup>* in Theorem 1, we obtain <sup>1</sup> *k T φ<sup>t</sup>* (*h*2(*xt*)) *dμ*(*t*) ≤ *h*2 1 *k T φt*(*xt*) *dμ*(*t*) , which gives *T φ<sup>t</sup>* (*f*(*xt*)) *dμ*(*t*) ≤ *k* · *h*<sup>2</sup> 1 *k T φt*(*xt*) *dμ*(*t*) . Since *mx* <sup>1</sup>*<sup>K</sup>* <sup>≤</sup> *<sup>T</sup> φt*(*xt*)*dμ*(*t*) ≤ *Mx* 1*K*, then using operator monotonicity of *F*(·, *v*) we obtain

$$F\left[\int\_{T} \phi\_{l}\left(f(\mathbf{x}\_{l})\right)d\mu(t), \mathbf{g}\left(\int\_{T} \phi\_{l}(\mathbf{x}\_{l})d\mu(t)\right)\right] \tag{9}$$

$$0 \le \mathcal{F}\left[k \cdot h\_2\left(\frac{1}{k} \int\_T \phi\_l(\mathbf{x}\_l) \, d\mu(t)\right), \mathcal{g}\left(\int\_T \phi\_l(\mathbf{x}\_l) d\mu(t)\right)\right] \le \sup\_{m\_z \le z \le M\_x} \mathcal{F}\left[k \cdot h\_2\left(\frac{1}{k} z\right), \mathcal{g}(z)\right] \mathbf{1}\_K$$

Applying RHS of (8) for a convex function *f* (or LHS of (8) for a concave function *f*) we obtain the following generalization of (5).

**Theorem 3.** *Let* (*xt*)*t*∈*T, mx, Mx and* (*φt*)*t*∈*<sup>T</sup> be as in Theorem 2. Let f* : [*m*, *<sup>M</sup>*] → **<sup>R</sup>***, g* : [*mx*, *Mx*] → **R***, F* : *U* × *V* → **R** *be functions such that* (*k f*)([*m*, *M*]) ⊂ *U*, *g* ([*mx*, *Mx*]) ⊂ *V and F is bounded. If F is operator monotone in the first variable and f is convex on the interval* [*m*, *M*]*, then*

$$\begin{aligned} &F\left[\int\_{T} \phi\_{l}\left(f(\mathbf{x}\_{l})\right)d\mu(t), \mathbf{g}\left(\int\_{T} \phi\_{l}(\mathbf{x}\_{l})d\mu(t)\right)\right] \\ &\leq \sup\_{m\_{\boldsymbol{x}} \leq z \leq M\_{\boldsymbol{x}}} F\left[\frac{Mk-z}{M-m}f(m) + \frac{z-km}{M-m}f(M), \mathbf{g}(z)\right] \mathbf{1}\_{K} \end{aligned} \tag{10}$$

*In the dual case (when f is concave) the opposite inequalities hold in* (10) *with* inf *instead of* sup*.*

*Proof.* We prove only the convex case. For convex *<sup>f</sup>* the inequality *<sup>f</sup>*(*z*) <sup>≤</sup> *<sup>M</sup>*−*<sup>z</sup> <sup>M</sup>*−*<sup>m</sup> <sup>f</sup>*(*m*) + *<sup>z</sup>*−*<sup>m</sup> <sup>M</sup>*−*<sup>m</sup> <sup>f</sup>*(*M*) holds for every *<sup>z</sup>* <sup>∈</sup> [*m*, *<sup>M</sup>*]. Thus, by putting *<sup>h</sup>*2(*z*) = *<sup>M</sup>*−*<sup>z</sup> <sup>M</sup>*−*<sup>m</sup> <sup>f</sup>*(*m*) + *<sup>z</sup>*−*<sup>m</sup> <sup>M</sup>*−*<sup>m</sup> <sup>f</sup>*(*M*) in (9) we obtain (10).

Numerous applications of the previous theorem can be given (see [15]). Applying Theorem 3 for the function *F*(*u*, *v*) = *u* − *αv* and *k* = 1, we obtain the following generalization of [15, Theorem 2.4].

**Corollary 4.** *Let* (*xt*)*t*∈*T, mx, Mx be as in Theorem 2 and* (*φt*)*t*∈*<sup>T</sup> be a unital field of positive linear mappings φ<sup>t</sup>* : A→B*. If f* : [*m*, *M*] → **R** *is convex on the interval* [*m*, *M*]*, m* < *M, and g* : [*m*, *M*] → **R***, then for any α* ∈ **R**

$$\int\_{T} \phi\_{l} \left( f(\mathbf{x}\_{l}) \right) d\mu(t) \le \mathfrak{a} \left( \int\_{T} \phi\_{l}(\mathbf{x}\_{l}) d\mu(t) \right) + \mathfrak{C} \mathbf{1}\_{K} \tag{11}$$

*where*

4 Will-be-set-by-IN-TECH

bounded continuous functions on *T* with values in A as a normed involutive algebra by

and it is not difficult to verify that the norm is already complete and satisfy the *C*∗-identity. In fact, this is a standard construction in *C*∗-algebra theory. It follows that *f*((*xt*)*t*∈*T*) =

*π*: *CB*(*T*, A) → *M*(B) ⊆ *B*(*K*)

 *T*

and note that it is a unital positive linear map. Setting *x* = (*xt*)*t*∈*<sup>T</sup>* ∈ *CB*(*T*, A), we use

In the present context we may obtain results of the Li-Mathias type cf. [15, Chapter 3] and

**Theorem 2.** *Let T be a locally compact Hausdorff space equipped with a bounded Radon measure μ. Let* (*xt*)*t*∈*<sup>T</sup> be a bounded continuous field of self-adjoint elements in a unital C*∗*-algebra* A *with spectra in* [*m*, *M*]*, m* < *M. Furthermore, let* (*φt*)*t*∈*<sup>T</sup> be a field of positive linear mappings φ<sup>t</sup>* : A → B *from* A *to another unital C*∗−*algebra* B*, such that the function t* �→ *φt*(1*H*) *is integrable with*

*<sup>T</sup> φt*(1*H*) *dμ*(*t*) = *k*1*<sup>K</sup> for some positive scalar k. Let mx and Mx, mx* ≤ *Mx, be the bounds of the*

*be functions such that* (*k f*)([*m*, *M*]) ⊂ *U*, *g* ([*mx*, *Mx*]) ⊂ *V and F is bounded. If F is operator*

 *T*

*holds for every operator convex function h*<sup>1</sup> *on* [*m*, *M*] *such that h*<sup>1</sup> ≤ *f and for every operator concave*

*Proof.* We prove only RHS of (8). Let *h*<sup>2</sup> be operator concave function on [*m*, *M*] such that *f*(*z*) ≤ *h*2(*z*) for every *z* ∈ [*m*, *M*]. By using the functional calculus, it follows that *f*(*xt*) ≤ *h*2(*xt*) for every *t* ∈ *T*. Applying the positive linear mappings *φ<sup>t</sup>* and integrating, we obtain

> *T*

1 *k z* , *g*(*z*) 1*K*

1*<sup>K</sup>* ≤ *F*

*F k* · *h*<sup>2</sup>

*φ<sup>t</sup>* (*f*(*xt*)) *dμ*(*t*) ≤

<sup>≤</sup> sup *mx*≤*z*≤*Mx*

*<sup>T</sup> φt*(*xt*) *dμ*(*t*) *and f* : [*m*, *M*] → **R***, g* : [*mx*, *Mx*] → **R***, F* : *U* × *V* → **R**

*φ<sup>t</sup>* (*f*(*xt*)) *dμ*(*t*), *g*

*φ<sup>t</sup>* (*h*2(*xt*)) *dμ*(*t*)

 *T*

*φt*(*xt*)*dμ*(*t*)

(8)

�*yt*� (*yt*)*t*∈*<sup>T</sup>* ∈ *CB*(*T*, A)

*φt*(*xt*) *dμ*(*t*)

*f* (*xt*)*t*∈*<sup>T</sup>* = *π*

 *f*(*xt*) *t*∈*T* 

applying the point-wise operations and setting

(*f*(*xt*))*t*∈*T*. We then consider the mapping

but this is just the statement of the theorem.

**2.2. Converses of Jensen's inequality**

defined by setting

[16, 17].

inequality (2) to obtain

*self-adjoint operator x* =

inf *mx*≤*z*≤*Mx*

*monotone in the first variable, then*

*F k* · *h*<sup>1</sup>

*function h*<sup>2</sup> *on* [*m*, *M*] *such that h*<sup>2</sup> ≥ *f .*

1 *k z* , *g*(*z*) 

> *T*

�(*yt*)*t*∈*T*� = sup

*<sup>f</sup>* (*<sup>π</sup>* ((*xt*)*t*∈*T*)) <sup>=</sup> *<sup>f</sup>*(*π*(*x*)) <sup>≤</sup> *<sup>π</sup>*(*f*(*x*)) = *<sup>π</sup>*

*t*∈*T*

*π* ((*xt*)*t*∈*T*) =

$$\begin{aligned} \mathcal{C} &= \max\_{m\_{\boldsymbol{z}} \le \boldsymbol{z} \le M\_{\boldsymbol{x}}} \left\{ \frac{M-\boldsymbol{z}}{M-m} f(m) + \frac{\boldsymbol{z}-m}{M-m} f(M) - \mathfrak{a}g(\boldsymbol{z}) \right\} \\ &\le \max\_{m \le \boldsymbol{z} \le M} \left\{ \frac{M-\boldsymbol{z}}{M-m} f(m) + \frac{\boldsymbol{z}-m}{M-m} f(M) - \mathfrak{a}g(\boldsymbol{z}) \right\} \end{aligned}$$

*If furthermore αg is strictly convex differentiable, then the constant C* ≡ *C*(*m*, *M*, *f* , *g*, *α*) *can be written more precisely as*

$$\mathcal{C} = \frac{M - z\_0}{M - m} f(m) + \frac{z\_0 - m}{M - m} f(M) - \mathfrak{a} \mathfrak{g}(z\_0)$$

*where*

$$z\_0 = \begin{cases} \text{g}'^{-1}\left(\frac{f(M) - f(m)}{a(M-m)}\right) & \text{if} \quad \text{ag}'(m\_X) \le \frac{f(M) - f(m)}{M-m} \le \text{ag}'(M\_X) \\ m\_X & \text{if} \quad \text{ag}'(m\_X) \ge \frac{f(M) - f(m)}{M-m} \\ M\_X & \text{if} \quad \text{ag}'(M\_X) \le \frac{f(M) - f(m)}{M-m} \end{cases}$$

*In the dual case (when f is concave and αg is strictly concave differentiable) the opposite inequalities hold in* (11) *with* min *instead of* max *with the opposite condition while determining z*0*.*

#### **3. Inequalities with conditions on spectra**

In this section we present Jensens's operator inequality for real valued continuous convex functions with conditions on the spectra of the operators. A discrete version of this result is given in [18]. Also, we obtain generalized converses of Jensen's inequality under the same conditions.

Operator convexity plays an essential role in (2). In fact, the inequality (2) will be false if we replace an operator convex function by a general convex function. For example, M.D. Choi in [2, Remark 2.6] considered the function *f*(*t*) = *t* <sup>4</sup> which is convex but not operator convex. He demonstrated that it is sufficient to put dim*H* = 3, so we have the matrix case as follows. Let <sup>Φ</sup> : *<sup>M</sup>*3(**C**) → *<sup>M</sup>*2(**C**) be the contraction mapping <sup>Φ</sup>((*aij*)1≤*i*,*j*≤3)=(*aij*)1≤*i*,*j*≤2. If

$$A = \begin{pmatrix} 1 \ 0 \ 1 \\ 0 \ 0 \ 1 \\ 1 \ 1 \ 1 \end{pmatrix}, \text{ then } \Phi(A)^4 = \begin{pmatrix} 1 \ 0 \\ 0 \ 0 \end{pmatrix} \not\le \begin{pmatrix} 9 \ 5 \\ 5 \ 3 \end{pmatrix} = \Phi(A^4) \text{ and no relation between } \Phi(A)^4 \text{ and } \Phi(A)^4 \text{ is equal to } \Phi(A)$$

Φ(*A*4) under the operator order.

**Example 5.** *It appears that the inequality* (7) *will be false if we replace the operator convex function by a general convex function. We give a small example for the matrix cases and T* = {1, 2}*. We define mappings* <sup>Φ</sup>1, <sup>Φ</sup><sup>2</sup> : *<sup>M</sup>*3(**C**) <sup>→</sup> *<sup>M</sup>*2(**C**) *by* <sup>Φ</sup>1((*aij*)1≤*i*,*j*≤3) = <sup>1</sup> <sup>2</sup> (*aij*)1≤*i*,*j*≤2*,* <sup>Φ</sup><sup>2</sup> = <sup>Φ</sup>1*. Then* Φ1(*I*3) + Φ2(*I*3) = *I*2*.*

$$D \quad \mathcal{Y}$$

$$X\_1 = 2\begin{pmatrix} 1 \ 0 \ 1 \\ 0 \ 0 \ 1 \\ 1 \ 1 \ 1 \end{pmatrix} \quad \text{and} \quad X\_2 = 2\begin{pmatrix} 1 \ 0 \ 0 \\ 0 \ 0 \ 0 \\ 0 \ 0 \ 0 \end{pmatrix}$$

*then*

$$\left(\Phi\_1(X\_1) + \Phi\_2(X\_2)\right)^4 = \begin{pmatrix} 16 \ 0 \\ 0 \ 0 \end{pmatrix} \not\le \begin{pmatrix} 80 \ 40 \\ 40 \ 24 \end{pmatrix} = \Phi\_1\left(X\_1^4\right) + \Phi\_2\left(X\_2^4\right)$$

*Given the above, there is no relation between* (Φ1(*X*1) + Φ2(*X*2)) <sup>4</sup> *and* <sup>Φ</sup><sup>1</sup> � *X*4 1 � + Φ<sup>2</sup> � *X*4 2 � *under the operator order. We observe that in the above case the following stands X* = Φ1(*X*1) + <sup>Φ</sup>2(*X*2) = � 2 0 0 0� *and* [*mx*, *Mx*]=[0, 2]*,* [*m*1, *M*1] ⊂ [−1.60388, 4.49396]*,* [*m*2, *M*2]=[0, 2]*, i.e.*

$$(m\_{\mathbf{x}\prime}M\_{\mathbf{x}}) \subset [m\_{1\prime}M\_1] \cup [m\_{2\prime}M\_2],$$

*(see Fig. 1.a).*

6 Will-be-set-by-IN-TECH

*If furthermore αg is strictly convex differentiable, then the constant C* ≡ *C*(*m*, *M*, *f* , *g*, *α*) *can be*

*if αg*�

*In the dual case (when f is concave and αg is strictly concave differentiable) the opposite inequalities*

In this section we present Jensens's operator inequality for real valued continuous convex functions with conditions on the spectra of the operators. A discrete version of this result is given in [18]. Also, we obtain generalized converses of Jensen's inequality under the same

Operator convexity plays an essential role in (2). In fact, the inequality (2) will be false if we replace an operator convex function by a general convex function. For example, M.D.

convex. He demonstrated that it is sufficient to put dim*H* = 3, so we have the matrix case as follows. Let <sup>Φ</sup> : *<sup>M</sup>*3(**C**) → *<sup>M</sup>*2(**C**) be the contraction mapping <sup>Φ</sup>((*aij*)1≤*i*,*j*≤3)=(*aij*)1≤*i*,*j*≤2. If

**Example 5.** *It appears that the inequality* (7) *will be false if we replace the operator convex function by a general convex function. We give a small example for the matrix cases and T* = {1, 2}*. We*

⎠ *and X*<sup>2</sup> = 2

*under the operator order. We observe that in the above case the following stands X* = Φ1(*X*1) +

(*mx*, *Mx*) ⊂ [*m*1, *M*1] ∪ [*m*2, *M*2]

⎛ ⎝

� = Φ<sup>1</sup> � *X*4 1 � + Φ<sup>2</sup> � *X*4 2 �

*and* [*mx*, *Mx*]=[0, 2]*,* [*m*1, *M*1] ⊂ [−1.60388, 4.49396]*,* [*m*2, *M*2]=[0, 2]*,*

100 000 000

⎞ ⎠

<sup>4</sup> *and* <sup>Φ</sup><sup>1</sup>

� *X*4 1 � + Φ<sup>2</sup> � *X*4 2 �

*<sup>M</sup>* <sup>−</sup> *<sup>m</sup> <sup>f</sup>*(*M*) <sup>−</sup> *<sup>α</sup>g*(*z*0)

(*mx*) <sup>≤</sup> *<sup>f</sup>*(*M*)−*f*(*m*)

(*mx*) <sup>≥</sup> *<sup>f</sup>*(*M*)−*f*(*m*) *M*−*m*

(*Mx*) <sup>≤</sup> *<sup>f</sup>*(*M*)−*f*(*m*) *M*−*m*

*<sup>M</sup>*−*<sup>m</sup>* <sup>≤</sup> *<sup>α</sup>g*�

(*Mx*)

<sup>4</sup> which is convex but not operator

<sup>2</sup> (*aij*)1≤*i*,*j*≤2*,* <sup>Φ</sup><sup>2</sup> = <sup>Φ</sup>1*. Then*

= Φ(*A*4) and no relation between Φ(*A*)<sup>4</sup> and

*<sup>M</sup>* <sup>−</sup> *<sup>m</sup> <sup>f</sup>*(*m*) + *<sup>z</sup>*<sup>0</sup> <sup>−</sup> *<sup>m</sup>*

�

*hold in* (11) *with* min *instead of* max *with the opposite condition while determining z*0*.*

*<sup>C</sup>* <sup>=</sup> *<sup>M</sup>* <sup>−</sup> *<sup>z</sup>*<sup>0</sup>

� *<sup>f</sup>*(*M*)−*f*(*m*) *α*(*M*−*m*)

*mx if αg*�

*Mx if αg*�

*written more precisely as*

*z*<sup>0</sup> =

⎧ ⎪⎪⎨

*g*�−<sup>1</sup>

**3. Inequalities with conditions on spectra**

Choi in [2, Remark 2.6] considered the function *f*(*t*) = *t*

� 1 0 0 0 � �≤ � 9 5 5 3 �

*define mappings* <sup>Φ</sup>1, <sup>Φ</sup><sup>2</sup> : *<sup>M</sup>*3(**C**) <sup>→</sup> *<sup>M</sup>*2(**C**) *by* <sup>Φ</sup>1((*aij*)1≤*i*,*j*≤3) = <sup>1</sup>

⎛ ⎝

<sup>4</sup> = � 16 0 0 0 � �≤ � 80 40 40 24

*Given the above, there is no relation between* (Φ1(*X*1) + Φ2(*X*2))

101 001 111

⎞

*X*<sup>1</sup> = 2

(Φ1(*X*1) + Φ2(*X*2))

⎪⎪⎩

*where*

conditions.

⎛ ⎝

101 001 111

Φ1(*I*3) + Φ2(*I*3) = *I*2*.*

⎞

Φ(*A*4) under the operator order.

<sup>⎠</sup> , then <sup>Φ</sup>(*A*)<sup>4</sup> <sup>=</sup>

*A* =

*I) If*

*then*

Φ2(*X*2) =

*i.e.*

� 2 0 0 0 �

**Figure 1.** Spectral conditions for a convex function *f*

*II) If*

$$X\_1 = \begin{pmatrix} -14 & 0 & 1 \\ 0 & -2 & -1 \\ 1 & -1 & -1 \end{pmatrix} \quad \text{and} \quad X\_2 = \begin{pmatrix} 15 \ 0 & 0 \\ 0 & 2 & 0 \\ 0 & 0 & 15 \end{pmatrix}.$$

*then*

$$\left(\Phi\_1(X\_1) + \Phi\_2(X\_2)\right)^4 = \begin{pmatrix} \frac{1}{15} & 0\\ 0 & 0 \end{pmatrix} < \begin{pmatrix} 89660 & -247\\ -247 & 51 \end{pmatrix} = \Phi\_1\left(X\_1^4\right) + \Phi\_2\left(X\_2^4\right)$$

*So we have that an inequality of type* (7) *now is valid. In the above case the following stands <sup>X</sup>* <sup>=</sup> <sup>Φ</sup>1(*X*1) + <sup>Φ</sup>2(*X*2) = �<sup>1</sup> <sup>2</sup> 0 0 0� *and* [*mx*, *Mx*]=[0, 0.5]*,* [*m*1, *M*1] ⊂ [−14.077, −0.328566]*,* [*m*2, *M*2]=[2, 15]*, i.e.*

$$(m\_{\mathcal{X}}, M\_{\mathcal{X}}) \cap [m\_{\mathcal{Y}}, M\_1] = \bigcirc \quad \text{and} \quad (m\_{\mathcal{X}}, M\_{\mathcal{X}}) \cap [m\_{\mathcal{Z}}, M\_2] = \bigcirc$$

*(see Fig. 1.b).*

#### **3.1. Jensen's inequality without operator convexity**

It is no coincidence that the inequality (7) is valid in Example 18-II). In the following theorem we prove a general result when Jensen's operator inequality (7) holds for convex functions.

**Theorem 6.** *Let* (*xt*)*t*∈*<sup>T</sup> be a bounded continuous field of self-adjoint elements in a unital C*∗*-algebra* A *defined on a locally compact Hausdorff space T equipped with a bounded Radon measure μ. Let mt and Mt, mt* ≤ *Mt, be the bounds of xt, t* ∈ *T. Let* (*φt*)*t*∈*<sup>T</sup> be a unital field of positive linear mappings φ<sup>t</sup>* : A→B *from* A *to another unital C*∗−*algebra* B*. If*

$$(m\_{\mathcal{X}\prime}M\_{\mathcal{X}}) \cap [m\_{t\prime}M\_{t}] = \mathcal{Q}\_{\prime} \qquad t \in T$$

*where mx and Mx, mx* <sup>≤</sup> *Mx, are the bounds of the self-adjoint operator x* <sup>=</sup> � *<sup>T</sup> φt*(*xt*) *dμ*(*t*)*, then*

$$f\left(\int\_{T} \phi\_{l}(\mathbf{x}\_{l}) \, d\mu(t)\right) \le \int\_{T} \phi\_{l}(f(\mathbf{x}\_{l})) \, d\mu(t) \tag{12}$$

*holds for every continuous convex function f* : *I* → **R** *provided that the interval I contains all mt*, *Mt. If f* : *I* → **R** *is concave, then the reverse inequality is valid in* (12)*.*

#### 8 Will-be-set-by-IN-TECH 196 Linear Algebra – Theorems and Applications

*Proof.* We prove only the case when *f* is a convex function. If we denote *m* = inf *t*∈*T* {*mt*} and *M* = sup *t*∈*T* {*Mt*}, then [*m*, *M*] ⊆ *I* and *m*1*<sup>H</sup>* ≤ *At* ≤ *M*1*H*, *t* ∈ *T*. It follows *<sup>m</sup>*1*<sup>K</sup>* <sup>≤</sup> *<sup>T</sup> φt*(*xt*) *dμ*(*t*) ≤ *M*1*K*. Therefore [*mx*, *Mx*] ⊆ [*m*, *M*] ⊆ *I*. **a)** Let *mx* < *Mx*. Since *f* is convex on [*mx*, *Mx*], then

$$f(z) \le \frac{M\_{\text{X}} - z}{M\_{\text{X}} - m\_{\text{X}}} f(m\_{\text{X}}) + \frac{z - m\_{\text{X}}}{M\_{\text{X}} - m\_{\text{X}}} f(M\_{\text{X}}), \quad z \in [m\_{\text{X}}, M\_{\text{X}}] \tag{13}$$

but since *f* is convex on [*mt*, *Mt*] and since (*mx*, *Mx*) ∩ [*mt*, *Mt*] = ∅, then

$$f(z) \ge \frac{M\_{\mathbf{x}} - z}{M\_{\mathbf{x}} - m\_{\mathbf{x}}} f(m\_{\mathbf{x}}) + \frac{z - m\_{\mathbf{x}}}{M\_{\mathbf{x}} - m\_{\mathbf{x}}} f(M\_{\mathbf{x}}), \quad z \in [m\_{l}, M\_{l}], \quad t \in T \tag{14}$$

Since *mx*1*<sup>K</sup>* <sup>≤</sup> *<sup>T</sup> φt*(*xt*) *dμ*(*t*) ≤ *Mx*1*K*, then by using functional calculus, it follows from (13)

$$f\left(\int\_{T} \phi\_{l}(\mathbf{x}\_{l}) \, d\mu(t)\right) \le \frac{M\_{\mathbf{x}} \mathbf{1}\_{K} - \int\_{T} \phi\_{l}(\mathbf{x}\_{l}) \, d\mu(t)}{M\_{\mathbf{x}} - m\_{\mathbf{x}}} f(m\_{\mathbf{x}}) + \frac{\int\_{T} \phi\_{l}(\mathbf{x}\_{l}) \, d\mu(t) - m\_{\mathbf{x}} \mathbf{1}\_{K}}{M\_{\mathbf{x}} - m\_{\mathbf{x}}} f(M\_{\mathbf{x}}) \tag{15}$$

On the other hand, since *mt*1*<sup>H</sup>* ≤ *xt* ≤ *Mt*1*H*, *t* ∈ *T*, then by using functional calculus, it follows from (14)

$$f\left(\mathbf{x}\_{t}\right) \ge \frac{M\_{\mathbf{x}}\mathbf{1}\_{H} - \mathbf{x}\_{t}}{M\_{\mathbf{x}} - m\_{\mathbf{x}}} f(m\_{\mathbf{x}}) + \frac{\mathbf{x}\_{t} - m\_{\mathbf{x}}\mathbf{1}\_{H}}{M\_{\mathbf{x}} - m\_{\mathbf{x}}} f(M\_{\mathbf{x}}), \qquad t \in T$$

Applying a positive linear mapping *φ<sup>t</sup>* and summing, we obtain

$$\int\_{T} \phi\_{l} \left( f(\mathbf{x}\_{l}) \right) \, d\mu(t) \ge \frac{M\_{\mathbf{x}} \mathbf{1}\_{K} - \int\_{T} \phi\_{l}(\mathbf{x}\_{l}) \, d\mu(t)}{M\_{\mathbf{x}} - m\_{\mathbf{x}}} f(m\_{\mathbf{x}}) + \frac{\int\_{T} \phi\_{l}(\mathbf{x}\_{l}) \, d\mu(t) - m\_{\mathbf{x}} \mathbf{1}\_{K}}{M\_{\mathbf{x}} - m\_{\mathbf{x}}} f(M\_{\mathbf{x}}) \tag{16}$$

since *<sup>T</sup> φt*(1*H*) *dμ*(*t*) = 1*K*. Combining the two inequalities (15) and (16), we have the desired inequality (12).

**b)** Let *mx* = *Mx*. Since *f* is convex on [*m*, *M*], we have

$$f(z) \ge f(m\_{\ge}) + l(m\_{\ge})(z - m\_{\ge}) \quad \text{for every } z \in [m\_{\prime}M] \tag{17}$$

where *l* is the subdifferential of *f* . Since *m*1*<sup>H</sup>* ≤ *xt* ≤ *M*1*H*, *t* ∈ *T*, then by using functional calculus, applying a positive linear mapping *φ<sup>t</sup>* and summing, we obtain from (17)

$$\int\_{T} \phi\_{t} \left( f(\mathbf{x}\_{t}) \right) \, d\mu(\mathbf{t}) \ge f(m\_{\mathbf{x}}) \mathbf{1}\_{\mathcal{K}} + l(m\_{\mathbf{x}}) \left( \int\_{T} \phi\_{t}(\mathbf{x}\_{t}) \, d\mu(\mathbf{t}) - m\_{\mathbf{x}} \mathbf{1}\_{\mathcal{K}} \right).$$

Since *mx*1*<sup>K</sup>* = *<sup>T</sup> φt*(*xt*) *dμ*(*t*), it follows

$$\int\_{T} \phi\_{t} \left( f(\mathbf{x}\_{t}) \right) \, d\mu(t) \ge f(m\_{\mathbf{x}}) \mathbf{1}\_{K} = f \left( \int\_{T} \phi\_{t}(\mathbf{x}\_{t}) \, d\mu(t) \right).$$

which is the desired inequality (12).

Putting *φt*(*y*) = *aty* for every *y* ∈ A, where *at* ≥ 0 is a real number, we obtain the following obvious corollary of Theorem 6.

**Corollary 7.** *Let* (*xt*)*t*∈*<sup>T</sup> be a bounded continuous field of self-adjoint elements in a unital C*∗*-algebra* A *defined on a locally compact Hausdorff space T equipped with a bounded Radon measure μ. Let mt and Mt, mt* ≤ *Mt, be the bounds of xt, t* ∈ *T. Let* (*at*)*t*∈*<sup>T</sup> be a continuous field of nonnegative real numbers such that <sup>T</sup> at dμ*(*t*) = 1*. If*

$$(m\_{\mathcal{X}}, M\_{\mathcal{X}}) \cap [m\_{\mathcal{Y}}, M\_{\mathcal{t}}] = \bigotimes\_{\prime} \qquad t \in T^\*$$

*where mx and Mx, mx* <sup>≤</sup> *Mx, are the bounds of the self-adjoint operator x* <sup>=</sup> *<sup>T</sup> atxt dμ*(*t*)*, then*

$$f\left(\int\_{T} a\_t \mathbf{x}\_t \, d\mu(t)\right) \le \int\_{T} a\_t f(\mathbf{x}\_t) \, d\mu(t) \tag{18}$$

*holds for every continuous convex function f* : *I* → **R** *provided that the interval I contains all mt*, *Mt.*

#### **3.2. Converses of Jensen's inequality with conditions on spectra**

8 Will-be-set-by-IN-TECH

{*Mt*}, then [*m*, *M*] ⊆ *I* and *m*1*<sup>H</sup>* ≤ *At* ≤ *M*1*H*, *t* ∈ *T*. It follows

*<sup>T</sup> φt*(*xt*) *dμ*(*t*) ≤ *Mx*1*K*, then by using functional calculus, it follows from (13)

*f*(*mx*) +

*Mx* − *mx*

*f*(*mx*) +

*<sup>T</sup> φt*(1*H*) *dμ*(*t*) = 1*K*. Combining the two inequalities (15) and (16), we have the desired

where *l* is the subdifferential of *f* . Since *m*1*<sup>H</sup>* ≤ *xt* ≤ *M*1*H*, *t* ∈ *T*, then by using functional

Putting *φt*(*y*) = *aty* for every *y* ∈ A, where *at* ≥ 0 is a real number, we obtain the following

*f*(*z*) ≥ *f*(*mx*) + *l*(*mx*)(*z* − *mx*) for every *z* ∈ [*m*, *M*] (17)

 *T*

> *T*

*t*∈*T* {*mt*}

*f*(*Mx*) (15)

*f*(*Mx*) (16)

*f*(*Mx*), *z* ∈ [*mx*, *Mx*] (13)

*f*(*Mx*), *z* ∈ [*mt*, *Mt*], *t* ∈ *T* (14)

*<sup>T</sup> φt*(*xt*) *dμ*(*t*) − *mx*1*<sup>K</sup> Mx* − *mx*

*f*(*Mx*), *t* ∈ *T*

*<sup>T</sup> φt*(*xt*) *dμ*(*t*) − *mx*1*<sup>K</sup> Mx* − *mx*

*φt*(*xt*) *dμ*(*t*) − *mx*1*<sup>K</sup>*

*φt*(*xt*) *dμ*(*t*)

*Proof.* We prove only the case when *f* is a convex function. If we denote *m* = inf

*<sup>f</sup>*(*mx*) + *<sup>z</sup>* <sup>−</sup> *mx*

*Mx* − *mx*

On the other hand, since *mt*1*<sup>H</sup>* ≤ *xt* ≤ *Mt*1*H*, *t* ∈ *T*, then by using functional calculus, it

*<sup>f</sup>*(*mx*) + *xt* <sup>−</sup> *mx*1*<sup>H</sup>*

*Mx* − *mx*

*<sup>T</sup> φt*(*xt*) *dμ*(*t*) ≤ *M*1*K*. Therefore [*mx*, *Mx*] ⊆ [*m*, *M*] ⊆ *I*.

but since *f* is convex on [*mt*, *Mt*] and since (*mx*, *Mx*) ∩ [*mt*, *Mt*] = ∅, then

*Mx* − *mx*

*<sup>f</sup>*(*mx*) + *<sup>z</sup>* <sup>−</sup> *mx*

*<sup>T</sup> φt*(*xt*) *dμ*(*t*)

*<sup>T</sup> φt*(*xt*) *dμ*(*t*)

calculus, applying a positive linear mapping *φ<sup>t</sup>* and summing, we obtain from (17)

*φ<sup>t</sup>* (*f*(*xt*)) *dμ*(*t*) ≥ *f*(*mx*)1*<sup>K</sup>* = *f*

*Mx* − *mx*

*φ<sup>t</sup>* (*f*(*xt*)) *dμ*(*t*) ≥ *f*(*mx*)1*<sup>K</sup>* + *l*(*mx*)

**a)** Let *mx* < *Mx*. Since *f* is convex on [*mx*, *Mx*], then

*<sup>f</sup>*(*z*) <sup>≤</sup> *Mx* <sup>−</sup> *<sup>z</sup>*

*Mx* − *mx*

*<sup>f</sup>*(*z*) <sup>≥</sup> *Mx* <sup>−</sup> *<sup>z</sup>*

 ≤ *Mx* − *mx*

*Mx*1*<sup>K</sup>* <sup>−</sup>

*<sup>f</sup>* (*xt*) <sup>≥</sup> *Mx*1*<sup>H</sup>* <sup>−</sup> *xt*

*Mx* − *mx*

Applying a positive linear mapping *φ<sup>t</sup>* and summing, we obtain

*Mx*1*<sup>K</sup>* <sup>−</sup>

**b)** Let *mx* = *Mx*. Since *f* is convex on [*m*, *M*], we have

*<sup>T</sup> φt*(*xt*) *dμ*(*t*), it follows

 *T*

which is the desired inequality (12).

obvious corollary of Theorem 6.

and *M* = sup

Since *mx*1*<sup>K</sup>* <sup>≤</sup>

follows from (14)

*φt*(*xt*) *dμ*(*t*)

*φ<sup>t</sup>* (*f*(*xt*)) *dμ*(*t*) ≥

 *T*

*f T*

 *T*

since

inequality (12).

Since *mx*1*<sup>K</sup>* =

*<sup>m</sup>*1*<sup>K</sup>* <sup>≤</sup>

*t*∈*T*

Using the condition on spectra we obtain the following extension of Theorem 3.

**Theorem 8.** *Let* (*xt*)*t*∈*<sup>T</sup> be a bounded continuous field of self-adjoint elements in a unital C*∗*-algebra* A *defined on a locally compact Hausdorff space T equipped with a bounded Radon measure μ. Furthermore, let* (*φt*)*t*∈*<sup>T</sup> be a field of positive linear mappings φ<sup>t</sup>* : A→B *from* A *to another unital <sup>C</sup>*∗−*algebra* <sup>B</sup>*, such that the function t* �→ *<sup>φ</sup>t*(1*H*) *is integrable with <sup>T</sup> φt*(1*H*) *dμ*(*t*) = *k*1*<sup>K</sup> for some positive scalar k. Let mt and Mt, mt* ≤ *Mt, be the bounds of xt, t* ∈ *T, m* = inf *t*∈*T* {*mt*}*, M* = sup *t*∈*T* {*Mt*}*,*

*and mx and Mx, mx* < *Mx, be the bounds of x* = *<sup>T</sup> φt*(*xt*) *dμ*(*t*)*. If*

$$(m\_{\mathcal{X}}, M\_{\mathcal{X}}) \cap [m\_{\mathcal{Y}}, M\_{\mathcal{t}}] = \bigotimes\_{\mathsf{T}} \qquad t \in T$$

*and f* : [*m*, *M*] → **R***, g* : [*mx*, *Mx*] → **R***, F* : *U* × *V* → **R** *are functions such that* (*k f*)([*m*, *M*]) ⊂ *U*, *g* ([*mx*, *Mx*]) ⊂ *V, f is convex, F is bounded and operator monotone in the first variable, then*

$$\begin{split} \inf\_{\boldsymbol{m}\_{\boldsymbol{x}} \le \boldsymbol{z} \le M\_{\mathcal{X}}} & F \left[ \frac{M\_{\mathcal{X}}k - \boldsymbol{z}}{M\_{\mathcal{X}} - \boldsymbol{m}\_{\boldsymbol{x}}} f(\boldsymbol{m}\_{\boldsymbol{x}}) + \frac{\boldsymbol{z} - k\boldsymbol{m}\_{\boldsymbol{x}}}{M\_{\mathcal{X}} - \boldsymbol{m}\_{\boldsymbol{x}}} f(\boldsymbol{M}\_{\boldsymbol{x}}), \boldsymbol{g}(\boldsymbol{z}) \right] \mathbf{1}\_{K} \\ & F \left[ \int\_{T} \phi\_{l} \left( f(\boldsymbol{x}\_{t}) \right) d\mu(t), \boldsymbol{g} \left( \int\_{T} \phi\_{l}(\mathbf{x}\_{t}) d\mu(t) \right) \right] \\ \leq \sup\_{\boldsymbol{m}\_{\boldsymbol{x}} \le \boldsymbol{z} \le M\_{\mathcal{X}}} & F \left[ \frac{M\boldsymbol{k} - \boldsymbol{z}}{M - \boldsymbol{m}} f(\boldsymbol{m}) + \frac{\boldsymbol{z} - k\boldsymbol{m}}{M - \boldsymbol{m}} f(\boldsymbol{M}), \boldsymbol{g}(\boldsymbol{z}) \right] \mathbf{1}\_{K} \end{split} \tag{19}$$

*In the dual case (when f is concave) the opposite inequalities hold in* (19) *by replacing* inf *and* sup *with* sup *and* inf*, respectively.*

*Proof.* We prove only LHS of (19). It follows from (14) (compare it to (16))

$$\int\_{T} \phi\_{l} \left( f(\mathbf{x}\_{l}) \right) \, d\mu(t) \ge \frac{M\_{\mathbf{x}} k \mathbf{1}\_{K} - \int\_{T} \phi\_{l}(\mathbf{x}\_{l}) \, d\mu(t)}{M\_{\mathbf{x}} - m\_{\mathbf{x}}} f(m\_{\mathbf{x}}) + \frac{\int\_{T} \phi\_{l}(\mathbf{x}\_{l}) \, d\mu(t) - m\_{\mathbf{x}} k \mathbf{1}\_{K}}{M\_{\mathbf{x}} - m\_{\mathbf{x}}} f(M\_{\mathbf{x}})$$

since *<sup>T</sup> φt*(1*H*) *dμ*(*t*) = *k*1*K*. By using operator monotonicity of *F*(·, *v*) we obtain

$$\begin{aligned} &\quad \begin{bmatrix} F\left[\int\_{T} \phi\_{t}\left(f(\mathbf{x}\_{t})\right) d\mu(t), \operatorname{g}\left(\int\_{T} \phi\_{t}(\mathbf{x}\_{t}) \, d\mu(t)\right) \right] \\ \geq &F\left[\frac{M\_{i}k\_{\mathrm{I}} - \int\_{T} \phi\_{t}(\mathbf{x}\_{t}) \, d\mu(t)}{M\_{\mathrm{I}} - m\_{\mathrm{X}}} f(m\_{\mathrm{X}}) + \frac{\int\_{T} \phi\_{t}(\mathbf{x}\_{t}) \, d\mu(t) - m\_{\mathrm{i}}k \mathbf{1}\_{\mathrm{K}}}{M\_{\mathrm{I}} - m\_{\mathrm{X}}} f(M\_{\mathrm{X}}), \operatorname{g}\left(\int\_{T} \phi\_{t}(\mathbf{x}\_{t}) \, d\mu(t)\right) \right] \\ \geq &\inf\_{m\_{\mathrm{I}} \leq z \leq M\_{\mathrm{X}}} F\left[\frac{M\_{\mathrm{X}}k - z}{M\_{\mathrm{X}} - m\_{\mathrm{X}}} f(m\_{\mathrm{X}}) + \frac{z - km\_{\mathrm{X}}}{M\_{\mathrm{X}} - m\_{\mathrm{X}}} f(M\_{\mathrm{X}}), \operatorname{g}(z) \right] \mathbbm{1}\_{\mathrm{K}} \end{aligned}$$

Putting *<sup>F</sup>*(*u*, *<sup>v</sup>*) = *<sup>u</sup>* <sup>−</sup> *<sup>α</sup><sup>v</sup>* or *<sup>F</sup>*(*u*, *<sup>v</sup>*) = *<sup>v</sup>*−1/2*uv*−1/2 in Theorem 8, we obtain the next corollary. **Corollary 9.** *Let* (*xt*)*t*∈*T, mt, Mt, mx, Mx, m, M,* (*φt*)*t*∈*<sup>T</sup> be as in Theorem 8 and f* : [*m*, *<sup>M</sup>*] → **<sup>R</sup>***, g* : [*mx*, *Mx*] → **R** *be continuous functions. If*

$$(m\_{\mathcal{X}\prime}M\_{\mathcal{X}}) \cap [m\_{\mathcal{U}\prime}M\_{\mathcal{U}}] = \bigotimes\_{\prime} \qquad t \in T$$

*and f is convex, then for any α* ∈ **R**

$$\begin{split} \min\_{m\_{\boldsymbol{x}} \le z \le M\_{\boldsymbol{x}}} & \left\{ \frac{M\_{\boldsymbol{x}}k - z}{M\_{\boldsymbol{x}} - m\_{\boldsymbol{x}}} f(m\_{\boldsymbol{x}}) + \frac{z - km\_{\boldsymbol{x}}}{M\_{\boldsymbol{x}} - m\_{\boldsymbol{x}}} f(M\_{\boldsymbol{x}}) - \operatorname{g}(z) \right\} 1\_{\mathcal{K}} + \operatorname{ag} \left( \int\_{T} \phi\_{l}(\mathbf{x}\_{l}) d\mu(t) \right) \\ & \qquad \le \int\_{T} \phi\_{l} \left( f(\mathbf{x}\_{l}) \right) d\mu(t) \\ & \qquad \le \operatorname{ag} \left( \int\_{T} \phi\_{l}(\mathbf{x}\_{l}) d\mu(t) \right) + \max\_{m\_{\boldsymbol{x}} \le z \le M\_{\boldsymbol{x}}} \left\{ \frac{Mk - z}{M - m} f(m) + \frac{z - km}{M - m} f(M) - \operatorname{g}(z) \right\} 1\_{\mathcal{K}} \end{split} \tag{20}$$

*If additionally g* > 0 *on* [*mx*, *Mx*]*, then*

$$\begin{split} \min\_{m\_{\mathbf{x}} \le z \le M\_{\mathbf{x}}} & \left\{ \frac{\frac{M\_{\mathbf{x}}k - z}{M\_{\mathbf{r}} - m\_{\mathbf{x}}} f(m\_{\mathbf{x}}) + \frac{z - km\_{\mathbf{x}}}{M\_{\mathbf{x}} - m\_{\mathbf{x}}} f(M\_{\mathbf{x}})}{g(z)} \right\} \operatorname{g} \left( \int\_{T} \phi\_{l}(\mathbf{x}\_{l}) d\mu(t) \right) \\ \le \int\_{T} \phi\_{l} \left( f(\mathbf{x}\_{l}) \right) d\mu(t) \le \max\_{m\_{\mathbf{x}} \le z \le M\_{\mathbf{x}}} & \left\{ \frac{\frac{M\mathbf{k} - z}{M - m} f(m) + \frac{z - km}{M - m} f(M)}{g(z)} \right\} g\left( \int\_{T} \phi\_{l}(\mathbf{x}\_{l}) d\mu(t) \right) \end{split} \tag{21}$$

*In the dual case (when f is concave) the opposite inequalities hold in* (20) *by replacing* min *and* max *with* max *and* min*, respectively. If additionally g* > 0 *on* [*mx*, *Mx*]*, then the opposite inequalities also hold in* (21) *by replacing* min *and* max *with* max *and* min*, respectively.*

#### **4. Refined Jensen's inequality**

In this section we present a refinement of Jensen's inequality for real valued continuous convex functions given in Theorem 6. A discrete version of this result is given in [19].

To obtain our result we need the following two lemmas.

**Lemma 10.** *Let f be a convex function on an interval I, m*, *M* ∈ *I and p*1, *p*<sup>2</sup> ∈ [0, 1] *such that p*<sup>1</sup> + *p*<sup>2</sup> = 1*. Then*

$$\min\{p\_1, p\_2\} \left[ f(m) + f(M) - 2f\left(\frac{m+M}{2}\right) \right] \le p\_1 f(m) + p\_2 f(M) - f(p\_1 m + p\_2 M) \tag{22}$$

*Proof.* These results follows from [20, Theorem 1, p. 717].

**Lemma 11.** *Let x be a bounded self-adjoint elements in a unital C*∗*-algebra* A *of operators on some Hilbert space H. If the spectrum of x is in* [*m*, *M*]*, for some scalars m* < *M, then*

$$f\left(\mathbf{x}\right) \le \frac{M\mathbf{1}\_H - \mathbf{x}}{M - m} f(m) + \frac{\mathbf{x} - m\mathbf{1}\_H}{M - m} f(M) - \delta\_f \tilde{\mathbf{x}} \tag{23}$$

$$f(resp.\quad f\left(\mathbf{x}\right) \ge \frac{M\mathbf{1}\_H - \mathbf{x}}{M - m} f(m) + \frac{\mathbf{x} - m\mathbf{1}\_H}{M - m} f(M) + \delta\_f \tilde{\mathbf{x}} \quad (1)$$

*holds for every continuous convex* (*resp. concave*) *function f* : [*m*, *M*] → **R***, where*

$$\delta\_f = f(m) + f(M) - 2f\left(\frac{m+M}{2}\right) \quad \text{(resp. } \delta\_f = 2f\left(\frac{m+M}{2}\right) - f(m) - f(M)\text{)}$$

$$and \quad \tilde{\mathbf{x}} = \frac{1}{2}\mathbf{1}\_H - \frac{1}{M-m} \left| \mathbf{x} - \frac{m+M}{2}\mathbf{1}\_H \right| $$

*Proof.* We prove only the convex case. It follows from (22) that

$$\begin{split} f\left(p\_1 m + p\_2 M\right) &\leq p\_1 f(m) + p\_2 f(M) \\ &- \min\{p\_1, p\_2\} \left(f(m) + f(M) - 2f\left(\frac{m+M}{2}\right)\right) \end{split} \tag{24}$$

for every *p*1, *p*<sup>2</sup> ∈ [0, 1] such that *p*<sup>1</sup> + *p*<sup>2</sup> = 1 . For any *z* ∈ [*m*, *M*] we can write

$$f\left(z\right) = f\left(\frac{M-z}{M-m}m + \frac{z-m}{M-m}M\right).$$

Then by using (24) for *<sup>p</sup>*<sup>1</sup> <sup>=</sup> *<sup>M</sup>*−*<sup>z</sup> <sup>M</sup>*−*<sup>m</sup>* and *<sup>p</sup>*<sup>2</sup> <sup>=</sup> *<sup>z</sup>*−*<sup>m</sup> <sup>M</sup>*−*<sup>m</sup>* we obtain

$$\begin{split} f(z) &\leq \frac{M-z}{M-m} f(m) + \frac{z-m}{M-m} f(M) \\ &\quad - \left(\frac{1}{2} - \frac{1}{M-m} \left| z - \frac{m+M}{2} \right| \right) \left( f(m) + f(M) - 2f\left(\frac{m+M}{2}\right) \right) \end{split} \tag{25}$$

since

10 Will-be-set-by-IN-TECH

Putting *<sup>F</sup>*(*u*, *<sup>v</sup>*) = *<sup>u</sup>* <sup>−</sup> *<sup>α</sup><sup>v</sup>* or *<sup>F</sup>*(*u*, *<sup>v</sup>*) = *<sup>v</sup>*−1/2*uv*−1/2 in Theorem 8, we obtain the next corollary. **Corollary 9.** *Let* (*xt*)*t*∈*T, mt, Mt, mx, Mx, m, M,* (*φt*)*t*∈*<sup>T</sup> be as in Theorem 8 and f* : [*m*, *<sup>M</sup>*] → **<sup>R</sup>***,*

(*mx*, *Mx*) ∩ [*mt*, *Mt*] = ∅, *t* ∈ *T*

*f*(*Mx*) − *g*(*z*)

*φ<sup>t</sup>* (*f*(*xt*)) *dμ*(*t*)

*Mx*−*mx <sup>f</sup>*(*Mx*)

*Mk*−*<sup>z</sup> <sup>M</sup>*−*<sup>m</sup> <sup>f</sup>*(*m*) + *<sup>z</sup>*−*km*

*In the dual case (when f is concave) the opposite inequalities hold in* (20) *by replacing* min *and* max *with* max *and* min*, respectively. If additionally g* > 0 *on* [*mx*, *Mx*]*, then the opposite inequalities also*

In this section we present a refinement of Jensen's inequality for real valued continuous

**Lemma 10.** *Let f be a convex function on an interval I, m*, *M* ∈ *I and p*1, *p*<sup>2</sup> ∈ [0, 1] *such that*

**Lemma 11.** *Let x be a bounded self-adjoint elements in a unital C*∗*-algebra* A *of operators on some*

*<sup>M</sup>* <sup>−</sup> *<sup>m</sup> <sup>f</sup>*(*m*) + *<sup>x</sup>* <sup>−</sup> *<sup>m</sup>*1*<sup>H</sup>*

*<sup>M</sup>* <sup>−</sup> *<sup>m</sup> <sup>f</sup>*(*m*) + *<sup>x</sup>* <sup>−</sup> *<sup>m</sup>*1*<sup>H</sup>*

convex functions given in Theorem 6. A discrete version of this result is given in [19].

 *m* + *M* 2

*Hilbert space H. If the spectrum of x is in* [*m*, *M*]*, for some scalars m* < *M, then*

*<sup>f</sup>* (*x*) <sup>≤</sup> *<sup>M</sup>*1*<sup>H</sup>* <sup>−</sup> *<sup>x</sup>*

(*resp. f* (*x*) <sup>≥</sup> *<sup>M</sup>*1*<sup>H</sup>* <sup>−</sup> *<sup>x</sup>*

*g*(*z*)

*Mk* <sup>−</sup> *<sup>z</sup>*

*<sup>M</sup>* <sup>−</sup> *<sup>m</sup> <sup>f</sup>*(*m*) + *<sup>z</sup>* <sup>−</sup> *km*

 *g T*

*<sup>M</sup>*−*<sup>m</sup> <sup>f</sup>*(*M*)

1*<sup>K</sup>* + *αg*

 *T*

*<sup>M</sup>* <sup>−</sup> *<sup>m</sup> <sup>f</sup>*(*M*) <sup>−</sup> *<sup>g</sup>*(*z*)

*φt*(*xt*)*dμ*(*t*)

≤ *p*<sup>1</sup> *f*(*m*) + *p*<sup>2</sup> *f*(*M*) − *f*(*p*1*m* + *p*2*M*) (22)

*<sup>M</sup>* <sup>−</sup> *<sup>m</sup> <sup>f</sup>*(*M*) <sup>−</sup> *<sup>δ</sup><sup>f</sup> <sup>x</sup>* (23)

*<sup>M</sup>* <sup>−</sup> *<sup>m</sup> <sup>f</sup>*(*M*) + *<sup>δ</sup><sup>f</sup> <sup>x</sup>* )

 *g T* *φt*(*xt*)*dμ*(*t*)

*φt*(*xt*)*dμ*(*t*)

 1*K* (20)

(21)

*<sup>f</sup>*(*mx*) + *<sup>z</sup>* <sup>−</sup> *kmx*

≤ *T*

<sup>+</sup> max *mx*≤*z*≤*Mx*

*Mx*−*mx <sup>f</sup>*(*mx*) + *<sup>z</sup>*−*kmx*

*hold in* (21) *by replacing* min *and* max *with* max *and* min*, respectively.*

To obtain our result we need the following two lemmas.

*f*(*m*) + *f*(*M*) − 2 *f*

*Proof.* These results follows from [20, Theorem 1, p. 717].

*g*(*z*)

*Mx* − *mx*

*g* : [*mx*, *Mx*] → **R** *be continuous functions. If*

*and f is convex, then for any α* ∈ **R**

 *Mxk* <sup>−</sup> *<sup>z</sup> Mx* − *mx*

*If additionally g* > 0 *on* [*mx*, *Mx*]*, then*

min *mx*≤*z*≤*Mx*

**4. Refined Jensen's inequality**

*φt*(*xt*)*dμ*(*t*)

*<sup>φ</sup><sup>t</sup>* (*f*(*xt*)) *<sup>d</sup>μ*(*t*) <sup>≤</sup> max *mx*≤*z*≤*Mx*

*Mx k*−*z*

min *mx*≤*z*≤*Mx*

≤ *αg*

≤ *T*

*p*<sup>1</sup> + *p*<sup>2</sup> = 1*. Then*

min{*p*1, *p*2}

 *T*

$$\min\left\{\frac{M-z}{M-m}, \frac{z-m}{M-m}\right\} = \frac{1}{2} - \frac{1}{M-m} \left| z - \frac{m+M}{2} \right|^2$$

Finally we use the continuous functional calculus for a self-adjoint operator *x*: *f* , *g* ∈ C(*I*), *Sp*(*x*) ⊆ *I* and *f* ≤ *g* on *I* implies *f*(*x*) ≤ *g*(*x*); and *h*(*z*) = |*z*| implies *h*(*x*) = |*x*|. Then by using (25) we obtain the desired inequality (23).

**Theorem 12.** *Let* (*xt*)*t*∈*<sup>T</sup> be a bounded continuous field of self-adjoint elements in a unital C*∗*-algebra* A *defined on a locally compact Hausdorff space T equipped with a bounded Radon measure μ. Let mt and Mt, mt* ≤ *Mt, be the bounds of xt, t* ∈ *T. Let* (*φt*)*t*∈*<sup>T</sup> be a unital field of positive linear mappings φ<sup>t</sup>* : A→B *from* A *to another unital C*∗−*algebra* B*. Let*

$$(m\_{\mathbf{x}\prime}M\_{\mathbf{x}}) \cap [m\_{\mathbf{l}\prime}M\_{\mathbf{l}}] = \bigcirc \,, \quad \mathbf{t} \in T\prime \qquad \text{and} \qquad m < M\_{\mathbf{l}\prime}$$

*where mx and Mx, mx* <sup>≤</sup> *Mx, be the bounds of the operator x* <sup>=</sup> *<sup>T</sup> φt*(*xt*) *dμ*(*t*) *and*

$$m = \sup\left\{M\_l \colon M\_l \le m\_{\ge \prime}t \in T\right\},\ M = \inf\left\{m\_l \colon m\_l \ge M\_{\ge \prime}t \in T\right\},$$

*If f* : *I* → **R** *is a continuous convex* (*resp. concave*) *function provided that the interval I contains all mt*, *Mt, then*

$$f\left(\int\_{T} \phi\_{t}(\mathbf{x}\_{t}) \, d\mu(t)\right) \le \int\_{T} \phi\_{t}(f(\mathbf{x}\_{t})) \, d\mu(t) - \delta\_{f} \tilde{\mathbf{x}} \le \int\_{T} \phi\_{t}(f(\mathbf{x}\_{t})) \, d\mu(t) \tag{26}$$

 *resp.*

$$f\left(\int\_{T} \phi\_{t}(\mathbf{x}\_{t}) \, d\mu(\mathbf{t})\right) \ge \int\_{T} \phi\_{t}(f(\mathbf{x}\_{t})) \, d\mu(\mathbf{t}) - \delta\_{f} \tilde{\mathbf{x}} \ge \int\_{T} \phi\_{t}(f(\mathbf{x}\_{t})) \, d\mu(\mathbf{t}) \tag{27}$$

*holds, where*

$$\delta\_f \equiv \delta\_f(\vec{m}, \bar{M}) = f(\vec{m}) + f(\bar{M}) - 2f\left(\frac{\vec{m} + \bar{M}}{2}\right)$$

$$\text{ (resp. } \quad \delta\_f \equiv \delta\_f(\vec{m}, \bar{M}) = 2f\left(\frac{\vec{m} + \bar{M}}{2}\right) - f(\bar{m}) - f(\bar{M}) \text{ )}\tag{28}$$

$$\tilde{\mathbf{x}} \equiv \tilde{\mathbf{x}}\_{\mathbf{x}}(\vec{m}, \bar{M}) = \frac{1}{2}\mathbf{1}\_{\mathbf{K}} - \frac{1}{\bar{M} - \bar{m}} \left| \mathbf{x} - \frac{\vec{m} + \bar{M}}{2}\mathbf{1}\_{\mathbf{K}} \right|$$

$$\vdots \quad \vdots \quad \vdots \quad \vdots \quad \vdots \quad \vdots \quad \vdots$$

*and <sup>m</sup>*¯ <sup>∈</sup> [*m*, *mA*]*, <sup>M</sup>*¯ <sup>∈</sup> [*MA*, *<sup>M</sup>*]*, <sup>m</sup>*¯ <sup>&</sup>lt; *M, are arbitrary numbers.* ¯

*Proof.* We prove only the convex case. Since *x* = *<sup>T</sup> φt*(*xt*) *dμ*(*t*) ∈ B is the self-adjoint elements such that *<sup>m</sup>*¯ <sup>1</sup>*<sup>K</sup>* <sup>≤</sup> *mx*1*<sup>K</sup>* <sup>≤</sup> *<sup>T</sup> <sup>φ</sup>t*(*xt*) *<sup>d</sup>μ*(*t*) <sup>≤</sup> *Mx*1*<sup>K</sup>* <sup>≤</sup> *<sup>M</sup>*¯ <sup>1</sup>*<sup>K</sup>* and *<sup>f</sup>* is convex on [*m*¯ , *<sup>M</sup>*¯ ] <sup>⊆</sup> *<sup>I</sup>*, then by Lemma 11 we obtain

$$f\left(\int\_{T} \phi\_{l}(\mathbf{x}\_{l}) \, d\mu(t)\right) \leq \frac{\bar{M}\mathbf{1}\_{K} - \int\_{T} \phi\_{l}(\mathbf{x}\_{l}) \, d\mu(t)}{\bar{M} - \bar{m}} f(\bar{m}) + \frac{\int\_{T} \phi\_{l}(\mathbf{x}\_{l}) \, d\mu(t) - \bar{m}\mathbf{1}\_{K}}{\bar{M} - \bar{m}} f(\bar{M}) - \delta\_{f} \tilde{\mathbf{x}} \tag{29}$$

where *<sup>δ</sup><sup>f</sup>* and *<sup>x</sup>* are defined by (28).

But since *<sup>f</sup>* is convex on [*mt*, *Mt*] and (*mx*, *Mx*) <sup>∩</sup> [*mt*, *Mt*] = <sup>∅</sup> implies (*m*¯ , *<sup>M</sup>*¯ ) <sup>∩</sup> [*mt*, *Mt*] = <sup>∅</sup>, then

$$f\left(\mathbf{x}\_{t}\right) \geq \frac{\bar{M}\mathbf{1}\_{H} - \mathbf{x}\_{t}}{\bar{M} - \bar{m}} f(\bar{m}) + \frac{\mathbf{x}\_{t} - \bar{m}\mathbf{1}\_{H}}{\bar{M} - \bar{m}} f(\bar{M}), \quad t \in T$$

Applying a positive linear mapping *<sup>φ</sup>t*, integrating and adding <sup>−</sup>*δ<sup>f</sup> <sup>x</sup>*, we obtain

$$\int\_{T} \phi\_{l} \left( f(\mathbf{x}\_{l}) \right) \, d\mu(\mathbf{t}) - \delta\_{f} \tilde{\mathbf{x}} \ge \frac{\bar{M} \mathbf{1}\_{\mathcal{K}} - \int\_{T} \phi\_{l}(\mathbf{x}\_{l}) \, d\mu(\mathbf{t})}{\bar{M} - \bar{m}} f(\bar{m}) + \frac{\int\_{T} \phi\_{l}(\mathbf{x}\_{l}) \, d\mu(\mathbf{t}) - \bar{m} \mathbf{1}\_{\mathcal{K}}}{\bar{M} - \bar{m}} f(\bar{M}) - \delta\_{f} \tilde{\mathbf{x}} \tag{30}$$

since *<sup>T</sup> φt*(1*H*) *dμ*(*t*) = 1*K*. Combining the two inequalities (29) and (30), we have LHS of (26). Since *<sup>δ</sup><sup>f</sup>* <sup>≥</sup> 0 and *<sup>x</sup>* <sup>≥</sup> 0, then we have RHS of (26).

If *<sup>m</sup>* <sup>&</sup>lt; *<sup>M</sup>* and *mx* <sup>=</sup> *Mx*, then the inequality (26) holds, but *<sup>δ</sup>f*(*mx*, *Mx*) *<sup>x</sup>*(*mx*, *Mx*) is not defined (see Example 13 I) and II)).

**Example 13.** *We give examples for the matrix cases and T* = {1, 2}*. Then we have refined inequalities given in Fig. 2. We put f*(*t*) = *t* <sup>4</sup> *which is convex but not operator convex in* (26)*. Also, we define mappings* <sup>Φ</sup>1, <sup>Φ</sup><sup>2</sup> : *<sup>M</sup>*3(**C**) <sup>→</sup> *<sup>M</sup>*2(**C**) *as follows:* <sup>Φ</sup>1((*aij*)1≤*i*,*j*≤3) = <sup>1</sup> <sup>2</sup> (*aij*)1≤*i*,*j*≤2*,* <sup>Φ</sup><sup>2</sup> = <sup>Φ</sup><sup>1</sup> *(then* Φ1(*I*3) + Φ2(*I*3) = *I*2*).*

*I) First, we observe an example when δ<sup>f</sup> X is equal to the difference RHS and LHS of Jensen's inequality. If X*<sup>1</sup> = −3*I*<sup>3</sup> *and X*<sup>2</sup> = 2*I*3*, then X* = Φ1(*X*1) + Φ2(*X*2) = −0.5*I*2*, so m* = −3*, <sup>M</sup>* <sup>=</sup> <sup>2</sup>*. We also put <sup>m</sup>*¯ <sup>=</sup> <sup>−</sup><sup>3</sup> *and <sup>M</sup>*¯ <sup>=</sup> <sup>2</sup>*. We obtain*

$$\left(\Phi\_1(X\_1) + \Phi\_2(X\_2)\right)^4 = 0.0625I\_2 < 48.5I\_2 = \Phi\_1\left(X\_1^4\right) + \Phi\_2\left(X\_2^4\right)$$

$$\begin{aligned} &\text{if } \{\phi\_1(\mathbf{x}\_1)\} + \mathsf{f}\left\{\phi\_2(\mathbf{x}\_2)\right\} \leq \phi\_1(\{\mathsf{f}\left(\mathbf{x}\_1\right)\} + \phi\_2(\{\mathsf{f}\left(\mathbf{x}\_2\right)\} - \bar{\phi}\_1\overline{\mathbf{X}}, \mathsf{f}\})\\ &\text{where} \\ &8\_t = \mathsf{f}\left(\overline{\mathbf{m}}\right) \oplus \mathsf{f}\left(\mathbf{M}\right) - 2\mathsf{f}\left(\overline{\mathbf{M}} + \overline{\mathbf{m}}\right)/2\right) \\ &\overline{\mathbf{X}} = \frac{1}{2}\mathsf{f}\_\mathsf{K} - \frac{1}{\overline{\mathbf{M}} - \overline{\mathbf{m}}} \Big| \phi\_\mathsf{K}(\mathbf{x}\_1) + \phi\_2(\mathbf{x}\_2) - \frac{\overline{\mathbf{M}} + \overline{\mathbf{m}}}{2}\mathsf{f}\_\mathsf{K} \end{aligned}$$

**Figure 2.** Refinement for two operators and a convex function *f*

*and its improvement*

12 Will-be-set-by-IN-TECH

*<sup>δ</sup><sup>f</sup>* <sup>≡</sup> *<sup>δ</sup>f*(*m*¯ , *<sup>M</sup>*¯ ) = *<sup>f</sup>*(*m*¯) + *<sup>f</sup>*(*M*¯ ) <sup>−</sup> <sup>2</sup> *<sup>f</sup>*

*<sup>φ</sup>t*(*f*(*xt*)) *<sup>d</sup>μ*(*t*) <sup>−</sup> *<sup>δ</sup><sup>f</sup> <sup>x</sup>* <sup>≥</sup>

 *<sup>m</sup>*¯ +*M*¯ 2 

<sup>2</sup> <sup>1</sup>*<sup>K</sup>* <sup>−</sup> <sup>1</sup> *<sup>M</sup>*¯ <sup>−</sup>*m*¯ 

 *T*

> *<sup>m</sup>*¯ +*M*¯ 2

<sup>−</sup> *<sup>f</sup>*(*m*¯) <sup>−</sup> *<sup>f</sup>*(*M*¯ ) )

*<sup>T</sup> <sup>φ</sup>t*(*xt*) *<sup>d</sup>μ*(*t*) <sup>≤</sup> *Mx*1*<sup>K</sup>* <sup>≤</sup> *<sup>M</sup>*¯ <sup>1</sup>*<sup>K</sup>* and *<sup>f</sup>* is convex on

*<sup>T</sup> φt*(*xt*) *dμ*(*t*) − *m*¯ 1*<sup>K</sup>*

*<sup>T</sup> φt*(*xt*) *dμ*(*t*) − *m*¯ 1*<sup>K</sup>*

*<sup>M</sup>*¯ <sup>−</sup> *<sup>m</sup>*¯ *<sup>f</sup>*(*M*¯ ), *<sup>t</sup>* <sup>∈</sup> *<sup>T</sup>*

<sup>4</sup> *which is convex but not operator convex in* (26)*. Also, we define*

 *X*4 1 + Φ<sup>2</sup> *X*4 2 

*<sup>x</sup>* <sup>−</sup> *<sup>m</sup>*¯ <sup>+</sup>*M*¯ <sup>2</sup> 1*<sup>K</sup>* 

*φt*(*f*(*xt*)) *dμ*(*t*)

*<sup>T</sup> φt*(*xt*) *dμ*(*t*) ∈ B is the self-adjoint

*<sup>M</sup>*¯ <sup>−</sup> *<sup>m</sup>*¯ *<sup>f</sup>*(*M*¯ ) <sup>−</sup> *<sup>δ</sup><sup>f</sup> <sup>x</sup>* (29)

*<sup>M</sup>*¯ <sup>−</sup> *<sup>m</sup>*¯ *<sup>f</sup>*(*M*¯ ) <sup>−</sup> *<sup>δ</sup><sup>f</sup> <sup>x</sup>*

<sup>2</sup> (*aij*)1≤*i*,*j*≤2*,* <sup>Φ</sup><sup>2</sup> = <sup>Φ</sup><sup>1</sup> *(then*

(30)

(27)

(28)

 *resp.*

> *f T*

then

 *T*

since

*holds, where*

*f T*

*φt*(*xt*) *dμ*(*t*)

 ≥ *T*

(*resp. <sup>δ</sup><sup>f</sup>* <sup>≡</sup> *<sup>δ</sup>f*(*m*¯ , *<sup>M</sup>*¯ ) = <sup>2</sup> *<sup>f</sup>*

*and <sup>m</sup>*¯ <sup>∈</sup> [*m*, *mA*]*, <sup>M</sup>*¯ <sup>∈</sup> [*MA*, *<sup>M</sup>*]*, <sup>m</sup>*¯ <sup>&</sup>lt; *M, are arbitrary numbers.* ¯

*Proof.* We prove only the convex case. Since *x* =

*<sup>M</sup>*¯ <sup>1</sup>*<sup>K</sup>* <sup>−</sup>

*<sup>f</sup>* (*xt*) <sup>≥</sup> *<sup>M</sup>*¯ <sup>1</sup>*<sup>H</sup>* <sup>−</sup> *xt*

(26). Since *<sup>δ</sup><sup>f</sup>* <sup>≥</sup> 0 and *<sup>x</sup>* <sup>≥</sup> 0, then we have RHS of (26).

*<sup>M</sup>* <sup>=</sup> <sup>2</sup>*. We also put <sup>m</sup>*¯ <sup>=</sup> <sup>−</sup><sup>3</sup> *and <sup>M</sup>*¯ <sup>=</sup> <sup>2</sup>*. We obtain*

(Φ1(*X*1) + Φ2(*X*2))

*<sup>M</sup>*¯ <sup>1</sup>*<sup>K</sup>* <sup>−</sup>

*mappings* <sup>Φ</sup>1, <sup>Φ</sup><sup>2</sup> : *<sup>M</sup>*3(**C**) <sup>→</sup> *<sup>M</sup>*2(**C**) *as follows:* <sup>Φ</sup>1((*aij*)1≤*i*,*j*≤3) = <sup>1</sup>

elements such that *<sup>m</sup>*¯ <sup>1</sup>*<sup>K</sup>* <sup>≤</sup> *mx*1*<sup>K</sup>* <sup>≤</sup>

*φt*(*xt*) *dμ*(*t*)

[*m*¯ , *<sup>M</sup>*¯ ] <sup>⊆</sup> *<sup>I</sup>*, then by Lemma 11 we obtain

 ≤

where *<sup>δ</sup><sup>f</sup>* and *<sup>x</sup>* are defined by (28).

*<sup>φ</sup><sup>t</sup>* (*f*(*xt*)) *<sup>d</sup>μ*(*t*) <sup>−</sup> *<sup>δ</sup><sup>f</sup> <sup>x</sup>* <sup>≥</sup>

defined (see Example 13 I) and II)).

*given in Fig. 2. We put f*(*t*) = *t*

Φ1(*I*3) + Φ2(*I*3) = *I*2*).*

*<sup>x</sup>* <sup>≡</sup> *<sup>x</sup>x*(*m*¯ , *<sup>M</sup>*¯ ) = <sup>1</sup>

*<sup>T</sup> φt*(*xt*) *dμ*(*t*) *<sup>M</sup>*¯ <sup>−</sup> *<sup>m</sup>*¯ *<sup>f</sup>*(*m*¯) +

Applying a positive linear mapping *<sup>φ</sup>t*, integrating and adding <sup>−</sup>*δ<sup>f</sup> <sup>x</sup>*, we obtain

But since *<sup>f</sup>* is convex on [*mt*, *Mt*] and (*mx*, *Mx*) <sup>∩</sup> [*mt*, *Mt*] = <sup>∅</sup> implies (*m*¯ , *<sup>M</sup>*¯ ) <sup>∩</sup> [*mt*, *Mt*] = <sup>∅</sup>,

*<sup>M</sup>*¯ <sup>−</sup> *<sup>m</sup>*¯ *<sup>f</sup>*(*m*¯) + *xt* <sup>−</sup> *<sup>m</sup>*¯ <sup>1</sup>*<sup>H</sup>*

*<sup>T</sup> φt*(*xt*) *dμ*(*t*) *<sup>M</sup>*¯ <sup>−</sup> *<sup>m</sup>*¯ *<sup>f</sup>*(*m*¯) +

*<sup>T</sup> φt*(1*H*) *dμ*(*t*) = 1*K*. Combining the two inequalities (29) and (30), we have LHS of

If *<sup>m</sup>* <sup>&</sup>lt; *<sup>M</sup>* and *mx* <sup>=</sup> *Mx*, then the inequality (26) holds, but *<sup>δ</sup>f*(*mx*, *Mx*) *<sup>x</sup>*(*mx*, *Mx*) is not

**Example 13.** *We give examples for the matrix cases and T* = {1, 2}*. Then we have refined inequalities*

*I) First, we observe an example when δ<sup>f</sup> X is equal to the difference RHS and LHS of Jensen's* 

*inequality. If X*<sup>1</sup> = −3*I*<sup>3</sup> *and X*<sup>2</sup> = 2*I*3*, then X* = Φ1(*X*1) + Φ2(*X*2) = −0.5*I*2*, so m* = −3*,*

<sup>4</sup> = 0.0625*I*<sup>2</sup> < 48.5*I*<sup>2</sup> = <sup>Φ</sup><sup>1</sup>

$$\left(\Phi\_1(X\_1) + \Phi\_2(X\_2)\right)^4 = 0.0625I\_2 = \Phi\_1\left(X\_1^4\right) + \Phi\_2\left(X\_2^4\right) - 48.4375I\_2$$

*since δ<sup>f</sup>* = 96.875*, X*� = 0.5*I*2. *We remark that in this case mx* = *Mx* = −1/2 *and X*�(*mx*, *Mx*) *is not defined.*

*II) Next, we observe an example when δ<sup>f</sup> X is not equal to the difference RHS and LHS of Jensen's* � *inequality and mx* = *Mx. If*

$$\begin{aligned} \mathbf{X}\_1 = \begin{pmatrix} -1 & 0 & 0 \\ 0 & -2 & 0 \\ 0 & 0 & -1 \end{pmatrix}, \ \mathbf{X}\_2 = \begin{pmatrix} 2 & 0 & 0 \\ 0 & 3 & 0 \\ 0 & 0 & 4 \end{pmatrix}, \ \text{then } \mathbf{X} = \frac{1}{2} \begin{pmatrix} 1 & 0 \\ 0 & 1 \end{pmatrix} \text{ and } \mathbf{m} = -1, \ \mathbf{M} = 2. \end{aligned}$$

*In this case <sup>x</sup>*�(*mx*, *Mx*) *is not defined, since mx* <sup>=</sup> *Mx* <sup>=</sup> 1/2*. We have*

$$\left(\Phi\_1(X\_1) + \Phi\_2(X\_2)\right)^4 = \frac{1}{16} \begin{pmatrix} 1 \ 0 \\ 0 \ 1 \end{pmatrix} < \begin{pmatrix} \frac{17}{2} & 0 \\ 0 & \frac{97}{2} \end{pmatrix} = \Phi\_1\left(X\_1^4\right) + \Phi\_2\left(X\_2^4\right)$$

*and putting <sup>m</sup>*¯ <sup>=</sup> <sup>−</sup>1*, <sup>M</sup>*¯ <sup>=</sup> <sup>2</sup> *we obtain <sup>δ</sup><sup>f</sup>* <sup>=</sup> 135/8*, <sup>X</sup>*� <sup>=</sup> *<sup>I</sup>*2/2 *which give the following improvement*

$$\left(\Phi\_1(X\_1) + \Phi\_2(X\_2)\right)^4 = \frac{1}{16} \begin{pmatrix} 1 \ 0 \\ 0 \ 1 \end{pmatrix} < \frac{1}{16} \begin{pmatrix} 1 & 0 \\ 0 \ 641 \end{pmatrix} = \Phi\_1 \begin{pmatrix} X\_1^4 \\ \end{pmatrix} + \Phi\_2 \begin{pmatrix} X\_2^4 \\ \end{pmatrix} - \frac{135}{16} \begin{pmatrix} 1 \ 0 \\ 0 \ 1 \end{pmatrix}$$

*III) Next, we observe an example with matrices that are not special. If*

$$X\_1 = \begin{pmatrix} -4 & 1 & 1 \\ 1 & -2 & -1 \\ 1 & -1 & -1 \end{pmatrix} \quad \text{and} \quad X\_2 = \begin{pmatrix} 5 & -1 & -1 \\ -1 & 2 & 1 \\ -1 & 1 & 3 \end{pmatrix}, \quad \text{then} \quad X = \frac{1}{2} \begin{pmatrix} 1 & 0 \\ 0 & 0 \end{pmatrix}.$$

*so m*<sup>1</sup> = −4.8662*, M*<sup>1</sup> = −0.3446*, m*<sup>2</sup> = 1.3446*, M*<sup>2</sup> = 5.8662*, m* = −0.3446*, M* = 1.3446 *and we put m*¯ = *m, M*¯ = *M (rounded to four decimal places). We have*

$$\left(\Phi\_1(X\_1) + \Phi\_2(X\_2)\right)^4 = \frac{1}{16} \begin{pmatrix} 1 \ 0 \\ 0 \ 0 \end{pmatrix} < \begin{pmatrix} \frac{1283}{2} & -255 \\ -255 & \frac{237}{2} \end{pmatrix} = \Phi\_1\left(X\_1^4\right) + \Phi\_2\left(X\_2^4\right)$$

*and its improvement*

$$\begin{aligned} \left(\Phi\_1(X\_1) + \Phi\_2(X\_2)\right)^4 &= \frac{1}{16} \begin{pmatrix} 1 & 0\\ 0 & 0 \end{pmatrix} \\ &< \begin{pmatrix} 639.9213 & -255\\ -255 & 117.8559 \end{pmatrix} = \Phi\_1 \begin{pmatrix} X\_1^4\\ X\_1^4 \end{pmatrix} + \Phi\_2 \begin{pmatrix} X\_2^4\\ X\_2^4 \end{pmatrix} - \begin{pmatrix} 1.5787 & 0\\ 0 & 0.6441 \end{pmatrix} \end{aligned}$$

*(rounded to four decimal places), since δ<sup>f</sup>* = 3.1574*, X* = 0.5 0 0 0.2040 *. But, if we put m*¯ = *mx* = 0*, <sup>M</sup>*¯ <sup>=</sup> *Mx* <sup>=</sup> 0.5*, then <sup>X</sup>* <sup>=</sup> **<sup>0</sup>***, so we do not have an improvement of Jensen's inequality. Also, if we put <sup>m</sup>*¯ <sup>=</sup> <sup>0</sup>*, <sup>M</sup>*¯ <sup>=</sup> <sup>1</sup>*, then <sup>X</sup>* <sup>=</sup> 0.5 1 0 0 1 *, <sup>δ</sup><sup>f</sup>* <sup>=</sup> 7/8 *and <sup>δ</sup><sup>f</sup> <sup>X</sup>* <sup>=</sup> 0.4375 1 0 0 1 *, which is worse than the above improvement.*

Putting Φ*t*(*y*) = *aty* for every *y* ∈ A, where *at* ≥ 0 is a real number, we obtain the following obvious corollary of Theorem 12.

**Corollary 14.** *Let* (*xt*)*t*∈*<sup>T</sup> be a bounded continuous field of self-adjoint elements in a unital C*∗*-algebra* A *defined on a locally compact Hausdorff space T equipped with a bounded Radon measure μ. Let mt and Mt, mt* ≤ *Mt, be the bounds of xt, t* ∈ *T. Let* (*at*)*t*∈*<sup>T</sup> be a continuous field of nonnegative real numbers such that <sup>T</sup> at dμ*(*t*) = 1*. Let*

$$(m\_{\mathbf{x}\prime}M\_{\mathbf{x}}) \cap [m\_{\mathbf{t}\prime}M\_{\mathbf{t}}] = \bigcirc \mathbf{t} \in T, \qquad \text{and} \qquad m < M$$

*where mx and Mx, mx* <sup>≤</sup> *Mx, are the bounds of the operator x* <sup>=</sup> *<sup>T</sup> φt*(*xt*) *dμ*(*t*) *and*

*m* = sup {*Mt* : *Mt* ≤ *mx*, *t* ∈ *T*} , *M* = inf {*mt* : *mt* ≥ *Mx*, *t* ∈ *T*}

*If f* : *I* → **R** *is a continuous convex* (*resp. concave*) *function provided that the interval I contains all mt*, *Mt, then*

$$\begin{cases} \int\_{T} a\_{t} \mathbf{x}\_{t} \, d\mu(t) \\ \text{(resp. } \quad f \left( \int\_{T} a\_{t} \mathbf{x}\_{t} \, d\mu(t) \right) \ge \int\_{T} a\_{t} f(\mathbf{x}\_{t}) \, d\mu(t) \end{cases}$$
 
$$\text{(resp. } \quad f \left( \int\_{T} a\_{t} \mathbf{x}\_{t} \, d\mu(t) \right) \ge \int\_{T} a\_{t} f(\mathbf{x}\_{t}) \, d\mu(t) + \delta\_{f} \tilde{\mathbf{x}} \ge \int\_{T} a\_{t} f(\mathbf{x}\_{t}) \, d\mu(t) \text{ )$$

*holds, where δ<sup>f</sup> is defined by* (28)*,* ˜ *x*˜ = <sup>1</sup> <sup>2</sup> <sup>1</sup>*<sup>H</sup>* <sup>−</sup> <sup>1</sup> *<sup>M</sup>*¯ <sup>−</sup>*m*¯ *<sup>T</sup> atxt <sup>d</sup>μ*(*t*) <sup>−</sup> *<sup>m</sup>*¯ <sup>+</sup>*M*¯ <sup>2</sup> 1*<sup>H</sup> and m*¯ ∈ [*m*, *mA*]*, <sup>M</sup>*¯ <sup>∈</sup> [*MA*, *<sup>M</sup>*]*, <sup>m</sup>*¯ <sup>&</sup>lt; *M, are arbitrary numbers.* ¯

#### **5. Extension Jensen's inequality**

In this section we present an extension of Jensen's operator inequality for *n*−tuples of self-adjoint operators, unital *n*−tuples of positive linear mappings and real valued continuous convex functions with conditions on the spectra of the operators.

In a discrete version of Theorem 6 we prove that Jensen's operator inequality holds for every continuous convex function and for every *n*−tuple of self-adjoint operators (*A*1,..., *An*), for every *n*−tuple of positive linear mappings (Φ1,..., Φ*n*) in the case when the interval with bounds of the operator *A* = ∑*<sup>n</sup> <sup>i</sup>*=<sup>1</sup> Φ*i*(*Ai*) has no intersection points with the interval with bounds of the operator *Ai* for each *i* = 1, . . . , *n*, i.e. when (*mA*, *MA*) ∩ [*mi*, *Mi*] = ∅ for *i* = 1, . . . , *n*, where *mA* and *MA*, *mA* ≤ *MA*, are the bounds of *A*, and *mi* and *Mi*, *mi* ≤ *Mi*, are the bounds of *Ai*, *i* = 1, . . . , *n*. It is interesting to consider the case when (*mA*, *MA*) ∩ [*mi*, *Mi*] = ∅ is valid for several *i* ∈ {1, . . . , *n*}, but not for all *i* = 1, . . . , *n*. We study it in the following theorem (see [21]).

**Theorem 15.** *Let* (*A*1,..., *An*) *be an n*−*tuple of self-adjoint operators Ai* ∈ *B*(*H*) *with the bounds mi and Mi, mi* ≤ *Mi, i* = 1, . . . , *n. Let* (Φ1,..., Φ*n*) *be an n*−*tuple of positive linear mappings* Φ*<sup>i</sup>* : *<sup>B</sup>*(*H*) <sup>→</sup> *<sup>B</sup>*(*K*)*, such that* <sup>∑</sup>*<sup>n</sup> <sup>i</sup>*=<sup>1</sup> Φ*i*(1*H*) = 1*K. For* 1 ≤ *n*<sup>1</sup> < *n, we denote m* = min{*m*1,..., *mn*<sup>1</sup> }*, <sup>M</sup>* <sup>=</sup> max{*M*1,..., *Mn*<sup>1</sup> } *and* <sup>∑</sup>*n*<sup>1</sup> *<sup>i</sup>*=<sup>1</sup> <sup>Φ</sup>*i*(1*H*) = *<sup>α</sup>* <sup>1</sup>*K,* <sup>∑</sup>*<sup>n</sup> <sup>i</sup>*=*n*1+<sup>1</sup> Φ*i*(1*H*) = *β* 1*K, where α*, *β* > 0*, α* + *β* = 1*. If*

$$(m,M)\cap[m\_{i\cdot},M\_i]=\mathcal{Q}\_{\prime}\qquad i=n\_1+1,\ldots,n\_n$$

*and one of two equalities*

$$\frac{1}{\alpha} \sum\_{i=1}^{n\_1} \Phi\_i(A\_i) = \frac{1}{\beta} \sum\_{i=n\_1+1}^n \Phi\_i(A\_i) = \sum\_{i=1}^n \Phi\_i(A\_i)$$

*is valid, then*

14 Will-be-set-by-IN-TECH

*<sup>M</sup>*¯ <sup>=</sup> *Mx* <sup>=</sup> 0.5*, then <sup>X</sup>* <sup>=</sup> **<sup>0</sup>***, so we do not have an improvement of Jensen's inequality. Also, if we put*

Putting Φ*t*(*y*) = *aty* for every *y* ∈ A, where *at* ≥ 0 is a real number, we obtain the following

**Corollary 14.** *Let* (*xt*)*t*∈*<sup>T</sup> be a bounded continuous field of self-adjoint elements in a unital C*∗*-algebra* A *defined on a locally compact Hausdorff space T equipped with a bounded Radon measure μ. Let mt and Mt, mt* ≤ *Mt, be the bounds of xt, t* ∈ *T. Let* (*at*)*t*∈*<sup>T</sup> be a continuous field of nonnegative real*

(*mx*, *Mx*) ∩ [*mt*, *Mt*] = ∅, *t* ∈ *T*, *and m* < *M*

*m* = sup {*Mt* : *Mt* ≤ *mx*, *t* ∈ *T*} , *M* = inf {*mt* : *mt* ≥ *Mx*, *t* ∈ *T*}

*If f* : *I* → **R** *is a continuous convex* (*resp. concave*) *function provided that the interval I contains all*

*<sup>T</sup> at <sup>f</sup>*(*xt*) *<sup>d</sup>μ*(*t*) <sup>−</sup> *<sup>δ</sup><sup>f</sup>* ˜

*<sup>T</sup> at <sup>f</sup>*(*xt*) *<sup>d</sup>μ*(*t*) + *<sup>δ</sup><sup>f</sup>* ˜

*, δ<sup>f</sup>* = 7/8 *and δ<sup>f</sup> X* = 0.4375

 0.5 0 0 0.2040 1.5787 0 0 0.6441

*<sup>T</sup> φt*(*xt*) *dμ*(*t*) *and*

*<sup>T</sup> at f*(*xt*) *dμ*(*t*)

*<sup>T</sup> at f*(*xt*) *dμ*(*t*) )

*and m*¯ ∈ [*m*, *mA*]*,*

<sup>2</sup> 1*<sup>H</sup>* 

*<sup>x</sup>*˜ <sup>≤</sup>

*<sup>x</sup>*˜ <sup>≥</sup>

*<sup>i</sup>*=<sup>1</sup> Φ*i*(*Ai*) has no intersection points with the interval

*<sup>T</sup> atxt <sup>d</sup>μ*(*t*) <sup>−</sup> *<sup>m</sup>*¯ <sup>+</sup>*M*¯

 1 0 0 1  *. But, if we put m*¯ = *mx* = 0*,*

*, which is worse than the*

<sup>4</sup> <sup>=</sup> <sup>1</sup> 16 1 0 0 0 

 = Φ<sup>1</sup> *X*4 1 + Φ<sup>2</sup> *X*4 2 − 

*and its improvement*

< 

*<sup>m</sup>*¯ <sup>=</sup> <sup>0</sup>*, <sup>M</sup>*¯ <sup>=</sup> <sup>1</sup>*, then <sup>X</sup>* <sup>=</sup> 0.5

obvious corollary of Theorem 12.

*f T*

 *T*

*<sup>M</sup>*¯ <sup>∈</sup> [*MA*, *<sup>M</sup>*]*, <sup>m</sup>*¯ <sup>&</sup>lt; *M, are arbitrary numbers.* ¯

**5. Extension Jensen's inequality**

with bounds of the operator *A* = ∑*<sup>n</sup>*

(*resp. f*

*holds, where δ<sup>f</sup> is defined by* (28)*,* ˜

*above improvement.*

*numbers such that*

*mt*, *Mt, then*

(Φ1(*X*1) + Φ2(*X*2))

639.9213 −255 −255 117.8559

*(rounded to four decimal places), since δ<sup>f</sup>* = 3.1574*, X* =

 1 0 0 1 

*<sup>T</sup> at dμ*(*t*) = 1*. Let*

*where mx and Mx, mx* <sup>≤</sup> *Mx, are the bounds of the operator x* <sup>=</sup>

*atxt dμ*(*t*)

*atxt dμ*(*t*)

 <sup>≤</sup>

 <sup>≥</sup>

*x*˜ = <sup>1</sup>

convex functions with conditions on the spectra of the operators.

<sup>2</sup> <sup>1</sup>*<sup>H</sup>* <sup>−</sup> <sup>1</sup> *<sup>M</sup>*¯ <sup>−</sup>*m*¯ 

In this section we present an extension of Jensen's operator inequality for *n*−tuples of self-adjoint operators, unital *n*−tuples of positive linear mappings and real valued continuous

In a discrete version of Theorem 6 we prove that Jensen's operator inequality holds for every continuous convex function and for every *n*−tuple of self-adjoint operators (*A*1,..., *An*), for every *n*−tuple of positive linear mappings (Φ1,..., Φ*n*) in the case when the interval

with bounds of the operator *Ai* for each *i* = 1, . . . , *n*, i.e. when (*mA*, *MA*) ∩ [*mi*, *Mi*] = ∅

$$\frac{1}{\alpha} \sum\_{i=1}^{n\_1} \Phi\_i(f(A\_i)) \le \sum\_{i=1}^n \Phi\_i(f(A\_i)) \le \frac{1}{\beta} \sum\_{i=n\_1+1}^n \Phi\_i(f(A\_i)) \tag{31}$$

*holds for every continuous convex function f* : *I* → **R** *provided that the interval I contains all mi*, *Mi, i* = 1, . . . , *n. If f* : *I* → **R** *is concave, then the reverse inequality is valid in* (31)*.*

*Proof.* We prove only the case when *f* is a convex function. Let us denote

$$A = \frac{1}{\mathfrak{a}} \sum\_{i=1}^{n\_1} \Phi\_i(A\_i), \qquad B = \frac{1}{\beta} \sum\_{i=n\_1+1}^n \Phi\_i(A\_i), \qquad \mathsf{C} = \sum\_{i=1}^n \Phi\_i(A\_i).$$

It is easy to verify that *A* = *B* or *B* = *C* or *A* = *C* implies *A* = *B* = *C*. **a)** Let *m* < *M*. Since *f* is convex on [*m*, *M*] and [*mi*, *Mi*] ⊆ [*m*, *M*] for *i* = 1, . . . , *n*1, then

$$f(z) \le \frac{M-z}{M-m} f(m) + \frac{z-m}{M-m} f(M), \quad z \in [m\_i, M\_i] \text{ for } i = 1, \dots, n\_1 \tag{32}$$

but since *f* is convex on all [*mi*, *Mi*] and (*m*, *M*) ∩ [*mi*, *Mi*] = ∅ for *i* = *n*<sup>1</sup> + 1, . . . , *n*, then

$$f(z) \ge \frac{M-z}{M-m} f(m) + \frac{z-m}{M-m} f(M), \quad z \in [m\_i, M\_i] \text{ for } i = n\_1 + 1, \dots, n \tag{33}$$

Since *mi*1*<sup>H</sup>* ≤ *Ai* ≤ *Mi*1*H*, *i* = 1, . . . , *n*1, it follows from (32)

$$f\left(A\_{i}\right) \le \frac{M\mathbf{1}\_{H} - A\_{i}}{M - m} f(m) + \frac{A\_{i} - m\mathbf{1}\_{H}}{M - m} f(M), \qquad i = 1, \dots, n\_{1}$$

Applying a positive linear mapping Φ*<sup>i</sup>* and summing, we obtain

$$\sum\_{i=1}^{n\_1} \Phi\_i \left( f(A\_i) \right) \le \frac{M\mathfrak{a}\mathbf{1}\_K - \sum\_{i=1}^{n\_1} \Phi\_i(A\_i)}{M - m} f(m) + \frac{\sum\_{i=1}^{n\_1} \Phi\_i(A\_i) - m\mathfrak{a}\mathbf{1}\_K}{M - m} f(M)$$

#### 16 Will-be-set-by-IN-TECH 204 Linear Algebra – Theorems and Applications

since ∑*n*<sup>1</sup> *<sup>i</sup>*=<sup>1</sup> Φ*i*(1*H*) = *α*1*K*. It follows

$$\frac{1}{\alpha} \sum\_{i=1}^{n\_1} \Phi\_i \left( f(A\_i) \right) \le \frac{M \mathbf{1}\_K - A}{M - m} f(m) + \frac{A - m \mathbf{1}\_K}{M - m} f(M) \tag{34}$$

Similarly to (34) in the case *mi*1*<sup>H</sup>* ≤ *Ai* ≤ *Mi*1*H*, *i* = *n*<sup>1</sup> + 1, . . . , *n*, it follows from (33)

$$\frac{1}{\beta} \sum\_{i=n\_1+1}^n \Phi\_i \left( f(A\_i) \right) \ge \frac{M \mathbf{1}\_K - B}{M - m} f(m) + \frac{B - m \mathbf{1}\_K}{M - m} f(M) \tag{35}$$

Combining (34) and (35) and taking into account that *A* = *B*, we obtain

$$\frac{1}{\alpha} \sum\_{i=1}^{n\_1} \Phi\_{\bar{l}} \left( f(A\_{\bar{l}}) \right) \le \frac{1}{\beta} \sum\_{i=n\_1+1}^{n} \Phi\_{\bar{l}} \left( f(A\_{\bar{l}}) \right) \tag{36}$$

It follows

$$\begin{split} \frac{1}{\mathfrak{a}} \sum\_{i=1}^{n\_1} \Phi\_i(f(A\_i)) &= \sum\_{i=1}^{n\_1} \Phi\_i(f(A\_i)) + \frac{\beta}{\mathfrak{a}} \sum\_{i=1}^{n\_1} \Phi\_i(f(A\_i)) \\ &\leq \sum\_{i=1}^{n\_1} \Phi\_i(f(A\_i)) + \sum\_{i=n\_1+1}^{n} \Phi\_i(f(A\_i)) \end{split} \tag{\text{by } \mathfrak{a} + \beta = 1}$$

$$\begin{aligned} &= \sum\_{i=1}^{n} \Phi\_i(f(A\_i)) \\ &\leq \frac{\alpha}{\beta} \sum\_{i=n\_1+1}^{n} \Phi\_i(f(A\_i)) + \sum\_{i=n\_1+1}^{n} \Phi\_i(f(A\_i)) &\quad &(\text{by (36)}) \\ &= \frac{1}{\beta} \sum\_{i=n\_1+1}^{n} \Phi\_i(f(A\_i)) &\quad &(\text{by } \alpha + \beta = 1) \end{aligned}$$

which gives the desired double inequality (31).

**b)** Let *m* = *M*. Since [*mi*, *Mi*] ⊆ [*m*, *M*] for *i* = 1, . . . , *n*1, then *Ai* = *m*1*<sup>H</sup>* and *f*(*Ai*) = *f*(*m*)1*<sup>H</sup>* for *i* = 1, . . . , *n*1. It follows

$$\frac{1}{\mathfrak{a}}\sum\_{i=1}^{\eta\_1} \Phi\_{\bar{i}}(A\_i) = m \mathbf{1}\_K \qquad \text{and} \qquad \frac{1}{\mathfrak{a}}\sum\_{i=1}^{\eta\_1} \Phi\_{\bar{i}}\left(f(A\_i)\right) = f(m)\mathbf{1}\_K \tag{37}$$

On the other hand, since *f* is convex on *I*, we have

$$f(z) \ge f(m) + l(m)(z - m) \quad \text{for every } z \in I \tag{38}$$

where *l* is the subdifferential of *f* . Replacing *z* by *Ai* for *i* = *n*<sup>1</sup> + 1, . . . , *n*, applying Φ*<sup>i</sup>* and summing, we obtain from (38) and (37)

$$\begin{aligned} \frac{1}{\beta} \sum\_{i=n\_1+1}^n \Phi\_i\left(f(A\_i)\right) &\geq f(m)\mathbf{1}\_K + l(m) \left(\frac{1}{\beta} \sum\_{i=n\_1+1}^n \Phi\_i(A\_i) - m\mathbf{1}\_K\right) \\ &= f(m)\mathbf{1}\_K = \frac{1}{\alpha} \sum\_{i=1}^{n\_1} \Phi\_i\left(f(A\_i)\right) \end{aligned}$$

So (36) holds again. The remaining part of the proof is the same as in the case a).

**Remark 16.** *We obtain the equivalent inequality to the one in Theorem 15 in the case when* ∑*n <sup>i</sup>*=<sup>1</sup> Φ*i*(1*H*) = *γ* 1*K, for some positive scalar γ. If α* + *β* = *γ and one of two equalities*

$$\frac{1}{\alpha} \sum\_{i=1}^{m\_1} \Phi\_i(A\_i) = \frac{1}{\beta} \sum\_{i=n\_1+1}^n \Phi\_i(A\_i) = \frac{1}{\gamma} \sum\_{i=1}^n \Phi\_i(A\_i)$$

*is valid, then*

16 Will-be-set-by-IN-TECH

1 *β*

*n* ∑ *i*=*n*1+1

*n* ∑ *i*=*n*1+1

Φ*i*(*f*(*Ai*)) (by *α* + *β* = 1)

*f*(*z*) ≥ *f*(*m*) + *l*(*m*)(*z* − *m*) for every *z* ∈ *I* (38)

*n* ∑ *i*=*n*1+1

Φ*<sup>i</sup>* (*f*(*Ai*))

*<sup>M</sup>* <sup>−</sup> *<sup>m</sup> <sup>f</sup>*(*m*) + *<sup>A</sup>* <sup>−</sup> *<sup>m</sup>*1*<sup>K</sup>*

*<sup>M</sup>* <sup>−</sup> *<sup>m</sup> <sup>f</sup>*(*m*) + *<sup>B</sup>* <sup>−</sup> *<sup>m</sup>*1*<sup>K</sup>*

*<sup>M</sup>* <sup>−</sup> *<sup>m</sup> <sup>f</sup>*(*M*) (34)

*<sup>M</sup>* <sup>−</sup> *<sup>m</sup> <sup>f</sup>*(*M*) (35)

Φ*<sup>i</sup>* (*f*(*Ai*)) (36)

Φ*i*(*f*(*Ai*)) (by *α* + *β* = 1)

Φ*i*(*f*(*Ai*)) (by (36))

Φ*i*(*f*(*Ai*)) (by (36))

Φ*<sup>i</sup>* (*f*(*Ai*)) = *f*(*m*)1*<sup>K</sup>* (37)

Φ*i*(*Ai*) − *m*1*<sup>K</sup>*

<sup>Φ</sup>*<sup>i</sup>* (*f*(*Ai*)) <sup>≤</sup> *<sup>M</sup>*1*<sup>K</sup>* <sup>−</sup> *<sup>A</sup>*

Similarly to (34) in the case *mi*1*<sup>H</sup>* ≤ *Ai* ≤ *Mi*1*H*, *i* = *n*<sup>1</sup> + 1, . . . , *n*, it follows from (33)

<sup>Φ</sup>*<sup>i</sup>* (*f*(*Ai*)) <sup>≥</sup> *<sup>M</sup>*1*<sup>K</sup>* <sup>−</sup> *<sup>B</sup>*

Φ*<sup>i</sup>* (*f*(*Ai*)) ≤

*α*

Φ*i*(*f*(*Ai*)) +

*n*1 ∑ *i*=1

*n* ∑ *i*=*n*1+1

**b)** Let *m* = *M*. Since [*mi*, *Mi*] ⊆ [*m*, *M*] for *i* = 1, . . . , *n*1, then *Ai* = *m*1*<sup>H</sup>* and *f*(*Ai*) = *f*(*m*)1*<sup>H</sup>*

where *l* is the subdifferential of *f* . Replacing *z* by *Ai* for *i* = *n*<sup>1</sup> + 1, . . . , *n*, applying Φ*<sup>i</sup>* and

*α*

*α*

*n*1 ∑ *i*=1

 1 *β*

*n*1 ∑ *i*=1

Combining (34) and (35) and taking into account that *A* = *B*, we obtain

<sup>Φ</sup>*i*(*f*(*Ai*)) + *<sup>β</sup>*

Φ*i*(*f*(*Ai*)) +

Φ*i*(*f*(*Ai*))

<sup>Φ</sup>*i*(*Ai*) = *<sup>m</sup>*1*<sup>K</sup>* and <sup>1</sup>

Φ*<sup>i</sup>* (*f*(*Ai*)) ≥ *f*(*m*)1*<sup>K</sup>* + *l*(*m*)

<sup>=</sup> *<sup>f</sup>*(*m*)1*<sup>K</sup>* <sup>=</sup> <sup>1</sup>

So (36) holds again. The remaining part of the proof is the same as in the case a).

*n* ∑ *i*=*n*1+1

*n* ∑ *i*=*n*1+1

since ∑*n*<sup>1</sup>

It follows

1 *α*

*n*1 ∑ *i*=1

*<sup>i</sup>*=<sup>1</sup> Φ*i*(1*H*) = *α*1*K*. It follows 1 *α*

> 1 *β*

Φ*i*(*f*(*Ai*)) =

*n*1 ∑ *i*=1

*n* ∑ *i*=*n*1+1

> 1 *α*

*n*1 ∑ *i*=1

≤ *n*1 ∑ *i*=1

= *n* ∑ *i*=1

≤ *α β*

<sup>=</sup> <sup>1</sup> *β*

On the other hand, since *f* is convex on *I*, we have

which gives the desired double inequality (31).

for *i* = 1, . . . , *n*1. It follows

1 *α*

*n*1 ∑ *i*=1

summing, we obtain from (38) and (37)

*n* ∑ *i*=*n*1+1

1 *β*

*n*1 ∑ *i*=1

$$\frac{1}{\alpha} \sum\_{i=1}^{n\_1} \Phi\_i(f(A\_i)) \le \frac{1}{\gamma} \sum\_{i=1}^n \Phi\_i(f(A\_i)) \le \frac{1}{\beta} \sum\_{i=n\_1+1}^n \Phi\_i(f(A\_i))$$

*holds for every continuous convex function f .*

**Remark 17.** *Let the assumptions of Theorem 15 be valid. 1. We observe that the following inequality*

$$f\left(\frac{1}{\beta}\sum\_{i=n\_1+1}^n \Phi\_i(A\_i)\right) \le \frac{1}{\beta} \sum\_{i=n\_1+1}^n \Phi\_i(f(A\_i))$$

*holds for every continuous convex function f* : *I* → **R***.*

*Indeed, by the assumptions of Theorem 15 we have*

$$\max \mathbf{1}\_H \le \sum\_{i=1}^{n\_1} \Phi\_i(f(A\_i)) \le M\alpha \mathbf{1}\_H \quad \text{and} \quad \frac{1}{\alpha} \sum\_{i=1}^{n\_1} \Phi\_i(A\_i) = \frac{1}{\beta} \sum\_{i=n\_1+1}^n \Phi\_i(A\_i)$$

*which implies*

$$m\mathbf{1}\_H \le \sum\_{i=n\_1+1}^n \frac{1}{\beta} \Phi\_i(f(A\_i)) \le M\mathbf{1}\_H.$$

*Also* (*m*, *<sup>M</sup>*) <sup>∩</sup> [*mi*, *Mi*] = <sup>∅</sup> *for i* <sup>=</sup> *<sup>n</sup>*<sup>1</sup> <sup>+</sup> 1, . . . , *n and* <sup>∑</sup>*<sup>n</sup> i*=*n*1+1 1 *<sup>β</sup>*Φ*i*(1*H*) = 1*<sup>K</sup> hold. So we can apply Theorem 6 on operators An*1+1,..., *An and mappings* <sup>1</sup> *<sup>β</sup>*Φ*<sup>i</sup> and obtain the desired inequality.*

*2. We denote by mC and MC the bounds of C* = ∑*<sup>n</sup> <sup>i</sup>*=<sup>1</sup> Φ*i*(*Ai*)*. If* (*mC*, *MC*) ∩ [*mi*, *Mi*] = ∅*, i* = 1, . . . , *n*<sup>1</sup> *or f is an operator convex function on* [*m*, *M*]*, then the double inequality* (31) *can be extended from the left side if we use Jensen's operator inequality (see [16, Theorem 2.1])*

$$\begin{aligned} f\left(\sum\_{i=1}^n \Phi\_i(A\_i)\right) &= f\left(\frac{1}{n} \sum\_{i=1}^{n\_1} \Phi\_i(A\_i)\right) \\ &\le \frac{1}{n} \sum\_{i=1}^{n\_1} \Phi\_i(f(A\_i)) \le \sum\_{i=1}^n \Phi\_i(f(A\_i)) \le \frac{1}{\beta} \sum\_{i=n\_1+1}^n \Phi\_i(f(A\_i)) \end{aligned}$$

**Example 18.** *If neither assumptions* (*mC*, *MC*) ∩ [*mi*, *Mi*] = ∅*, i* = 1, . . . , *n*1*, nor f is operator convex in Remark 17 - 2. is satisfied and if* 1 < *n*<sup>1</sup> < *n, then* (31) *can not be extended by Jensen's operator inequality, since it is not valid. Indeed, for n*<sup>1</sup> = 2 *we define mappings* Φ1, Φ<sup>2</sup> : *M*3(**C**) → *<sup>M</sup>*2(**C**) *by* <sup>Φ</sup>1((*aij*)1≤*i*,*j*≤3) = *<sup>α</sup>* <sup>2</sup> (*aij*)1≤*i*,*j*≤2*,* <sup>Φ</sup><sup>2</sup> = <sup>Φ</sup>1*. Then* <sup>Φ</sup>1(*I*3) + <sup>Φ</sup>2(*I*3) = *<sup>α</sup>I*2*. If*

$$A\_1 = 2\begin{pmatrix} 1 \ 0 \ 1 \\ 0 \ 0 \ 1 \\ 1 \ 1 \ 1 \end{pmatrix} \quad \text{and} \quad A\_2 = 2\begin{pmatrix} 1 \ 0 \ 0 \\ 0 \ 0 \ 0 \\ 0 \ 0 \ 0 \end{pmatrix}$$

*then*

$$\left(\frac{1}{a}\Phi\_1(A\_1) + \frac{1}{a}\Phi\_2(A\_2)\right)^4 = \frac{1}{a^4}\begin{pmatrix} 16 \ 0 \\ 0 \ 0 \end{pmatrix} \not\le \frac{1}{a}\begin{pmatrix} 80 \ 40 \\ 40 \ 24 \end{pmatrix} = \frac{1}{a}\Phi\_1\left(A\_1^4\right) + \frac{1}{a}\Phi\_2\left(A\_2^4\right)$$

*for every α* ∈ (0, 1)*. We observe that f*(*t*) = *t* <sup>4</sup> *is not operator convex and* (*mC*, *MC*) <sup>∩</sup> [*mi*, *Mi*] �<sup>=</sup> ∅, *since C* = *A* = <sup>1</sup> *<sup>α</sup>*Φ1(*A*1) + <sup>1</sup> *<sup>α</sup>*Φ2(*A*2) = <sup>1</sup> *α* 2 0 0 0 , [*mC*, *MC*]=[0, 2/*α*]*,* [*m*1, *M*1] ⊂ [−1.60388, 4.49396] *and* [*m*2, *M*2]=[0, 2]*.*

With respect to Remark 16, we obtain the following obvious corollary of Theorem 15.

**Corollary 19.** *Let* (*A*1,..., *An*) *be an n*−*tuple of self-adjoint operators Ai* ∈ *B*(*H*) *with the bounds mi and Mi, mi* ≤ *Mi, i* = 1, . . . , *n. For some* 1 ≤ *n*<sup>1</sup> < *n, we denote m* = min{*m*1,..., *mn*<sup>1</sup> }*, M* = max{*M*1,..., *Mn*<sup>1</sup> }*. Let* (*p*1,..., *pn*) *be an n*−*tuple of non-negative numbers, such that* 0 < ∑*n*<sup>1</sup> *<sup>i</sup>*=<sup>1</sup> *pi* <sup>=</sup> **pn1** <sup>&</sup>lt; **pn** <sup>=</sup> <sup>∑</sup>*<sup>n</sup> <sup>i</sup>*=<sup>1</sup> *pi. If*

$$(m,M)\cap[m\_{i\cdot},M\_i]=\mathcal{Q}\_{\prime}\qquad i=n\_1+1,\ldots,n\_n$$

*and one of two equalities*

$$\frac{1}{\mathbf{p\_{n\_1}}} \sum\_{i=1}^{n\_1} p\_i A\_i = \frac{1}{\mathbf{p\_{n\_1}}} \sum\_{i=1}^n p\_i A\_i = \frac{1}{\mathbf{p\_{n\_1}} - \mathbf{p\_{n\_1}}} \sum\_{i=n\_1+1}^n p\_i A\_i$$

*is valid, then*

$$\frac{1}{\mathbf{p}\_{\mathbf{n}\_{1}}}\sum\_{i=1}^{n\_{1}}p\_{i}f(A\_{i}) \le \frac{1}{\mathbf{p}\_{\mathbf{n}}}\sum\_{i=1}^{n}p\_{i}f(A\_{i}) \le \frac{1}{\mathbf{p}\_{\mathbf{n}} - \mathbf{p}\_{\mathbf{n}\_{1}}}\sum\_{i=n\_{1}+1}^{n}p\_{i}f(A\_{i})\tag{39}$$

*holds for every continuous convex function f* : *I* → **R** *provided that the interval I contains all mi*, *Mi, i* = 1, . . . , *n.*

*If f* : *I* → **R** *is concave, then the reverse inequality is valid in* (39)*.*

As a special case of Corollary 19 we can obtain a discrete version of Corollary 7 as follows.

**Corollary 20** (Discrete version of Corollary 7)**.** *Let* (*A*1,..., *An*) *be an n*−*tuple of self-adjoint operators Ai* ∈ *B*(*H*) *with the bounds mi and Mi, mi* ≤ *Mi, i* = 1, . . . , *n. Let* (*α*1,..., *αn*) *be an <sup>n</sup>*−*tuple of nonnegative real numbers such that* <sup>∑</sup>*<sup>n</sup> <sup>i</sup>*=<sup>1</sup> *α<sup>i</sup>* = 1*. If*

$$(m\_{A\prime}M\_A) \cap [m\_{\dot{1}\prime}M\_{\dot{1}}] = \bigcirc \qquad \quad \dot{\mathbf{i}} = \mathbf{1}, \ldots, \mathbf{n} \tag{40}$$

*where mA and MA, mA* <sup>≤</sup> *MA, are the bounds of A* <sup>=</sup> <sup>∑</sup>*<sup>n</sup> <sup>i</sup>*=<sup>1</sup> *αiAi, then*

$$f\left(\sum\_{i=1}^{n} \alpha\_i A\_i\right) \le \sum\_{i=1}^{n} \alpha\_i f(A\_i) \tag{41}$$

*holds for every continuous convex function f* : *I* → **R** *provided that the interval I contains all mi*, *Mi.*

*Proof.* We prove only the convex case. We define (*n* + 1)−tuple of operators (*B*1,..., *Bn*+1), *Bi* <sup>∈</sup> *<sup>B</sup>*(*H*), by *<sup>B</sup>*<sup>1</sup> <sup>=</sup> *<sup>A</sup>* <sup>=</sup> <sup>∑</sup>*<sup>n</sup> <sup>i</sup>*=<sup>1</sup> *<sup>α</sup>iAi* and *Bi* = *Ai*−1, *<sup>i</sup>* = 2, . . . , *<sup>n</sup>* + 1. Then *mB*<sup>1</sup> = *mA*, *MB*<sup>1</sup> = *MA* are the bounds of *<sup>B</sup>*<sup>1</sup> and *mBi* = *mi*−1, *MBi* = *Mi*−<sup>1</sup> are the ones of *Bi*, *<sup>i</sup>* = 2, . . . , *n* + 1. Also, we define (*n* + 1)−tuple of non-negative numbers (*p*1,..., *pn*+1) by *p*<sup>1</sup> = 1 and *pi* <sup>=</sup> *<sup>α</sup>i*−1, *<sup>i</sup>* <sup>=</sup> 2, . . . , *<sup>n</sup>* <sup>+</sup> 1. Then <sup>∑</sup>*n*+<sup>1</sup> *<sup>i</sup>*=<sup>1</sup> *pi* = 2 and by using (40) we have

$$(m\_{B\_{1'}}M\_{B\_1}) \cap [m\_{B\_{1'}}M\_{B\_1}] = \bigcirc, \qquad \mathbf{i} = \mathbf{2}, \ldots, n+1\tag{42}$$

Since *<sup>n</sup>*+<sup>1</sup>

18 Will-be-set-by-IN-TECH

*<sup>α</sup>*Φ2(*A*2) = <sup>1</sup>

With respect to Remark 16, we obtain the following obvious corollary of Theorem 15.

*α* 2 0 0 0 

**Corollary 19.** *Let* (*A*1,..., *An*) *be an n*−*tuple of self-adjoint operators Ai* ∈ *B*(*H*) *with the bounds mi and Mi, mi* ≤ *Mi, i* = 1, . . . , *n. For some* 1 ≤ *n*<sup>1</sup> < *n, we denote m* = min{*m*1,..., *mn*<sup>1</sup> }*, M* = max{*M*1,..., *Mn*<sup>1</sup> }*. Let* (*p*1,..., *pn*) *be an n*−*tuple of non-negative numbers, such that* 0 <

(*m*, *M*) ∩ [*mi*, *Mi*] = ∅, *i* = *n*<sup>1</sup> + 1, . . . , *n*

*pi f*(*Ai*) ≤

*holds for every continuous convex function f* : *I* → **R** *provided that the interval I contains all mi*, *Mi,*

As a special case of Corollary 19 we can obtain a discrete version of Corollary 7 as follows.

**Corollary 20** (Discrete version of Corollary 7)**.** *Let* (*A*1,..., *An*) *be an n*−*tuple of self-adjoint operators Ai* ∈ *B*(*H*) *with the bounds mi and Mi, mi* ≤ *Mi, i* = 1, . . . , *n. Let* (*α*1,..., *αn*) *be an*

*holds for every continuous convex function f* : *I* → **R** *provided that the interval I contains all mi*, *Mi.*

*<sup>i</sup>*=<sup>1</sup> *α<sup>i</sup>* = 1*. If*

(*mA*, *MA*) ∩ [*mi*, *Mi*] = ∅, *i* = 1, . . . , *n* (40)

*<sup>i</sup>*=<sup>1</sup> *αiAi, then*

*α<sup>i</sup> f*(*Ai*) (41)

*piAi* <sup>=</sup> <sup>1</sup>

**pn** − **pn1**

1 **pn** − **pn1**

*n* ∑ *i*=*n*1+1

> *n* ∑ *i*=*n*1+1

*piAi*

*pi f*(*Ai*) (39)

 <sup>=</sup> <sup>1</sup> *α* Φ<sup>1</sup> *A*4 1 + 1 *α* Φ<sup>2</sup> *A*4 2 

<sup>4</sup> *is not operator convex and* (*mC*, *MC*) <sup>∩</sup> [*mi*, *Mi*] �<sup>=</sup>

, [*mC*, *MC*]=[0, 2/*α*]*,* [*m*1, *M*1] ⊂

*then*

∑*n*<sup>1</sup>

 1 *α*

<sup>Φ</sup>1(*A*1) + <sup>1</sup>

∅, *since C* = *A* = <sup>1</sup>

*<sup>i</sup>*=<sup>1</sup> *pi* <sup>=</sup> **pn1** <sup>&</sup>lt; **pn** <sup>=</sup> <sup>∑</sup>*<sup>n</sup>*

*and one of two equalities*

*is valid, then*

*i* = 1, . . . , *n.*

*α*

*for every α* ∈ (0, 1)*. We observe that f*(*t*) = *t*

[−1.60388, 4.49396] *and* [*m*2, *M*2]=[0, 2]*.*

Φ2(*A*2)

<sup>4</sup> <sup>=</sup> <sup>1</sup> *α*4 16 0 0 0 �≤ 1 *α* 80 40 40 24

*<sup>α</sup>*Φ1(*A*1) + <sup>1</sup>

*<sup>i</sup>*=<sup>1</sup> *pi. If*

1 **pn1**

*n*1 ∑ *i*=1

*<sup>n</sup>*−*tuple of nonnegative real numbers such that* <sup>∑</sup>*<sup>n</sup>*

*where mA and MA, mA* <sup>≤</sup> *MA, are the bounds of A* <sup>=</sup> <sup>∑</sup>*<sup>n</sup>*

*f n* ∑ *i*=1 *αiAi* ≤ *n* ∑ *i*=1

1 **pn1**

*n*1 ∑ *i*=1

*pi f*(*Ai*) ≤

*If f* : *I* → **R** *is concave, then the reverse inequality is valid in* (39)*.*

*piAi* <sup>=</sup> <sup>1</sup>

**pn**

*n* ∑ *i*=1

1 **pn**

*n* ∑ *i*=1

$$\sum\_{i=1}^{n+1} p\_i B\_i = B\_1 + \sum\_{i=2}^{n+1} p\_i B\_i = \sum\_{i=1}^n \alpha\_i A\_i + \sum\_{i=1}^n \alpha\_i A\_i = 2B\_1$$

then

$$p\_1 B\_1 = \frac{1}{2} \sum\_{i=1}^{n+1} p\_i B\_i = \sum\_{i=2}^{n+1} p\_i B\_i \tag{43}$$

Taking into account (42) and (43), we can apply Corollary 19 for *n*<sup>1</sup> = 1 and *Bi*, *pi* as above, and we get

$$p\_1 f(\mathcal{B}\_1) \le \frac{1}{2} \sum\_{i=1}^{n+1} p\_i f(\mathcal{B}\_i) \le \sum\_{i=2}^{n+1} p\_i f(\mathcal{B}\_i)$$

which gives the desired inequality (41).

#### **6. Extension of the refined Jensen's inequality**

There is an extensive literature devoted to Jensen's inequality concerning different refinements and extensive results, see, for example [22–29].

In this section we present an extension of the refined Jensen's inequality obtained in Section 4 and a refinement of the same inequality obtained in Section 5.

**Theorem 21.** *Let* (*A*1,..., *An*) *be an n*−*tuple of self-adjoint operators Ai* ∈ *B*(*H*) *with the bounds mi and Mi, mi* ≤ *Mi, i* = 1, . . . , *n. Let* (Φ1,..., Φ*n*) *be an n*−*tuple of positive linear mappings* <sup>Φ</sup>*<sup>i</sup>* : *<sup>B</sup>*(*H*) <sup>→</sup> *<sup>B</sup>*(*K*)*, such that* <sup>∑</sup>*n*<sup>1</sup> *<sup>i</sup>*=<sup>1</sup> <sup>Φ</sup>*i*(1*H*) = *<sup>α</sup>* <sup>1</sup>*K,* <sup>∑</sup>*<sup>n</sup> <sup>i</sup>*=*n*1+<sup>1</sup> Φ*i*(1*H*) = *β* 1*K, where* 1 ≤ *n*<sup>1</sup> < *n, α*, *β* > 0 *and α* + *β* = 1*. Let mL* = min{*m*1,..., *mn*<sup>1</sup> }*, MR* = max{*M*1,..., *Mn*<sup>1</sup> } *and*

$$\begin{aligned} m &= \max\left\{ M\_{\bar{i}} \colon M\_{\bar{i}} \le m\_{L'} \, \middle| \, \bar{i} \in \{ n\_1 + 1, \dots, n \} \right\} \\ M &= \min\left\{ m\_{\bar{i}} \colon m\_{\bar{i}} \ge M\_{R'} \, \middle| \, \bar{i} \in \{ n\_1 + 1, \dots, n \} \right\} \end{aligned}$$

*If*

$$(\mathfrak{m}\_{L\prime}M\_R) \cap [\mathfrak{m}\_{\natural}M\_{\bar{i}}] = \bigcirc \,, \quad \mathfrak{i} = \mathfrak{n}\_1 + \mathfrak{1}, \ldots, \mathfrak{n}\_{\prime} \qquad \text{and} \qquad \mathfrak{m} < M$$

*and one of two equalities*

$$\frac{1}{\pi} \sum\_{i=1}^{n\_1} \Phi\_i(A\_i) = \sum\_{i=1}^n \Phi\_i(A\_i) = \frac{1}{\beta} \sum\_{i=n\_1+1}^n \Phi\_i(A\_i)$$

*is valid, then*

$$\begin{split} \frac{1}{n} \sum\_{i=1}^{n\_1} \Phi\_i(f(A\_i)) &\le \frac{1}{n} \sum\_{i=1}^{n\_1} \Phi\_i(f(A\_i)) + \beta \delta\_f \tilde{A} \le \sum\_{i=1}^n \Phi\_i(f(A\_i)) \\ &\le \frac{1}{\beta} \sum\_{i=n\_1+1}^n \Phi\_i(f(A\_i)) - a \delta\_f \tilde{A} \le \frac{1}{\beta} \sum\_{i=n\_1+1}^n \Phi\_i(f(A\_i)) \end{split} \tag{44}$$

*holds for every continuous convex function f* : *I* → **R** *provided that the interval I contains all mi*, *Mi, i* = 1, . . . , *n, where*

$$\begin{aligned} \delta\_f \equiv \delta\_f(\bar{m}, \bar{M}) &= f(\bar{m}) + f(\bar{M}) - 2f\left(\frac{\bar{m} + \bar{M}}{2}\right) \\ \tilde{A} \equiv \tilde{A}\_{A, \Phi, n\_1, a}(\bar{m}, \bar{M}) &= \frac{1}{2} \mathbf{1}\_K - \frac{1}{a(\bar{M} - \bar{m})} \sum\_{i=1}^{n\_1} \Phi\_i\left(\left|A\_i - \frac{\bar{m} + \bar{M}}{2} \mathbf{1}\_H\right|\right) \end{aligned} \tag{45}$$

*and <sup>m</sup>*¯ <sup>∈</sup> [*m*, *mL*]*, <sup>M</sup>*¯ <sup>∈</sup> [*MR*, *<sup>M</sup>*]*, <sup>m</sup>*¯ <sup>&</sup>lt; *M, are arbitrary numbers. If f* ¯ : *<sup>I</sup>* <sup>→</sup> **<sup>R</sup>** *is concave, then the reverse inequality is valid in* (44)*.*

*Proof.* We prove only the convex case. Let us denote

$$A = \frac{1}{\mathfrak{a}} \sum\_{i=1}^{n\_1} \Phi\_i(A\_i), \qquad B = \frac{1}{\mathfrak{d}} \sum\_{i=n\_1+1}^n \Phi\_i(A\_i), \qquad \mathsf{C} = \sum\_{i=1}^n \Phi\_i(A\_i).$$

It is easy to verify that *A* = *B* or *B* = *C* or *A* = *C* implies *A* = *B* = *C*.

Since *<sup>f</sup>* is convex on [*m*¯ , *<sup>M</sup>*¯ ] and Sp(*Ai*) <sup>⊆</sup> [*mi*, *Mi*] <sup>⊆</sup> [*m*¯ , *<sup>M</sup>*¯ ] for *<sup>i</sup>* <sup>=</sup> 1, . . . , *<sup>n</sup>*1, it follows from Lemma 11 that

$$f\left(A\_{i}\right) \leq \frac{\tilde{M}1\_{H} - A\_{i}}{\tilde{M} - \tilde{m}} f(\tilde{m}) + \frac{A\_{i} - \tilde{m}1\_{H}}{\tilde{M} - \tilde{m}} f(\tilde{M}) - \delta\_{f} \tilde{A}\_{i\prime} \qquad i = 1, \dots, n\_{1}$$

holds, where *<sup>δ</sup><sup>f</sup>* <sup>=</sup> *<sup>f</sup>*(*m*¯) + *<sup>f</sup>*(*M*¯ ) <sup>−</sup> <sup>2</sup> *<sup>f</sup> <sup>m</sup>*¯ +*M*¯ 2 and *<sup>A</sup><sup>i</sup>* <sup>=</sup> <sup>1</sup> <sup>2</sup> <sup>1</sup>*<sup>H</sup>* <sup>−</sup> <sup>1</sup> *<sup>M</sup>*¯ <sup>−</sup>*m*¯ *Ai* <sup>−</sup> *<sup>m</sup>*¯ <sup>+</sup>*M*¯ <sup>2</sup> 1*<sup>H</sup>* . Applying a positive linear mapping Φ*<sup>i</sup>* and summing, we obtain

$$\begin{aligned} \Sigma\_{i=1}^{\eta\_1} \Phi\_{\bar{l}} \left( f(A\_{\bar{l}}) \right) &\leq \frac{\bar{M} \mathfrak{a} \mathbf{1}\_K - \sum\_{i=1}^{\eta\_1} \Phi\_{\bar{l}}(A\_{\bar{l}})}{\bar{M} - \bar{m}} f(\bar{m}) + \frac{\sum\_{i=1}^{\eta\_1} \Phi\_{\bar{l}}(A\_{\bar{l}}) - \bar{m} \mathfrak{a} \mathbf{1}\_K}{\bar{M} - \bar{m}} f(\bar{M}) \\ &- \delta\_f \left( \frac{\underline{a}}{2} \mathbf{1}\_K - \frac{1}{\bar{M} - \bar{m}} \sum\_{i=1}^{\eta\_1} \Phi\_{\bar{l}} \left( \left| A\_{\bar{l}} - \frac{\bar{m} + \bar{M}}{2} \mathbf{1}\_H \right| \right) \right) \end{aligned}$$

since ∑*n*<sup>1</sup> *<sup>i</sup>*=<sup>1</sup> Φ*i*(1*H*) = *α*1*K*. It follows that

$$\frac{1}{\alpha} \sum\_{i=1}^{n\_1} \Phi\_i \left( f(A\_i) \right) \le \frac{\bar{M} \mathbf{1}\_K - A}{\bar{M} - \bar{m}} f(\bar{m}) + \frac{A - \bar{m} \mathbf{1}\_K}{\bar{M} - \bar{m}} f(\bar{M}) - \delta\_f \tilde{A} \tag{46}$$

where *<sup>A</sup>* <sup>=</sup> <sup>1</sup> <sup>2</sup> <sup>1</sup>*<sup>K</sup>* <sup>−</sup> <sup>1</sup> *<sup>α</sup>*(*M*¯ <sup>−</sup>*m*¯) <sup>∑</sup>*n*<sup>1</sup> *<sup>i</sup>*=<sup>1</sup> Φ*<sup>i</sup> Ai* <sup>−</sup> *<sup>m</sup>*¯ <sup>+</sup>*M*¯ <sup>2</sup> 1*<sup>H</sup>* .

Additionally, since *<sup>f</sup>* is convex on all [*mi*, *Mi*] and (*m*¯ , *<sup>M</sup>*¯ ) <sup>∩</sup> [*mi*, *Mi*] = <sup>∅</sup>, *<sup>i</sup>* <sup>=</sup> *<sup>n</sup>*<sup>1</sup> <sup>+</sup> 1, . . . , *<sup>n</sup>*, then

$$f(A\_i) \ge \frac{\bar{M}\mathbf{1}\_H - A\_i}{\bar{M} - \bar{m}} f(\bar{m}) + \frac{A\_i - \bar{m}\mathbf{1}\_H}{\bar{M} - \bar{m}} f(\bar{M}), \qquad i = n\_1 + 1, \dots, n$$

It follows

$$\frac{1}{\beta} \sum\_{i=n\_1+1}^n \Phi\_i \left( f(A\_i) \right) - \delta\_f \tilde{A} \ge \frac{\tilde{M} \mathbf{1}\_K - B}{\tilde{M} - \bar{m}} f(\bar{m}) + \frac{B - \bar{m} \mathbf{1}\_K}{\tilde{M} - \bar{m}} f(\bar{M}) - \delta\_f \tilde{A} \tag{47}$$

Combining (46) and (47) and taking into account that *A* = *B*, we obtain

$$\frac{1}{\alpha} \sum\_{i=1}^{n\_1} \Phi\_{\bar{l}} \left( f(A\_{\bar{l}}) \right) \le \frac{1}{\mathcal{P}} \sum\_{i=n\_1+1}^n \Phi\_{\bar{l}} \left( f(A\_{\bar{l}}) \right) - \delta\_{\bar{f}} \tilde{A} \tag{48}$$

Next, we obtain

20 Will-be-set-by-IN-TECH

*holds for every continuous convex function f* : *I* → **R** *provided that the interval I contains all mi*, *Mi,*

*<sup>α</sup>*(*M*¯ <sup>−</sup> *<sup>m</sup>*¯)

*and <sup>m</sup>*¯ <sup>∈</sup> [*m*, *mL*]*, <sup>M</sup>*¯ <sup>∈</sup> [*MR*, *<sup>M</sup>*]*, <sup>m</sup>*¯ <sup>&</sup>lt; *M, are arbitrary numbers. If f* ¯ : *<sup>I</sup>* <sup>→</sup> **<sup>R</sup>** *is concave, then the*

*n* ∑ *i*=*n*1+1

Since *<sup>f</sup>* is convex on [*m*¯ , *<sup>M</sup>*¯ ] and Sp(*Ai*) <sup>⊆</sup> [*mi*, *Mi*] <sup>⊆</sup> [*m*¯ , *<sup>M</sup>*¯ ] for *<sup>i</sup>* <sup>=</sup> 1, . . . , *<sup>n</sup>*1, it follows from

 *<sup>m</sup>*¯ +*M*¯ 2 

*<sup>i</sup>*=<sup>1</sup> Φ*i*(*Ai*) *<sup>M</sup>*¯ <sup>−</sup> *<sup>m</sup>*¯ *<sup>f</sup>*(*m*¯) + <sup>∑</sup>*n*<sup>1</sup>

*n*1 ∑ *i*=1 Φ*i* 

*<sup>M</sup>*¯ <sup>−</sup> *<sup>m</sup>*¯ *<sup>f</sup>*(*m*¯) + *<sup>A</sup>* <sup>−</sup> *<sup>m</sup>*¯ <sup>1</sup>*<sup>K</sup>*

<sup>2</sup> 1*<sup>H</sup>* .

Additionally, since *<sup>f</sup>* is convex on all [*mi*, *Mi*] and (*m*¯ , *<sup>M</sup>*¯ ) <sup>∩</sup> [*mi*, *Mi*] = <sup>∅</sup>, *<sup>i</sup>* <sup>=</sup> *<sup>n</sup>*<sup>1</sup> <sup>+</sup> 1, . . . , *<sup>n</sup>*,

*<sup>M</sup>*¯ <sup>−</sup> *<sup>m</sup>*¯

*Ai* <sup>−</sup> *<sup>m</sup>*¯ <sup>+</sup>*M*¯

*n*1 ∑ *i*=1 Φ*i* 

Φ*i*(*Ai*), *C* =

*<sup>M</sup>*¯ <sup>−</sup> *<sup>m</sup>*¯ *<sup>f</sup>*(*M*¯ ) <sup>−</sup> *<sup>δ</sup><sup>f</sup> <sup>A</sup><sup>i</sup>*, *<sup>i</sup>* <sup>=</sup> 1, . . . , *<sup>n</sup>*<sup>1</sup>

*Ai* <sup>−</sup> *<sup>m</sup>*¯ <sup>+</sup> *<sup>M</sup>*¯

*<sup>M</sup>*¯ <sup>−</sup> *<sup>m</sup>*¯ *<sup>f</sup>*(*M*¯ ), *<sup>i</sup>* <sup>=</sup> *<sup>n</sup>*<sup>1</sup> <sup>+</sup> 1, . . . , *<sup>n</sup>*

*<sup>M</sup>*¯ <sup>−</sup> *<sup>m</sup>*¯ *<sup>f</sup>*(*m*¯) + *<sup>B</sup>* <sup>−</sup> *<sup>m</sup>*¯ <sup>1</sup>*<sup>K</sup>*

and *<sup>A</sup><sup>i</sup>* <sup>=</sup> <sup>1</sup>

 *m*¯ + *M*¯ 2

*Ai* <sup>−</sup> *<sup>m</sup>*¯ <sup>+</sup> *<sup>M</sup>*¯

*n* ∑ *i*=1

<sup>2</sup> <sup>1</sup>*<sup>H</sup>* <sup>−</sup> <sup>1</sup>

*<sup>i</sup>*=<sup>1</sup> Φ*i*(*Ai*) − *m*¯ *α*1*<sup>K</sup>*

<sup>2</sup> <sup>1</sup>*<sup>H</sup>*

*<sup>M</sup>*¯ <sup>−</sup>*m*¯ 

*<sup>M</sup>*¯ <sup>−</sup> *<sup>m</sup>*¯ *<sup>f</sup>*(*M*¯ )

*<sup>M</sup>*¯ <sup>−</sup> *<sup>m</sup>*¯ *<sup>f</sup>*(*M*¯ ) <sup>−</sup> *<sup>δ</sup><sup>f</sup> <sup>A</sup>* (46)

*<sup>M</sup>*¯ <sup>−</sup> *<sup>m</sup>*¯ *<sup>f</sup>*(*M*¯ ) <sup>−</sup> *<sup>δ</sup><sup>f</sup> <sup>A</sup>* (47)

  *Ai* <sup>−</sup> *<sup>m</sup>*¯ <sup>+</sup>*M*¯

<sup>2</sup> 1*<sup>H</sup>* .

<sup>2</sup> <sup>1</sup>*<sup>H</sup>*

Φ*i*(*Ai*)

  (45)

*<sup>δ</sup><sup>f</sup>* <sup>≡</sup> *<sup>δ</sup>f*(*m*¯ , *<sup>M</sup>*¯ ) = *<sup>f</sup>*(*m*¯) + *<sup>f</sup>*(*M*¯ ) <sup>−</sup> <sup>2</sup> *<sup>f</sup>*

<sup>1</sup>*<sup>K</sup>* <sup>−</sup> <sup>1</sup>

*β*

2

<sup>Φ</sup>*i*(*Ai*), *<sup>B</sup>* <sup>=</sup> <sup>1</sup>

It is easy to verify that *A* = *B* or *B* = *C* or *A* = *C* implies *A* = *B* = *C*.

*<sup>M</sup>*¯ <sup>−</sup> *<sup>m</sup>*¯ *<sup>f</sup>*(*m*¯) + *Ai* <sup>−</sup> *<sup>m</sup>*¯ <sup>1</sup>*<sup>H</sup>*

Applying a positive linear mapping Φ*<sup>i</sup>* and summing, we obtain

 *α* 2

− *δ<sup>f</sup>*

*<sup>i</sup>*=<sup>1</sup> Φ*i*(1*H*) = *α*1*K*. It follows that

*<sup>α</sup>*(*M*¯ <sup>−</sup>*m*¯) <sup>∑</sup>*n*<sup>1</sup>

*<sup>f</sup>*(*Ai*) <sup>≥</sup> *<sup>M</sup>*¯ <sup>1</sup>*<sup>H</sup>* <sup>−</sup> *Ai*

*<sup>M</sup>*¯ *<sup>α</sup>*1*<sup>K</sup>* <sup>−</sup> <sup>∑</sup>*n*<sup>1</sup>

<sup>Φ</sup>*<sup>i</sup>* (*f*(*Ai*)) <sup>≤</sup> *<sup>M</sup>*¯ <sup>1</sup>*<sup>K</sup>* <sup>−</sup> *<sup>A</sup>*

 

*<sup>M</sup>*¯ <sup>−</sup> *<sup>m</sup>*¯ *<sup>f</sup>*(*m*¯) + *Ai* <sup>−</sup> *<sup>m</sup>*¯ <sup>1</sup>*<sup>H</sup>*

<sup>Φ</sup>*<sup>i</sup>* (*f*(*Ai*)) <sup>−</sup> *<sup>δ</sup><sup>f</sup> <sup>A</sup>* <sup>≥</sup> *<sup>M</sup>*¯ <sup>1</sup>*<sup>K</sup>* <sup>−</sup> *<sup>B</sup>*

*<sup>i</sup>*=<sup>1</sup> Φ*<sup>i</sup>*

<sup>1</sup>*<sup>K</sup>* <sup>−</sup> <sup>1</sup>

*<sup>A</sup>* <sup>≡</sup> *<sup>A</sup><sup>A</sup>*,Φ,*n*1,*α*(*m*¯ , *<sup>M</sup>*¯ ) = <sup>1</sup>

*Proof.* We prove only the convex case. Let us denote

*n*1 ∑ *i*=1

*<sup>f</sup>* (*Ai*) <sup>≤</sup> *<sup>M</sup>*¯ <sup>1</sup>*<sup>H</sup>* <sup>−</sup> *Ai*

holds, where *<sup>δ</sup><sup>f</sup>* <sup>=</sup> *<sup>f</sup>*(*m*¯) + *<sup>f</sup>*(*M*¯ ) <sup>−</sup> <sup>2</sup> *<sup>f</sup>*

*<sup>i</sup>*=<sup>1</sup> Φ*<sup>i</sup>* (*f*(*Ai*)) ≤

1 *α*

<sup>2</sup> <sup>1</sup>*<sup>K</sup>* <sup>−</sup> <sup>1</sup>

*n* ∑ *i*=*n*1+1

1 *β*

*n*1 ∑ *i*=1

*i* = 1, . . . , *n, where*

Lemma 11 that

∑*n*<sup>1</sup>

since ∑*n*<sup>1</sup>

where *<sup>A</sup>* <sup>=</sup> <sup>1</sup>

then

It follows

*reverse inequality is valid in* (44)*.*

*<sup>A</sup>* <sup>=</sup> <sup>1</sup> *α*

$$\begin{aligned} &\frac{1}{\alpha}\sum\_{i=1}^{n}\Phi\_{i}(f(A\_{i})) \\ &=\sum\_{i=1}^{n\_{1}}\Phi\_{i}(f(A\_{i})) + \frac{\beta}{\alpha}\sum\_{i=1}^{n\_{1}}\Phi\_{i}(f(A\_{i})) \quad (\text{by }\alpha+\beta=1) \\ &\leq\sum\_{i=1}^{n\_{1}}\Phi\_{i}(f(A\_{i})) + \sum\_{i=n\_{1}+1}^{n}\Phi\_{i}(f(A\_{i})) - \beta\delta\_{f}\tilde{A} \quad \quad (\text{by (48)}) \\ &\leq\frac{\alpha}{\beta}\sum\_{i=n\_{1}+1}^{n}\Phi\_{i}(f(A\_{i})) - \alpha\delta\_{f}\tilde{A} + \sum\_{i=n\_{1}+1}^{n}\Phi\_{i}(f(A\_{i})) - \beta\delta\_{f}\tilde{A} \quad \quad \text{(by (48))} \\ &=\frac{1}{\beta}\sum\_{i=n\_{1}+1}^{n}\Phi\_{i}(f(A\_{i})) - \delta\_{f}\tilde{A} \quad \text{(by }\alpha+\beta=1) \end{aligned}$$

which gives the following double inequality

$$\frac{1}{n}\sum\_{i=1}^{n\_1} \Phi\_i(f(A\_i)) \le \sum\_{i=1}^n \Phi\_i(f(A\_i)) - \beta \delta\_f \tilde{A} \le \frac{1}{\beta} \sum\_{i=n\_1+1}^n \Phi\_i(f(A\_i)) - \delta\_f \tilde{A}$$

Adding *βδ<sup>f</sup> A* in the above inequalities, we get

$$\frac{1}{n}\sum\_{i=1}^{n\_1} \Phi\_i(f(A\_i)) + \beta \delta\_f \tilde{A} \le \sum\_{i=1}^n \Phi\_i(f(A\_i)) \le \frac{1}{\beta} \sum\_{i=n\_1+1}^n \Phi\_i(f(A\_i)) - a \delta\_f \tilde{A} \tag{49}$$

Now, we remark that *<sup>δ</sup><sup>f</sup>* <sup>≥</sup> 0 and *<sup>A</sup>* <sup>≥</sup> 0. (Indeed, since *<sup>f</sup>* is convex, then *<sup>f</sup>* ((*m*¯ <sup>+</sup> *<sup>M</sup>*¯ )/2) <sup>≤</sup> (*f*(*m*¯) + *<sup>f</sup>*(*M*¯ ))/2, which implies that *<sup>δ</sup><sup>f</sup>* <sup>≥</sup> 0. Also, since

$$\mathsf{Sp}(A\_{i}) \subseteq [\vec{m}, \bar{M}] \quad \Rightarrow \quad \left| A\_{i} - \frac{\bar{M} + \bar{m}}{2} \mathbf{1}\_{H} \right| \leq \frac{\bar{M} - \bar{m}}{2} \mathbf{1}\_{H} \qquad i = 1, \ldots, n\_{1}$$

then *<sup>n</sup>*<sup>1</sup>

$$\sum\_{i=1}^{n\_1} \Phi\_i \left( \left| A\_i - \frac{\bar{M} + \tilde{m}}{2} \mathbf{1}\_H \right| \right) \le \frac{\bar{M} - \tilde{m}}{2} \mathfrak{a} \mathbf{1}\_K$$

which gives

$$0 \le \frac{1}{2} \mathbf{1}\_K - \frac{1}{\alpha (\tilde{M} - \bar{m})} \sum\_{i=1}^{n\_1} \Phi\_i \left( \left| A\_i - \frac{\bar{M} + \bar{m}}{2} \mathbf{1}\_H \right| \right) = \tilde{A} \ )\ .$$

#### 22 Will-be-set-by-IN-TECH 210 Linear Algebra – Theorems and Applications

Consequently, the following inequalities

$$\frac{1}{\alpha} \sum\_{i=1}^{n\_1} \Phi\_i(f(A\_i)) \le \frac{1}{\alpha} \sum\_{i=1}^{n\_1} \Phi\_i(f(A\_i)) + \beta \delta\_f \tilde{A}$$

$$\frac{1}{\beta} \sum\_{i=n\_1+1}^n \Phi\_i(f(A\_i)) - \alpha \delta\_f \tilde{A} \le \frac{1}{\beta} \sum\_{i=n\_1+1}^n \Phi\_i(f(A\_i))$$

hold, which with (49) proves the desired series inequalities (44).

**Example 22.** *We observe the matrix case of Theorem 21 for f*(*t*) = *t* <sup>4</sup>*, which is the convex function but not operator convex, n* = 4*, n*<sup>1</sup> = 2 *and the bounds of matrices as in Fig. 3. We show an example*

**Figure 3.** An example a convex function and the bounds of four operators

*such that*

$$\begin{aligned} \frac{1}{a} \left( \Phi\_1(A\_1^4) + \Phi\_2(A\_2^4) \right) &< \frac{1}{a} \left( \Phi\_1(A\_1^4) + \Phi\_2(A\_2^4) \right) + \beta \delta\_f \tilde{A} \\ &< \Phi\_1(A\_1^4) + \Phi\_2(A\_2^4) + \Phi\_3(A\_3^4) + \Phi\_4(A\_4^4) \\ &< \frac{1}{\tilde{\mathcal{P}}} \left( \Phi\_3(A\_3^4) + \Phi\_4(A\_4^4) \right) - a \delta\_f \tilde{A} < \frac{1}{\tilde{\mathcal{P}}} \left( \Phi\_3(A\_3^4) + \Phi\_4(A\_4^4) \right) \end{aligned} \tag{50}$$

*holds, where <sup>δ</sup><sup>f</sup>* <sup>=</sup> *<sup>M</sup>*¯ <sup>4</sup> <sup>+</sup> *<sup>m</sup>*¯ <sup>4</sup> <sup>−</sup> (*M*¯ <sup>+</sup> *<sup>m</sup>*¯)4/8 *and*

$$\tilde{A} = \frac{1}{2}I\_2 - \frac{1}{\mathfrak{a}(\bar{M} - \bar{m})} \left( \Phi\_1 \left( |A\_1 - \frac{\bar{M} + \tilde{m}}{2} I\_{\bar{h}}| \right) + \Phi\_2 \left( |A\_2 - \frac{\bar{M} + \tilde{m}}{2} I\_3| \right) \right)$$

*We define mappings* <sup>Φ</sup>*<sup>i</sup>* : *<sup>M</sup>*3(**C**) <sup>→</sup> *<sup>M</sup>*2(**C**) *as follows:* <sup>Φ</sup>*i*((*ajk*)1≤*j*,*k*≤3) = <sup>1</sup> <sup>4</sup> (*ajk*)1≤*j*,*k*≤2*, i* = 1, . . . , 4*. Then* ∑<sup>4</sup> *<sup>i</sup>*=<sup>1</sup> <sup>Φ</sup>*i*(*I*3) = *<sup>I</sup>*<sup>2</sup> *and <sup>α</sup>* <sup>=</sup> *<sup>β</sup>* <sup>=</sup> <sup>1</sup> 2 *.*

$$Let$$

$$A\_1 = 2\begin{pmatrix} 2 & 9/8 \ 1 \\ 9/8 & 2 & 0 \\ 1 & 0 & 3 \end{pmatrix}, A\_2 = 3\begin{pmatrix} 2 & 9/8 \ 0 \\ 9/8 & 1 & 0 \\ 0 & 0 & 2 \end{pmatrix}, A\_3 = -3\begin{pmatrix} 4 & 1/2 \ 1 \\ 1/2 & 4 & 0 \\ 1 & 0 & 2 \end{pmatrix}, A\_4 = 12\begin{pmatrix} 5/3 \ 1/2 \ 0 \\ 1/2 \ 3/2 \ 0 \\ 0 & 0 \end{pmatrix}$$

*Then m*<sup>1</sup> = 1.28607*, M*<sup>1</sup> = 7.70771*, m*<sup>2</sup> = 0.53777*, M*<sup>2</sup> = 5.46221*, m*<sup>3</sup> = −14.15050*, M*<sup>3</sup> = −4.71071*, m*<sup>4</sup> = 12.91724*, M*<sup>4</sup> = 36.*, so mL* = *m*2*, MR* = *M*1*, m* = *M*<sup>3</sup> *and M* = *m*<sup>4</sup> *(rounded to*

#### 210 Linear Algebra – Theorems and Applications Recent Research on Jensen's Inequality for Oparators <sup>23</sup> Recent Research on Jensen's Inequality for Operators 211

*five decimal places). Also,*

$$\frac{1}{\mathcal{A}}\left(\Phi\_1(A\_1) + \Phi\_2(A\_2)\right) = \frac{1}{\mathcal{B}}\left(\Phi\_3(A\_3) + \Phi\_4(A\_4)\right) = \begin{pmatrix} 4 & 9/4\\ 9/4 & 3 \end{pmatrix}.$$

*and*

22 Will-be-set-by-IN-TECH

1 *α*

Φ*i*(*f*(*Ai*)) − *αδ<sup>f</sup> A*� ≤

*n*1 ∑ *i*=1

*but not operator convex, n* = 4*, n*<sup>1</sup> = 2 *and the bounds of matrices as in Fig. 3. We show an example*

Φ*i*(*f*(*Ai*)) + *βδ<sup>f</sup> A*�

*n* ∑ *i*=*n*1+1

<sup>1</sup>) + <sup>Φ</sup>2(*A*<sup>4</sup>

<sup>3</sup>) + <sup>Φ</sup>4(*A*<sup>4</sup>

<sup>2</sup>) + <sup>Φ</sup>3(*A*<sup>4</sup>

<sup>2</sup> *Ih*<sup>|</sup>

*β* � Φ3(*A*<sup>4</sup>

> � + Φ<sup>2</sup> �

⎛ ⎝

4 1/2 1 1/2 4 0 1 02

⎞

⎠ , *A*<sup>4</sup> = 12

<sup>−</sup> *αδ<sup>f</sup> <sup>A</sup>*� <sup>&</sup>lt; <sup>1</sup>

<sup>|</sup>*A*<sup>1</sup> <sup>−</sup> *<sup>M</sup>*¯ <sup>+</sup> *<sup>m</sup>*¯

2 *.*

⎞

⎠ , *A*<sup>3</sup> = −3

*Then m*<sup>1</sup> = 1.28607*, M*<sup>1</sup> = 7.70771*, m*<sup>2</sup> = 0.53777*, M*<sup>2</sup> = 5.46221*, m*<sup>3</sup> = −14.15050*, M*<sup>3</sup> = −4.71071*, m*<sup>4</sup> = 12.91724*, M*<sup>4</sup> = 36.*, so mL* = *m*2*, MR* = *M*1*, m* = *M*<sup>3</sup> *and M* = *m*<sup>4</sup> *(rounded to*

2) �

+ *βδ<sup>f</sup> A*�

<sup>|</sup>*A*<sup>2</sup> <sup>−</sup> *<sup>M</sup>*¯ <sup>+</sup> *<sup>m</sup>*¯

4) �

<sup>3</sup>) + <sup>Φ</sup>4(*A*<sup>4</sup>

<sup>4</sup>) (50)

<sup>2</sup> *<sup>I</sup>*3<sup>|</sup>

��

⎛ ⎝

<sup>4</sup> (*ajk*)1≤*j*,*k*≤2*, i* =

5/3 1/2 0 1/2 3/2 0 0 03

⎞ ⎠

Φ*i*(*f*(*Ai*))

<sup>4</sup>*, which is the convex function*

1 *β*

Consequently, the following inequalities

1 *α* � Φ1(*A*<sup>4</sup>

< <sup>1</sup> *β* � Φ3(*A*<sup>4</sup>

*<sup>A</sup>*� <sup>=</sup> <sup>1</sup>

2 9/8 1 9/8 2 0 1 03

1, . . . , 4*. Then* ∑<sup>4</sup>

⎛ ⎝

*Let*

*A*<sup>1</sup> = 2

*holds, where <sup>δ</sup><sup>f</sup>* <sup>=</sup> *<sup>M</sup>*¯ <sup>4</sup> <sup>+</sup> *<sup>m</sup>*¯ <sup>4</sup> <sup>−</sup> (*M*¯ <sup>+</sup> *<sup>m</sup>*¯)4/8 *and*

*<sup>α</sup>*(*M*¯ <sup>−</sup> *<sup>m</sup>*¯)

<sup>2</sup> *<sup>I</sup>*<sup>2</sup> <sup>−</sup> <sup>1</sup>

⎞

⎠ , *A*<sup>2</sup> = 3

*such that*

1 *α*

1 *β*

*n*1 ∑ *i*=1

*n* ∑ *i*=*n*1+1

Φ*i*(*f*(*Ai*)) ≤

hold, which with (49) proves the desired series inequalities (44).

**Example 22.** *We observe the matrix case of Theorem 21 for f*(*t*) = *t*

**Figure 3.** An example a convex function and the bounds of four operators

<sup>1</sup>) + <sup>Φ</sup>2(*A*<sup>4</sup>

<sup>3</sup>) + <sup>Φ</sup>4(*A*<sup>4</sup>

� Φ<sup>1</sup> �

*<sup>i</sup>*=<sup>1</sup> <sup>Φ</sup>*i*(*I*3) = *<sup>I</sup>*<sup>2</sup> *and <sup>α</sup>* <sup>=</sup> *<sup>β</sup>* <sup>=</sup> <sup>1</sup>

⎛ ⎝

< Φ1(*A*<sup>4</sup>

2) � < <sup>1</sup> *α* � Φ1(*A*<sup>4</sup>

<sup>1</sup>) + <sup>Φ</sup>2(*A*<sup>4</sup>

4) �

*We define mappings* <sup>Φ</sup>*<sup>i</sup>* : *<sup>M</sup>*3(**C**) <sup>→</sup> *<sup>M</sup>*2(**C**) *as follows:* <sup>Φ</sup>*i*((*ajk*)1≤*j*,*k*≤3) = <sup>1</sup>

2 9/8 0 9/8 1 0 0 02

$$A\_f \equiv \frac{1}{\alpha} \left( \Phi\_1(A\_1^4) + \Phi\_2(A\_2^4) \right) = \begin{pmatrix} 989.00391 \ 663.46875 \\ 663.46875 \ 526.12891 \end{pmatrix}$$

$$\mathbf{C}\_f \equiv \Phi\_1(A\_1^4) + \Phi\_2(A\_2^4) + \Phi\_3(A\_3^4) + \Phi\_4(A\_4^4) = \begin{pmatrix} 68093.14258 \ 48477.98437 \\ 48477.98437 \ 51335.39258 \end{pmatrix}$$

$$B\_f \equiv \frac{1}{\beta} \left( \Phi\_3(A\_3^4) + \Phi\_4(A\_4^4) \right) = \begin{pmatrix} 135197.28125 & 96292.5 \\ 96292.5 & 102144.65625 \end{pmatrix}$$

*Then*

$$A\_f < \mathbb{C}\_f < \mathbb{B}\_f \tag{51}$$

*holds (which is consistent with* (31)*).*

*We will choose three pairs of numbers* (*m*¯ , *<sup>M</sup>*¯ )*, <sup>m</sup>*¯ <sup>∈</sup> [−4.71071, 0.53777]*, <sup>M</sup>*¯ <sup>∈</sup> [7.70771, 12.91724] *as follows*

$$\begin{aligned} \text{i) } \ \vec{m} = m\_{\perp} = 0.53777, \ \vec{M} = M\_{R} = 7.70771, \ \text{then} \\ \tilde{\Delta}\_{1} = \beta \delta\_{f} \tilde{A} = 0.5 \cdot 2951.69249 \cdot \begin{pmatrix} 0.15678 \ 0.09030 \\ 0.09030 \ 0.15943 \end{pmatrix} = \begin{pmatrix} 231.38908 \ 133.26139 \\ 133.26139 \ 235.29515 \end{pmatrix} \\ \text{ii) } \ \vec{m} = m = -4.71071, \ \vec{M} = M = 12.91724, \ \text{then} \\ \tilde{\Delta}\_{2} = \beta \delta\_{f} \tilde{A} = 0.5 \cdot 27766.07963 \cdot \begin{pmatrix} 0.36022 \ 0.03573 \\ 0.03573 \ 0.36155 \end{pmatrix} = \begin{pmatrix} 5000.89860 & 496.04498 \\ 496.04498 & 5019.50711 \end{pmatrix} \\ \text{iii) } \ \vec{m} = -1, \ \tilde{M} = 10, \text{then} \\ \tilde{\Delta}\_{3} = \beta \delta\_{f} \tilde{A} = 0.5 \cdot 9180.875 \cdot \begin{pmatrix} 0.28203 \ 0.08975 \\ 0.08975 \ 0.27557 \end{pmatrix} = \begin{pmatrix} 1294.66 \ 411.999 \\ 411.999 & 1265. \end{pmatrix} \end{aligned}$$

*New, we obtain the following improvement of* (51) *(see* (50)*)*

$$\begin{aligned} \text{i)} \quad A\_f &< A\_f + \tilde{\Lambda}\_1 = \begin{pmatrix} 1220.3929796.73014 \\ 796.73014 & 761.42406 \end{pmatrix} \\ &< \mathbb{C}\_f < \begin{pmatrix} 134965.89217 & 96159.23861 \\ 96159.23861 & 1109.903610 \end{pmatrix} = B\_f - \tilde{\Lambda}\_1 < B\_f \\ \text{ii)} \quad A\_f &< A\_f + \tilde{\Lambda}\_2 = \begin{pmatrix} 5989.90251 & 1159.51373 \\ 1159.51373 & 5545.63601 \end{pmatrix} \\ &< \mathbb{C}\_f < \begin{pmatrix} 130196.38265 & 95796.45502 \\ 95796.45502 & 97125.14914 \end{pmatrix} = B\_f - \tilde{\Lambda}\_2 < B\_f \\ \text{iii)} \quad A\_f &< A\_f + \tilde{\Lambda}\_3 = \begin{pmatrix} 2283.66362 & 1075.46746 \\ 1075.46746 & 1791.12874 \end{pmatrix} \\ &< \mathbb{C}\_f < \begin{pmatrix} 133902.62153 & 95890.50129 \\ 95880.50129 & 100879.65641 \end{pmatrix} = B\_f - \tilde{\Lambda}\_3 < B\_f \end{aligned}$$

#### 24 Will-be-set-by-IN-TECH 212 Linear Algebra – Theorems and Applications

Using Theorem 21 we get the following result.

**Corollary 23.** *Let the assumptions of Theorem 21 hold. Then*

$$\frac{1}{\alpha} \sum\_{i=1}^{n\_1} \Phi\_i(f(A\_i)) \le \frac{1}{\alpha} \sum\_{i=1}^{n\_1} \Phi\_i(f(A\_i)) + \gamma\_1 \delta\_f \tilde{A} \le \frac{1}{\beta} \sum\_{i=n\_1+1}^n \Phi\_i(f(A\_i)) \tag{52}$$

*and*

$$\frac{1}{n}\sum\_{i=1}^{n\_1} \Phi\_i(f(A\_i)) \le \frac{1}{\mathcal{\beta}}\sum\_{i=n\_1+1}^n \Phi\_i(f(A\_i)) - \gamma\_2 \delta\_f \tilde{A} \le \frac{1}{\mathcal{\beta}}\sum\_{i=n\_1+1}^n \Phi\_i(f(A\_i))\tag{53}$$

*holds for every γ*1, *γ*<sup>2</sup> *in the close interval joining α and β, where δ<sup>f</sup> and A are defined by* (45)*.*

*Proof.* Adding *αδ<sup>f</sup> A* in (44) and noticing *δ<sup>f</sup> A* ≥ 0, we obtain

$$\frac{1}{\alpha} \sum\_{i=1}^{n\_1} \Phi\_i(f(A\_i)) \le \frac{1}{\alpha} \sum\_{i=1}^{n\_1} \Phi\_i(f(A\_i)) + \alpha \delta\_f \tilde{A} \le \frac{1}{\beta} \sum\_{i=n\_1+1}^n \Phi\_i(f(A\_i))$$

Taking into account the above inequality and the left hand side of (44) we obtain (52).

Similarly, subtracting *βδ<sup>f</sup> A* in (44) we obtain (53).

**Remark 24.** *We can obtain extensions of inequalities which are given in Remark 16 and 17. Also, we can obtain a special case of Theorem 21 with the convex combination of operators Ai putting* Φ*i*(*B*) = *αiB, for i* = 1, . . . , *n, similarly as in Corollary 19. Finally, applying this result, we can give another proof of Corollary 14. The interested reader can see the details in [30].*

#### **Author details**

Jadranka Mi´ci´c

*Faculty of Mechanical Engineering and Naval Architecture, University of Zagreb, Ivana Luˇci´ca 5, 10000 Zagreb, Croatia*

Josip Peˇcari´c

*Faculty of Textile Technology, University of Zagreb, Prilaz baruna Filipovi´ca 30, 10000 Zagreb, Croatia*

#### **7. References**


[6] Mond B, Peˇcari´c J (1994) Converses of Jensen's inequality for several operators, Rev. Anal. Numér. Théor. Approx. 23: 179-183.

24 Will-be-set-by-IN-TECH

Φ*i*(*f*(*Ai*)) + *γ*1*δ<sup>f</sup> A* ≤

Φ*i*(*f*(*Ai*)) − *γ*2*δ<sup>f</sup> A* ≤

Φ*i*(*f*(*Ai*)) + *αδ<sup>f</sup> A* ≤

**Remark 24.** *We can obtain extensions of inequalities which are given in Remark 16 and 17. Also, we can obtain a special case of Theorem 21 with the convex combination of operators Ai putting* Φ*i*(*B*) = *αiB, for i* = 1, . . . , *n, similarly as in Corollary 19. Finally, applying this result, we can give another*

*Faculty of Mechanical Engineering and Naval Architecture, University of Zagreb, Ivana Luˇci´ca 5,*

*Faculty of Textile Technology, University of Zagreb, Prilaz baruna Filipovi´ca 30, 10000 Zagreb, Croatia*

[1] Davis C (1957) A Schwarz inequality for convex operator functions. Proc. Amer. Math.

[2] Choi M.D (1974) A Schwarz inequality for positive linear maps on *C*∗-algebras. Illinois

[3] Hansen F, Pedersen G.K (1982) Jensen's inequality for operators and Löwner's theorem.

[4] Mond B, Peˇcari´c J (1995) On Jensen's inequality for operator convex functions. Houston

[5] Hansen F, Pedersen G.K (2003) Jensen's operator inequality. Bull. London Math. Soc. 35:

1 *β*

> 1 *β*

1 *β*

*n* ∑ *i*=*n*1+1

*n* ∑ *i*=*n*1+1

> *n* ∑ *i*=*n*1+1

Φ*i*(*f*(*Ai*)) (52)

Φ*i*(*f*(*Ai*)) (53)

(45)*.*

Φ*i*(*f*(*Ai*))

Using Theorem 21 we get the following result.

1 *α*

1 *α*

*n*1 ∑ *i*=1

> 1 *α*

**Author details**

*10000 Zagreb, Croatia*

Jadranka Mi´ci´c

Josip Peˇcari´c

**7. References**

Soc. 8: 42-44.

553-564.

J. Math. 18: 565-574.

J. Math. 21: 739-754.

Math. Ann. 258: 229-241.

*n*1 ∑ *i*=1

*and*

*n*1 ∑ *i*=1

**Corollary 23.** *Let the assumptions of Theorem 21 hold. Then*

1 *α*

1 *β*

*Proof.* Adding *αδ<sup>f</sup> A* in (44) and noticing *δ<sup>f</sup> A* ≥ 0, we obtain

1 *α*

*proof of Corollary 14. The interested reader can see the details in [30].*

*n*1 ∑ *i*=1

Φ*i*(*f*(*Ai*)) ≤

Similarly, subtracting *βδ<sup>f</sup> A* in (44) we obtain (53).

*n*1 ∑ *i*=1

*n* ∑ *i*=*n*1+1

*holds for every γ*1, *γ*<sup>2</sup> *in the close interval joining α and β, where δ<sup>f</sup> and A are defined by*

Taking into account the above inequality and the left hand side of (44) we obtain (52).

Φ*i*(*f*(*Ai*)) ≤

Φ*i*(*f*(*Ai*)) ≤

	- [28] Xiao Z.G, Srivastava H.M, Zhang Z.H (2010) Further refinements of the Jensen inequalities based upon samples with repetitions. Math. Comput. Modelling 51: 592-600.
	- [29] Wang L.C, Ma X.F, Liu L.H (2009) A note on some new refinements of Jensen's inequality for convex functions. J. Inequal. Pure Appl. Math. 10. 2. Art. 48: 6 p.
	- [30] Mi´ci´c J, Peˇcari´c J, Peri´c J (2012) Extension of the refined Jensen's operator inequality with condition on spectra. Ann. Funct. Anal. 3: 67-85.

## **A Linear System of Both Equations and Inequalities in Max-Algebra**

Abdulhadi Aminu

26 Will-be-set-by-IN-TECH

[28] Xiao Z.G, Srivastava H.M, Zhang Z.H (2010) Further refinements of the Jensen inequalities based upon samples with repetitions. Math. Comput. Modelling 51:

[29] Wang L.C, Ma X.F, Liu L.H (2009) A note on some new refinements of Jensen's inequality

[30] Mi´ci´c J, Peˇcari´c J, Peri´c J (2012) Extension of the refined Jensen's operator inequality with

for convex functions. J. Inequal. Pure Appl. Math. 10. 2. Art. 48: 6 p.

condition on spectra. Ann. Funct. Anal. 3: 67-85.

592-600.

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/48195

## **1. Introduction**

The aim of this chapter is to present a system of linear equation and inequalities in max-algebra. Max-algebra is an analogue of linear algebra developed on the pair of operations (⊕, ⊗) extended to matrices and vectors, where *a* ⊕ *b* = *max*(*a*, *b*) and *a* ⊗ *b* = *a* + *b* for *a*, *b* ∈ **R**. The system of equations *A* ⊗ *x* = *c* and inequalities *B* ⊗ *x* ≤ *d* have each been studied in the literature. We will present necessary and sufficient conditions for the solvability of a system consisting of these two systems and also develop a polynomial algorithm for solving max-linear program whose constraints are max-linear equations and inequalities. Moreover, some solvability concepts of an inteval system of linear equations and inequalities will also be presented.

Max-algebraic linear systems were investigated in the first publications which deal with the introduction of algebraic structures called (max,+) algebras. Systems of equations with variables only on one side were considered in these publications [1, 2] and [3]. Other systems with a special structure were investigated in the context of solving eigenvalue problems in correspondence with algebraic structures or synchronisation of discrete event systems, see [4] and also [1] for additional information. Given a matrix *A*, a vector *b* of an appropriate size, using the notation ⊕ = max, ⊗ = plus, the studied systems had one of the following forms: *A* ⊗ *x* = *b*, *A* ⊗ *x* = *x* or *A* ⊗ *x* = *x* ⊕ *b*. An infinite dimensional generalisation can be found in [5].

In [1] Cuninghame-Green showed that the problem *A* ⊗ *x* = *b* can be solved using residuation [6]. That is the equality in *A* ⊗ *x* = *b* be relaxed so that the set of its sub-solutions is studied. It was shown that the greatest solution of *A* ⊗ *x* ≤ *b* is given by *x*¯ where

$$\mathfrak{x}\_{\mathfrak{j}} = \min\_{i \in M} (b\_i \otimes a\_{ij}^{-1}) \text{ for all } j \in N$$

The equation *A* ⊗ *x* = *b* is also solved using the above result as follows: The equation *A* ⊗ *x* = *b* has solution if and only if *A* ⊗ *x*¯ = *b*. Also, Gaubert [7] proposed a method for solving the one-sided system *x* = *A* ⊗ *x* ⊕ *b* using rational calculus.

> ©2012 Aminu, licensee InTech. This is an open access chapter distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. ©2012 Aminu, licensee InTech. This is a paper distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Zimmermann [3] developed a method for solving *A* ⊗ *x* = *b* by set covering and also presented an algorithm for solving max-linear programs with one sided constraints. This method is proved to has a computational complexity of *O*(*mn*), where *m*, *n* are the number of rows and columns of input matrices respectively. Akian, Gaubert and Kolokoltsov [5] extended Zimmermann's solution method by set covering to the case of functional Galois connections.

Butkovic [8] developed a max-algebraic method for finding all solutions to a system of inequalities *xi* − *xj* > *bij*, *i*, *j* = 1, ..., *n* using *n* generators. Using this method Butkovic [8] developed a pseudopolynomial algorithm which either finds a bounded mixed-integer solution, or decides that no such solution exists. Summary of these results can be found in [9] and [10]

Cechlarova and Diko [11] proposed a method for resolving infeasibility of the system ´ *A* ⊗ *x* = *b* . The techniques presented in this method are to modify the right-hand side as little as possible or to omit some equations. It was shown that the problem of finding the minimum number of those equations is NP-complete.

## **2. Max-algebra and some basic definitions**

In this section we introduce max-algebra, give the essential definitions and show how the operations of max-algebra can be extended to matrices and vectors.

In max-algebra, we replace addition and multiplication, the binary operations in conventional linear algebra, by maximum and addition respectively. For any problem that involves adding numbers together and taking the maximum of numbers, it may be possible to describe it in max-algebra. A problem that is nonlinear when described in conventional terms may be converted to a max-algebraic problem that is linear with respect to (⊕, ⊗)=(max, +).

**Definition 1.** The max-plus semiring **R** is the set **R** ∪ {−∞}, equipped with the addition (*a*, *b*) �→ max(*a*, *b*) and multiplication (*a*, *b*) �→ *a* + *b* denoted by ⊕ and ⊗ respectively. That is *a* ⊕ *b* = max(*a*, *b*) and *a* ⊗ *b* = *a* + *b*. The identity element for the addition (or zero) is −∞, and the identity element for the multiplication (or unit) is 0.

**Definition 2.** The min-plus semiring **R**min is the set **R** ∪ {+∞}, equipped with the addition (*a*, *<sup>b</sup>*) �→ min(*a*, *<sup>b</sup>*) and multiplication (*a*, *<sup>b</sup>*) �→ *<sup>a</sup>* <sup>+</sup> *<sup>b</sup>* denoted by <sup>⊕</sup>� and <sup>⊗</sup>� respectively. The zero is +∞, and the unit is 0. The name tropical semiring is also used as a synonym of min-plus when the ground set is **N**.

The completed max-plus semiring **R***max* is the set **R** ∪ {±∞}, equipped with the addition (*a*, *b*) �→ max(*a*, *b*) and multiplication (*a*, *b*) �→ *a* + *b*, with the convention that −∞ + (+∞) = +∞ + (−∞) = −∞. The completed min-plus semiring **R**min is defined in the dual way.

**Proposition 1.** The following properties hold for all *a*, *b*, *c* ∈ **R**:

$$\begin{aligned} a \oplus b &= b \oplus a \\ a \otimes b &= b \otimes a \\ a \oplus (b \oplus c) &= (a \oplus b) \oplus c \\ a \otimes (b \otimes c) &= (a \otimes b) \otimes c \end{aligned}$$

216 Linear Algebra – Theorems and Applications A Linear System of Both Equations and Inequalities in Max-Algebra <sup>3</sup> A Linear System of Both Equations and Inequalities in Max-Algebra 217

$$a \otimes (b \oplus c) = a \otimes b \oplus a \otimes c$$

$$a \oplus (-\infty) = -\infty = (-\infty) \oplus a$$

$$a \otimes 0 = a = 0 \otimes a$$

$$a \otimes a^{-1} = 0, a, a^{-1} \in \mathbb{R}$$

*Proof.*

2 Will-be-set-by-IN-TECH

Zimmermann [3] developed a method for solving *A* ⊗ *x* = *b* by set covering and also presented an algorithm for solving max-linear programs with one sided constraints. This method is proved to has a computational complexity of *O*(*mn*), where *m*, *n* are the number of rows and columns of input matrices respectively. Akian, Gaubert and Kolokoltsov [5] extended Zimmermann's solution method by set covering to the case of functional Galois

Butkovic [8] developed a max-algebraic method for finding all solutions to a system of inequalities *xi* − *xj* > *bij*, *i*, *j* = 1, ..., *n* using *n* generators. Using this method Butkovic [8] developed a pseudopolynomial algorithm which either finds a bounded mixed-integer solution, or decides that no such solution exists. Summary of these results can be found in [9]

Cechlarova and Diko [11] proposed a method for resolving infeasibility of the system ´ *A* ⊗ *x* = *b* . The techniques presented in this method are to modify the right-hand side as little as possible or to omit some equations. It was shown that the problem of finding the minimum

In this section we introduce max-algebra, give the essential definitions and show how the

In max-algebra, we replace addition and multiplication, the binary operations in conventional linear algebra, by maximum and addition respectively. For any problem that involves adding numbers together and taking the maximum of numbers, it may be possible to describe it in max-algebra. A problem that is nonlinear when described in conventional terms may be converted to a max-algebraic problem that is linear with respect to (⊕, ⊗)=(max, +).

**Definition 1.** The max-plus semiring **R** is the set **R** ∪ {−∞}, equipped with the addition (*a*, *b*) �→ max(*a*, *b*) and multiplication (*a*, *b*) �→ *a* + *b* denoted by ⊕ and ⊗ respectively. That is *a* ⊕ *b* = max(*a*, *b*) and *a* ⊗ *b* = *a* + *b*. The identity element for the addition (or zero) is −∞,

**Definition 2.** The min-plus semiring **R**min is the set **R** ∪ {+∞}, equipped with the addition

zero is +∞, and the unit is 0. The name tropical semiring is also used as a synonym of min-plus

The completed max-plus semiring **R***max* is the set **R** ∪ {±∞}, equipped with the addition (*a*, *b*) �→ max(*a*, *b*) and multiplication (*a*, *b*) �→ *a* + *b*, with the convention that −∞ + (+∞) = +∞ + (−∞) = −∞. The completed min-plus semiring **R**min is defined in the dual way.

> *a* ⊕ *b* = *b* ⊕ *a a* ⊗ *b* = *b* ⊗ *a a* ⊕ (*b* ⊕ *c*)=(*a* ⊕ *b*) ⊕ *c a* ⊗ (*b* ⊗ *c*)=(*a* ⊗ *b*) ⊗ *c*

and <sup>⊗</sup>�

respectively. The

connections.

and [10]

number of those equations is NP-complete.

**2. Max-algebra and some basic definitions**

operations of max-algebra can be extended to matrices and vectors.

and the identity element for the multiplication (or unit) is 0.

when the ground set is **N**.

(*a*, *<sup>b</sup>*) �→ min(*a*, *<sup>b</sup>*) and multiplication (*a*, *<sup>b</sup>*) �→ *<sup>a</sup>* <sup>+</sup> *<sup>b</sup>* denoted by <sup>⊕</sup>�

**Proposition 1.** The following properties hold for all *a*, *b*, *c* ∈ **R**:

The statements follow from the definitions.

**Proposition 2.** For all *a*, *b*, *c* ∈ **R** the following properties hold:

$$\begin{array}{l} a \le b \implies a \oplus c \le b \oplus c \\\\ a \le b \iff a \otimes c \le b \otimes c, c \in \mathbb{R} \\\\ a \le b \iff a \oplus b = b \\\\ a > b \iff a \otimes c > b \otimes c, -\infty < c < +\infty \end{array}$$

*Proof.* The statements follow from definitions.

The pair of operations (⊕,⊗) is extended to matrices and vectors as in the conventional linear algebra as follows: For *A* = (*aij*), *B* = (*bij*) of compatible sizes and *α* ∈ **R** we have:

$$\begin{aligned} A \oplus B &= (a\_{ij} \oplus b\_{ij}) \\ A \otimes B &= \left(\sum\_{k} ^{\oplus} a\_{ik} \otimes b\_{kj}\right) \\ \mathfrak{a} \otimes A &= (\mathfrak{a} \otimes a\_{ij}) \end{aligned}$$

**Example 1.**

$$
\begin{pmatrix} 3 \ 1 \ 5 \\ 2 \ 1 \ 5 \end{pmatrix} \oplus \begin{pmatrix} -1 & 0 \ 2 \\ 6 \ -5 \ 4 \end{pmatrix} = \begin{pmatrix} 3 \ 1 \ 5 \\ 6 \ 1 \ 5 \end{pmatrix}
$$

**Example 2.**

$$\begin{pmatrix} -4 \ 1 \ -5 \\ 3 \ 0 \ 8 \end{pmatrix} \otimes \begin{pmatrix} -1 \ 2 \\ 1 \ 7 \\ 3 \ 1 \end{pmatrix}$$

$$\mathbf{J} = \begin{pmatrix} (-4 + (-1)) \oplus (1 + 1) \oplus (-5 + 3) & (-4 + 2) \oplus (1 + 7) \oplus (-5 + 1) \\ (3 + (-1)) \oplus (0 + 1) \oplus (8 + 3) & (3 + 2) \oplus (0 + 7) \oplus (8 + 1) \end{pmatrix} = \begin{pmatrix} 2 \ 8 \\ 11 \ 9 \end{pmatrix}$$

**Example 3.**

$$10 \otimes \begin{pmatrix} 7 \ -3 \ 2 \\ 6 \ -1 \ 0 \end{pmatrix} = \begin{pmatrix} 17 \ 7 \ 12 \\ 16 \ 11 \ 10 \end{pmatrix}$$

#### **Proposition 3.**

For *<sup>A</sup>*, *<sup>B</sup>*, *<sup>C</sup>* <sup>∈</sup> **<sup>R</sup>***m*×*<sup>n</sup>* of compatible sizes, the following properties hold:

$$A \oplus B = B \oplus A$$

$$A \oplus (B \oplus \mathbb{C}) = (A \oplus B) \oplus \mathbb{C}$$

$$A \otimes (B \otimes \mathbb{C}) = (A \otimes B) \otimes \mathbb{C}$$

$$A \otimes (B \oplus \mathbb{C}) = A \otimes B \oplus A \otimes \mathbb{C}$$

$$(A \oplus B) \otimes \mathbb{C} = A \otimes \mathbb{C} \oplus B \otimes \mathbb{C}$$

*Proof.*

The statements follow from the definitions.

#### **Proposition 4.**

The following hold for *A*, *B*, *C*, *a*, *b*, *c*, *x*, *y* of compatible sizes and *α*, *β* ∈ **R**:

$$\begin{aligned} A \otimes (\mathfrak{a} \otimes B) &= \mathfrak{a} \otimes (A \otimes B) \\ \mathfrak{a} \otimes (A \oplus B) &= \mathfrak{a} \otimes A \oplus \mathfrak{a} \otimes B \\ (\mathfrak{a} \oplus \beta) \otimes A &= \mathfrak{a} \otimes A \oplus \mathfrak{f} \otimes B \\ \mathfrak{x}^T \otimes \mathfrak{a} \otimes \mathfrak{y} &= \mathfrak{a} \otimes \mathfrak{x}^T \otimes \mathfrak{y} \\ \mathfrak{a} \le b &\Longrightarrow \mathfrak{c}^T \otimes \mathfrak{a} \leq \mathfrak{c}^T \otimes b \\ A \le B &\Longrightarrow A \oplus \mathbb{C} \leq B \oplus \mathbb{C} \\ A \le B &\Longrightarrow A \otimes \mathbb{C} \leq B \otimes \mathbb{C} \\ A \le B &\Longleftrightarrow A \oplus B = B \end{aligned}$$

*Proof.* The statements follow from the definition of the pair of operations (⊕,⊗).

**Definition 3.** Given real numbers *a*, *b*, *c*, . . . , a max-algebraic *diagonal matrix* is defined as:

$$\text{diag}(a, b, c, \dots) = \begin{pmatrix} a & & & & \\ & b & & -\infty & \\ & & c & & \\ & -\infty & \ddots & \\ & & & \ddots \\ & & & & \ddots \end{pmatrix}\_{\lambda\_{-}}$$

Given a vector *d* = (*d*1, *d*2,..., *dn*), the *diagonal of the vector d* is denoted as diag(*d*) = diag(*d*1, *d*2,..., *dn*).

**Definition 4.** Max-algebraic *identity matrix* is a diagonal matrix with all diagonal entries zero. We denote by *I* an identity matrix. Therefore, *identity matrix I* = diag(0, 0, 0, . . .).

It is obvious that *A* ⊗ *I* = *I* ⊗ *A* for any matrices *A* and *I* of compatible sizes.

**Definition 5.** Any matrix that can be obtained from the identity matrix, *I*, by permuting its rows and or columns is called a *permutation matrix*. A matrix arising as a product of a diagonal matrix and a permutation matrix is called a *generalised permutation matrix* [12].

**Definition 6.** A matrix *<sup>A</sup>* <sup>∈</sup> **<sup>R</sup>***n*×*<sup>n</sup>* is *invertible* if there exists a matrix *<sup>B</sup>* <sup>∈</sup> **<sup>R</sup>***n*×*<sup>n</sup>* , such that *A* ⊗ *B* = *B* ⊗ *A* = *I*. The matrix *B* is unique and will be called the *inverse* of *A*. We will henceforth denote *B* by *A*−1.

It has been shown in [1] that a matrix is *invertible* if and only if it is a generalised permutation matrix. If *x* = (*x*1,..., *xn*) we will denote *x*−<sup>1</sup> = (*x*−<sup>1</sup> <sup>1</sup> ,..., *<sup>x</sup>*−<sup>1</sup> *<sup>n</sup>* ), that is *<sup>x</sup>*−<sup>1</sup> <sup>=</sup> <sup>−</sup>*x*, in conventional notation.

#### **Example 4.**

4 Will-be-set-by-IN-TECH

*A* ⊕ *B* = *B* ⊕ *A A* ⊕ (*B* ⊕ *C*)=(*A* ⊕ *B*) ⊕ *C A* ⊗ (*B* ⊗ *C*)=(*A* ⊗ *B*) ⊗ *C A* ⊗ (*B* ⊕ *C*) = *A* ⊗ *B* ⊕ *A* ⊗ *C* (*A* ⊕ *B*) ⊗ *C* = *A* ⊗ *C* ⊕ *B* ⊗ *C*

For *<sup>A</sup>*, *<sup>B</sup>*, *<sup>C</sup>* <sup>∈</sup> **<sup>R</sup>***m*×*<sup>n</sup>* of compatible sizes, the following properties hold:

The following hold for *A*, *B*, *C*, *a*, *b*, *c*, *x*, *y* of compatible sizes and *α*, *β* ∈ **R**:

*A* ⊗ (*α* ⊗ *B*) = *α* ⊗ (*A* ⊗ *B*) *α* ⊗ (*A* ⊕ *B*) = *α* ⊗ *A* ⊕ *α* ⊗ *B* (*α* ⊕ *β*) ⊗ *A* = *α* ⊗ *A* ⊕ *β* ⊗ *B <sup>x</sup><sup>T</sup>* <sup>⊗</sup> *<sup>α</sup>* <sup>⊗</sup> *<sup>y</sup>* <sup>=</sup> *<sup>α</sup>* <sup>⊗</sup> *<sup>x</sup><sup>T</sup>* <sup>⊗</sup> *<sup>y</sup>*

*Proof.* The statements follow from the definition of the pair of operations (⊕,⊗).

diag(*a*, *b*, *c*,...) =

**Definition 3.** Given real numbers *a*, *b*, *c*, . . . , a max-algebraic *diagonal matrix* is defined as:

⎛

*a*

*b* −∞ *c* <sup>−</sup><sup>∞</sup> ...

...

⎞

⎟⎟⎟⎟⎟⎟⎠

⎜⎜⎜⎜⎜⎜⎝

Given a vector *d* = (*d*1, *d*2,..., *dn*), the *diagonal of the vector d* is denoted as diag(*d*) =

**Definition 4.** Max-algebraic *identity matrix* is a diagonal matrix with all diagonal entries zero.

**Definition 5.** Any matrix that can be obtained from the identity matrix, *I*, by permuting its rows and or columns is called a *permutation matrix*. A matrix arising as a product of a diagonal

We denote by *I* an identity matrix. Therefore, *identity matrix I* = diag(0, 0, 0, . . .).

It is obvious that *A* ⊗ *I* = *I* ⊗ *A* for any matrices *A* and *I* of compatible sizes.

matrix and a permutation matrix is called a *generalised permutation matrix* [12].

*<sup>a</sup>* <sup>≤</sup> *<sup>b</sup>* <sup>=</sup><sup>⇒</sup> *<sup>c</sup><sup>T</sup>* <sup>⊗</sup> *<sup>a</sup>* <sup>≤</sup> *<sup>c</sup><sup>T</sup>* <sup>⊗</sup> *<sup>b</sup> A* ≤ *B* =⇒ *A* ⊕ *C* ≤ *B* ⊕ *C A* ≤ *B* =⇒ *A* ⊗ *C* ≤ *B* ⊗ *C A* ≤ *B* ⇐⇒ *A* ⊕ *B* = *B*

The statements follow from the definitions.

**Proposition 3.**

*Proof.*

**Proposition 4.**

diag(*d*1, *d*2,..., *dn*).

Consider the following matrices

$$A = \begin{pmatrix} -\infty & -\infty & 3\\ 5 & -\infty & -\infty \\ -\infty & 8 & -\infty \end{pmatrix} \text{ and } B = \begin{pmatrix} -\infty & -5 & -\infty \\ -\infty & -\infty & -8 \\ -3 & -\infty & -\infty \end{pmatrix}.$$

The matrix *B* is an inverse of *A* because,

$$A \otimes B = \begin{pmatrix} -\infty & -\infty & 3 \\ 5 & -\infty & -\infty \\ -\infty & 8 & -\infty \end{pmatrix} \otimes \begin{pmatrix} -\infty & -5 & -\infty \\ -\infty & -\infty & -8 \\ 3 & -\infty & -\infty \end{pmatrix} = \begin{pmatrix} 0 & -\infty & -\infty \\ -\infty & 0 & -\infty \\ -\infty & -\infty & 0 \end{pmatrix}$$

Given a matrix *<sup>A</sup>* = (*aij*) <sup>∈</sup> **<sup>R</sup>**, the *transpose* of *<sup>A</sup>* will be denoted by *<sup>A</sup>T*, that is *<sup>A</sup><sup>T</sup>* = (*aji*). Structures of discrete-event dynamic systems may be represented by square matrices *A* over the semiring:

$$\overline{\mathbb{R}} = (\{ -\infty \} \cup \mathbb{R}, \oplus, \otimes) = (\{ -\infty \} \cup \mathbb{R}, \max, +)$$
The system  $\mathbb{R}$  is embedded in the self-dual system:

$$\overline{\overline{\mathbb{R}}} = (\{-\infty\} \cup \mathbb{R}\{+\infty\}, \oplus, \otimes, \stackrel{\circ}{\oplus}, \stackrel{\circ}{\otimes}) = (\{-\infty\} \cup \mathbb{R}\{+\infty\}, \max, +, \min, +)$$

Basic algebraic properties for <sup>⊕</sup>� and <sup>⊗</sup>� are similar to those of ⊕ and ⊗ described earlier. These are obtained by swapping <sup>≤</sup> and <sup>≥</sup> . Extension of the pair (⊕� , <sup>⊗</sup>� ) to matrices and vectors is as follows:

Given *A*, *B* of compatible sizes and *α* ∈ **R**, we define the following:

$$\begin{aligned} A \oplus^{'} B &= (a\_{ij} \oplus^{'} b\_{ij}) \\ A \otimes^{'} B &= \left( \sum\_{k}^{\oplus^{'}} a\_{ik} \otimes^{'} b\_{kj} \right) = \min\_{k} (a\_{ik} + b\_{kj}) \\ \alpha \otimes^{'} A &= (\alpha \otimes^{'} a\_{ij}) \end{aligned}$$

Also, properties of matrices for the pair (⊕� , <sup>⊗</sup>� ) are similar to those of (⊕, ⊗), just swap ≤ and ≥. For any matrix *A* = [*aij*] over **R**, the *conjugate* matrix is *A*<sup>∗</sup> = [−*aji*] obtained by negation and transposition, that is *<sup>A</sup>* <sup>=</sup> <sup>−</sup>*AT*.

**Proposition 5.** The following relations hold for any matrices *U*, *V*, *W* over **R** .

$$(\mathcal{U}\otimes^{'}V)\otimes\mathcal{W}\leq\mathcal{U}\otimes^{'}(V\otimes\mathcal{W})\tag{1}$$

$$\mathcal{U}\ll(\mathcal{U}^\*\otimes \stackrel{\circ}{\mathcal{W}}\mathcal{W})\le\mathcal{W}\tag{2}$$

$$\mathcal{U}\lhd\left(\mathcal{U}^\*\otimes\ulcorner(\mathcal{U}\otimes\mathcal{W})\right)=\mathcal{U}\otimes\mathcal{W}\tag{3}$$

*Proof.* Follows from the definitions.

#### **3. The Multiprocessor Interactive System (MPIS): A practical application**

Linear equations and inequalities in max-algebra have a considerable number of applications, the model we present here is called the *multiprocessor interactive system (MPIS)* which is formulated as follows:

Products *P*1,..., *Pm* are prepared using *n* processors, every processor contributing to the completion of each product by producing a partial product. It is assumed that every processor can work on all products simultaneously and that all these actions on a processor start as soon as the processor is ready to work. Let *aij* be the duration of the work of the *j th* processor needed to complete the partial product for *Pi* (*i* = 1, . . . , *m*; *j* = 1, . . . , *n*). Let us denote by *xj* the starting time of the *j th* processor (*j* = 1, . . . , *n*). Then, all partial products for *Pi* (*i* = 1, . . . , *m*; *j* = 1, . . . , *n*) will be ready at time max(*ai*<sup>1</sup> + *x*1,..., *ain* + *xn*). If the completion times *b*1,..., *bm* are given for each product then the starting times have to satisfy the following system of equations:

$$\max(a\_{i1} + \mathbf{x}\_{1\prime} \dots \mathbf{a}\_{in} + \mathbf{x}\_n) = b\_i \text{ for all } i \in M$$

Using the notation *a* ⊕ *b* = *max*(*a*, *b*) and *a* ⊗ *b* = *a* + *b* for *a*, *b* ∈ **R** extended to matrices and vectors in the same way as in linear algebra, then this system can be written as

$$A \circledast \mathfrak{x} = b$$

Any system of the form (4) is called 'one-sided max-linear system'. Also, if the requirement is that each product is to be produced on or before the completion times *b*1,..., *bm*, then the starting times have to satisfy

$$\max(a\_{i1} + \mathbf{x}\_{1'} \dots \mathbf{a}\_{in} + \mathbf{x}\_n) \le b\_i \text{ for all } i \in M$$

which can also be written as

$$A \otimes \mathfrak{x} \le b \tag{5}$$

The system of inequalities (5) is called 'one-sided max-linear system of inequalities'.

#### **4. Linear equations and inequalities in max-algebra**

In this section we will present a system of linear equation and inequalities in max-algebra. Solvability conditions for linear system and inequalities will each be presented. A system consisting of max-linear equations and inequalities will also be discussed and necessary and sufficient conditions for the solvability of this system will be presented.

#### **4.1. System of equations**

In this section we present a solution method for the system *A* ⊗ *x* = *b* as given in [1, 3, 13] and also in the monograph [10]. Results concerning the existence and uniqueness of solution to the system will also be presented.

$$\text{Given } A = (a\_{ij}) \in \overline{\mathbb{R}}^{m \times n} \text{ and } b = (b\_1, \dots, b\_m)^T \in \overline{\mathbb{R}}^m \text{, a system of the form}$$

$$A \otimes \mathbf{x} = b \tag{6}$$

is called a *one-sided max-linear system*, some times we may omit 'max-linear' and say one-sided system. This system can be written using the conventional notation as follows

$$\max\_{\{j=1,\ldots,n\}} (a\_{l\bar{l}} + x\_{\bar{l}}) = b\_{\bar{l}\bar{\iota}} \; i \in M \tag{7}$$

The system in (7) can be written after subtracting the right-hand sides constants as

$$\max\_{j=1,\dots,n} (a\_{ij}\otimes b\_i^{-1} + \mathfrak{x}\_j) = \mathbf{0}, \; i \in M$$

A one-sided max-linear system whose all right hand side constants are zero is called *normalised max-linear system* or just *normalised* and the process of subtracting the right-hand side constants is called *normalisation*. Equivalently, *normalisation* is the process of multiplying the system (6) by the matrix *B* � from the left. That is

$$\mathcal{B}' \otimes \mathcal{A} \otimes \mathfrak{x} = \mathcal{B}' \otimes b = 0$$

where,

6 Will-be-set-by-IN-TECH

**3. The Multiprocessor Interactive System (MPIS): A practical application** Linear equations and inequalities in max-algebra have a considerable number of applications, the model we present here is called the *multiprocessor interactive system (MPIS)* which is

Products *P*1,..., *Pm* are prepared using *n* processors, every processor contributing to the completion of each product by producing a partial product. It is assumed that every processor can work on all products simultaneously and that all these actions on a processor start as soon

needed to complete the partial product for *Pi* (*i* = 1, . . . , *m*; *j* = 1, . . . , *n*). Let us denote

(*i* = 1, . . . , *m*; *j* = 1, . . . , *n*) will be ready at time max(*ai*<sup>1</sup> + *x*1,..., *ain* + *xn*). If the completion times *b*1,..., *bm* are given for each product then the starting times have to satisfy the following

max(*ai*<sup>1</sup> + *x*1,..., *ain* + *xn*) = *bi* for all *i* ∈ *M*

Using the notation *a* ⊕ *b* = *max*(*a*, *b*) and *a* ⊗ *b* = *a* + *b* for *a*, *b* ∈ **R** extended to matrices and

Any system of the form (4) is called 'one-sided max-linear system'. Also, if the requirement is that each product is to be produced on or before the completion times *b*1,..., *bm*, then the

max(*ai*<sup>1</sup> + *x*1,..., *ain* + *xn*) ≤ *bi* for all *i* ∈ *M*

In this section we will present a system of linear equation and inequalities in max-algebra. Solvability conditions for linear system and inequalities will each be presented. A system consisting of max-linear equations and inequalities will also be discussed and necessary and

In this section we present a solution method for the system *A* ⊗ *x* = *b* as given in [1, 3, 13] and also in the monograph [10]. Results concerning the existence and uniqueness of solution

The system of inequalities (5) is called 'one-sided max-linear system of inequalities'.

**4. Linear equations and inequalities in max-algebra**

sufficient conditions for the solvability of this system will be presented.

Given *<sup>A</sup>* = (*aij*) <sup>∈</sup> **<sup>R</sup>***m*×*<sup>n</sup>* and *<sup>b</sup>* = (*b*1,..., *bm*)*<sup>T</sup>* <sup>∈</sup> **<sup>R</sup>***m*, a system of the form

*th* processor (*j* = 1, . . . , *n*). Then, all partial products for *Pi*

*A* ⊗ *x* = *b* (4)

*A* ⊗ *x* ≤ *b* (5)

*A* ⊗ *x* = *b* (6)

*th* processor

as the processor is ready to work. Let *aij* be the duration of the work of the *j*

vectors in the same way as in linear algebra, then this system can be written as

formulated as follows:

system of equations:

by *xj* the starting time of the *j*

starting times have to satisfy

which can also be written as

**4.1. System of equations**

to the system will also be presented.

$$\boldsymbol{b}^{\prime} = \text{diag}(\boldsymbol{b}\_1^{-1}, \boldsymbol{b}\_2^{-1}, \dots, \boldsymbol{b}\_m^{-1}) = \text{diag}(\boldsymbol{b}^{-1})$$

For instance, consider the following one-sided system:

$$
\begin{pmatrix} -2 \ 1 \ 3 \\ 3 \ 0 \ 2 \\ 1 \ 2 \ 1 \end{pmatrix} \otimes \begin{pmatrix} x\_1 \\ x\_2 \\ x\_3 \end{pmatrix} = \begin{pmatrix} 5 \\ 6 \\ 3 \end{pmatrix} \tag{8}
$$

After normalisation, this system is equivalent to

$$
\begin{pmatrix} -7 \ -4 \ -2 \\ -3 \ -6 \ -4 \\ -2 \ -1 \ -2 \end{pmatrix} \otimes \begin{pmatrix} \mathbf{x}\_1 \\ \mathbf{x}\_2 \\ \mathbf{x}\_3 \end{pmatrix} = \begin{pmatrix} 0 \\ 0 \\ 0 \end{pmatrix}.
$$

That is after multiplying the system (8) by

$$
\begin{pmatrix}
\end{pmatrix},
$$

Consider the first equation of the normalised system above, that is *max*(*x*<sup>1</sup> − 7, *x*<sup>2</sup> − 4, *x*<sup>3</sup> − <sup>2</sup>) = 0. This means that if (*x*1, *<sup>x</sup>*2, *<sup>x</sup>*3)*<sup>T</sup>* is a solution to this system then *<sup>x</sup>*<sup>1</sup> <sup>≤</sup> 7,*x*<sup>2</sup> <sup>≤</sup> 4, *x*<sup>3</sup> ≤ 2 and at least one of these inequalities will be satisfied with equality. From the other equations of the system, we have for *x*<sup>1</sup> ≤ 3, *x*<sup>1</sup> ≤ 2, hence we have *x*<sup>1</sup> ≤ *min*(7, 3, 2) = −*max*(−7, −3, −2) = −*x*¯1 where −*x*¯1 is the column 1 maximum. It is clear that for all *j* then *xj* ≤ *x*¯*j*, where −*x*¯*<sup>j</sup>* is the column *j* maximum. At the same time equality must be attained in some of these inequalities so that in every row there is at least one column maximum which is attained by *xj*. This observation was made in [3].

**Definition 7.** A matrix *A* is called *doubly* **R***-astic* [14, 15], if it has at least one finite element on each row and on each column.

#### 8 Will-be-set-by-IN-TECH 222 Linear Algebra – Theorems and Applications

We introduce the following notations

$$\begin{aligned} S(A, b) &= \{ \mathbf{x} \in \overline{\mathbb{R}}^n ; A \otimes \mathbf{x} = b \} \\ M\_{\bar{j}} &= \{ k \in M ; b\_k \otimes a\_{k\bar{j}}^{-1} = \max\_{\bar{i}} (b\_{\bar{i}} \otimes a\_{i\bar{j}}^{-1}) \} \text{ for all } j \in \mathbb{N} \\ \bar{\mathbf{x}}(A, b)\_{\bar{j}} &= \min\_{\bar{i} \in M} (b\_{\bar{i}} \otimes a\_{i\bar{j}}^{-1}) \text{ for all } j \in \mathbb{N} \end{aligned}$$

We now consider the cases when *A* = −∞ and/or *b* = −∞. Suppose that *b* = −∞. Then *S*(*A*, *b*) can simply be written as

$$S(A, b) = \{ \mathbf{x} \in \mathbb{R}^n ; \mathbf{x}\_j = -\infty, \text{ if } A\_j \neq -\infty, \ j \in N \}$$

Therefore if *<sup>A</sup>* <sup>=</sup> <sup>−</sup><sup>∞</sup> we have *<sup>S</sup>*(*A*, *<sup>b</sup>*) = **<sup>R</sup>***<sup>n</sup>* . Now, if *<sup>A</sup>* <sup>=</sup> <sup>−</sup><sup>∞</sup> and *<sup>b</sup>* �<sup>=</sup> <sup>−</sup><sup>∞</sup> then *<sup>S</sup>*(*A*, *<sup>b</sup>*) = <sup>∅</sup>. Thus, we may assume in this section that *A* = −∞ and *b* �= −∞. If *bk* = −∞ for some *<sup>k</sup>* <sup>∈</sup> *<sup>M</sup>* then for any *<sup>x</sup>* <sup>∈</sup> *<sup>S</sup>*(*A*, *<sup>b</sup>*) we have *xj* <sup>=</sup> <sup>−</sup><sup>∞</sup> if *akj* �<sup>=</sup> <sup>−</sup>∞, *<sup>j</sup>* <sup>∈</sup> *<sup>N</sup>*, as a result the *<sup>k</sup>th* equation could be removed from the system together with every column *j* in the matrix *A* where *akj* �= −∞ (if any), and set the corresponding *xj* = −∞. Consequently, we may assume without loss of generality that *<sup>b</sup>* <sup>∈</sup> **<sup>R</sup>***m*.

Moreover, if *<sup>b</sup>* <sup>∈</sup> **<sup>R</sup>***<sup>m</sup>* and *<sup>A</sup>* has an <sup>−</sup><sup>∞</sup> row then *<sup>S</sup>*(*A*, *<sup>b</sup>*) = <sup>∅</sup>. If there is an <sup>−</sup><sup>∞</sup> column *<sup>j</sup>* in *<sup>A</sup>* then *xj* may take on any value in a solution *x*. Thus, in what follows we assume without loss of generality that *<sup>A</sup>* is doubly **<sup>R</sup>** <sup>−</sup> *astic* and *<sup>b</sup>* <sup>∈</sup> **<sup>R</sup>***m*.

**Theorem 1.** Let *<sup>A</sup>* = (*aij*) <sup>∈</sup> **<sup>R</sup>***m*×*<sup>n</sup>* be doubly **<sup>R</sup>** <sup>−</sup> *astic* and *<sup>b</sup>* <sup>∈</sup> **<sup>R</sup>***m*. Then *<sup>x</sup>* <sup>∈</sup> *<sup>S</sup>*(*A*, *<sup>b</sup>*) if and only if

$$\begin{aligned} \text{i) } &\ge \bar{\mathfrak{x}}(A, b) \text{ and} \\ \text{ii) } &\bigcup\_{j \in \mathcal{N}\_{\mathcal{X}}} M\_j = M \text{ where } \mathcal{N}\_{\mathcal{X}} = \{ j \in \mathcal{N}; \mathfrak{x}\_j = \mathfrak{x}(A, b)\_j \} \end{aligned}$$

*Proof.* Suppose *x* ∈ *S*(*A*, *b*). Thus we have,

$$\begin{aligned} A \otimes \mathfrak{x} &= b \\ \iff \max\_{j} (a\_{ij} + \mathfrak{x}\_{j}) &= b\_{i} \text{ for all } i \in M \\ \iff a\_{ij} + \mathfrak{x}\_{j} &= b\_{i} \text{ for some } j \in N \\ \iff \mathfrak{x}\_{j} &\le b\_{i} \otimes a\_{ij}^{-1} \text{ for all } i \in M \\ \iff \mathfrak{x}\_{j} &\le \min\_{i \in M} (b\_{i} \otimes a\_{ij}^{-1}) \text{ for all } j \in N \end{aligned}$$

Hence, *x* ≤ *x*¯ .

Now that *<sup>x</sup>* ∈ *<sup>S</sup>*(*A*, *<sup>b</sup>*). Since *Mj* ⊆ *<sup>M</sup>* we only need to show that *<sup>M</sup>* ⊆ *<sup>j</sup>*∈*Nx Mj*. Let *<sup>k</sup>* ∈ *<sup>M</sup>*. Since *bk* <sup>=</sup> *akj* <sup>⊗</sup> *xj* <sup>&</sup>gt; <sup>−</sup><sup>∞</sup> for some *<sup>j</sup>* <sup>∈</sup> *<sup>N</sup>* and *<sup>x</sup>*−<sup>1</sup> *<sup>j</sup>* ≥ *x*¯ −1 *<sup>j</sup>* <sup>≥</sup> *aij* <sup>⊗</sup> *<sup>b</sup>*−<sup>1</sup> *<sup>i</sup>* for every *i* ∈ *M* we have *x*−<sup>1</sup> *<sup>j</sup>* <sup>=</sup> *akj* <sup>⊗</sup> *<sup>b</sup>*−<sup>1</sup> *<sup>k</sup>* <sup>=</sup> max*i*∈*<sup>M</sup> aij* <sup>⊗</sup> *<sup>b</sup>*−<sup>1</sup> *<sup>i</sup>* . Hence *k* ∈ *Mj* and *xj* = *x*¯*j*.

Suppose that *<sup>x</sup>* ≤ *<sup>x</sup>*¯ and *<sup>j</sup>*∈*Nx Mj* = *<sup>M</sup>*. Let *<sup>k</sup>* ∈ *<sup>M</sup>*, *<sup>j</sup>* ∈ *<sup>N</sup>*. Then *akj* ⊗ *xj* ≤ *bk* if *akj* = −∞. If *akj* �= −∞ then

$$a\_{k\mathbf{j}} \otimes \mathbf{x}\_{\mathbf{j}} \le a\_{k\mathbf{j}} \otimes \mathbf{\bar{x}}\_{\mathbf{j}} \le a\_{k\mathbf{j}} \otimes b\_{\mathbf{k}} \otimes a\_{k\mathbf{j}}^{-1} = b\_{\mathbf{k}} \tag{9}$$

Therefore *A* ⊗ *x* ≤ *b*. At the same time *k* ∈ *Mj* for some *j* ∈ *N* satisfying *xj* = *x*¯*j*. For this *j* both inequalities in (9) are equalities and thus *A* ⊗ *x* = *b*.

The following is a summary of prerequisites proved in [1] and [12]:

**Theorem 2.** Let *<sup>A</sup>* = (*aij*) <sup>∈</sup> **<sup>R</sup>***m*×*<sup>n</sup>* be doubly **<sup>R</sup>** <sup>−</sup> *astic* and *<sup>b</sup>* <sup>∈</sup> **<sup>R</sup>***m*. The system *<sup>A</sup>* <sup>⊗</sup> *<sup>x</sup>* <sup>=</sup> *<sup>b</sup>* has a solution if and only if *x*¯(*A*, *b*) is a solution.

*Proof.* Follows from Theorem 1.

8 Will-be-set-by-IN-TECH

*kj* = max

*ij* ) for all *j* ∈ *N*

We now consider the cases when *A* = −∞ and/or *b* = −∞. Suppose that *b* = −∞. Then

Therefore if *<sup>A</sup>* <sup>=</sup> <sup>−</sup><sup>∞</sup> we have *<sup>S</sup>*(*A*, *<sup>b</sup>*) = **<sup>R</sup>***<sup>n</sup>* . Now, if *<sup>A</sup>* <sup>=</sup> <sup>−</sup><sup>∞</sup> and *<sup>b</sup>* �<sup>=</sup> <sup>−</sup><sup>∞</sup> then *<sup>S</sup>*(*A*, *<sup>b</sup>*) = <sup>∅</sup>. Thus, we may assume in this section that *A* = −∞ and *b* �= −∞. If *bk* = −∞ for some *<sup>k</sup>* <sup>∈</sup> *<sup>M</sup>* then for any *<sup>x</sup>* <sup>∈</sup> *<sup>S</sup>*(*A*, *<sup>b</sup>*) we have *xj* <sup>=</sup> <sup>−</sup><sup>∞</sup> if *akj* �<sup>=</sup> <sup>−</sup>∞, *<sup>j</sup>* <sup>∈</sup> *<sup>N</sup>*, as a result the *<sup>k</sup>th* equation could be removed from the system together with every column *j* in the matrix *A* where *akj* �= −∞ (if any), and set the corresponding *xj* = −∞. Consequently, we may assume

Moreover, if *<sup>b</sup>* <sup>∈</sup> **<sup>R</sup>***<sup>m</sup>* and *<sup>A</sup>* has an <sup>−</sup><sup>∞</sup> row then *<sup>S</sup>*(*A*, *<sup>b</sup>*) = <sup>∅</sup>. If there is an <sup>−</sup><sup>∞</sup> column *<sup>j</sup>* in *<sup>A</sup>* then *xj* may take on any value in a solution *x*. Thus, in what follows we assume without loss

**Theorem 1.** Let *<sup>A</sup>* = (*aij*) <sup>∈</sup> **<sup>R</sup>***m*×*<sup>n</sup>* be doubly **<sup>R</sup>** <sup>−</sup> *astic* and *<sup>b</sup>* <sup>∈</sup> **<sup>R</sup>***m*. Then *<sup>x</sup>* <sup>∈</sup> *<sup>S</sup>*(*A*, *<sup>b</sup>*) if and

*Mj* = *M* where *Nx* = {*j* ∈ *N*; *xj* = *x*¯(*A*, *b*)*j*}

*<sup>j</sup>* (*aij* <sup>+</sup> *xj*) = *bi* for all *<sup>i</sup>* <sup>∈</sup> *<sup>M</sup>*

*ij* for all *i* ∈ *M*

*<sup>j</sup>* ≥ *x*¯ −1

*<sup>i</sup>* . Hence *k* ∈ *Mj* and *xj* = *x*¯*j*.

*ij* )for all *j* ∈ *N*

*<sup>j</sup>* <sup>≥</sup> *aij* <sup>⊗</sup> *<sup>b</sup>*−<sup>1</sup>

*<sup>j</sup>*∈*Nx Mj* = *<sup>M</sup>*. Let *<sup>k</sup>* ∈ *<sup>M</sup>*, *<sup>j</sup>* ∈ *<sup>N</sup>*. Then *akj* ⊗ *xj* ≤ *bk* if *akj* = −∞. If

*<sup>j</sup>*∈*Nx Mj*. Let *<sup>k</sup>* ∈ *<sup>M</sup>*.

*<sup>i</sup>* for every *i* ∈ *M* we have

*kj* = *bk* (9)

⇐⇒ *aij* + *xj* = *bi* for some *j* ∈ *N*

*<sup>i</sup>*∈*M*(*bi* <sup>⊗</sup> *<sup>a</sup>*−<sup>1</sup>

*akj* <sup>⊗</sup> *xj* <sup>≤</sup> *akj* <sup>⊗</sup> *<sup>x</sup>*¯*<sup>j</sup>* <sup>≤</sup> *akj* <sup>⊗</sup> *bk* <sup>⊗</sup> *<sup>a</sup>*−<sup>1</sup>

*<sup>i</sup>* (*bi* <sup>⊗</sup> *<sup>a</sup>*−<sup>1</sup>

; *xj* = −∞, if *Aj* �= −∞, *j* ∈ *N*}

*ij* )} for all *j* ∈ *N*

; *A* ⊗ *x* = *b*}

We introduce the following notations

*S*(*A*, *b*) can simply be written as

without loss of generality that *<sup>b</sup>* <sup>∈</sup> **<sup>R</sup>***m*.

only if

Hence, *x* ≤ *x*¯ .

*<sup>j</sup>* <sup>=</sup> *akj* <sup>⊗</sup> *<sup>b</sup>*−<sup>1</sup>

*akj* �= −∞ then

Suppose that *<sup>x</sup>* ≤ *<sup>x</sup>*¯ and

*x*−<sup>1</sup>

of generality that *<sup>A</sup>* is doubly **<sup>R</sup>** <sup>−</sup> *astic* and *<sup>b</sup>* <sup>∈</sup> **<sup>R</sup>***m*.

ii) *j*∈*Nx*

Since *bk* <sup>=</sup> *akj* <sup>⊗</sup> *xj* <sup>&</sup>gt; <sup>−</sup><sup>∞</sup> for some *<sup>j</sup>* <sup>∈</sup> *<sup>N</sup>* and *<sup>x</sup>*−<sup>1</sup>

*<sup>k</sup>* <sup>=</sup> max*i*∈*<sup>M</sup> aij* <sup>⊗</sup> *<sup>b</sup>*−<sup>1</sup>

*Proof.* Suppose *x* ∈ *S*(*A*, *b*). Thus we have,

i) *x* ≤ *x*¯(*A*, *b*) and

*A* ⊗ *x* = *b* ⇐⇒ max

⇐⇒ *xj* <sup>≤</sup> *bi* <sup>⊗</sup> *<sup>a</sup>*−<sup>1</sup>

⇐⇒ *xj* ≤ min

Now that *<sup>x</sup>* ∈ *<sup>S</sup>*(*A*, *<sup>b</sup>*). Since *Mj* ⊆ *<sup>M</sup>* we only need to show that *<sup>M</sup>* ⊆

*<sup>S</sup>*(*A*, *<sup>b</sup>*) = {*<sup>x</sup>* <sup>∈</sup> **<sup>R</sup>***<sup>n</sup>*

*x*¯(*A*, *b*)*<sup>j</sup>* = min

*Mj* <sup>=</sup> {*<sup>k</sup>* <sup>∈</sup> *<sup>M</sup>*; *bk* <sup>⊗</sup> *<sup>a</sup>*−<sup>1</sup>

*<sup>S</sup>*(*A*, *<sup>b</sup>*) = {*<sup>x</sup>* <sup>∈</sup> **<sup>R</sup>***<sup>n</sup>*

*<sup>i</sup>*∈*M*(*bi* <sup>⊗</sup> *<sup>a</sup>*−<sup>1</sup>

Since *x*¯(*A*, *b*) has played an important role in the solution of *A* ⊗ *x* = *b*. This vector *x*¯ is called the *principal solution* to *A* ⊗ *x* = *b* [1], and we will call it likewise. The principal solution will also be used when studying the systems *A* ⊗ *x* ≤ *b* and also when solving the one-sided system containing both equations and inequalities. The one-sided systems containing both equations and inequalities have been studied in [16] and the result will be presented later in this chapter.

Note that the principal solution may not be a solution to the system *A* ⊗ *x* = *b*. More precisely, the following are observed in [12]:

**Corollary 1.** Let *<sup>A</sup>* = (*aij*) <sup>∈</sup> **<sup>R</sup>***m*×*<sup>n</sup>* be doubly **<sup>R</sup>** <sup>−</sup> *astic* and *<sup>b</sup>* <sup>∈</sup> **<sup>R</sup>***m*. Then the following three statements are equivalent:

$$\begin{aligned} \text{i) } S(A, b) &\neq \bigotimes \\ \text{ii) } \bar{x}(A, b) &\in S(A, b) \\ \text{iii) } \bigcup\_{j \in N} M\_j &= M \end{aligned}$$

*Proof.*

The statements follow from Theorems 1 and 2.

For the existence of a unique solution to the max-linear system *A* ⊗ *x* = *b* we have the following corollary:

**Corollary 2.** Let *<sup>A</sup>* = (*aij*) <sup>∈</sup> **<sup>R</sup>***m*×*<sup>n</sup>* be doubly **<sup>R</sup>** <sup>−</sup> *astic* and *<sup>b</sup>* <sup>∈</sup> **<sup>R</sup>***m*. Then *<sup>S</sup>*(*A*, *<sup>b</sup>*) = {*x*¯(*A*, *<sup>b</sup>*)} if and only if

$$\begin{aligned} \text{i) } & \bigcup\_{j \in N} M\_j = M \text{ and} \\ \text{ii) } & \bigcup\_{j \in N} M\_j \neq M \text{ for any } N' \subseteq N, N' \neq N \end{aligned}$$

*Proof.* Follows from Theorem 1.

The question of solvability and unique solvability of the system *A* ⊗ *x* = *b* was linked to the set covering and minimal set covering problem of combinatorics in [12].

#### **4.2. System of inequalities**

In this section we show how a solution to the one-sided system of inequalities can be obtained. Let *<sup>A</sup>* = (*aij*) <sup>∈</sup> **<sup>R</sup>***m*×*<sup>n</sup>* and *<sup>b</sup>* = (*b*1,..., *bm*)*<sup>T</sup>* <sup>∈</sup> **<sup>R</sup>**. A system of the form:

$$A \otimes \mathfrak{x} \le b$$

is called *one-sided max-linear system of inequalities* or just *one-sided system of inequalities*. The one-sided system of inequalities has received some attention in the past, see [1, 3] and [17] for more information. Here, we will only present a result which shows that the principal solution, *x*¯(*A*, *b*) is the greatest solution to (10). That is if (10) has a solution then *x*¯(*A*, *b*) is the greatest of all the solutions. We denote the solution set of (10) by *S*(*A*, *b*, ≤). That is

$$S(A, b \le) = \{ \mathfrak{x} \in \mathbb{R}^n ; A \otimes \mathfrak{x} \le b \}$$

**Theorem 3.** *x* ∈ *S*(*A*, *b*, ≤) if and only if *x* ≤ *x*¯(*A*, *b*).

*Proof.* Suppose *x* ∈ *S*(*A*, *b*, ≤). Then we have

$$\begin{aligned} A \otimes \mathfrak{x} &\leq b \\ \iff \max\_{j} (a\_{ij} + \mathfrak{x}\_{j}) &\leq b\_{i} \text{ for all } i \\ \iff a\_{ij} + \mathfrak{x}\_{j} &\leq b\_{i} \text{ for all } i, j \\ \iff \mathfrak{x}\_{j} &\leq b\_{i} \otimes a\_{ij}^{-1} \text{ for all } i, j \\ \iff \mathfrak{x}\_{j} &\leq \min\_{i} (b\_{i} \otimes a\_{ij}^{-1}) \text{ for all } j \\ \iff \mathfrak{x} &\leq \mathfrak{x}(A, b) \end{aligned}$$

and the proof is now complete.

The system of inequalities

$$\begin{aligned} A \otimes \mathfrak{x} &\le b \\ \mathfrak{C} \otimes \mathfrak{x} &\ge d \end{aligned} \tag{11}$$

was discussed in [18] where the following result was presented.

**Lemma 1.** A system of inequalities (11) has a solution if and only if *C* ⊗ *x*¯(*A*, *b*) ≥ *d*

#### **4.3. A system containing of both equations and inequalities**

In this section a system containing both equations and inequalities will be presented, the results were taken from [16]. Let *<sup>A</sup>* = (*aij*) <sup>∈</sup> **<sup>R</sup>***k*×*n*, *<sup>C</sup>* = (*cij*) <sup>∈</sup> **<sup>R</sup>***r*×*n*, *<sup>b</sup>* = (*b*1,..., *bk*)*<sup>T</sup>* <sup>∈</sup> **<sup>R</sup>***<sup>k</sup>* and *<sup>d</sup>* = (*d*1,..., *dr*)*<sup>T</sup>* <sup>∈</sup> **<sup>R</sup>***r*. A *one-sided max-linear system with both equations and inequalities* is of the form:

$$\begin{aligned} A \otimes \mathfrak{x} &= b \\ \mathfrak{C} \otimes \mathfrak{x} &\le d \end{aligned} \tag{12}$$

We shall use the following notation throughout this paper

$$R = \{1, 2, \ldots, r\}$$

$$S(A, C, b, d) = \{x \in \mathbb{R}^n; A \otimes x = b \text{ and } C \otimes x \le d\}$$

$$S(C, d, \le) = \{x \in \mathbb{R}^n; C \otimes x \le d\}$$

$$\mathfrak{F}\_j(\mathbb{C}, d) = \min\_{i \in K} (d\_i \otimes c\_{ij}^{-1}) \text{ for all } j \in N$$

$$K = \{1, \ldots, k\}$$

$$K\_j = \left\{k \in K; b\_k \otimes a\_{kj}^{-1} = \min\_{i \in K} \left(b\_i \otimes a\_{ij}^{-1}\right)\right\} \text{ for all } j \in N$$

$$\mathfrak{F}\_j(A, b) = \min\_{i \in K} (b\_i \otimes a\_{ij}^{-1}) \text{ for all } j \in N$$

$$\mathfrak{F} = (\mathfrak{F}\_1, \ldots, \mathfrak{F}\_n)^T$$

$$I = \{j \in N; \mathfrak{F}\_j(\mathbb{C}, d) \ge \mathfrak{F}\_j(A, b)\} \text{ and }$$

$$L = N \mid f$$

We also define the vector *x*ˆ = (*x*ˆ1, *x*ˆ2, ..., *x*ˆ*n*)*T*, where

$$\mathfrak{X}\_{\dot{f}}(A,\mathsf{C},b,d) \equiv \begin{cases} \mathfrak{X}\_{\dot{f}}(A,b) & \text{if } \dot{f} \in I \\ \mathfrak{X}\_{\dot{f}}(\mathsf{C},d) & \text{if } \dot{f} \in L \end{cases} \tag{13}$$

and *Nx*<sup>ˆ</sup> = {*j* ∈ *N*; *x*ˆ*<sup>j</sup>* = *x*¯*j*}.

10 Will-be-set-by-IN-TECH

In this section we show how a solution to the one-sided system of inequalities can be obtained.

is called *one-sided max-linear system of inequalities* or just *one-sided system of inequalities*. The one-sided system of inequalities has received some attention in the past, see [1, 3] and [17] for more information. Here, we will only present a result which shows that the principal solution, *x*¯(*A*, *b*) is the greatest solution to (10). That is if (10) has a solution then *x*¯(*A*, *b*) is the greatest

*<sup>S</sup>*(*A*, *<sup>b</sup>*, <sup>≤</sup>) = {*<sup>x</sup>* <sup>∈</sup> **<sup>R</sup>***n*; *<sup>A</sup>* <sup>⊗</sup> *<sup>x</sup>* <sup>≤</sup> *<sup>b</sup>*}

*<sup>j</sup>* (*aij* <sup>+</sup> *xj*) <sup>≤</sup> *bi* for all *<sup>i</sup>*

*<sup>i</sup>* (*bi* <sup>⊗</sup> *<sup>a</sup>*−<sup>1</sup>

*A* ⊗ *x* ≤ *b*

In this section a system containing both equations and inequalities will be presented, the results were taken from [16]. Let *<sup>A</sup>* = (*aij*) <sup>∈</sup> **<sup>R</sup>***k*×*n*, *<sup>C</sup>* = (*cij*) <sup>∈</sup> **<sup>R</sup>***r*×*n*, *<sup>b</sup>* = (*b*1,..., *bk*)*<sup>T</sup>* <sup>∈</sup> **<sup>R</sup>***<sup>k</sup>* and *<sup>d</sup>* = (*d*1,..., *dr*)*<sup>T</sup>* <sup>∈</sup> **<sup>R</sup>***r*. A *one-sided max-linear system with both equations and inequalities* is

**Lemma 1.** A system of inequalities (11) has a solution if and only if *C* ⊗ *x*¯(*A*, *b*) ≥ *d*

*ij* for all *i*, *j*

*ij* )for all *j*

*<sup>C</sup>* <sup>⊗</sup> *<sup>x</sup>* <sup>≥</sup> *<sup>d</sup>* (11)

⇐⇒ *aij* + *xj* ≤ *bi* for all *i*, *j*

⇐⇒ *xj* <sup>≤</sup> *bi* <sup>⊗</sup> *<sup>a</sup>*−<sup>1</sup>

⇐⇒ *xj* ≤ min

was discussed in [18] where the following result was presented.

**4.3. A system containing of both equations and inequalities**

⇐⇒ *x* ≤ *x*¯(*A*, *b*)

*A* ⊗ *x* ≤ *b* (10)

Let *<sup>A</sup>* = (*aij*) <sup>∈</sup> **<sup>R</sup>***m*×*<sup>n</sup>* and *<sup>b</sup>* = (*b*1,..., *bm*)*<sup>T</sup>* <sup>∈</sup> **<sup>R</sup>**. A system of the form:

of all the solutions. We denote the solution set of (10) by *S*(*A*, *b*, ≤). That is

*A* ⊗ *x* ≤ *b* ⇐⇒ max

**Theorem 3.** *x* ∈ *S*(*A*, *b*, ≤) if and only if *x* ≤ *x*¯(*A*, *b*).

*Proof.* Suppose *x* ∈ *S*(*A*, *b*, ≤). Then we have

and the proof is now complete.

The system of inequalities

of the form:

**4.2. System of inequalities**

**Theorem 4.** Let *<sup>A</sup>* = (*aij*) <sup>∈</sup> **<sup>R</sup>***k*×*n*, *<sup>C</sup>* = (*cij*) <sup>∈</sup> **<sup>R</sup>***r*×*n*, *<sup>b</sup>* = (*b*1,..., *bk*)*<sup>T</sup>* <sup>∈</sup> **<sup>R</sup>***<sup>k</sup>* and *<sup>d</sup>* <sup>=</sup> (*d*1,..., *dr*)*<sup>T</sup>* <sup>∈</sup> **<sup>R</sup>***r*. Then the following three statements are equivalent:

$$\begin{aligned} \text{(i)} \quad &S(A,\mathsf{C},b,d) \neq \bigotimes \\ \text{(ii)} \quad &\hat{\mathsf{x}}(A,\mathsf{C},b,d) \in S(A,\mathsf{C},b,d) \\ \text{(iii)} \quad &\bigcup\_{j \in I} K\_j = K \end{aligned}$$

*Proof.* (*i*) =⇒ (*ii*). Let *x* ∈ *S*(*A*, *C*, *b*, *d*), therefore *x* ∈ *S*(*A*, *b*) and *x* ∈ *S*(*C*, *d*, ≤). Since *x* ∈ *S*(*C*, *d*, ≤), it follows from Theorem 3 that *x* ≤ *x*¯(*C*, *d*). Now that *x* ∈ *S*(*A*, *b*) and also *x* ∈ *S*(*C*, *d*, ≤), we need to show that *x*¯*j*(*C*, *d*) ≥ *x*¯*j*(*A*, *b*) for all *j* ∈ *Nx* (that is *Nx* ⊆ *J*). Let *j* ∈ *Nx* then *xj* = *x*¯*j*(*A*, *b*). Since *x* ∈ *S*(*C*, *d*, ≤) we have *x* ≤ *x*¯(*C*, *d*) and therefore *<sup>x</sup>*¯*j*(*A*, *<sup>b</sup>*) ≤ *<sup>x</sup>*¯*j*(*C*, *<sup>d</sup>*) thus *<sup>j</sup>* ∈ *<sup>J</sup>*. Hence, *Nx* ⊆ *<sup>J</sup>* and by Theorem 1 *<sup>j</sup>*∈*<sup>J</sup> Kj* = *<sup>K</sup>*. This also proves (*i*) =⇒ (*iii*) (*iii*) =⇒ (*i*). Suppose *<sup>j</sup>*∈*<sup>J</sup> Kj* = *<sup>K</sup>*. Since *<sup>x</sup>*ˆ(*A*, *<sup>C</sup>*, *<sup>b</sup>*, *<sup>d</sup>*) ≤ *<sup>x</sup>*¯(*C*, *<sup>d</sup>*) we have *<sup>x</sup>*ˆ(*A*, *<sup>C</sup>*, *<sup>b</sup>*, *<sup>d</sup>*) ∈ *<sup>S</sup>*(*C*, *<sup>d</sup>*, ≤). Also *<sup>x</sup>*ˆ(*A*, *<sup>C</sup>*, *<sup>b</sup>*, *<sup>d</sup>*) ≤ *<sup>x</sup>*¯(*A*, *<sup>b</sup>*) and *Nx*<sup>ˆ</sup> ⊇ *<sup>J</sup>* gives *<sup>j</sup>*∈*Nx*<sup>ˆ</sup>(*A*,*C*,*b*,*d*) *Kj* <sup>⊇</sup> *<sup>j</sup>*∈*<sup>J</sup> Kj* = *<sup>K</sup>*. Hence *<sup>j</sup>*∈*Nx*<sup>ˆ</sup>(*A*,*C*,*b*,*d*) *Kj* = *<sup>K</sup>*, therefore *<sup>x</sup>*ˆ(*A*, *<sup>C</sup>*, *<sup>b</sup>*, *<sup>d</sup>*) ∈ *<sup>S</sup>*(*A*, *<sup>b</sup>*) and *<sup>x</sup>*ˆ(*A*, *<sup>C</sup>*, *<sup>b</sup>*, *<sup>d</sup>*) ∈ *<sup>S</sup>*(*C*, *<sup>d</sup>*, ≤). Hence *x*ˆ(*A*, *C*, *b*, *d*) ∈ *S*(*A*, *C*, *b*, *d*) (that is *S*(*A*, *C*, *b*, *d*) �= ∅) and this also proves (*iii*) =⇒ (*ii*).

**Theorem 5.** Let *<sup>A</sup>* = (*aij*) <sup>∈</sup> **<sup>R</sup>***k*×*n*, *<sup>C</sup>* = (*cij*) <sup>∈</sup> **<sup>R</sup>***r*×*n*, *<sup>b</sup>* = (*b*1,..., *bk*)*<sup>T</sup>* <sup>∈</sup> **<sup>R</sup>***<sup>k</sup>* and *<sup>d</sup>* <sup>=</sup> (*d*1,..., *dr*)*<sup>T</sup>* <sup>∈</sup> **<sup>R</sup>***r*. Then *<sup>x</sup>* <sup>∈</sup> *<sup>S</sup>*(*A*, *<sup>C</sup>*, *<sup>b</sup>*, *<sup>d</sup>*) if and only if

> (i) *x* ≤ *x*ˆ(*A*, *C*, *b*, *d*) and (ii) *j*∈*Nx Kj* = *K* where *Nx* = {*j* ∈ *N* ; *xj* = *x*¯*j*(*A*, *b*)}

*Proof.* (=⇒) Let *x* ∈ *S*(*A*, *C*, *b*, *d*), then *x* ≤ *x*¯(*A*, *b*) and *x* ≤ *x*¯(*C*, *d*). Since *x*ˆ(*A*, *C*, *b*, *d*) = *<sup>x</sup>*¯(*A*, *<sup>b</sup>*) <sup>⊕</sup>� *x*¯(*C*, *d*) we have *x* ≤ *x*ˆ(*A*, *C*, *b*, *d*). Also, *x* ∈ *S*(*A*, *C*, *b*, *d*) implies that *x* ∈ *S*(*C*, *d*, ≤). It follows from Theorem 1 that *<sup>j</sup>*∈*Nx Kj* = *<sup>K</sup>*.

(⇐=) Suppose that *<sup>x</sup>* <sup>≤</sup> *<sup>x</sup>*ˆ(*A*, *<sup>C</sup>*, *<sup>b</sup>*, *<sup>d</sup>*) = *<sup>x</sup>*¯(*A*, *<sup>b</sup>*) <sup>⊕</sup>� *x*¯(*C*, *d*) and *<sup>j</sup>*∈*Nx Kj* = *<sup>K</sup>*. It follows from Theorem 1 that *x* ∈ *S*(*A*, *b*), also by Theorem 3 *x* ∈ *S*(*C*, *d*, ≤). Thus *x* ∈ *S*(*A*, *b*) ∩ *S*(*C*, *d*, ≤) = *S*(*A*, *C*, *b*, *d*).

We introduce the symbol |*X*| which stands for the number of elements of the set *X*.

**Lemma 2.** Let *<sup>A</sup>* = (*aij*) <sup>∈</sup> **<sup>R</sup>***k*×*n*, *<sup>C</sup>* = (*cij*) <sup>∈</sup> **<sup>R</sup>***r*×*n*, *<sup>b</sup>* = (*b*1,..., *bk*)*<sup>T</sup>* <sup>∈</sup> **<sup>R</sup>***<sup>k</sup>* and *<sup>d</sup>* <sup>=</sup> (*d*1,..., *dr*)*<sup>T</sup>* <sup>∈</sup> **<sup>R</sup>***r*. If <sup>|</sup>*S*(*A*, *<sup>C</sup>*, *<sup>b</sup>*, *<sup>d</sup>*)<sup>|</sup> <sup>=</sup> 1 then <sup>|</sup>*S*(*A*, *<sup>b</sup>*)<sup>|</sup> <sup>=</sup> 1.

*Proof.* Suppose <sup>|</sup>*S*(*A*, *<sup>C</sup>*, *<sup>b</sup>*, *<sup>d</sup>*)<sup>|</sup> <sup>=</sup> 1, that is *<sup>S</sup>*(*A*, *<sup>C</sup>*, *<sup>b</sup>*, *<sup>d</sup>*) = {*x*} for an *<sup>x</sup>* <sup>∈</sup> **<sup>R</sup>***n*. Since *S*(*A*, *C*, *b*, *d*) = {*x*} we have *x* ∈ *S*(*A*, *b*) and thus *S*(*A*, *b*) �= ∅. For contradiction, suppose |*S*(*A*, *b*)| > 1. We need to check the following two cases: (i) *L* �= ∅ and (ii) *L* = ∅ where *L* = *N* \ *J*, and show in each case that |*S*(*A*, *C*, *b*, *d*)| > 1.

**Proof of Case (i)**, that is *L* �= ∅: Suppose that *L* contains only one element say *n* ∈ *N* i.e *L* = {*n*}. Since *x* ∈ *S*(*A*, *C*, *b*, *d*) it follows from Theorem 4that *x*ˆ(*A*, *C*, *b*, *d*) ∈ *S*(*A*, *C*, *b*, *d*). That is *x* = *x*ˆ(*A*, *C*, *b*, *d*) =

(*x*¯1(*A*, *<sup>b</sup>*), *<sup>x</sup>*¯2(*A*, *<sup>b</sup>*),..., *<sup>x</sup>*¯*n*−1(*A*, *<sup>b</sup>*), *<sup>x</sup>*¯*n*(*C*, *<sup>d</sup>*)) ∈ *<sup>S</sup>*(*A*, *<sup>C</sup>*, *<sup>b</sup>*, *<sup>d</sup>*). It can also be seen that, *<sup>x</sup>*¯(*C*, *<sup>d</sup>*)*<sup>n</sup>* < *x*¯*n*(*A*, *b*) and any vector of the form *z* =

(*x*¯1(*A*, *<sup>b</sup>*), *<sup>x</sup>*¯2(*A*, *<sup>b</sup>*),..., *<sup>x</sup>*¯*n*−1(*A*, *<sup>b</sup>*), *<sup>α</sup>*) ∈ *<sup>S</sup>*(*A*, *<sup>C</sup>*, *<sup>b</sup>*, *<sup>d</sup>*), where *<sup>α</sup>* ≤ *<sup>x</sup>*¯*n*(*C*, *<sup>d</sup>*). Hence |*S*(*A*, *C*, *b*, *d*)| > 1. If *L* contains more than one element, then the proof is done in a similar way.

**Proof of Case (ii)**, that is *L* = ∅ (*J* = *N*): Suppose that *J* = *N*. Then we have *x*ˆ(*A*, *C*, *b*, *d*) = *x*¯(*A*, *b*) ≤ *x*¯(*C*, *d*). Suppose without loss of generality that *x*, *x* � ∈ *S*(*A*, *b*) such that *x* �= *x* � . Then *x* ≤ *x*¯(*A*, *b*) ≤ *x*¯(*C*, *d*) and also *x* � ≤ *x*¯(*A*, *b*) ≤ *x*¯(*C*, *d*). Thus, *x*, *x* � ∈ *S*(*C*, *d*, ≤). Consequently, *x*, *x* � ∈ *S*(*A*, *C*, *b*, *d*) and *x* �= *x* � . Hence |*S*(*A*, *C*, *b*, *d*)| > 1.

**Theorem 6.** Let *<sup>A</sup>* = (*aij*) <sup>∈</sup> **<sup>R</sup>***k*×*n*, *<sup>C</sup>* = (*cij*) <sup>∈</sup> **<sup>R</sup>***r*×*n*, *<sup>b</sup>* = (*b*1,..., *bk*)*<sup>T</sup>* <sup>∈</sup> **<sup>R</sup>***<sup>k</sup>* and *<sup>d</sup>* <sup>=</sup> (*d*1,..., *dr*)*<sup>T</sup>* <sup>∈</sup> **<sup>R</sup>***r*. If <sup>|</sup>*S*(*A*, *<sup>C</sup>*, *<sup>b</sup>*, *<sup>d</sup>*)<sup>|</sup> <sup>=</sup> 1 then *<sup>J</sup>* <sup>=</sup> *<sup>N</sup>*.

*Proof.* Suppose |*S*(*A*, *<sup>C</sup>*, *<sup>b</sup>*, *<sup>d</sup>*)| = 1. It follows from Theorem 4 that *<sup>j</sup>*∈*<sup>J</sup> Kj* = *<sup>K</sup>*. Also, |*S*(*A*, *C*, *b*, *d*)| = 1 implies that |*S*(*A*, *b*)| = 1 (Lemma 2). Moreover, |*S*(*A*, *b*)| = 1 implies that *<sup>j</sup>*∈*<sup>N</sup> Kj* <sup>=</sup> *<sup>K</sup>* and *<sup>j</sup>*∈*N*� *Kj* �<sup>=</sup> *<sup>K</sup>*, *<sup>N</sup>*� <sup>⊆</sup> *<sup>N</sup>*, *<sup>N</sup>*� �= *<sup>N</sup>* (Theorem 2). Since *<sup>J</sup>* ⊆ *<sup>N</sup>* and *<sup>j</sup>*∈*<sup>J</sup> Kj* = *<sup>K</sup>*, we have *J* = *N*.

**Corollary 3.** Let *<sup>A</sup>* = (*aij*) <sup>∈</sup> **<sup>R</sup>***k*×*n*, *<sup>C</sup>* = (*cij*) <sup>∈</sup> **<sup>R</sup>***r*×*n*, *<sup>b</sup>* = (*b*1,..., *bk*)*<sup>T</sup>* <sup>∈</sup> **<sup>R</sup>***<sup>k</sup>* and *<sup>d</sup>* <sup>=</sup> (*d*1,..., *dr*)*<sup>T</sup>* <sup>∈</sup> **<sup>R</sup>***r*. If <sup>|</sup>*S*(*A*, *<sup>C</sup>*, *<sup>b</sup>*, *<sup>d</sup>*)<sup>|</sup> <sup>=</sup> 1 then *<sup>S</sup>*(*A*, *<sup>C</sup>*, *<sup>b</sup>*, *<sup>d</sup>*) = {*x*¯(*A*, *<sup>b</sup>*)}.

*Proof.* The statement follows from Theorem 6 and Lemma 2.

12 Will-be-set-by-IN-TECH

*<sup>j</sup>*∈*Nx*<sup>ˆ</sup>(*A*,*C*,*b*,*d*) *Kj* = *<sup>K</sup>*, therefore *<sup>x</sup>*ˆ(*A*, *<sup>C</sup>*, *<sup>b</sup>*, *<sup>d</sup>*) ∈ *<sup>S</sup>*(*A*, *<sup>b</sup>*) and *<sup>x</sup>*ˆ(*A*, *<sup>C</sup>*, *<sup>b</sup>*, *<sup>d</sup>*) ∈ *<sup>S</sup>*(*C*, *<sup>d</sup>*, ≤). Hence

**Theorem 5.** Let *<sup>A</sup>* = (*aij*) <sup>∈</sup> **<sup>R</sup>***k*×*n*, *<sup>C</sup>* = (*cij*) <sup>∈</sup> **<sup>R</sup>***r*×*n*, *<sup>b</sup>* = (*b*1,..., *bk*)*<sup>T</sup>* <sup>∈</sup> **<sup>R</sup>***<sup>k</sup>* and *<sup>d</sup>* <sup>=</sup>

*Proof.* (=⇒) Let *x* ∈ *S*(*A*, *C*, *b*, *d*), then *x* ≤ *x*¯(*A*, *b*) and *x* ≤ *x*¯(*C*, *d*). Since *x*ˆ(*A*, *C*, *b*, *d*) =

Theorem 1 that *x* ∈ *S*(*A*, *b*), also by Theorem 3 *x* ∈ *S*(*C*, *d*, ≤). Thus *x* ∈ *S*(*A*, *b*) ∩ *S*(*C*, *d*, ≤) =

**Lemma 2.** Let *<sup>A</sup>* = (*aij*) <sup>∈</sup> **<sup>R</sup>***k*×*n*, *<sup>C</sup>* = (*cij*) <sup>∈</sup> **<sup>R</sup>***r*×*n*, *<sup>b</sup>* = (*b*1,..., *bk*)*<sup>T</sup>* <sup>∈</sup> **<sup>R</sup>***<sup>k</sup>* and *<sup>d</sup>* <sup>=</sup>

*Proof.* Suppose <sup>|</sup>*S*(*A*, *<sup>C</sup>*, *<sup>b</sup>*, *<sup>d</sup>*)<sup>|</sup> <sup>=</sup> 1, that is *<sup>S</sup>*(*A*, *<sup>C</sup>*, *<sup>b</sup>*, *<sup>d</sup>*) = {*x*} for an *<sup>x</sup>* <sup>∈</sup> **<sup>R</sup>***n*. Since *S*(*A*, *C*, *b*, *d*) = {*x*} we have *x* ∈ *S*(*A*, *b*) and thus *S*(*A*, *b*) �= ∅. For contradiction, suppose |*S*(*A*, *b*)| > 1. We need to check the following two cases: (i) *L* �= ∅ and (ii) *L* = ∅ where

**Proof of Case (i)**, that is *L* �= ∅: Suppose that *L* contains only one element say *n* ∈ *N* i.e *L* = {*n*}. Since *x* ∈ *S*(*A*, *C*, *b*, *d*) it follows from Theorem 4that *x*ˆ(*A*, *C*, *b*, *d*) ∈ *S*(*A*, *C*, *b*, *d*).

(*x*¯1(*A*, *<sup>b</sup>*), *<sup>x</sup>*¯2(*A*, *<sup>b</sup>*),..., *<sup>x</sup>*¯*n*−1(*A*, *<sup>b</sup>*), *<sup>x</sup>*¯*n*(*C*, *<sup>d</sup>*)) ∈ *<sup>S</sup>*(*A*, *<sup>C</sup>*, *<sup>b</sup>*, *<sup>d</sup>*). It can also be seen that, *<sup>x</sup>*¯(*C*, *<sup>d</sup>*)*<sup>n</sup>* <

(*x*¯1(*A*, *<sup>b</sup>*), *<sup>x</sup>*¯2(*A*, *<sup>b</sup>*),..., *<sup>x</sup>*¯*n*−1(*A*, *<sup>b</sup>*), *<sup>α</sup>*) ∈ *<sup>S</sup>*(*A*, *<sup>C</sup>*, *<sup>b</sup>*, *<sup>d</sup>*), where *<sup>α</sup>* ≤ *<sup>x</sup>*¯*n*(*C*, *<sup>d</sup>*). Hence |*S*(*A*, *C*, *b*, *d*)| > 1. If *L* contains more than one element, then the proof is done in a similar

**Proof of Case (ii)**, that is *L* = ∅ (*J* = *N*): Suppose that *J* = *N*. Then we have *x*ˆ(*A*, *C*, *b*, *d*) =

**Theorem 6.** Let *<sup>A</sup>* = (*aij*) <sup>∈</sup> **<sup>R</sup>***k*×*n*, *<sup>C</sup>* = (*cij*) <sup>∈</sup> **<sup>R</sup>***r*×*n*, *<sup>b</sup>* = (*b*1,..., *bk*)*<sup>T</sup>* <sup>∈</sup> **<sup>R</sup>***<sup>k</sup>* and *<sup>d</sup>* <sup>=</sup>

�

�

�

≤ *x*¯(*A*, *b*) ≤ *x*¯(*C*, *d*). Thus, *x*, *x*

. Hence |*S*(*A*, *C*, *b*, *d*)| > 1.

∈ *S*(*A*, *b*) such that *x* �=

∈ *S*(*C*, *d*, ≤).

�

*<sup>j</sup>*∈*Nx Kj* = *<sup>K</sup>*.

We introduce the symbol |*X*| which stands for the number of elements of the set *X*.

*Kj* = *K* where *Nx* = {*j* ∈ *N* ; *xj* = *x*¯*j*(*A*, *b*)}

*x*¯(*C*, *d*) we have *x* ≤ *x*ˆ(*A*, *C*, *b*, *d*). Also, *x* ∈ *S*(*A*, *C*, *b*, *d*) implies that *x* ∈ *S*(*C*, *d*, ≤).

*x*¯(*C*, *d*) and

*x*ˆ(*A*, *C*, *b*, *d*) ∈ *S*(*A*, *C*, *b*, *d*) (that is *S*(*A*, *C*, *b*, *d*) �= ∅) and this also proves (*iii*) =⇒ (*ii*).

*<sup>j</sup>*∈*<sup>J</sup> Kj* = *<sup>K</sup>*. Since *<sup>x</sup>*ˆ(*A*, *<sup>C</sup>*, *<sup>b</sup>*, *<sup>d</sup>*) ≤ *<sup>x</sup>*¯(*C*, *<sup>d</sup>*) we have *<sup>x</sup>*ˆ(*A*, *<sup>C</sup>*, *<sup>b</sup>*, *<sup>d</sup>*) ∈

*<sup>j</sup>*∈*Nx*<sup>ˆ</sup>(*A*,*C*,*b*,*d*) *Kj* <sup>⊇</sup>

*<sup>j</sup>*∈*<sup>J</sup> Kj* = *<sup>K</sup>*. Hence

*<sup>j</sup>*∈*Nx Kj* = *<sup>K</sup>*. It follows from

(*i*) =⇒ (*iii*)

*<sup>x</sup>*¯(*A*, *<sup>b</sup>*) <sup>⊕</sup>�

*S*(*A*, *C*, *b*, *d*).

(*iii*) =⇒ (*i*). Suppose

*<sup>S</sup>*(*C*, *<sup>d</sup>*, ≤). Also *<sup>x</sup>*ˆ(*A*, *<sup>C</sup>*, *<sup>b</sup>*, *<sup>d</sup>*) ≤ *<sup>x</sup>*¯(*A*, *<sup>b</sup>*) and *Nx*<sup>ˆ</sup> ⊇ *<sup>J</sup>* gives

(*d*1,..., *dr*)*<sup>T</sup>* <sup>∈</sup> **<sup>R</sup>***r*. Then *<sup>x</sup>* <sup>∈</sup> *<sup>S</sup>*(*A*, *<sup>C</sup>*, *<sup>b</sup>*, *<sup>d</sup>*) if and only if

(ii) *j*∈*Nx*

(⇐=) Suppose that *<sup>x</sup>* <sup>≤</sup> *<sup>x</sup>*ˆ(*A*, *<sup>C</sup>*, *<sup>b</sup>*, *<sup>d</sup>*) = *<sup>x</sup>*¯(*A*, *<sup>b</sup>*) <sup>⊕</sup>�

(*d*1,..., *dr*)*<sup>T</sup>* <sup>∈</sup> **<sup>R</sup>***r*. If <sup>|</sup>*S*(*A*, *<sup>C</sup>*, *<sup>b</sup>*, *<sup>d</sup>*)<sup>|</sup> <sup>=</sup> 1 then <sup>|</sup>*S*(*A*, *<sup>b</sup>*)<sup>|</sup> <sup>=</sup> 1.

*L* = *N* \ *J*, and show in each case that |*S*(*A*, *C*, *b*, *d*)| > 1.

*x*¯(*A*, *b*) ≤ *x*¯(*C*, *d*). Suppose without loss of generality that *x*, *x*

∈ *S*(*A*, *C*, *b*, *d*) and *x* �= *x*

It follows from Theorem 1 that

That is *x* = *x*ˆ(*A*, *C*, *b*, *d*) =

way.

*x* �

Consequently, *x*, *x*

*x*¯*n*(*A*, *b*) and any vector of the form *z* =

. Then *x* ≤ *x*¯(*A*, *b*) ≤ *x*¯(*C*, *d*) and also *x*

(*d*1,..., *dr*)*<sup>T</sup>* <sup>∈</sup> **<sup>R</sup>***r*. If <sup>|</sup>*S*(*A*, *<sup>C</sup>*, *<sup>b</sup>*, *<sup>d</sup>*)<sup>|</sup> <sup>=</sup> 1 then *<sup>J</sup>* <sup>=</sup> *<sup>N</sup>*.

�

(i) *x* ≤ *x*ˆ(*A*, *C*, *b*, *d*) and

**Corollary 4.** Let *<sup>A</sup>* = (*aij*) <sup>∈</sup> **<sup>R</sup>***k*×*n*, *<sup>C</sup>* = (*cij*) <sup>∈</sup> **<sup>R</sup>***r*×*n*, *<sup>b</sup>* = (*b*1,..., *bk*)*<sup>T</sup>* <sup>∈</sup> **<sup>R</sup>***<sup>k</sup>* and *<sup>d</sup>* <sup>=</sup> (*d*1,..., *dr*)*<sup>T</sup>* <sup>∈</sup> **<sup>R</sup>***k*. Then, the following three statements are equivalent:

$$\begin{aligned} \text{(i)} \quad &|S(A, \mathbb{C}, b, d)| = 1\\ \text{(ii)} \quad &|S(A, b)| = 1 \text{ and } f = N\\ \text{(iii)} \quad &\bigcup\_{j \in J} K\_j = \text{Kand} \bigcup\_{j \in J'} K\_j \neq K, \text{ for every } f' \subseteq f, f' \neq f, \text{ and } f = N \end{aligned}$$

*Proof.* (*i*) =⇒ (*ii*) Follows from Lemma 2 and Theorem 6. (*ii*) =⇒ (*i*) Let *J* = *N*, therefore *x*¯ ≤ *x*¯(*C*, *d*) and thus *S*(*A*, *b*) ⊆ *S*(*C*, *d*, ≤). Therefore we have *S*(*A*, *C*, *b*, *d*) = *S*(*A*, *b*) ∩ *S*(*C*, *d*, ≤) = *S*(*A*, *b*). Hence |*S*(*A*, *C*, *b*, *d*)| = 1. (*ii*) =⇒ (*iii*) Suppose that *S*(*A*, *b*) = {*x*} and *J* = *N*. It follows from Theorem 2 that *<sup>j</sup>*∈*<sup>N</sup> Kj* <sup>=</sup> *<sup>K</sup>* and *<sup>j</sup>*∈*N*� *Kj* �<sup>=</sup> *<sup>K</sup>*, *<sup>N</sup>*� <sup>⊆</sup> *<sup>N</sup>*, *<sup>N</sup>*� �= *N*. Since *J* = *N* the statement now follows from Theorem 2.

(*iii*) =⇒ (*ii*) It is immediate that *J* = *N* and the statement now follows from Theorem 2.

**Theorem 7.** Let *<sup>A</sup>* = (*aij*) <sup>∈</sup> **<sup>R</sup>***k*×*n*, *<sup>C</sup>* = (*cij*) <sup>∈</sup> **<sup>R</sup>***r*×*n*, *<sup>b</sup>* = (*b*1,..., *bk*)*<sup>T</sup>* <sup>∈</sup> **<sup>R</sup>***<sup>k</sup>* and *<sup>d</sup>* <sup>=</sup> (*d*1,..., *dr*)*<sup>T</sup>* <sup>∈</sup> **<sup>R</sup>***k*. If <sup>|</sup>*S*(*A*, *<sup>C</sup>*, *<sup>b</sup>*, *<sup>d</sup>*)<sup>|</sup> <sup>&</sup>gt; 1 then <sup>|</sup>*S*(*A*, *<sup>C</sup>*, *<sup>b</sup>*, *<sup>d</sup>*)<sup>|</sup> is infinite .

*Proof.* Suppose |*S*(*A*, *<sup>C</sup>*, *<sup>b</sup>*, *<sup>d</sup>*)| > 1. By Corollary 4 we have *<sup>j</sup>*∈*<sup>J</sup> Kj* = *<sup>K</sup>*, for some *<sup>J</sup>* ⊆ *<sup>N</sup>*, *<sup>J</sup>* �= *<sup>N</sup>*(that is ∃*<sup>j</sup>* ∈ *<sup>N</sup>* such that *<sup>x</sup>*¯*j*(*A*, *<sup>b</sup>*) > *<sup>x</sup>*¯*j*(*C*, *<sup>d</sup>*)). Now *<sup>J</sup>* ⊆ *<sup>N</sup>* and *<sup>j</sup>*∈*<sup>J</sup> Kj* = *<sup>K</sup>*, Theorem 5 implies that any vector *x* = (*x*1, *x*2, ..., *xn*)*<sup>T</sup>* of the form

$$\mathfrak{x}\_{\mathfrak{j}} \equiv \begin{cases} \mathfrak{x}\_{\mathfrak{j}}(A, b) & \text{if } j \in J \\ y \le \mathfrak{x}\_{\mathfrak{j}}(C, d) & \text{if } j \in L \end{cases}$$

is in *S*(*A*, *C*, *b*, *d*), and the statement follows.

**Remark 1.** From Theorem 7 we can say that the number of solutions to the one-sided system containing both equations and inequalities can only be 0, 1, or ∞.

The vector *x*ˆ(*A*, *C*, *b*, *d*) plays an important role in the solution of the one-sided system containing both equations and inequalities. This role is the same as that of the principal solution *x*¯(*A*, *b*) to the one-sided max-linear system *A* ⊗ *x* = *b*, see [19] for more details.

#### **5. Max-linear program with equation and inequality constraints**

Suppose that the vector *<sup>f</sup>* = (*f*1, *<sup>f</sup>*2, ..., *fn*)*<sup>T</sup>* <sup>∈</sup> **<sup>R</sup>***<sup>n</sup>* is given. The task of minimizing [maximizing]the function *<sup>f</sup>*(*x*) = *<sup>f</sup> <sup>T</sup>* <sup>⊗</sup> *<sup>x</sup>* <sup>=</sup> max(*f*<sup>1</sup> <sup>+</sup> *<sup>x</sup>*1, *<sup>f</sup>*<sup>1</sup> <sup>+</sup> *<sup>x</sup>*2..., *fn* <sup>+</sup> *xn*) subject to (12) is called max-linear program with one-sided equations and inequalities and will be denoted by MLPmin <sup>≤</sup> and [MLPmax <sup>≤</sup> ]. We denote the sets of optimal solutions by *<sup>S</sup>*min(*A*, *<sup>C</sup>*, *<sup>b</sup>*, *<sup>d</sup>*) and *S*max(*A*, *C*, *b*, *d*), respectively.

**Lemma 3.** Suppose *<sup>f</sup>* <sup>∈</sup> **<sup>R</sup>***<sup>n</sup>* and let *<sup>f</sup>*(*x*) = *<sup>f</sup> <sup>T</sup>* <sup>⊗</sup> *<sup>x</sup>* be defined on **<sup>R</sup>***n*. Then, (i) *<sup>f</sup>*(*x*) is max-linear, i.e. *<sup>f</sup>*(*<sup>λ</sup>* <sup>⊗</sup> *<sup>x</sup>* <sup>⊕</sup> *<sup>μ</sup>* <sup>⊗</sup> *<sup>y</sup>*) = *<sup>λ</sup>* <sup>⊗</sup> *<sup>f</sup>*(*x*) <sup>⊕</sup> *<sup>μ</sup>* <sup>⊗</sup> *<sup>f</sup>*(*y*) for every *<sup>x</sup>*, *<sup>y</sup>* <sup>∈</sup> **<sup>R</sup>***n*. (ii) *<sup>f</sup>*(*x*) is isotone, i.e. *<sup>f</sup>*(*x*) <sup>≤</sup> *<sup>f</sup>*(*y*) for every *<sup>x</sup>*, *<sup>y</sup>* <sup>∈</sup> **<sup>R</sup>***n*, *<sup>x</sup>* <sup>≤</sup> *<sup>y</sup>*.

*Proof.* (i) Let *α* ∈ **R**. Then we have

$$f(\lambda \otimes \mathbf{x} \oplus \mu \otimes \mathbf{y}) = f^T \otimes \lambda \otimes \mathbf{x} \oplus f^T \otimes \mu \otimes \mathbf{y}$$

$$= \lambda \otimes f^T \otimes \mathbf{x} \oplus \mu \otimes f^T \otimes \mathbf{y}$$

$$= \lambda \otimes f(\mathbf{x}) \oplus \mu \otimes f(\mathbf{y})$$

and the statement now follows.

(ii) Let *<sup>x</sup>*, *<sup>y</sup>* <sup>∈</sup> **<sup>R</sup>***<sup>n</sup>* such that *<sup>x</sup>* <sup>≤</sup> *<sup>y</sup>*. Since *<sup>x</sup>* <sup>≤</sup> *<sup>y</sup>*, we have

$$\max(\mathbf{x}) \le \max(y)$$

$$\iff f^T \otimes \mathbf{x} \le f^T \otimes y, \text{ for any } f \in \mathbb{R}^n$$

$$\iff \quad f(\mathbf{x}) \le f(y).$$

Note that it would be possible to convert equations to inequalities and conversely but this would result in an increase of the number of constraints or variables and thus increasing the computational complexity. The method we present here does not require any new constraint or variable.

We denote by

$$(A \otimes \mathfrak{x})\_{\mathfrak{i}} = \max\_{\mathfrak{j} \in N} (a\_{\mathfrak{i}\mathfrak{j}} + \mathfrak{x}\_{\mathfrak{j}})$$

A variable *xj* will be called *active* if *xj* = *f*(*x*), for some *j* ∈ *N*. Also, a variable will be called *active* on the constraint equation if the value (*A* ⊗ *x*)*<sup>i</sup>* is attained at the term *xj* respectively. It follows from Theorem 5 and Lemma 3 that *<sup>x</sup>*ˆ(*A*, *<sup>C</sup>*, *<sup>b</sup>*, *<sup>d</sup>*) <sup>∈</sup> *<sup>S</sup>*max(*A*, *<sup>C</sup>*, *<sup>b</sup>*, *<sup>d</sup>*). We now present a polynomial algorithm which finds *<sup>x</sup>* <sup>∈</sup> *<sup>S</sup>*min(*A*, *<sup>C</sup>*, *<sup>b</sup>*, *<sup>d</sup>*) or recognizes that *<sup>S</sup>*min(*A*, *<sup>B</sup>*, *<sup>c</sup>*, *<sup>d</sup>*) = <sup>∅</sup>. Due to Theorem 4 either *x*ˆ(*A*, *C*, *b*, *d*) ∈ *S*(*A*, *C*, *b*, *d*) or *S*(*A*, *C*, *b*, *d*) = ∅. Therefore, we assume in the following algorithm that *<sup>S</sup>*(*A*, *<sup>C</sup>*, *<sup>b</sup>*, *<sup>d</sup>*) �<sup>=</sup> <sup>∅</sup> and also *<sup>S</sup>*min(*A*, *<sup>C</sup>*, *<sup>b</sup>*, *<sup>d</sup>*) �<sup>=</sup> <sup>∅</sup>.

**Theorem 8.** The algorithm ONEMLP-EI is correct and its computational complexity is *O*((*k* + *r*)*n*2).

#### **Algorithm 1** ONEMLP-EI(Max-linear program with one-sided equations and inequalities)

**Input:** *<sup>f</sup>* = (*f*1, *<sup>f</sup>*2, ..., *fn*)*<sup>T</sup>* <sup>∈</sup> **<sup>R</sup>***n*, *<sup>b</sup>* = (*b*1, *<sup>b</sup>*2, ...*bk*)*<sup>T</sup>* <sup>∈</sup> **<sup>R</sup>***k*, *<sup>d</sup>* = (*d*1, *<sup>d</sup>*2, ...*dr*)*<sup>T</sup>* <sup>∈</sup> **<sup>R</sup>***r*, *<sup>A</sup>* = (*aij*) <sup>∈</sup> *<sup>R</sup>k*×*<sup>n</sup>* and *<sup>C</sup>* = (*cij*) <sup>∈</sup> *<sup>R</sup>r*×*n*. **Output:** *<sup>x</sup>* <sup>∈</sup> *<sup>S</sup>*min(*A*, *<sup>C</sup>*, *<sup>b</sup>*, *<sup>d</sup>*). 1. Find *x*¯(*A*, *b*), *x*¯(*C*, *d*), *x*ˆ(*A*, *C*, *b*, *d*) and *Kj*, *j* ∈ *J*;*J* = {*j* ∈ *N*; *x*¯*j*(*C*, *d*) ≥ *x*¯*j*(*A*, *b*)} 2. *x* := *x*ˆ(*A*, *C*, *b*, *d*) 3. *H*(*x*) := {*j* ∈ *N*; *fj* + *xj* = *f*(*x*)} 4. *J* := *J* \ *H*(*x*) 5. If � *j*∈*J Kj* �= *K*

then stop (*<sup>x</sup>* <sup>∈</sup> *<sup>S</sup>*min(*A*, *<sup>C</sup>*, *<sup>b</sup>*, *<sup>d</sup>*))


14 Will-be-set-by-IN-TECH

Suppose that the vector *<sup>f</sup>* = (*f*1, *<sup>f</sup>*2, ..., *fn*)*<sup>T</sup>* <sup>∈</sup> **<sup>R</sup>***<sup>n</sup>* is given. The task of minimizing [maximizing]the function *<sup>f</sup>*(*x*) = *<sup>f</sup> <sup>T</sup>* <sup>⊗</sup> *<sup>x</sup>* <sup>=</sup> max(*f*<sup>1</sup> <sup>+</sup> *<sup>x</sup>*1, *<sup>f</sup>*<sup>1</sup> <sup>+</sup> *<sup>x</sup>*2..., *fn* <sup>+</sup> *xn*) subject to (12) is called max-linear program with one-sided equations and inequalities and will be denoted

<sup>≤</sup> ]. We denote the sets of optimal solutions by *<sup>S</sup>*min(*A*, *<sup>C</sup>*, *<sup>b</sup>*, *<sup>d</sup>*) and

<sup>=</sup> *<sup>λ</sup>* <sup>⊗</sup> *<sup>f</sup> <sup>T</sup>* <sup>⊗</sup> *<sup>x</sup>* <sup>⊕</sup> *<sup>μ</sup>* <sup>⊗</sup> *<sup>f</sup> <sup>T</sup>* <sup>⊗</sup> *<sup>y</sup>*

= *λ* ⊗ *f*(*x*) ⊕ *μ* ⊗ *f*(*y*)

**5. Max-linear program with equation and inequality constraints**

**Lemma 3.** Suppose *<sup>f</sup>* <sup>∈</sup> **<sup>R</sup>***<sup>n</sup>* and let *<sup>f</sup>*(*x*) = *<sup>f</sup> <sup>T</sup>* <sup>⊗</sup> *<sup>x</sup>* be defined on **<sup>R</sup>***n*. Then,

(ii) *<sup>f</sup>*(*x*) is isotone, i.e. *<sup>f</sup>*(*x*) <sup>≤</sup> *<sup>f</sup>*(*y*) for every *<sup>x</sup>*, *<sup>y</sup>* <sup>∈</sup> **<sup>R</sup>***n*, *<sup>x</sup>* <sup>≤</sup> *<sup>y</sup>*.

(ii) Let *<sup>x</sup>*, *<sup>y</sup>* <sup>∈</sup> **<sup>R</sup>***<sup>n</sup>* such that *<sup>x</sup>* <sup>≤</sup> *<sup>y</sup>*. Since *<sup>x</sup>* <sup>≤</sup> *<sup>y</sup>*, we have

(i) *<sup>f</sup>*(*x*) is max-linear, i.e. *<sup>f</sup>*(*<sup>λ</sup>* <sup>⊗</sup> *<sup>x</sup>* <sup>⊕</sup> *<sup>μ</sup>* <sup>⊗</sup> *<sup>y</sup>*) = *<sup>λ</sup>* <sup>⊗</sup> *<sup>f</sup>*(*x*) <sup>⊕</sup> *<sup>μ</sup>* <sup>⊗</sup> *<sup>f</sup>*(*y*) for every *<sup>x</sup>*, *<sup>y</sup>* <sup>∈</sup> **<sup>R</sup>***n*.

max(*x*) ≤ max(*y*)

⇐⇒ *f*(*x*) ≤ *f*(*y*).

*<sup>f</sup>*(*<sup>λ</sup>* <sup>⊗</sup> *<sup>x</sup>* <sup>⊕</sup> *<sup>μ</sup>* <sup>⊗</sup> *<sup>y</sup>*) = *<sup>f</sup> <sup>T</sup>* <sup>⊗</sup> *<sup>λ</sup>* <sup>⊗</sup> *<sup>x</sup>* <sup>⊕</sup> *<sup>f</sup> <sup>T</sup>* <sup>⊗</sup> *<sup>μ</sup>* <sup>⊗</sup> *<sup>y</sup>*

⇐⇒ *<sup>f</sup> <sup>T</sup>* <sup>⊗</sup> *<sup>x</sup>* <sup>≤</sup> *<sup>f</sup> <sup>T</sup>* <sup>⊗</sup> *<sup>y</sup>*, for any, *<sup>f</sup>* <sup>∈</sup> **<sup>R</sup>***<sup>n</sup>*

Note that it would be possible to convert equations to inequalities and conversely but this would result in an increase of the number of constraints or variables and thus increasing the computational complexity. The method we present here does not require any new constraint

A variable *xj* will be called *active* if *xj* = *f*(*x*), for some *j* ∈ *N*. Also, a variable will be called *active* on the constraint equation if the value (*A* ⊗ *x*)*<sup>i</sup>* is attained at the term *xj* respectively. It follows from Theorem 5 and Lemma 3 that *<sup>x</sup>*ˆ(*A*, *<sup>C</sup>*, *<sup>b</sup>*, *<sup>d</sup>*) <sup>∈</sup> *<sup>S</sup>*max(*A*, *<sup>C</sup>*, *<sup>b</sup>*, *<sup>d</sup>*). We now present a polynomial algorithm which finds *<sup>x</sup>* <sup>∈</sup> *<sup>S</sup>*min(*A*, *<sup>C</sup>*, *<sup>b</sup>*, *<sup>d</sup>*) or recognizes that *<sup>S</sup>*min(*A*, *<sup>B</sup>*, *<sup>c</sup>*, *<sup>d</sup>*) = <sup>∅</sup>. Due to Theorem 4 either *x*ˆ(*A*, *C*, *b*, *d*) ∈ *S*(*A*, *C*, *b*, *d*) or *S*(*A*, *C*, *b*, *d*) = ∅. Therefore, we assume

**Theorem 8.** The algorithm ONEMLP-EI is correct and its computational complexity is *O*((*k* +

*<sup>j</sup>*∈*<sup>N</sup>* (*aij* <sup>+</sup> *xj*)

(*A* ⊗ *x*)*<sup>i</sup>* = max

in the following algorithm that *<sup>S</sup>*(*A*, *<sup>C</sup>*, *<sup>b</sup>*, *<sup>d</sup>*) �<sup>=</sup> <sup>∅</sup> and also *<sup>S</sup>*min(*A*, *<sup>C</sup>*, *<sup>b</sup>*, *<sup>d</sup>*) �<sup>=</sup> <sup>∅</sup>.

by MLPmin

*Proof.*

or variable. We denote by

*r*)*n*2).

<sup>≤</sup> and [MLPmax

*S*max(*A*, *C*, *b*, *d*), respectively.

(i) Let *α* ∈ **R**. Then we have

and the statement now follows.

*Proof.* The correctness follows from Theorem 5 and the computational complexity is computed as follows. In Step 1 *x*¯(*A*, *b*) is *O*(*kn*), while *x*¯(*C*, *d*), *x*ˆ(*A*, *C*, *b*, *d*) and *Kj* can be determined in *O*(*rn*), *O*(*k* + *r*)*n* and *O*(*kn*) respectively. The loop 3-7 can be repeated at most *n* − 1 times, since the number of elements in *J* is at most *n* and in Step 4 at least one element will be removed at a time. Step 3 is *O*(*n*), Step 6 is *O*(*kn*) and Step 7 is *O*(*n*). Hence loop 3-7 is *O*(*kn*2).

#### **5.1. An example**

Consider the following system max-linear program in which *<sup>f</sup>* = (5, 6, 1, 4, <sup>−</sup>1)*T*,

$$A = \begin{pmatrix} 3 \ 8 \ 4 \ 0 \ 1 \\ 0 \ 6 \ 2 \ 2 \ 1 \\ 0 \ 1 \ -2 \ 4 \ 8 \end{pmatrix}, b = \begin{pmatrix} 7 \\ 5 \\ 7 \end{pmatrix}.$$

$$\mathbf{C} = \begin{pmatrix} -1 \ 2 \ -3 \ 0 \ 6 \\ 3 \ 4 \ -2 \ 2 \ 1 \\ 1 \ 3 \ -2 \ 3 \ 4 \end{pmatrix} \text{ and } d = \begin{pmatrix} 5 \\ 5 \\ 6 \end{pmatrix}.$$

We now make a record run of Algorithm ONEMLP-EI. *<sup>x</sup>*¯(*A*, *<sup>b</sup>*)=(5, <sup>−</sup>1, 3, 3, <sup>−</sup>1)*T*, *<sup>x</sup>*¯(*C*, *<sup>d</sup>*) = (2, 1, 7, 3, <sup>−</sup>1)*T*, *<sup>x</sup>*ˆ(*A*, *<sup>C</sup>*, *<sup>b</sup>*, *<sup>d</sup>*)=(2, <sup>−</sup>1, 3, 3, <sup>−</sup>1)*T*, *<sup>J</sup>* <sup>=</sup> {2, 3, 4, 5} and *<sup>K</sup>*<sup>2</sup> <sup>=</sup> {1, 2}, *<sup>K</sup>*<sup>3</sup> <sup>=</sup> {1, 2}, *<sup>K</sup>*<sup>4</sup> <sup>=</sup> {2, 3} and *<sup>K</sup>*<sup>5</sup> <sup>=</sup> {3}. *<sup>x</sup>* :<sup>=</sup> *<sup>x</sup>*ˆ(*A*, *<sup>C</sup>*, *<sup>b</sup>*, *<sup>d</sup>*)=(2, <sup>−</sup>1, 3, 3, <sup>−</sup>1)*<sup>T</sup>* and *<sup>H</sup>*(*x*) = {1, 4} and *J* �⊆ *H*(*x*). We also have *J* := *J* \ *H*(*x*) = {2, 3, 5} and *K*<sup>2</sup> ∪ *K*<sup>3</sup> ∪ *K*<sup>5</sup> = *K*. Then set *x*<sup>1</sup> = *x*<sup>4</sup> = <sup>10</sup>−<sup>4</sup> (say) and *<sup>x</sup>* = (10−4, <sup>−</sup>1, 3, 10−4, <sup>−</sup>1)*T*. Now *<sup>H</sup>*(*x*) = {2} and *<sup>J</sup>* :<sup>=</sup> *<sup>J</sup>* \ *<sup>H</sup>*(*x*) = {3, 5}. Since *<sup>K</sup>*<sup>3</sup> <sup>∪</sup> *<sup>K</sup>*<sup>5</sup> <sup>=</sup> *<sup>K</sup>* set *<sup>x</sup>*<sup>2</sup> <sup>=</sup> <sup>10</sup>−4(say) and we have *<sup>x</sup>* = (10−4, 10−4, 3, 10−4, <sup>−</sup>1)*T*. Now *H*(*x*) = {3} and *J* := *J* \ *H*(*x*) = {5}. Since *K*<sup>5</sup> �= *K* then we stop and an optimal solution is *<sup>x</sup>* = (10−4, 10−4, 3, 10−4, <sup>−</sup>1)*<sup>T</sup>* and *<sup>f</sup>* min <sup>=</sup> 4.

#### **6. A special case of max-linear program with two-sided constraints**

Suppose *<sup>c</sup>* = (*c*1, *<sup>c</sup>*2, ..., *cm*)*T*, *<sup>d</sup>* = (*d*1, *<sup>d</sup>*2, ..., *dm*)*<sup>T</sup>* <sup>∈</sup> **<sup>R</sup>***m*, *<sup>A</sup>* = (*aij*) and *<sup>B</sup>* = (*bij*) <sup>∈</sup> **<sup>R</sup>***m*×*<sup>n</sup>* are given matrices and vectors. The system

$$A \otimes \mathfrak{x} \oplus \mathfrak{c} = \mathfrak{B} \otimes \mathfrak{x} \oplus \mathfrak{d} \tag{14}$$

is called non-homogeneous two-sided max-linear system and the set of solutions of this system will be denoted by *S*. Two-sided max-linear systems have been studied in [20], [21], [22] and [23].

Optimization problems whose objective function is max-linear and constraint (14) are called max-linear programs (MLP). Max-linear programs are studied in [24] and solution methods for both minimization and maximization problems were developed. The methods are proved to be pseudopolynomial if all entries are integer. Also non-linear programs with max-linear constraints were dealt with in [25], where heuristic methods were develeoped and tested for a number of instances.

Consider max-linear programs with two-sided constraints (minimization), MLPmin

$$\begin{aligned} f(\mathbf{x}) &= f^T \otimes \mathbf{x} \longrightarrow \min\\ \text{subject to} \\ A \otimes \mathbf{x} \oplus \mathbf{c} &= B \otimes \mathbf{x} \oplus d \end{aligned} \tag{15}$$

where *<sup>f</sup>* = (*f*1,..., *fn*)*<sup>T</sup>* <sup>∈</sup> **<sup>R</sup>***n*, *<sup>c</sup>* = (*c*1,..., *cm*)*T*, *<sup>d</sup>* = (*d*1,..., *dm*)*<sup>T</sup>* <sup>∈</sup> **<sup>R</sup>***m*, *<sup>A</sup>* = (*aij*) and *<sup>B</sup>* = (*bij*) <sup>∈</sup> **<sup>R</sup>***m*×*<sup>n</sup>* are given matrices and vectors. We introduce the following:

$$\begin{aligned} y &= (f\_1 \otimes \mathfrak{x}\_1, f\_2 \otimes \mathfrak{x}\_2, \dots, f\_n \otimes \mathfrak{x}\_n) \\ &= \text{diag}(f) \otimes \mathfrak{x} \end{aligned} \tag{16}$$

diag(*f*) means a diagonal matrix whose diagonal elements are *f*1, *f*2, ..., *fn* and off diagonal elements are −∞. It therefore follows from (16) that

$$\begin{aligned} f^T \otimes \mathbf{x} &= \mathbf{0}^T \otimes \mathbf{y} \\ \iff \mathbf{x} &= (f\_1^{-1} \otimes y\_1, f\_2^{-1} \otimes y\_2, \dots, f\_n^{-1} \otimes y\_n) \\ &= (\text{diag}(f))^{-1} \otimes \mathbf{y} \end{aligned} \tag{17}$$

Hence, by substituting (16) and (17) into (15) we have

$$\begin{aligned} \mathbf{0}^T \otimes \mathbf{y} &\longrightarrow \min\\ \text{subject to} \\ \mathbf{A}^\prime \otimes \mathbf{y} \oplus \mathbf{c} &= \mathbf{B}^\prime \otimes \mathbf{y} \oplus \mathbf{d}\_\prime \end{aligned} \tag{18}$$

where 0*<sup>T</sup>* is transpose of the zero vector, *A*� <sup>=</sup> *<sup>A</sup>* <sup>⊗</sup> (diag(*f*))−<sup>1</sup> and *<sup>B</sup>* � <sup>=</sup> *<sup>B</sup>* <sup>⊗</sup> (diag(*f*))−<sup>1</sup> Therefore we assume without loss of generality that *f* = 0 and hence (15) is equivalent to

$$f(\mathbf{x}) = \sum\_{j=1,\ldots,n} {}^{\oplus}\mathbf{x}\_{j} \longrightarrow \min$$
  $\text{subject to}$ 

$$\mathcal{A}\otimes\mathfrak{x}\oplus\mathfrak{c}=\mathcal{B}\otimes\mathfrak{x}\oplus\mathfrak{d}$$

The set of feasible solutions for (19) will be denoted by *S* and the set of optimal solutions by *<sup>S</sup>*min. A vector is called *constant* if all its components are equal. That is a vector *<sup>x</sup>* <sup>∈</sup> **<sup>R</sup>***<sup>n</sup>* is constant if *x*<sup>1</sup> = *x*<sup>2</sup> = ··· = *xn*. For any *x* ∈ *S* we define the set *Q*(*x*) = {*i* ∈ *M*;(*A* ⊗ *x*)*<sup>i</sup>* > *ci*}. We introduce the following notation of matrices. Let *<sup>A</sup>* = (*aij*) <sup>∈</sup> *<sup>R</sup>m*×*n*, 1 <sup>≤</sup> *<sup>i</sup>*<sup>1</sup> <sup>&</sup>lt; *<sup>i</sup>*<sup>2</sup> <sup>&</sup>lt; ··· <sup>&</sup>lt; *iq* ≤ *m* and 1 ≤ *j*<sup>1</sup> < *j*<sup>2</sup> < ··· < *jr* ≤ *n*. Then,

$$A\begin{pmatrix} i\_{1\prime}i\_{2\prime}\dots\iota\_{\prime}i\_{\dot{q}}\\j\_{1\prime}j\_{2\prime}\dots\iota\_{\prime\dot{r}} \end{pmatrix} = \begin{pmatrix} a\_{i\_1j\_1}a\_{i\_1j\_2}\dots a\_{i\_1j\_r}\\a\_{i\_2j\_1}a\_{i\_2j\_2}\dots a\_{i\_2j\_r}\\\dots\\a\_{i\_qj\_1}a\_{i\_qj\_2}\dots a\_{i\_qj\_r} \end{pmatrix} = A(Q\_\prime R)$$

where, *Q* = {*i*1,..., *iq*}, *R* = {*j*1,..., *jr*}. Similar notation is used for vectors *c*(*i*1,..., *ir*) = (*ci*<sup>1</sup> ... *cir* )*<sup>T</sup>* <sup>=</sup> *<sup>c</sup>*(*R*). Given MLPmin with *<sup>c</sup>* <sup>≥</sup> *<sup>d</sup>*, we define the following sets

$$\begin{aligned} M^=&=\{i \in M; c\_i = d\_i\} \text{ and} \\ M^>&=\{i \in M; c\_i > d\_i\} \end{aligned}$$

We also define the following matrices:

$$\begin{aligned} A\_=&=A(M^\equiv,N), A\_> = A(M^>,N) \\ B\_=&=B(M^\equiv,N), B\_> = B(M^>,N) \\ c\_=&=c(M^\equiv), c\_> = c(M^>) \end{aligned} \tag{20}$$

An easily solvable case arises when there is a constant vector *x* ∈ *S* such that the set *Q*(*x*) = ∅. This constant vector *x* satisfies the following equations and inequalities

$$\begin{aligned} A &= \otimes \mathfrak{x} \le \mathfrak{c}\_{=} \\ A\_{>} \otimes \mathfrak{x} &\le \mathfrak{c}\_{>} \\ B\_{=} \otimes \mathfrak{x} &\le \mathfrak{c}\_{=} \\ B\_{>} \otimes \mathfrak{x} &= \mathfrak{c}\_{>} \end{aligned} \tag{21}$$

where *A*=, *A*>, *B*=, *B*>, *c*= and *c*> are defined in (20). The one-sided system of equation and inequalities (21) can be written as

$$\begin{cases} G \otimes \mathfrak{x} = p \\ H \otimes \mathfrak{x} \le q \end{cases} \tag{22}$$

where,

16 Will-be-set-by-IN-TECH

Suppose *<sup>c</sup>* = (*c*1, *<sup>c</sup>*2, ..., *cm*)*T*, *<sup>d</sup>* = (*d*1, *<sup>d</sup>*2, ..., *dm*)*<sup>T</sup>* <sup>∈</sup> **<sup>R</sup>***m*, *<sup>A</sup>* = (*aij*) and *<sup>B</sup>* = (*bij*) <sup>∈</sup> **<sup>R</sup>***m*×*<sup>n</sup>* are

is called non-homogeneous two-sided max-linear system and the set of solutions of this system will be denoted by *S*. Two-sided max-linear systems have been studied in [20], [21],

Optimization problems whose objective function is max-linear and constraint (14) are called max-linear programs (MLP). Max-linear programs are studied in [24] and solution methods for both minimization and maximization problems were developed. The methods are proved to be pseudopolynomial if all entries are integer. Also non-linear programs with max-linear constraints were dealt with in [25], where heuristic methods were develeoped and tested for

*<sup>f</sup>*(*x*) = *<sup>f</sup> <sup>T</sup>* <sup>⊗</sup> *<sup>x</sup>* −→ min

*A* ⊗ *x* ⊕ *c* = *B* ⊗ *x* ⊕ *d*

where *<sup>f</sup>* = (*f*1,..., *fn*)*<sup>T</sup>* <sup>∈</sup> **<sup>R</sup>***n*, *<sup>c</sup>* = (*c*1,..., *cm*)*T*, *<sup>d</sup>* = (*d*1,..., *dm*)*<sup>T</sup>* <sup>∈</sup> **<sup>R</sup>***m*, *<sup>A</sup>* = (*aij*) and

*y* = (*f*<sup>1</sup> ⊗ *x*1, *f*<sup>2</sup> ⊗ *x*2,..., *fn* ⊗ *xn*)

diag(*f*) means a diagonal matrix whose diagonal elements are *f*1, *f*2, ..., *fn* and off diagonal

<sup>−</sup><sup>1</sup> <sup>⊗</sup> *<sup>y</sup>*

�

⊗ *y* ⊕ *d*,

*xj* −→ min

<sup>=</sup> *<sup>A</sup>* <sup>⊗</sup> (diag(*f*))−<sup>1</sup> and *<sup>B</sup>*

�

<sup>=</sup> *<sup>B</sup>* <sup>⊗</sup> (diag(*f*))−<sup>1</sup>

<sup>2</sup> <sup>⊗</sup> *<sup>y</sup>*2,..., *<sup>f</sup>* <sup>−</sup><sup>1</sup> *<sup>n</sup>* <sup>⊗</sup> *yn*)

<sup>1</sup> <sup>⊗</sup> *<sup>y</sup>*1, *<sup>f</sup>* <sup>−</sup><sup>1</sup>

<sup>0</sup>*<sup>T</sup>* <sup>⊗</sup> *<sup>y</sup>* −→ min subject to

⊗ *y* ⊕ *c* = *B*

Therefore we assume without loss of generality that *f* = 0 and hence (15) is equivalent to

⊕

*j*=1,...,*n*

*A* ⊗ *x* ⊕ *c* = *B* ⊗ *x* ⊕ *d*

*f*(*x*) = ∑

subject to

Consider max-linear programs with two-sided constraints (minimization), MLPmin

subject to

*<sup>B</sup>* = (*bij*) <sup>∈</sup> **<sup>R</sup>***m*×*<sup>n</sup>* are given matrices and vectors. We introduce the following:

= diag(*f*) ⊗ *x*

= (diag(*f*))

*A*�

*<sup>f</sup> <sup>T</sup>* <sup>⊗</sup> *<sup>x</sup>* <sup>=</sup> <sup>0</sup>*<sup>T</sup>* <sup>⊗</sup> *<sup>y</sup>* ⇐⇒ *<sup>x</sup>* = (*<sup>f</sup>* <sup>−</sup><sup>1</sup>

elements are −∞. It therefore follows from (16) that

Hence, by substituting (16) and (17) into (15) we have

where 0*<sup>T</sup>* is transpose of the zero vector, *A*�

*A* ⊗ *x* ⊕ *c* = *B* ⊗ *x* ⊕ *d* (14)

(15)

(16)

(17)

(18)

(19)

**6. A special case of max-linear program with two-sided constraints**

given matrices and vectors. The system

[22] and [23].

a number of instances.

$$\begin{aligned} G = (B\_{>})\_{\prime} \; H &= \begin{pmatrix} A\_{=} \\ A\_{>} \\ B\_{=} \end{pmatrix} \\ p = c\_{>} \quad \text{and} \quad q = \begin{pmatrix} c\_{=} \\ c\_{>} \\ c\_{=} \end{pmatrix} \end{aligned} \tag{23}$$

Recall that *S*(*G*, *H*, *p*, *q*) is the set of solutions for (22).

**Theorem 9.** Let *<sup>Q</sup>*(*x*) = <sup>∅</sup> for some constant vector *<sup>x</sup>* = (*α*,..., *<sup>α</sup>*)*<sup>T</sup>* <sup>∈</sup> *<sup>S</sup>*. If *<sup>z</sup>* <sup>∈</sup> *<sup>S</sup>*min then *z* ∈ *S*(*G*, *H*, *p*, *q*).

*Proof.* Let *<sup>x</sup>* = (*α*,..., *<sup>α</sup>*)*<sup>T</sup>* <sup>∈</sup> *<sup>S</sup>*. Suppose *<sup>Q</sup>*(*z*) = <sup>∅</sup> and *<sup>z</sup>* <sup>∈</sup> *<sup>S</sup>*min. This implies that *<sup>f</sup>*(*z*) <sup>≤</sup> *f*(*x*) = *α*. Therefore we have, ∀*j* ∈ *N*, *z* ≤ *α*. Consequently, *z* ≤ *x* and (*A* ⊗ *z*)*<sup>i</sup>* ≤ (*A* ⊗ *x*)*<sup>i</sup>* for all *i* ∈ *M*. Since, *Q*(*z*) = ∅ and *z* ∈ *S*(*G*, *H*, *p*, *q*).

**Corollary 5.** If *<sup>Q</sup>*(*x*) = <sup>∅</sup> for some constant vector *<sup>x</sup>* <sup>∈</sup> *<sup>S</sup>* then *<sup>S</sup>*min <sup>⊆</sup> *<sup>S</sup>*min(*G*, *<sup>H</sup>*, *<sup>p</sup>*, *<sup>q</sup>*).

*Proof.* The statement follows from Theorem 9.

## **7. Some solvability concepts of a linear system containing of both equations and inequalities**

System of max-separable linear equations and inequalities arise frequently in several branches of Applied Mathematics: for instance in the description of discrete-event dynamic system [1, 4] and machine scheduling [10]. However, choosing unsuitable values for the matrix entries and right-handside vectors may lead to unsolvable systems. Therefore, methods for restoring solvability suggested in the literature could be employed. These methods include modifying the input data [11, 26] or dropping some equations [11]. Another possibility is to replace each entry by an interval of possible values. In doing so, our question will be shifted to asking about weak solvability, strong solvability and control solvability.

Interval mathematics was championed by Moore [27] as a tool for bounding errors in computer programs. The area has now been developed in to a general methodology for investigating numerical uncertainty in several problems. System of interval equations and inequalities in max-algebra have each been studied in the literature. In [26] weak and strong solvability of interval equations were discussed, control sovability, weak control solvability and universal solvability have been dealt with in [28]. In [29] a system of linear inequality with interval coefficients was discussed. In this section we consider a system consisting of interval linear equations and inequalities and present solvability concepts for such system.

An algebraic structure (*B*, ⊕, ⊗) with two binary operations ⊕ and ⊗ is called max-plus algebra if

$$B = \mathbb{R} \cup \{ -\infty \}, \ a \oplus b = \max\{a, b\}, \ a \otimes b = a + b$$

for any *a*, *b* ∈ **R**.

Let *m*, *n*,*r* be given positive integers and *a* ∈ **R**, we use throughout the paper the notation *<sup>M</sup>* <sup>=</sup> {1, 2, ..., *<sup>m</sup>*}, *<sup>N</sup>* <sup>=</sup> {1, 2, ..., *<sup>n</sup>*}, *<sup>R</sup>* <sup>=</sup> {1, 2, ...,*r*} and *<sup>a</sup>*−<sup>1</sup> <sup>=</sup> <sup>−</sup>*a*. The set of all *<sup>m</sup>* <sup>×</sup> *<sup>n</sup>*, *<sup>r</sup>* <sup>×</sup> *<sup>n</sup>* matrices over *B* is denoted by *B*(*m*, *n*) and *B*(*r*, *n*) respectively. The set of all *n*-dimensional vectors is denoted by *B*(*n*). Then for each matrix *A* ∈ *B*(*n*, *m*) and vector *x* ∈ *B*(*n*) the product *A* ⊗ *x* is define as

$$(A \otimes \mathfrak{x}) = \max\_{j \in N} \left( a\_{ij} + \mathfrak{x}\_j \right).$$

For a given matrix interval *A* = [*A*, *A*] with *A*, *A* ∈ *B*(*k*, *n*), *A* ≤ *A* and given vector interval *b* = [*b*, *b*] with *b*, *b* ∈ *B*(*n*), *b* ≤ *b* the notation

$$A \otimes \mathfrak{x} = \mathfrak{b} \tag{24}$$

represents an interval system of linear max-separable equations of the form

$$A \circledast \mathfrak{x} = b \tag{25}$$

Similarly, for a given matrix interval *C* = [*A*, *A*] with *C*, *C* ∈ *B*(*r*, *n*), *C* ≤ *C* and given vector interval *d* = [*d*, *d*] with *d*, *d* ∈ *B*(*n*), *b* ≤ *b* the notation

$$\mathcal{C} \otimes \mathfrak{x} \le d \tag{26}$$

represents an interval system of linear max-separable inequalities of the form

$$
\mathbb{C} \otimes \mathfrak{x} \le d \tag{27}
$$

Interval system of linear max-separable equations and inequalities have each been studied in the literature, for more information the reader is reffered to . The following notation

$$\begin{aligned} \mathbf{A} \otimes \mathbf{x} &= \mathbf{b} \\ \mathbf{C} \otimes \mathbf{x} &\le \mathbf{d} \end{aligned} \tag{28}$$

represents an interval system of linear max-separable equations and inequalities of the form

$$\begin{aligned} A \otimes \mathfrak{x} &= b \\ \mathfrak{C} \otimes \mathfrak{x} &\le d \end{aligned} \tag{29}$$

where *A* ∈ *A*, *C* ∈ *C*, *b* ∈ *b* and *d* ∈ *d*.

The aim of this section is to consider a system consisting of max-separable linear equations and inequalities and presents some solvability conditions of such system. Note that it is possible to convert equations to inequalities and conversely, but this would result in an increase in the number of equations and inequalities or an increase in the number of unknowns thus increasing the computational complexity when testing the solvability conditions. Each system of the form (29) is said to be a subsystem of (28). An interval system (29) has constant matrices if *A* = *A* and *C* = *C*. Similarly, an interval system has constant right hand side if *b* = *b* and *d* = *d*. In what follows we will consider *A* ∈ **R**(*k*, *n*) and *C* ∈ **R**(*r*, *n*).

#### **7.1. Weak solvability**

**Definition 8.** A vector *y* is a weak solution to an interval system (29) if there exists *A* ∈ *A*, *C* ∈ *C*, *b* ∈ *b* and *d* ∈ *d* such that

$$\begin{aligned} A \otimes y &= b \\ \mathbb{C} \otimes y &\le d \end{aligned} \tag{30}$$

**Theorem 10.** A vector *<sup>x</sup>* <sup>∈</sup> **<sup>R</sup>***<sup>n</sup>* is a weak solution of (29) if and only if

$$\mathfrak{x} = \mathfrak{x}\left(\frac{\underline{A}}{\underline{C}}\frac{\overline{b}}{\overline{d}}\right).$$

and

18 Will-be-set-by-IN-TECH

**Theorem 9.** Let *<sup>Q</sup>*(*x*) = <sup>∅</sup> for some constant vector *<sup>x</sup>* = (*α*,..., *<sup>α</sup>*)*<sup>T</sup>* <sup>∈</sup> *<sup>S</sup>*. If *<sup>z</sup>* <sup>∈</sup> *<sup>S</sup>*min then

*Proof.* Let *<sup>x</sup>* = (*α*,..., *<sup>α</sup>*)*<sup>T</sup>* <sup>∈</sup> *<sup>S</sup>*. Suppose *<sup>Q</sup>*(*z*) = <sup>∅</sup> and *<sup>z</sup>* <sup>∈</sup> *<sup>S</sup>*min. This implies that *<sup>f</sup>*(*z*) <sup>≤</sup> *f*(*x*) = *α*. Therefore we have, ∀*j* ∈ *N*, *z* ≤ *α*. Consequently, *z* ≤ *x* and (*A* ⊗ *z*)*<sup>i</sup>* ≤ (*A* ⊗ *x*)*<sup>i</sup>*

System of max-separable linear equations and inequalities arise frequently in several branches of Applied Mathematics: for instance in the description of discrete-event dynamic system [1, 4] and machine scheduling [10]. However, choosing unsuitable values for the matrix entries and right-handside vectors may lead to unsolvable systems. Therefore, methods for restoring solvability suggested in the literature could be employed. These methods include modifying the input data [11, 26] or dropping some equations [11]. Another possibility is to replace each entry by an interval of possible values. In doing so, our question will be shifted to asking

Interval mathematics was championed by Moore [27] as a tool for bounding errors in computer programs. The area has now been developed in to a general methodology for investigating numerical uncertainty in several problems. System of interval equations and inequalities in max-algebra have each been studied in the literature. In [26] weak and strong solvability of interval equations were discussed, control sovability, weak control solvability and universal solvability have been dealt with in [28]. In [29] a system of linear inequality with interval coefficients was discussed. In this section we consider a system consisting of interval linear equations and inequalities and present solvability concepts for such system.

An algebraic structure (*B*, ⊕, ⊗) with two binary operations ⊕ and ⊗ is called max-plus

*B* = **R** ∪ {−∞}, *a* ⊕ *b* = max{*a*, *b*}, *a* ⊗ *b* = *a* + *b*

Let *m*, *n*,*r* be given positive integers and *a* ∈ **R**, we use throughout the paper the notation *<sup>M</sup>* <sup>=</sup> {1, 2, ..., *<sup>m</sup>*}, *<sup>N</sup>* <sup>=</sup> {1, 2, ..., *<sup>n</sup>*}, *<sup>R</sup>* <sup>=</sup> {1, 2, ...,*r*} and *<sup>a</sup>*−<sup>1</sup> <sup>=</sup> <sup>−</sup>*a*. The set of all *<sup>m</sup>* <sup>×</sup> *<sup>n</sup>*, *<sup>r</sup>* <sup>×</sup> *<sup>n</sup>* matrices over *B* is denoted by *B*(*m*, *n*) and *B*(*r*, *n*) respectively. The set of all *n*-dimensional vectors is denoted by *B*(*n*). Then for each matrix *A* ∈ *B*(*n*, *m*) and vector *x* ∈ *B*(*n*) the product

> *j*∈*N aij* + *xj*

For a given matrix interval *A* = [*A*, *A*] with *A*, *A* ∈ *B*(*k*, *n*), *A* ≤ *A* and given vector interval

*A* ⊗ *x* = *b* (24)

(*A* ⊗ *x*) = max

**Corollary 5.** If *<sup>Q</sup>*(*x*) = <sup>∅</sup> for some constant vector *<sup>x</sup>* <sup>∈</sup> *<sup>S</sup>* then *<sup>S</sup>*min <sup>⊆</sup> *<sup>S</sup>*min(*G*, *<sup>H</sup>*, *<sup>p</sup>*, *<sup>q</sup>*).

**7. Some solvability concepts of a linear system containing of both**

about weak solvability, strong solvability and control solvability.

*z* ∈ *S*(*G*, *H*, *p*, *q*).

algebra if

for any *a*, *b* ∈ **R**.

*A* ⊗ *x* is define as

*b* = [*b*, *b*] with *b*, *b* ∈ *B*(*n*), *b* ≤ *b* the notation

for all *i* ∈ *M*. Since, *Q*(*z*) = ∅ and *z* ∈ *S*(*G*, *H*, *p*, *q*).

*Proof.* The statement follows from Theorem 9.

**equations and inequalities**

$$
\left(\overline{A} \otimes \bar{\underline{x}}\left(\frac{\underline{A}}{\underline{C}} \frac{\overline{b}}{\overline{d}}\right) \geq \underline{b}\right)
$$

*Proof.* Let *<sup>i</sup>* <sup>=</sup> {1, ..., *<sup>m</sup>*} be an arbitrary chosen index and *<sup>x</sup>* = (*x*1, *<sup>x</sup>*2, ..., *xn*)*<sup>T</sup>* <sup>∈</sup> **<sup>R</sup>***<sup>n</sup>* fixed. If *A* ∈ *A* then (*A* ⊗ *x*)*<sup>i</sup>* is isotone and we have

$$[(A \otimes \mathfrak{x})\_{\mathfrak{i}} \in [(\underline{A} \otimes \mathfrak{x})\_{\mathfrak{i}\mathfrak{i}} (\overline{A} \otimes \mathfrak{x})\_{\mathfrak{i}}] \subseteq \mathbb{R}$$

Hence, *x* is a weak solution if and only if

$$[(\underline{A}\otimes\mathfrak{x})\_{i\prime}(\overline{A}\otimes\mathfrak{x})\_{i}]\cap[\underline{b}\_{i\prime}\overline{b}\_{i}]\tag{31}$$

Similarly, if *C* ⊗ *x* ≤ *d* then *x* is obviously a weak solution to

$$\underline{A}\otimes\mathfrak{x}\leq b$$

$$\underline{C}\otimes\mathfrak{x}\leq\overline{d}$$

That is

$$\mathfrak{x} = \mathfrak{x}\left(\frac{\underline{A}}{\underline{C}}\frac{\overline{b}}{\underline{d}}\right).$$

Also from (31) *x* is a weak solution if and only if

$$[(\underline{A}\otimes\mathfrak{x})\_{i\prime}(\overline{A}\otimes\mathfrak{x})\_{\mathfrak{i}}]\cap[\underline{b}\_{i\prime}\overline{b}\_{\mathfrak{i}}]\neq\mathcal{O},\forall\mathfrak{i}=1,2,...,m$$

That is

$$
\overline{A} \otimes \overline{x} \left( \frac{\underline{A}}{\underline{C}} \frac{\overline{b}}{\overline{d}} \right) \geq \underline{b}
$$

**Definition 9.** An interval system (29) is weakly solvable if there exists *A* ∈ *A*, *C* ∈ *C*, *b* ∈ *b* and *d* ∈ *d* such that (29) is solvable.

**Theorem 11.** An interval system (29) with constant matrix *A* = *A* = *A*, *C* = *C* = *C* is weakly solvable if and only if

$$A \otimes \overline{\mathfrak{x}}\left(\begin{matrix} A \ \overline{b} \\ C \ \overline{d} \end{matrix}\right) \geq \underline{b}$$

*Proof.* The (if) part follows from the definition. Conversely, Let

$$A \otimes \mathfrak{F} \begin{pmatrix} A \ b \\ C \ d \end{pmatrix} = b$$

be solvable subsystem for *b* ∈ [*bi*, *bi*]. Then we have

$$A \otimes \mathfrak{x} \begin{pmatrix} A \ \overline{b} \\ \mathcal{C} \ \overline{d} \end{pmatrix} \ge A \otimes \mathfrak{x} \begin{pmatrix} A \ b \\ \mathcal{C} \ d \end{pmatrix} = b \ge \underline{b}$$

#### **7.2. Strong solvability**

**Definition 10.** A vector *x* is a strong solution to an interval system (29) if for each *A* ∈ *A*, *C* ∈ *C* and each *b* ∈ *b*, *d* ∈ *d* there is an *x* ∈ **R** such that (29) holds.

**Theorem 12.** a vector *x* is a strong solution to (29) if and only if it is a solution to

$$E \otimes \mathfrak{x} = f$$
 
$$\overline{\mathbb{C}} \otimes \mathfrak{x} \le \underline{d}$$

where

20 Will-be-set-by-IN-TECH

*Proof.* Let *<sup>i</sup>* <sup>=</sup> {1, ..., *<sup>m</sup>*} be an arbitrary chosen index and *<sup>x</sup>* = (*x*1, *<sup>x</sup>*2, ..., *xn*)*<sup>T</sup>* <sup>∈</sup> **<sup>R</sup>***<sup>n</sup>* fixed. If

(*A* ⊗ *x*)*<sup>i</sup>* ∈ [(*A* ⊗ *x*)*i*,(*A* ⊗ *x*)*i*] ⊆ **R**

*A* ⊗ *x* ≤ *b*

*C* ⊗ *x* ≤ *d*

 *A b C d* 

[(*A* ⊗ *x*)*i*,(*A* ⊗ *x*)*i*] ∩ [*bi*, *bi*] �= ∅, ∀*i* = 1, 2, ..., *m*

 *A b C d* ≥ *b*

**Definition 9.** An interval system (29) is weakly solvable if there exists *A* ∈ *A*, *C* ∈ *C*, *b* ∈ *b*

**Theorem 11.** An interval system (29) with constant matrix *A* = *A* = *A*, *C* = *C* = *C* is weakly

 *A b C d* ≥ *b*

 *A b C d* = *b*

≥ *A* ⊗ *x*¯

 *A b C d* 

= *b* ≥ *b*

*x* = *x*¯

*A* ⊗ *x*¯

*A* ⊗ *x*¯

*A* ⊗ *x*¯

*Proof.* The (if) part follows from the definition. Conversely, Let

be solvable subsystem for *b* ∈ [*bi*, *bi*]. Then we have

*A* ⊗ *x*¯

 *A b C d* 

[(*A* ⊗ *x*)*i*,(*A* ⊗ *x*)*i*] ∩ [*bi*, *bi*] (31)

(32)

*A* ∈ *A* then (*A* ⊗ *x*)*<sup>i</sup>* is isotone and we have

Hence, *x* is a weak solution if and only if

Also from (31) *x* is a weak solution if and only if

and *d* ∈ *d* such that (29) is solvable.

solvable if and only if

That is

That is

Similarly, if *C* ⊗ *x* ≤ *d* then *x* is obviously a weak solution to

$$E = \begin{pmatrix} \overline{A} \\ \underline{A} \end{pmatrix}, f = \begin{pmatrix} \frac{b}{b} \\ \overline{b} \end{pmatrix} \tag{33}$$

*Proof.* If *x* is a strong solution of (29), it obviously satisfies (33). Conversely, suppose *x* satisfies (33) and let *<sup>A</sup>*˜ <sup>∈</sup> **<sup>A</sup>**, *<sup>C</sup>*˜ <sup>∈</sup> **<sup>C</sup>**, ˜ *<sup>b</sup>* <sup>∈</sup> **<sup>b</sup>**, ˜*<sup>d</sup>* <sup>∈</sup> **<sup>d</sup>** such that *<sup>A</sup>*˜ <sup>⊗</sup> *<sup>x</sup>* �<sup>=</sup> ˜ *<sup>b</sup>* and *<sup>C</sup>*˜ <sup>⊗</sup> *<sup>x</sup>* <sup>&</sup>gt; ˜*d*. Then <sup>∃</sup>*<sup>i</sup>* <sup>∈</sup> (1, 2, ..., *<sup>m</sup>*) such that either (*A*˜ <sup>⊗</sup> *<sup>x</sup>*)*<sup>i</sup>* <sup>&</sup>lt; ˜ *bi* or (*A*˜ <sup>⊗</sup> *<sup>x</sup>*)*<sup>i</sup>* <sup>&</sup>gt; ˜ *bi* and (*C*˜ <sup>⊗</sup> *<sup>x</sup>*)*<sup>i</sup>* <sup>&</sup>gt; ˜*di*. Therefore, (*<sup>A</sup>* <sup>⊗</sup> *<sup>x</sup>*)*<sup>i</sup>* <sup>&</sup>lt; (*A*˜ <sup>⊗</sup> *<sup>x</sup>*)*<sup>i</sup>* <sup>&</sup>lt; *bi*, (*<sup>A</sup>* <sup>⊗</sup> *<sup>x</sup>*)*<sup>i</sup>* <sup>≥</sup> (*A*˜ <sup>⊗</sup> *<sup>x</sup>*)*<sup>i</sup>* <sup>&</sup>gt; *bi* and (*<sup>C</sup>* <sup>⊗</sup> *<sup>x</sup>*)*<sup>i</sup>* <sup>&</sup>gt; (*C*˜ <sup>⊗</sup> *<sup>x</sup>*)*<sup>i</sup>* <sup>&</sup>gt; *di* and the theorem statement follows.

### **Acknowledgement**

The author is grateful to the Kano University of Science and Technology, Wudil for paying the publication fee.

#### **Author details**

Abdulhadi Aminu

*Department of Mathematics, Kano University of Science and Technology, Wudil, P.M.B 3244, Kano, Nigeria*

#### **8. References**


## **Efficient Model Transition in Adaptive Multi-Resolution Modeling of Biopolymers**

Mohammad Poursina, Imad M. Khan and Kurt S. Anderson

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/48196

## **1. Introduction**

22 Will-be-set-by-IN-TECH

[9] M. Akian, R. Bapat, S. Gaubert, Max-plus algebra, in: *L. Hogben (Ed.), Handbook of Linear algebra: Discrete Mathematics and its Application*, Chapman & Hall/CRC, Baton Rouge,

[10] P. Butkovic,ˇ *Max-linear Systems: Theory and Algorithms.*, Springer Monographs in

[11] K. Cechlarov ´ a, P. Diko, Resolving infeasibility in extremal algebras, ´ *Linear Algebra &*

[12] P. Butkovic, Max-algebra: the linear algebra of combinatorics?, ˇ *Linear Algebra & Appl.*

[13] R. A. Cuninghame-Green, Minimax Algebra and applications, in: *Advances in Imaging*

[14] R. A. Cuninghame-Green, K. Zimmermann, Equation with residual functions, *Comment.*

[15] P. D. Moral, G. Salut, Random particle methods in (max,+) optimisation problems in:

[16] A. Aminu, Simultaneous solution of linear equations and inequalities in max-algebra

[17] P. Butkovic, Necessary solvability conditions of linear extremal equations, ˇ *Discrete*

[18] K. Cechlarov ´ a, Solution of interval systems in max-algebra, in: ´ *V. Rupnik, L. Zadnik-stirn,*

[19] A. Aminu, Max-algebraic linear systems and programs, *PhD Thesis, University of*

[20] P. Butkovic, G. Heged ˇ us, An elemination method for finding all solutions of the system ¨ of linear equations over an extremal algebra, *Ekonom. mat. Obzor* 20 (1984) 2003-215. [21] R. A. Cuninghame-Green, P. Butkovic, The equation ˇ *A* ⊗ *x* = *B* ⊗ *y* over (max,+), *Theoret.*

[22] B. Heidergott, G. J. Olsder, J. van der Woude, *Max-plus at work, Modelling and Analysis of Synchronized Systems: A course on Max-Plus Algebra and Its Applications*, Princeton

[23] E. A. Walkup, G. Boriello, A general linear max-plus solution technique, in:

[24] P. Butkovic, A. Aminu, Introduction to Max-linear programming, ˇ *IMA Journal of*

[25] A. Aminu, P. Butkovic, Non-linear programs with max-linear constraints: a heuristic ˇ

[26] R.A Cuninghame-Green, K. Cechlarov ´ a´ *Residuationin fuzzy algebra and some appplications*,

[28] H. Mysová, Control solvability of interval systems in max-separable linear equations, ˇ

[29] M. Fielder, J. Nedoma, J. Ramik, J.Rohn, K. Zimmermann, *Linear optimization problems*

approach, *IMA Journal of Management Mathematics* 23, 1, (2012), 41-66.

*and Electron Physics*, vol. 90, Academic Press, New York (1995) 1-121.

*Gunawardena (Ed.), Idempotency*, Cambridge (1988) 383-391.

*S. Drobne(Eds.)*,Proc. SOR (2001) Preddvor, Slonvenia, 321-326.

*Gunawardena( Ed.), Idempotency*, Cambridge (1988) 406-415.

[27] R.E Moore *Methods and application of interval analysis*, SIAM, (1979).

Linear algebra and its Application, (2006) 416, 215-223.

*Management Mathematics* (2008), 20, 3 (2009) 233-249.

Fuzzy Sets and Systems 71 (1995) 227-239.

*with inexact data*, Springer, Berlin, (2006).

*Applied Mathematics* 10 (1985) 19-26, North-Holland.

L.A (2007).

Mathematics, Springer-Verlag (2010).

*Math. Uni. Carolina* 42(2001) 729-740.

*Kybernetika*, 47, 2, (2011),241-250.

*Birmingham, UK* , (2009).

*Comput. Sci.* 293 (1991) 3-12.

University Press, New Jersey (2006).

*Appl.* 290 (1999) 267-273.

236 Linear Algebra – Theorems and Applications

367 (2003) 313-335.

Multibody dynamics methods are being used extensively to model biomolecular systems to study important physical phenomena occurring at different spatial and temporal scales [10, 13]. These systems may contain thousands or even millions of degrees of freedom, whereas, the size of the time step involved during the simulation is on the order of femto seconds. Examples of such problems may include proteins, DNAs, and RNAs. These highly complex physical systems are often studied at resolutions ranging from a fully atomistic model to coarse-grained molecules, up to a continuum level system [4, 19, 20]. In studying these problems, it is often desirable to change the system definition during the course of the simulation in order to achieve an optimal combination of accuracy and speed. For example, in order to study the overall conformational motion of a bimolecular system, a model based on super-atoms (beads) [18, 22] or articulated multi-rigid and/or flexible body [21, 23] can be used. Whereas, localized behavior has to be studied using fully atomistic models. In such cases, the need for the transition from a fine-scale to a coarse model and vice versa arises. Illustrations of a fully atomistic model of a molecule, and its coarse-grained model are shown in Fig. (1-a) and Fig. (1-b).

Given the complexity and nonlinearity of challenging bimolecular systems, it is expected that different physical parameters such as dynamic boundary conditions and applied forces will have a significant affect on the behavior of the system. It is shown in [16] that time-invariant coarse models may provide inadequate or poor results and as such, an adaptive framework to model these systems should be considered [14]. Transitions between different system models can be achieved by intelligently removing or adding certain degrees of freedom (*do f*). This change occurs instantaneously and as such, may be viewed as model changes as a result of impulsively applied constraints. For multi-rigid and flexible body systems, the transition from a higher fidelity (fine-scale) model to a lower fidelity model (coarse-scale) using divide-and-conquer algorithm (DCA) has been studied previously in [8, 12]. DCA efficiently provides the unique states of the system after this transition. In this chapter, we focus on the transition from a coarse model to a fine-scale model. When the system is modeled in an articulated multi-flexible-body framework, such transitions may be achieved by two

©2012 Poursina et al., licensee InTech. This is an open access chapter distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. ©2012 Poursina et al., licensee InTech. This is a paper distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

(b) Mixed type multibody model

**Figure 1.** Illustration of a biomolecular system. a) Fully atomistic model. b) Coarse grain model with different rigid and flexible sub-domains connected to each other via kinematic joints.

different means. In the first, a fine-scale model is generated by adding flexible *do f* . This type of fine scaling may be necessary in order to capture higher frequency modes. For instance, when two molecules bind together, due to the impact, the higher frequency modes of the system are excited. The second type of fine scaling transition may be achieved through releasing the connecting joints in the multi-flexible-body system. In other words, certain constraints on joints are removed to introduce new *do f* in the model.

In contrast to those types of dynamic systems in which the model definition is persistent, and the total energy of the system is conserved, the class of problems discussed here experiences discontinuous changes in the model definition and hence, the energy of the system must also change (nominally increase) in a discontinuous fashion. During the coarse graining process, based on a predefined metric, one may conclude that naturally existing higher modes are less relevant and can be ignored. As such, the kinetic energy associated with those modes must be estimated and properly accounted for, when transitioning back to the fine-scale model. Moreover, any change in the system model definition is assumed to occur as a result of impulsively applied constraints, without the influence of external loads. As such, the generalized momentum of the system must also be conserved [6]. In other words, the momentum of each differential element projected onto the space of admissible motions permitted by the more restrictive model (whether pre- or post-transition) when integrated over the entire system must be conserved across the model transition. If the generalized momentum is not conserved during the transition, the results are non-physical, and the new initial conditions for the rest of the simulation of the system are invalid.

In the next section, a brief overview of the DCA and analytical preliminaries necessary to the algorithm development are presented. The optimization problem associated with the coarse to fine scale transitioning is discussed next. Then the impulse-momentum formulation for transitioning from coarse models to fine-scale models in the articulated flexible-body scheme is presented. Finally conclusions are made.

### **2. Theoretical background**

2 Will-be-set-by-IN-TECH

(a) Fully atomistic model

(b) Mixed type multibody model

different means. In the first, a fine-scale model is generated by adding flexible *do f* . This type of fine scaling may be necessary in order to capture higher frequency modes. For instance, when two molecules bind together, due to the impact, the higher frequency modes of the system are excited. The second type of fine scaling transition may be achieved through releasing the connecting joints in the multi-flexible-body system. In other words, certain constraints on

In contrast to those types of dynamic systems in which the model definition is persistent, and the total energy of the system is conserved, the class of problems discussed here experiences discontinuous changes in the model definition and hence, the energy of the system must also change (nominally increase) in a discontinuous fashion. During the coarse graining process, based on a predefined metric, one may conclude that naturally existing higher modes are less relevant and can be ignored. As such, the kinetic energy associated with those modes must be estimated and properly accounted for, when transitioning back to the fine-scale model. Moreover, any change in the system model definition is assumed to occur as a result of impulsively applied constraints, without the influence of external loads. As such, the generalized momentum of the system must also be conserved [6]. In other words, the momentum of each differential element projected onto the space of admissible motions permitted by the more restrictive model (whether pre- or post-transition) when integrated over the entire system must be conserved across the model transition. If the generalized

**Figure 1.** Illustration of a biomolecular system. a) Fully atomistic model. b) Coarse grain model with

different rigid and flexible sub-domains connected to each other via kinematic joints.

joints are removed to introduce new *do f* in the model.

In this section, a brief introduction of the basic divide-and-conquer algorithm is presented. The DCA scheme has been developed for the simulation of general multi-rigid and multi-flexible-body systems [5, 8, 9], and systems with discontinuous changes [11, 12]. The basic algorithm described here is independent of the type of problem and is presented so that the chapter might be more self contained. In other words, it can be used to study the behavior of any rigid- and flexible-body system, even if the system undergoes a discontinuous change. Some mathematical preliminaries are also presented in this section which are important to the development of the algorithm.

#### **2.1. Basic divide-and-conquer algorithm**

The basic DCA scheme presented in this chapter works in a similar manner described in detail in [5, 9]. Consider two representative flexible bodies *k* and *k* + 1 connected to each other by a joint *J<sup>k</sup>* as shown in Fig. (2-a). The two points of interest, *H<sup>k</sup>* <sup>1</sup> and *<sup>H</sup><sup>k</sup>* <sup>2</sup>, on body *k* are termed *handles*. A handle is any selected point through which a body interacts with the environment. In this chapter, we will limit our attention to each body having two handles, and each handle coincides with the joint location on the body, i.e. joint locations *Jk*−<sup>1</sup> and *J<sup>k</sup>* in case of body *k*. Similarly, for body *k* + 1, the points *Hk*+<sup>1</sup> <sup>1</sup> and *<sup>H</sup>k*+<sup>1</sup> <sup>2</sup> are located at the joint locations *<sup>J</sup><sup>k</sup>* and *Jk*<sup>+</sup>1, respectively. Furthermore, large rotations and translations in the flexible bodies are modeled as rigid body *do f* . Elastic deformations in the flexible bodies are modeled through the use of modal coordinates and admissible shape functions.

DCA is implemented using two main processes, hierarchic assembly and disassembly. The goal of the assembly process is to find the equations describing the dynamics of each body in the hierarchy at its two handles. This process begins at the level of individual bodies and adjacent bodies are assembled in a binary tree configuration. Using recursive formulations, this process couples the two-handle equations of successive bodies to find the two-handle equations of the resulting assembly. For example, body *k* and body *k* + 1 are coupled together to form the assembly shown in Fig. (2-b). At the end of the assembly process, the two-handle equations of the entire system are obtained.

The hierarchic disassembly process begins with the solution of the two-handle equations associated with the primary node of the binary tree. The process works from this node to the individual sub-domain nodes of the binary tree to solve for the two-handle equations of the constituent subassemblies. This process is repeated until all unknowns (e.g., spatial constraint forces, spatial constraint impulses, spatial accelerations, jumps in the spatial velocities) of the bodies at the individual sub-domain level of the binary tree are known. The assembly and disassembly processes are illustrated in Fig. (3).

(b) Assembly *k* : *k* + 1

**Figure 2.** Assembling of the two bodies to form a subassembly. a) Consecutive bodies *k* and *k* + 1. b) A fictitious subassembly formed by coupling bodies *k* and *k* + 1.

**Figure 3.** The hierarchic assembly-disassembly process in DCA.

#### **2.2. Analytical preliminaries**

For convenience, the superscript *c* shows that a quantity of interest is associated with the coarse model, while *f* denotes that it is associated with the fine model. For example, the column matrix � *v*1 *q*˙1 �*c* represents the velocity of handle-1 in the coarse model, and � *v*1 *q*˙1 �*f* represents the velocity of the same handle in the fine-scale model. In these matrices, *v*<sup>1</sup> and *q*˙1 are the spatial velocity vector of handle-1 and the associated generalized modal speeds, respectively.

As discussed previously, the change in the system model definition may occur by changing the number of flexible modes used to describe the behavior of flexible bodies, and/or the number of *do f* of the connecting joints. To implement these changes in the system model mathematically, the joint free-motion map is defined as follows.

The joint free-motion map *<sup>P</sup>J<sup>k</sup> <sup>R</sup>* can be interpreted as the 6 <sup>×</sup> *<sup>ν</sup><sup>k</sup>* matrix of the free-modes of motion permitted by the *<sup>ν</sup><sup>k</sup>* degree-of-freedom joint, *<sup>J</sup>k*. In other words, *<sup>P</sup>J<sup>k</sup> <sup>R</sup>* maps *<sup>ν</sup><sup>k</sup>* <sup>×</sup> <sup>1</sup> generalized speeds associated with relative free motion permitted by the joint into a 6 × 1 spatial relative velocity vector which may occur across the joint, *J<sup>k</sup>* [5]. For instance, consider a transition in which a spherical joint in the system is altered, where only one *do f* is locked about the first axis. The joint free-motion maps of the fine and coarse models in this case are shown in the following:

$$\begin{aligned} \;^{I}P\_{R}^{I^{k}f} = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{bmatrix} \;^{P\_{C}^{I^{k}c}} = \begin{bmatrix} 0 \; 0 \\ 1 \; 0 \\ 0 \; 1 \\ 0 \; 0 \\ 0 \; 0 \\ 0 \; 0 \end{bmatrix} \end{aligned} \tag{1}$$

We define the orthogonal complement of the joint free-motion map, *D<sup>k</sup> <sup>R</sup>*. As such, by definition one arises at the following

$$(D\_R^k)^T P\_R^{J^k} = 0 \tag{2}$$

#### **3. Optimization problem**

4 Will-be-set-by-IN-TECH

(a) Consecutive bodies

(b) Assembly *k* : *k* + 1

**Figure 2.** Assembling of the two bodies to form a subassembly. a) Consecutive bodies *k* and *k* + 1. b) A

For convenience, the superscript *c* shows that a quantity of interest is associated with the coarse model, while *f* denotes that it is associated with the fine model. For example, the

fictitious subassembly formed by coupling bodies *k* and *k* + 1.

**Figure 3.** The hierarchic assembly-disassembly process in DCA.

**2.2. Analytical preliminaries**

Any violation in the conservation of the generalized momentum of the system in the transition between different models leads to non-physical results since the instantaneous switch in the system model definition is incurred without the influence of any external load. In other words, the momentum of each differential element projected onto the space of the admissible motions permitted by the more restrictive model (whether pre- or post-transition) when integrated over the entire system must be conserved across the model transition [6]. Jumps in the system partial velocities due to the sudden change in the model resolution result in the jumps in the generalized speeds corresponding to the new set of degrees of freedom. Since the model is instantaneously swapped, the position of the system does not change. Hence, the position dependent forces acting on the system do not change, and do not affect the generalized speeds. Any change in the applied loads (e.g., damping terms) which might occur due to the change in the model definition and the associated velocity jumps do not contribute to the impulse-momentum equations which describe the model transition. This is because these changes in the applied loads are bounded, and integrated over the infinitesimally short transition time.

Consider a fine-scale model with *n do f* . Let the *do f* of the model reduce to *n* − *m* after the imposition of certain instantaneous constraints. In this case, the conservation of the

#### 6 Will-be-set-by-IN-TECH 242 Linear Algebra – Theorems and Applications

generalized momentum of the system is expressed as

$$L^{c/c} = L^{f/c} \tag{3}$$

In the above equation, *Lc*/*<sup>c</sup>* and *L<sup>f</sup>* /*<sup>c</sup>* represent the momenta of the coarse and fine models, respectively, projected on to the space of partial velocity vectors of the coarse model. Equation (3) provides a set of *n* − *m* equations which are linear in the generalized speeds of the coarse model and solvable for the unique and physically meaningful states of the system after the transition to the coarser model.

Now consider the case in which, the coarse model is transitioned back to the fine-scale model. Equation (3) is still valid, and provides *n* − *m* equations with *n* unknown generalized speeds of the finer model. Furthermore, during the coarsening process, the level of the kinetic energy also drops because we chose to ignore certain modes of the system. However, in actual biomolecular systems such a decrease in energy does not happen. Consequently, it is important to realize the proper kinetic energy when transitioning back to the fine-scale model. Therefore, the following equation must be satisfied

$$KE^f = \frac{1}{2} (\boldsymbol{\mu}^f)^T \mathcal{M} \boldsymbol{u}^f \tag{4}$$

In the above equation *<sup>u</sup><sup>f</sup>* is the *<sup>n</sup>* <sup>×</sup> 1 column matrix containing the generalized speeds of the fine model, and M represents the generalized inertia matrix of the fine model. It is clear that Eqs. (3) and (4) provide *n* − *m* + 1 equations with *n* unknowns. This indicates that the problem is under-determined when multiple *do f* of the system are released. We may arrive at a unique or finite number of solutions, solving the following optimization problem

$$\text{Optimize} \qquad J(u^f, t) \tag{5}$$

$$\text{Subjected to }\quad \Theta\_i(\boldsymbol{\mu}^f, t) = 0, \; i = 1, \dots, k \tag{6}$$

In the above equation, *J* is the physics- or knowledge- or mathematics-based objective function to be optimized (nominally minimized) subjected to the constraint equations Θ*i*. In [1, 15], different objective functions are proposed for coarse to fine-scale transition problems. For instance, in order to prevent the generalized speeds of the new fine-scale model from deviating greatly from those of the coarse scale model, we may minimize the *L*<sup>2</sup> norm of the difference between the generalized speeds of the coarse and fine scale models as follows

$$J = (\mathfrak{u}^f - \mathfrak{u}^c)^T (\mathfrak{u}^f - \mathfrak{u}^c) \tag{7}$$

As indicated previously, (*n* − *m*) constraint equations governing the optimization problem are obtained from the conservation of the generalized momentum of the system within the transition. The rest of the constraint equations are obtained from other information about the system, such as the specific value of kinetic energy or the temperature of the system.

The generalized momenta balance equations from Eq. (3) are expressed as

$$A\mathfrak{u}^f = \mathfrak{b} \tag{8}$$

where *<sup>A</sup>* and *<sup>b</sup>* are (*<sup>n</sup>* <sup>−</sup> *<sup>m</sup>*) <sup>×</sup> *<sup>n</sup>* and *<sup>n</sup>* known matrices, respectively, and *<sup>u</sup><sup>f</sup>* is an *<sup>n</sup>* <sup>×</sup> 1 column matrix of the generalized speeds of the fine-scale system model. As a part of the optimization problem, one must solve this linear system for *n* − *m* dependent generalized speeds in terms of *m* independent generalized speeds. Therefore, the optimization is performed on a much smaller number (*m*) variables, with a cost of *O*(*m*3). For a complex molecular system, *n* could be very large, and *n* >> *m*, hence a significant reduction is achieved in the overall cost of optimization as compared to other traditional techniques, such as Lagrange multipliers [3]. However, the computations required to find the relations between dependent and independent generalized speeds can impose a significant burden on these simulations. It is shown in [2] that if traditional methods such as Gaussian elimination or LU factorization are used to find these relations, this cost tends to be *<sup>O</sup>*(*n*(*<sup>n</sup>* <sup>−</sup> *<sup>m</sup>*)2). The DCA scheme provided here finds these relations at the end of the hierarchic disassembly process with computational complexity of almost *O*(*n*) in serial implementation. In other words, in this strategy, DCA formulates the impulse-momentum equations of the system which is followed by providing the relations between dependent and independent generalized speeds of the system in a timely manner. As such, this significantly reduces the costs associated with forming and solving the optimization problem in the transitions to the finer models.

#### **4. DCA-based momenta balance for multi-flexible bodies**

6 Will-be-set-by-IN-TECH

In the above equation, *Lc*/*<sup>c</sup>* and *L<sup>f</sup>* /*<sup>c</sup>* represent the momenta of the coarse and fine models, respectively, projected on to the space of partial velocity vectors of the coarse model. Equation (3) provides a set of *n* − *m* equations which are linear in the generalized speeds of the coarse model and solvable for the unique and physically meaningful states of the system after the

Now consider the case in which, the coarse model is transitioned back to the fine-scale model. Equation (3) is still valid, and provides *n* − *m* equations with *n* unknown generalized speeds of the finer model. Furthermore, during the coarsening process, the level of the kinetic energy also drops because we chose to ignore certain modes of the system. However, in actual biomolecular systems such a decrease in energy does not happen. Consequently, it is important to realize the proper kinetic energy when transitioning back to the fine-scale model.

In the above equation *<sup>u</sup><sup>f</sup>* is the *<sup>n</sup>* <sup>×</sup> 1 column matrix containing the generalized speeds of the fine model, and M represents the generalized inertia matrix of the fine model. It is clear that Eqs. (3) and (4) provide *n* − *m* + 1 equations with *n* unknowns. This indicates that the problem is under-determined when multiple *do f* of the system are released. We may arrive at a unique

In the above equation, *J* is the physics- or knowledge- or mathematics-based objective function to be optimized (nominally minimized) subjected to the constraint equations Θ*i*. In [1, 15], different objective functions are proposed for coarse to fine-scale transition problems. For instance, in order to prevent the generalized speeds of the new fine-scale model from deviating greatly from those of the coarse scale model, we may minimize the *L*<sup>2</sup> norm of the difference between the generalized speeds of the coarse and fine scale models as follows

As indicated previously, (*n* − *m*) constraint equations governing the optimization problem are obtained from the conservation of the generalized momentum of the system within the transition. The rest of the constraint equations are obtained from other information about the

where *<sup>A</sup>* and *<sup>b</sup>* are (*<sup>n</sup>* <sup>−</sup> *<sup>m</sup>*) <sup>×</sup> *<sup>n</sup>* and *<sup>n</sup>* known matrices, respectively, and *<sup>u</sup><sup>f</sup>* is an *<sup>n</sup>* <sup>×</sup> 1 column matrix of the generalized speeds of the fine-scale system model. As a part of the optimization

)*T*(*u<sup>f</sup>* <sup>−</sup> *<sup>u</sup><sup>c</sup>*

*KE<sup>f</sup>* <sup>=</sup> <sup>1</sup> 2 (*u<sup>f</sup>*

or finite number of solutions, solving the following optimization problem

Optimize *J*(*u<sup>f</sup>*

Subjected to Θ*i*(*u<sup>f</sup>*

*<sup>J</sup>* = (*u<sup>f</sup>* <sup>−</sup> *<sup>u</sup><sup>c</sup>*

system, such as the specific value of kinetic energy or the temperature of the system.

The generalized momenta balance equations from Eq. (3) are expressed as

*Lc*/*<sup>c</sup>* = *L<sup>f</sup>* /*<sup>c</sup>* (3)

)*T*M*u<sup>f</sup>* (4)

, *t*) (5)

, *t*) = 0, *i* = 1, ··· , *k* (6)

*Au<sup>f</sup>* = *b* (8)

) (7)

generalized momentum of the system is expressed as

Therefore, the following equation must be satisfied

transition to the coarser model.

In this section, two-handle impulse-momentum equations of flexible bodies are derived. Mathematical modeling of the transition from a coarse model to a fine-scale model is discussed. For the fine-scale to coarse-scale model transition in multi-flexible-body system the reader is referred to [7, 17]. We will now derive the two-handle impulse-momentum equations when flexible degrees of freedom of a flexible body or the joints in the system are released. Then, the assembly of two consecutive bodies for which the connecting joint is unlocked is discussed. Finally, the hierarchic assembly-disassembly process for the multi-flexible-body system is presented.

#### **4.1. Two-handle impulse-momentum equations in coarse to fine transitions**

Now, we develop the two-handle impulse-momentum equations for consecutive flexible bodies in the transition from a coarse model to a fine-scale model. It is desired to develop the handle equations which express the spatial velocity vectors of the handles after the transition to the finer model as explicit functions of only newly introduced modal generalized speeds of the fine model. For this purpose, we start from the impulse-momentum equation of the flexible body as

$$
\Gamma^f \mathbf{v}\_1^f - \Gamma^c \mathbf{v}\_1^c = \begin{bmatrix} \gamma\_R \\ \gamma\_F \end{bmatrix}\_1^c \int\_{t\_\ell}^{t\_f} F\_{1c} dt + \begin{bmatrix} \gamma\_R \\ \gamma\_F \end{bmatrix}\_2^c \int\_{t\_\ell}^{t\_f} F\_{2c} dt \tag{9}
$$

where Γ*<sup>f</sup>* and Γ*<sup>c</sup>* are the inertia matrices associated with the fine-scale and coarse models, respectively. Also, *tc* and *tf* represent the time right before and right after the transition. The quantities *tf tc <sup>F</sup>*1*cdt* and *tf tc F*2*cdt* are the spatial impulsive constraint forces on handle-1 and handle-2 of the flexible body. The matrices *γ<sup>R</sup> γF c* 1 and *γ<sup>R</sup> γF c* 2 are the coefficients resulting from the generalized constraint force contribution at handle-1 and handle-2, respectively. Moreover, in Eq. (9), the impulses due to the applied loads are not considered since they represent a bounded loads integrated over an infinitesimal time interval. For detailed derivation of these quantities the reader is referred to [8]. It is desired to develop the handle equations which provide the spatial velocity vectors of the handles right after the transition to the fine-scale model in terms of newly added modal generalized speeds. Therefore, in Eq. (9), the inertia matrix of the flexible body is represented by its components corresponding to rigid and flexible modes, as well as the coupling terms

$$
\begin{aligned}
\begin{bmatrix}
\Gamma\_{RR}\,\Gamma\_{RF} \\
\Gamma\_{FR}\,\Gamma\_{FF}
\end{bmatrix}^{f}
\begin{bmatrix}
\upsilon\_{1} \\
\dot{\eta}
\end{bmatrix}^{f} &= \begin{bmatrix}
\gamma\_{R} \\
\gamma\_{F}
\end{bmatrix}^{c}\_{1} \int\_{t\_{c}}^{t\_{f}} F\_{1c} dt + \begin{bmatrix}
\gamma\_{R} \\
\gamma\_{F}
\end{bmatrix}^{c}\_{2} \int\_{t\_{c}}^{t\_{f}} F\_{2c} dt \\ &+ \begin{bmatrix}
\Gamma\_{RR}\,\Gamma\_{RF} \\
\Gamma\_{FR}\,\Gamma\_{FF}
\end{bmatrix}^{c} \begin{bmatrix}
\upsilon\_{1} \\
\dot{\eta}
\end{bmatrix}^{c}
\end{aligned}
\tag{10}
$$

which is decomposed to the following relations

$$
\Gamma^{\underline{f}}\_{\rm FF} \dot{\eta}^{\underline{f}} = \gamma^{\underline{c}}\_{\rm F1} \int\_{t\_{\rm c}}^{t\_{\rm f}} \mathcal{F}\_{\rm Ic} dt + \gamma^{\underline{c}}\_{\rm F2} \int\_{t\_{\rm c}}^{t\_{\rm f}} \mathcal{F}\_{\rm 2c} dt + \Gamma^{\underline{c}}\_{\rm FR} \upsilon^{\underline{c}}\_{\rm 1} + \Gamma^{\underline{c}}\_{\rm FF} \dot{\eta}^{\underline{c}} - \Gamma^{\underline{f}}\_{\rm FR} \upsilon^{\underline{f}}\_{\rm 1} \tag{11}
$$

$$
\Gamma^{f}\_{\rm RR} v^{f}\_{1} = \gamma^{\varepsilon}\_{\rm R1} \int\_{t\_{\varepsilon}}^{t\_{f}} \mathbf{F}\_{\rm lc} dt + \gamma^{\varepsilon}\_{\rm R2} \int\_{t\_{\varepsilon}}^{t\_{f}} \mathbf{F}\_{\rm 2c} dt + \Gamma^{c}\_{\rm RR} v^{c}\_{1} + \Gamma^{c}\_{\rm RF} \dot{\boldsymbol{\eta}}^{c} - \Gamma^{f}\_{\rm RF} \dot{\boldsymbol{\eta}}^{f} \tag{12}
$$

Since the generalized momentum equations are calculated based on the projection onto the space of the coarser model, the matrix Γ*<sup>f</sup>* is not a square matrix and thus Γ*<sup>f</sup> FF* is not invertible. However, we can partition Eq. (11) in terms of dependent (those associated with the coarser model) and independent (newly introduced) generalized speeds as

$$
\begin{bmatrix}
\Gamma\_{FF}^{f\_d} \vdots \Gamma\_{FF}^{f\_l} \end{bmatrix} \begin{bmatrix}
\dot{q}^{f\_d} \\
\vdots \\
\dot{q}^{f\_l}
\end{bmatrix} = \gamma\_{F1}^c \int\_{t\_c}^{t\_f} F\_{1c} dt + \gamma\_{F2}^c \int\_{t\_c}^{t\_f} F\_{2c} dt
$$

$$
+ \Gamma\_{FR}^c v\_1^c + \Gamma\_{FF}^c \dot{q}^c - \Gamma\_{FR}^f v\_1^f \tag{13}
$$

Using the above relation, the expression for the dependent generalized modal speeds is written as

$$\begin{split} \boldsymbol{\dot{q}^{f\_d}} &= (\boldsymbol{\Gamma}^{f\_d}\_{\mathrm{FF}})^{-1} [\boldsymbol{\gamma}^{\boldsymbol{c}}\_{\mathrm{F}1} \int\_{t\_{\boldsymbol{c}}}^{t\_{\boldsymbol{f}}} \boldsymbol{F}\_{\mathrm{L}} dt + \boldsymbol{\gamma}^{\boldsymbol{c}}\_{\mathrm{F}2} \int\_{t\_{\boldsymbol{c}}}^{t\_{\boldsymbol{f}}} \boldsymbol{F}\_{\mathrm{2c}} dt + \boldsymbol{\Gamma}^{\boldsymbol{c}}\_{\mathrm{FR}} \boldsymbol{v}^{\boldsymbol{c}}\_{\mathrm{1}} \\ &+ \boldsymbol{\Gamma}^{\boldsymbol{c}}\_{\mathrm{FF}} \boldsymbol{\dot{q}^{c}} - \boldsymbol{\Gamma}^{\boldsymbol{f}}\_{\mathrm{FR}} \boldsymbol{v}^{\boldsymbol{f}}\_{\mathrm{1}} - \boldsymbol{\Gamma}^{f\_{\mathrm{i}}}\_{\mathrm{FF}} \boldsymbol{\dot{q}^{f\_{\mathrm{i}}}}] \end{split} \tag{14}$$

Defining

$$
\Gamma\_{\rm RF}^f = \begin{bmatrix} \Gamma\_{\rm RF}^{f\_d} \vdots \Gamma\_{\rm RF}^{f\_l} \end{bmatrix}\_{\dots,\dots,\dots} \tag{15}
$$

$$\Lambda = \left[\Gamma\_{RR}^f - \Gamma\_{RF}^{f\_d} (\Gamma\_{FF}^{f\_d})^{-1} \Gamma\_{FR}^f\right]^{-1} \tag{16}$$

$$\mathcal{L}\_1 = \left[ \gamma\_{R1}^{\mathcal{L}} - \Gamma\_{RF}^{f\_d} (\Gamma\_{FF}^{f\_d})^{-1} \gamma\_{F1}^{\mathcal{L}} \right] \tag{17}$$

$$\mathcal{L}\_2 = \left[ \gamma\_{R2}^c - \Gamma\_{RF}^{f\_d} (\Gamma\_{FF}^{f\_d})^{-1} \gamma\_{F2}^c \right] \tag{18}$$

$$\mathcal{L}\_{\mathfrak{D}} = [\Gamma^{\mathfrak{c}}\_{RR} - \Gamma^{f\_{\mathfrak{d}}}\_{RF} (\Gamma^{f\_{\mathfrak{f}}}\_{FF})^{-1} \Gamma^{\mathfrak{c}}\_{FR}] v^{\mathfrak{c}}\_{1} + [\Gamma^{\mathfrak{c}}\_{RF} - \Gamma^{f\_{\mathfrak{d}}}\_{RF} (\Gamma^{f\_{\mathfrak{d}}}\_{FF})^{-1} \Gamma^{\mathfrak{c}}\_{FF}] \phi^{\mathfrak{c}} \tag{19}$$

$$\mathcal{L}\_4 = \left[ \Gamma\_{RF}^{f\_d} (\Gamma\_{FF}^{f\_d})^{-1} \Gamma\_{FF}^{f\_l} - \Gamma\_{RF}^{f\_l} \right] \tag{20}$$

$$
\lambda\_{1i} = \Lambda\_{\mathfrak{z}i}^{\tau}, \quad (i = 1, 2, 3, 4) \tag{21}
$$

and using Eqs. (12) and (14), the spatial velocity vector of handle-1 can be written in terms of the independent modal speeds

$$v\_1^f = \lambda\_{11} \int\_{t\_\ell}^{t\_f} F\_{1c} dt + \lambda\_{12} \int\_{t\_\ell}^{t\_f} F\_{2c} dt + \lambda\_{13} + \lambda\_{14} \dot{q}^{f\_l} \tag{22}$$

As such, the spatial velocity vector of handle-2 becomes

$$v\_2^f = (S^{k1k2})^T v\_1^f + \phi\_2^f \dot{q}^f \tag{23}$$

Employing the same partitioning technique, Eqs. (23) can be written as

$$\boldsymbol{v}\_{2}^{f} = (\boldsymbol{\mathcal{S}}^{\text{k}1\text{k}2})^{T} \boldsymbol{v}\_{1}^{f} + \left[ \boldsymbol{\phi}\_{2}^{f\_{d}} \vdots \boldsymbol{\phi}\_{2}^{f\_{l}} \right] \begin{bmatrix} \dot{\boldsymbol{q}}^{f\_{d}} \\ \cdots \\ \dot{\boldsymbol{q}}^{f\_{l}} \end{bmatrix} \tag{24}$$

$$\Rightarrow v\_2^f = (\mathbf{S}^{k1k2})^T v\_1^f + \phi\_2^{f\_d} \dot{\eta}^{f\_d} + \phi\_2^{f\_l} \dot{\eta}^{f\_l} \tag{25}$$

Using

8 Will-be-set-by-IN-TECH

equations which provide the spatial velocity vectors of the handles right after the transition to the fine-scale model in terms of newly added modal generalized speeds. Therefore, in Eq. (9), the inertia matrix of the flexible body is represented by its components corresponding to rigid

> Γ*RR* Γ*RF* Γ*FR* Γ*FF*

� *tf tc*

*F*1*cdt* +

�*<sup>c</sup>* � *v*1 *q*˙ �*c*

*F*2*cdt* + Γ*<sup>c</sup>*

*F*2*cdt* + Γ*<sup>c</sup>*

� *γ<sup>R</sup> γF* �*c* 2

*FRv<sup>c</sup>* <sup>1</sup> <sup>+</sup> <sup>Γ</sup>*<sup>c</sup> FFq*˙ *<sup>c</sup>* <sup>−</sup> <sup>Γ</sup>*<sup>f</sup> FRv<sup>f</sup>*

*RRv<sup>c</sup>* <sup>1</sup> <sup>+</sup> <sup>Γ</sup>*<sup>c</sup> RFq*˙ *<sup>c</sup>* <sup>−</sup> <sup>Γ</sup>*<sup>f</sup>*

*F*1*cdt* + *γ<sup>c</sup>*

*F*2 � *tf tc*

*F*2*cdt* + Γ*<sup>c</sup>*

*F*2*cdt*

*FRv<sup>c</sup>* 1

*FFq*˙ *fi* ] (14)

<sup>−</sup><sup>1</sup> (16)

*<sup>F</sup>*1] (17)

*<sup>F</sup>*2] (18)

*FF*)−1Γ*<sup>c</sup>*

*RF*] (20)

*FF*]*q*˙

*RF*(Γ*fd*

� *tf tc*

*F*2*cdt*

(10)

(15)

*<sup>c</sup>* (19)

<sup>1</sup> (11)

*RFq*˙ *<sup>f</sup>* (12)

*FF* is not invertible.

<sup>1</sup> (13)

and flexible modes, as well as the coupling terms

�*<sup>f</sup>* � *v*1 *q*˙ �*f* = � *γ<sup>R</sup> γF* �*c* 1

> + �

*F*1*cdt* + *γ<sup>c</sup>*

*F*1*cdt* + *γ<sup>c</sup>*

space of the coarser model, the matrix Γ*<sup>f</sup>* is not a square matrix and thus Γ*<sup>f</sup>*

⎤ <sup>⎦</sup> <sup>=</sup> *<sup>γ</sup><sup>c</sup> F*1 � *tf tc*

> + Γ*<sup>c</sup> FRv<sup>c</sup>* <sup>1</sup> <sup>+</sup> <sup>Γ</sup>*<sup>c</sup> FFq*˙ *<sup>c</sup>* <sup>−</sup> <sup>Γ</sup>*<sup>f</sup> FRv<sup>f</sup>*

Using the above relation, the expression for the dependent generalized modal speeds is

*F*1*cdt* + *γ<sup>c</sup>*

*F*2 � *tf tc*

model) and independent (newly introduced) generalized speeds as

�⎡ ⎣ *q*˙ *fd* ··· *q*˙ *fi*

*FF*)−1[*γ<sup>c</sup> F*1 � *tf tc*

*F*2 � *tf tc*

*R*2 � *tf tc*

Since the generalized momentum equations are calculated based on the projection onto the

However, we can partition Eq. (11) in terms of dependent (those associated with the coarser

Γ*RR* Γ*RF* Γ*FR* Γ*FF*

which is decomposed to the following relations

*F*1 � *tf tc*

*FFq*˙ *<sup>f</sup>* <sup>=</sup> *<sup>γ</sup><sup>c</sup>*

<sup>1</sup> <sup>=</sup> *<sup>γ</sup><sup>c</sup> R*1 � *tf tc*

> � Γ*fd FF* . . . Γ*fi FF*

*q*˙ *fd* = (Γ*fd*

+ Γ*<sup>c</sup> FFq*˙ *<sup>c</sup>* <sup>−</sup> <sup>Γ</sup>*<sup>f</sup> FRv<sup>f</sup>* <sup>1</sup> <sup>−</sup> <sup>Γ</sup>*fi*

*RR* <sup>−</sup> <sup>Γ</sup>*fd*

*<sup>R</sup>*<sup>1</sup> <sup>−</sup> <sup>Γ</sup>*fd*

*<sup>R</sup>*<sup>2</sup> <sup>−</sup> <sup>Γ</sup>*fd*

*RR* <sup>−</sup> <sup>Γ</sup>*fd*

*RF*(Γ*fd*

*RF*(Γ*fd*

*RF*(Γ*fd*

*RF*(Γ*fd*

*RF*(Γ*fd*

*FF*)−1Γ*fi*

*FF*)−1Γ*<sup>f</sup> FR*]

*FF*)−1*γ<sup>c</sup>*

*FF*)−1*γ<sup>c</sup>*

*FF*)−1Γ*<sup>c</sup>*

*FF* <sup>−</sup> <sup>Γ</sup>*fi*

*FR*]*v<sup>c</sup>*

<sup>1</sup> + [Γ*<sup>c</sup>*

*RF* <sup>−</sup> <sup>Γ</sup>*fd*

*λ*1*<sup>i</sup>* = Λ*ζi*, (*i* = 1, 2, 3, 4) (21)

Γ*f RF* = � Γ*fd RF* . . . Γ*fi RF* �

Λ = [Γ*<sup>f</sup>*

*ζ*<sup>1</sup> = [*γ<sup>c</sup>*

*ζ*<sup>2</sup> = [*γ<sup>c</sup>*

*ζ*<sup>3</sup> = [Γ*<sup>c</sup>*

*<sup>ζ</sup>*<sup>4</sup> = [Γ*fd*

�

Γ*f*

Γ*f RRv<sup>f</sup>*

written as

Defining

$$
\lambda\_{21} = \begin{bmatrix} (\mathbf{S}^{k1k2})^T \lambda\_{11} + \boldsymbol{\phi}\_2^{f\_d} (\boldsymbol{\Gamma}\_{FF}^{f\_d})^{-1} \boldsymbol{\gamma}\_{F1}^c - \boldsymbol{\phi}\_2^{f\_d} (\boldsymbol{\Gamma}\_{FF}^{f\_d})^{-1} \boldsymbol{\Gamma}\_{FR}^f \lambda\_{11} \end{bmatrix} \tag{26}
$$

$$\lambda\_{22} = \left[ (\mathbf{S}^{\rm k1k2})^T \lambda\_{12} + \phi\_2^{f\_d} (\Gamma\_{FF}^{f\_d})^{-1} \gamma\_{F2}^c - \phi\_2^{f\_d} (\Gamma\_{FF}^{f\_d})^{-1} \Gamma\_{FR}^f \lambda\_{12} \right] \tag{27}$$

$$\lambda\_{23} = \left[ \mathbf{S}^{\rm k1k2} \right)^T \lambda\_{13} + \phi\_2^{f\_d} (\Gamma\_{FF}^{f\_d})^{-1} \Gamma\_{FR}^c \upsilon\_1^c + \phi\_2^{f\_d} (\Gamma\_{FF}^{f\_d})^{-1} \Gamma\_{FF}^c \dot{\eta}^c$$

$$-\left[\phi\_2^{f\_d} (\Gamma\_{\rm FF}^{f\_d})^{-1} \Gamma\_{\rm FR}^f \lambda\_{13}\right] \tag{28}$$

$$
\lambda\_{24} = [\mathcal{S}^{\text{k}\text{l}\text{k}2}]^T \lambda\_{14} + \phi\_2^{f\_l} - \phi\_2^{f\_d} (\Gamma\_{\text{FF}}^{f\_d})^{-1} \Gamma\_{\text{FF}}^{f\_l} - \phi\_2^{f\_d} (\Gamma\_{\text{FF}}^{f\_d})^{-1} \Gamma\_{\text{FR}}^f \lambda\_{14} \tag{29}
$$

and Eq. (25), the spatial velocity vector of handle-2 can be written as

$$
v\_{2}^{f} = \lambda\_{21} \int\_{t\_{\mathcal{E}}}^{t\_{f}} F\_{1c} dt + \lambda\_{22} \int\_{t\_{\mathcal{E}}}^{t\_{f}} F\_{2c} dt + \lambda\_{23} + \lambda\_{24} \dot{q}^{f\_{1}} \tag{30}$$

Equations (22) and (30) are now in two-handle impulse-momentum form and along with Eq. (14), give the new velocities associated with each handle after the transition. These equations express the spatial velocity vectors of the handles of the body as well as the modal generalized speeds which have not changed within the transition in terms of the newly added modal generalized speeds. This important property will be used in the optimization problem to provide the states of the system after the transition to the finer models.

As such, the two-handle equations describing the impulse-momentum of two consecutive bodies, body *k* and body *k* + 1 are expressed as

$$v\_1^{(k)f} = \lambda\_{11}^{(k)} \int\_{t\_\ell}^{t\_f} F\_{1c}^{(k)} dt + \lambda\_{12}^{(k)} \int\_{t\_\ell}^{t\_f} F\_{2c}^{(k)} dt + \lambda\_{13}^{(k)} + \lambda\_{14}^{(k)} \dot{q}^{(k)f\_l} \tag{31}$$

$$v\_2^{(k)f} = \lambda\_{21}^{(k)} \int\_{t\_\ell}^{t\_f} F\_{1c}^{(k)} dt + \lambda\_{22}^{(k)} \int\_{t\_\ell}^{t\_f} F\_{2c}^{(k)} dt + \lambda\_{23}^{(k)} + \lambda\_{24}^{(k)} \dot{q}^{(k)f\_l} \tag{32}$$

$$w\_1^{(k+1)f} = \lambda\_{11}^{(k+1)} \int\_{t\_\varepsilon}^{t\_f} F\_{1\varepsilon}^{(k+1)} dt + \lambda\_{12}^{(k+1)} \int\_{t\_\varepsilon}^{t\_f} F\_{2\varepsilon}^{(k+1)} dt + \lambda\_{13}^{(k+1)} + \lambda\_{14}^{(k+1)} \dot{q}^{(k+1)f\_i} \tag{33}$$

$$v\_{2}^{(k+1)f} = \lambda\_{21}^{(k+1)} \int\_{t\_{\mathcal{c}}}^{t\_{f}} F\_{1\mathcal{c}}^{(k+1)} dt + \lambda\_{22}^{(k+1)} \int\_{t\_{\mathcal{c}}}^{t\_{f}} F\_{2\mathcal{c}}^{(k+1)} dt + \lambda\_{23}^{(k+1)} + \lambda\_{24}^{(k+1)} \dot{q}^{(k+1)f\_{i}} \tag{34}$$

#### **4.2. Assembly process and releasing the joint between two consecutive bodies**

In this section, a method to combine the two-handle equations of individual flexible bodies to form the equations of the resulting assembly is presented. Herein, the assembly process of the consecutive bodies is discussed only within the transition from a coarse model to a finer model. This transition is achieved by releasing the joint between two consecutive bodies. Clearly, this would mean a change in the joint free-motion map *<sup>P</sup>J<sup>k</sup> <sup>R</sup>* and its orthogonal complement *<sup>D</sup>J<sup>k</sup> <sup>R</sup>* . It will become evident that the assembly process of the consecutive bodies for the fine to coarse transition is similar, and the associated equations can be easily derived by following the given procedure.

From the definition of joint free-motion map, the relative spatial velocity vector at the joint between two consecutive bodies is expressed by the following kinematic constraint

$$
v\_1^{(k+1)f} - v\_2^{(k)f} = P\_R^{f^k f} \mathfrak{u}^{(k/k+1)f} \tag{35}$$

In the above equation, *u*(*k*/*k*+1)*<sup>f</sup>* is the relative generalized speed defined at the joint of the fine model. From Newton's third law of motion, the impulses at the intermediate joint are related by

$$\int\_{t\_{\mathcal{L}}}^{t\_f} F\_{2\mathcal{L}}^{(k)} dt = -\int\_{t\_{\mathcal{L}}}^{t\_f} F\_{1\mathcal{L}}^{(k+1)} dt \tag{36}$$

Substituting Eqs. (32), (33) and (36) into Eq. (35) results in

$$\begin{split} \left(\lambda\_{11}^{(k+1)} + \lambda\_{22}^{(k)}\right) \int\_{t\_{\mathcal{c}}}^{t\_{f}} F\_{1c}^{(k+1)} dt &= \lambda\_{21}^{(k)} \int\_{t\_{\mathcal{c}}}^{t\_{f}} F\_{1c}^{(k)} dt - \lambda\_{12}^{(k+1)} \int\_{t\_{\mathcal{c}}}^{t\_{f}} F\_{2c}^{(k+1)} dt \\ &+ \lambda\_{23}^{(k)} - \lambda\_{13}^{(k+1)} + \lambda\_{24}^{(k)} \dot{\eta}^{(k) f\_{\mathcal{i}}} - \lambda\_{14}^{(k+1)} \dot{\eta}^{(k+1) f\_{\mathcal{i}}} + \mathcal{P}\_{R}^{f^{k} f} u^{(k/k+1) f} \end{split} \tag{37}$$

Using the definition of the joint free-motion map, the spatial constraint impulses lie exactly in the space spanned by the orthogonal complement of joint free-motion map of the *coarser* model. These constraint impulses can be expressed as

$$\int\_{t\_{\mathcal{L}}}^{t\_f} \mathbf{F}\_{1c}^{(k+1)} dt = D\_{\mathcal{R}}^{J^k c} \int\_{t\_{\mathcal{L}}}^{t\_f} \mathbf{F}\_{1c}^{(k+1)} dt \tag{38}$$

In the above equation, *tf tc* **<sup>F</sup>**(*k*+1) <sup>1</sup>*<sup>c</sup> dt* is an ordered measure number of the impulsive constraint torques and forces. Pre-multiplying Eq. (37) by (*DJ<sup>k</sup> <sup>c</sup> <sup>R</sup>* )*T*, one arrives at the expression for *tf tc <sup>F</sup>*(*k*+1) <sup>1</sup>*<sup>c</sup> dt* as

$$\begin{split} \int\_{t\_{\mathcal{c}}}^{t\_{f}} F\_{1c}^{(k+1)} dt &= \mathcal{X} \lambda\_{21}^{(k)} \int\_{t\_{\mathcal{c}}}^{t\_{f}} F\_{1c}^{(k)} dt - \mathcal{X} \lambda\_{12}^{(k+1)} \int\_{t\_{\mathcal{c}}}^{t\_{f}} F\_{2c}^{(k+1)} dt \\ &+ \mathcal{X} \mathcal{Y} + \mathcal{X} \lambda\_{24}^{(k)} \dot{q}^{(k) f\_{i}} - \mathcal{X} \lambda\_{14}^{(k+1)} \dot{q}^{(k+1) f\_{i}} + \mathcal{X} P\_{\mathcal{R}}^{I^{k} f} u^{(k/k+1) f} \end{split} \tag{39}$$

where

$$X = D\_{\mathbb{R}}^{J^{\mathbb{k}}c} [(D\_{\mathbb{R}}^{J^{\mathbb{k}}c})^T (\lambda\_{11}^{(k+1)} + \lambda\_{22}^{(k)}) D\_{\mathbb{R}}^{J^{\mathbb{k}}c}]^{-1} (D\_{\mathbb{R}}^{J^{\mathbb{k}}c})^T \tag{40}$$

$$Y = \lambda\_{23}^{(k)} - \lambda\_{13}^{(k+1)} \tag{41}$$

Using Eqs. (31), (34), and (39), we write the two-handle equations for the assembly *k* : *k* + 1

$$\begin{split} v\_{1}^{(k:k+1)f} &= \Psi\_{11}^{(k:k+1)} \int\_{t\_{\mathcal{L}}}^{t\_{f}} F\_{1c}^{(k)} dt + \Psi\_{12}^{(k:k+1)} \int\_{t\_{\mathcal{L}}}^{t\_{f}} F\_{2c}^{(k+1)} dt \\ &+ \Psi\_{13}^{(k:k+1)} + \Psi\_{14}^{(k:k+1)} \dot{q}^{(k)f\_{i}} + \Psi\_{15}^{(k:k+1)} \dot{q}^{(k+1)f\_{i}} + \Psi\_{16}^{(k:k+1)} u^{(k/k+1)f} \end{split} \tag{42}$$

$$\begin{split} \boldsymbol{v}\_{2}^{(\boldsymbol{k}:\boldsymbol{k}+1)f} &= \mathbf{Y}\_{21}^{(\boldsymbol{k}:\boldsymbol{k}+1)} \int\_{t\_{\boldsymbol{\ell}}}^{t\_{f}} \boldsymbol{F}\_{1\boldsymbol{c}}^{(\boldsymbol{k})} dt + \mathbf{Y}\_{22}^{(\boldsymbol{k}:\boldsymbol{k}+1)} \int\_{t\_{\boldsymbol{\ell}}}^{t\_{f}} \boldsymbol{F}\_{2\boldsymbol{c}}^{(\boldsymbol{k}+1)} dt \\ &+ \mathbf{Y}\_{23}^{(\boldsymbol{k}:\boldsymbol{k}+1)} + \mathbf{Y}\_{24}^{(\boldsymbol{k}:\boldsymbol{k}+1)} \boldsymbol{\dot{q}}^{(\boldsymbol{k})f\_{i}} + \mathbf{Y}\_{25}^{(\boldsymbol{k}:\boldsymbol{k}+1)f\_{i}} + \mathbf{Y}\_{26}^{(\boldsymbol{k}:\boldsymbol{k}+1)} \boldsymbol{u}^{(\boldsymbol{k}/\boldsymbol{k}+1)f} \end{split} \tag{43}$$

where:

10 Will-be-set-by-IN-TECH

In this section, a method to combine the two-handle equations of individual flexible bodies to form the equations of the resulting assembly is presented. Herein, the assembly process of the consecutive bodies is discussed only within the transition from a coarse model to a finer model. This transition is achieved by releasing the joint between two consecutive

for the fine to coarse transition is similar, and the associated equations can be easily derived

From the definition of joint free-motion map, the relative spatial velocity vector at the joint

In the above equation, *u*(*k*/*k*+1)*<sup>f</sup>* is the relative generalized speed defined at the joint of the fine model. From Newton's third law of motion, the impulses at the intermediate joint are

> 21 *tf tc*

Using the definition of the joint free-motion map, the spatial constraint impulses lie exactly in the space spanned by the orthogonal complement of joint free-motion map of the *coarser*

*R*

<sup>1</sup>*<sup>c</sup> dt* <sup>−</sup> *<sup>X</sup>λ*(*k*+1) 12

(*k*)*fi* <sup>−</sup> *<sup>X</sup>λ*(*k*+1)

<sup>11</sup> <sup>+</sup> *<sup>λ</sup>*(*k*)

 *tf tc*

**F**(*k*+1)

 *tf tc*

<sup>14</sup> *q*˙

<sup>22</sup> )*DJ<sup>k</sup> <sup>c</sup> <sup>R</sup>* ]

<sup>1</sup>*<sup>c</sup> dt* is an ordered measure number of the impulsive constraint

*F*(*k*+1) <sup>2</sup>*<sup>c</sup> dt*

(*k*+1)*fi* <sup>+</sup> *XPJ<sup>k</sup> <sup>f</sup>*

<sup>−</sup>1(*DJ<sup>k</sup> <sup>c</sup>*

<sup>13</sup> (41)

 *tf tc*

(*k*)*fi* <sup>−</sup> *<sup>λ</sup>*(*k*+1) <sup>14</sup> *q*˙

*F*(*k*+1)

*F*(*k*)

<sup>1</sup>*<sup>c</sup> dt* <sup>−</sup> *<sup>λ</sup>*(*k*+1) 12

(*k*+1)*fi* <sup>+</sup> *<sup>P</sup>J<sup>k</sup> <sup>f</sup>*

between two consecutive bodies is expressed by the following kinematic constraint

(*k*)*f* <sup>2</sup> <sup>=</sup> *<sup>P</sup>J<sup>k</sup> <sup>f</sup>*

*<sup>R</sup>* . It will become evident that the assembly process of the consecutive bodies

*<sup>R</sup>* and its orthogonal

*<sup>R</sup> <sup>u</sup>*(*k*/*k*+1)*<sup>f</sup>* (35)

<sup>1</sup>*<sup>c</sup> dt* (36)

*F*(*k*+1) <sup>2</sup>*<sup>c</sup> dt*

<sup>1</sup>*<sup>c</sup> dt* (38)

*<sup>R</sup>* )*T*, one arrives at the expression for

*<sup>R</sup> <sup>u</sup>*(*k*/*k*+1)*<sup>f</sup>* (37)

*<sup>R</sup> <sup>u</sup>*(*k*/*k*+1)*<sup>f</sup>* (39)

*<sup>R</sup>* )*<sup>T</sup>* (40)

 *tf tc*

**4.2. Assembly process and releasing the joint between two consecutive bodies**

bodies. Clearly, this would mean a change in the joint free-motion map *<sup>P</sup>J<sup>k</sup>*

*v* (*k*+1)*f* <sup>1</sup> − *v*

> *tf tc*

*F*(*k*+1)

<sup>13</sup> <sup>+</sup> *<sup>λ</sup>*(*k*)

*F*(*k*+1)

Substituting Eqs. (32), (33) and (36) into Eq. (35) results in

<sup>23</sup> <sup>−</sup> *<sup>λ</sup>*(*k*+1)

 *tf tc*

> 21 *tf tc*

+ *XY* + *<sup>X</sup>λ*(*k*)

*<sup>R</sup>* [(*DJ<sup>k</sup> <sup>c</sup>*

<sup>23</sup> <sup>−</sup> *<sup>λ</sup>*(*k*+1)

<sup>22</sup> ) *tf tc*

model. These constraint impulses can be expressed as

*tc* **<sup>F</sup>**(*k*+1)

torques and forces. Pre-multiplying Eq. (37) by (*DJ<sup>k</sup> <sup>c</sup>*

<sup>1</sup>*<sup>c</sup> dt* <sup>=</sup> *<sup>X</sup>λ*(*k*)

*<sup>X</sup>* = *<sup>D</sup>J<sup>k</sup> <sup>c</sup>*

*<sup>Y</sup>* = *<sup>λ</sup>*(*k*)

+ *<sup>λ</sup>*(*k*)

*F*(*k*) <sup>2</sup>*<sup>c</sup> dt* = −

<sup>1</sup>*<sup>c</sup> dt* <sup>=</sup> *<sup>λ</sup>*(*k*)

<sup>24</sup> *q*˙

<sup>1</sup>*<sup>c</sup> dt* <sup>=</sup> *<sup>D</sup>J<sup>k</sup> <sup>c</sup>*

*F*(*k*)

<sup>24</sup> *q*˙

*<sup>R</sup>* )*T*(*λ*(*k*+1)

complement *<sup>D</sup>J<sup>k</sup>*

related by

by following the given procedure.

(*λ*(*k*+1)

In the above equation, *tf*

 *tf tc*

*F*(*k*+1)

 *tf tc <sup>F</sup>*(*k*+1) <sup>1</sup>*<sup>c</sup> dt* as

where

<sup>11</sup> <sup>+</sup> *<sup>λ</sup>*(*k*)

$$\Psi\_{11}^{(k:k+1)} = \lambda\_{11}^{(k)} - \lambda\_{12}^{(k)} X \lambda\_{21}^{(k)} \tag{44}$$

$$\Psi\_{12}^{(k:k+1)} = \lambda\_{12}^{(k)} X \lambda\_{12}^{(k+1)} \tag{45}$$

$$\Psi\_{13}^{(k:k+1)} = \lambda\_{13}^{(k)} - \lambda\_{12}^{(k)} XY \tag{46}$$

$$\mathbf{Y}\_{14}^{(k:k+1)} = \lambda\_{14}^{(k)} - \lambda\_{12}^{(k)} X \lambda\_{24}^{(k)} \tag{47}$$

$$\Psi\_{15}^{(k:k+1)} = \lambda\_{12}^{(k)} X \lambda\_{14}^{(k+1)} \tag{48}$$

$$\Psi\_{16}^{(k:k+1)} = -\lambda\_{12}^{(k)} X P\_{\mathbb{R}}^{l^k f} \tag{49}$$

$$\Psi\_{21}^{(k:k+1)} = \lambda\_{21}^{(k+1)} X \lambda\_{21}^{(k)} \tag{50}$$

$$\Psi\_{22}^{(k:k+1)} = \lambda\_{22}^{(k+1)} - \lambda\_{21}^{(k+1)} X \lambda\_{12}^{(k+1)} \tag{51}$$

$$
\Psi\_{23}^{(k:k+1)} = \lambda\_{21}^{(k+1)} XY + \lambda\_{23}^{(k+1)} \tag{52}
$$

$$\Psi\_{24}^{(k:k+1)} = \lambda\_{21}^{(k+1)} X \lambda\_{24}^{(k)} \tag{53}$$

$$\mathbf{V}\_{25}^{(k:k+1)} = \lambda\_{24}^{(k+1)} - \lambda\_{21}^{(k+1)} X \lambda\_{14}^{(k+1)} \tag{54}$$

$$\Psi\_{26}^{(k:k+1)} = \lambda\_{21}^{(k+1)} X P\_R^{f^k f} \tag{55}$$

The two-handle equations of the resultant assembly express the spatial velocity vectors of the terminal handles of the assembly in terms of the spatial constraint impulses on the same handles, as well as the newly added modal generalized speeds of each constituent flexible body, and the newly introduced *do f* at the connecting joint. These are the equations which address the dynamics of the assembly when both types of transitions occur simultaneously. In other words, they are applicable when new flexible modes are added to the flexible constituent subassemblies and new degrees of freedom are released at the connecting joint. If there is no change in the joint free-motion map, the spatial partial velocity vector associated with *u*(*k*/*k*+1)*<sup>f</sup>* does not appear in the handle equations of the resulting assembly.

#### **5. Hierarchic assembly-disassembly**

The DCA is implemented in two main passes: assembly and disassembly [8, 9]. As mentioned previously, two consecutive bodies can be combined together to recursively form the handle equations of the resulting assembly. As such, the assembly process starts at the individual sub-domain level of the binary tree to combine the adjacent bodies and form the equations of motion of the resulting assembly. This process is recursively implemented as that of the binary tree to find the impulse-momentum equations of the new assemblies. In this process, the spatial velocity vector (after transition) and impulsive load of the handles at the common joint of the consecutive bodies are eliminated. The handle equations of the resulting assembly are expressed in terms of the constraint impulses and spatial velocities of the terminal handles, as well as the newly introduce modal generalized speeds and generalized speeds associated with the newly added degrees of freedom at the connecting joints. This process stops at the top level of the binary tree in which the impulse-momentum equations of the entire system are expressed by the following two-handle equations

$$\begin{split} \boldsymbol{w}\_{1}^{1f} &= \Psi\_{11}^{(1:n)} \int\_{t\_{\boldsymbol{c}}}^{t\_{f}} \boldsymbol{F}\_{1\boldsymbol{c}}^{1} dt + \Psi\_{12}^{(1:n)} \int\_{t\_{\boldsymbol{c}}}^{t\_{f}} \boldsymbol{F}\_{2\boldsymbol{c}}^{n} dt + \Psi\_{13}^{(1:n)} \\ &+ \Psi\_{14}^{(1:n)} \dot{\boldsymbol{q}}^{(1:n)f\_{\boldsymbol{t}}} + \Psi\_{15}^{(1:n)} \boldsymbol{u}^{(1:n)f} \\ \boldsymbol{w}\_{2}^{\boldsymbol{n}f} &= \Psi\_{21}^{(1:n)} \int\_{t\_{\boldsymbol{c}}}^{t\_{\boldsymbol{f}}} \boldsymbol{F}\_{1\boldsymbol{c}}^{1} dt + \Psi\_{22}^{(1:n)} \int\_{t\_{\boldsymbol{c}}}^{t\_{\boldsymbol{f}}} \boldsymbol{F}\_{2\boldsymbol{c}}^{\boldsymbol{n}} dt + \Psi\_{23}^{(1:n)} \end{split} \tag{56}$$

$$\begin{array}{cccc} & J \mathfrak{t}\_{\mathfrak{t}} & J \mathfrak{t}\_{\mathfrak{t}} \\ + & \Psi\_{24}^{(1:n)} \dot{\mathfrak{q}}^{(1:n)f\_{\mathfrak{t}}} + \Psi\_{25}^{(1:n)} u^{(1:n)f} \end{array} \tag{57}$$

Note that through the partial velocity vectors Ψ(1:*n*) *ij* ,(*i* = 1, 2 and *j* = 4, 5), these equations are linear in terms of the newly added generalized modal speeds as well as the generalized speeds associated with the released *do f* at the joints of the system.

The two-handle equations for the assembly at the primary node is solvable by imposing appropriate boundary conditions. Solving for the unknowns of the terminal handles initiates the disassembly process [1, 11]. In this process, the known quantities of the terminal handles of each assembly are used to solve for the spatial velocities and the impulsive loads at the common joint of the constituent subassemblies using the handle equations of each individual subassembly. This process is repeated in a hierarchic disassembly of the binary tree where the known boundary conditions are used to solve the impulse-momentum equations of the subassemblies, until the spatial velocities of the fine model and impulses on all bodies in the system are determined as a known linear function of the newly introduced generalized speeds of the fine model.

#### **6. Conclusion**

The method presented in this chapter is able to efficiently simulate discontinuous changes in the model definitions for articulated multi-flexible-body systems. The impulse-momentum equations govern the dynamics of the transitions when the number of deformable modes changes and the joints in the system are locked or released. The method is implemented in a divide-and-conquer scheme which provides linear and logarithmic complexity when implemented in serial and parallel, respectively. Moreover, the transition from a coarse-scale to a fine-scale model is treated as an optimization problem to arrive at a finite number of solutions or even a unique one. The divide-and-conquer algorithm is able to efficiently produce equations to express the generalized speeds of the system after the transition to the finer models in terms of the newly added generalized speeds. This allows the reduction in computational expenses associated with forming and solving the optimization problem.

### **Acknowledgment**

12 Will-be-set-by-IN-TECH

sub-domain level of the binary tree to combine the adjacent bodies and form the equations of motion of the resulting assembly. This process is recursively implemented as that of the binary tree to find the impulse-momentum equations of the new assemblies. In this process, the spatial velocity vector (after transition) and impulsive load of the handles at the common joint of the consecutive bodies are eliminated. The handle equations of the resulting assembly are expressed in terms of the constraint impulses and spatial velocities of the terminal handles, as well as the newly introduce modal generalized speeds and generalized speeds associated with the newly added degrees of freedom at the connecting joints. This process stops at the top level of the binary tree in which the impulse-momentum equations of the entire system

> <sup>1</sup>*cdt* <sup>+</sup> <sup>Ψ</sup>(1:*n*) 12

> <sup>1</sup>*cdt* <sup>+</sup> <sup>Ψ</sup>(1:*n*) 22

are linear in terms of the newly added generalized modal speeds as well as the generalized

The two-handle equations for the assembly at the primary node is solvable by imposing appropriate boundary conditions. Solving for the unknowns of the terminal handles initiates the disassembly process [1, 11]. In this process, the known quantities of the terminal handles of each assembly are used to solve for the spatial velocities and the impulsive loads at the common joint of the constituent subassemblies using the handle equations of each individual subassembly. This process is repeated in a hierarchic disassembly of the binary tree where the known boundary conditions are used to solve the impulse-momentum equations of the subassemblies, until the spatial velocities of the fine model and impulses on all bodies in the system are determined as a known linear function of the newly introduced generalized speeds

The method presented in this chapter is able to efficiently simulate discontinuous changes in the model definitions for articulated multi-flexible-body systems. The impulse-momentum equations govern the dynamics of the transitions when the number of deformable modes changes and the joints in the system are locked or released. The method is implemented in a divide-and-conquer scheme which provides linear and logarithmic complexity when implemented in serial and parallel, respectively. Moreover, the transition from a coarse-scale to a fine-scale model is treated as an optimization problem to arrive at a finite number of solutions or even a unique one. The divide-and-conquer algorithm is able to efficiently produce equations to express the generalized speeds of the system after the transition to the finer models in terms of the newly added generalized speeds. This allows the reduction in computational expenses associated with forming and solving the optimization problem.

(1:*n*)*fi* + Ψ(1:*n*)

(1:*n*)*fi* + Ψ(1:*n*)

 *tf tc Fn*

 *tf tc Fn*

<sup>2</sup>*cdt* <sup>+</sup> <sup>Ψ</sup>(1:*n*) 13

<sup>2</sup>*cdt* <sup>+</sup> <sup>Ψ</sup>(1:*n*) 23

<sup>15</sup> *<sup>u</sup>*(1:*n*)*<sup>f</sup>* (56)

<sup>25</sup> *<sup>u</sup>*(1:*n*)*<sup>f</sup>* (57)

*ij* ,(*i* = 1, 2 and *j* = 4, 5), these equations

are expressed by the following two-handle equations

<sup>1</sup> <sup>=</sup> <sup>Ψ</sup>(1:*n*) 11

<sup>2</sup> <sup>=</sup> <sup>Ψ</sup>(1:*n*) 21

Note that through the partial velocity vectors Ψ(1:*n*)

+ Ψ(1:*n*) <sup>14</sup> *q*˙

+ Ψ(1:*n*) <sup>24</sup> *q*˙

speeds associated with the released *do f* at the joints of the system.

 *tf tc F*1

 *tf tc F*1

*v* 1 *f*

*v n f*

of the fine model.

**6. Conclusion**

Support for this work received under National Science Foundation through award No. 0757936 is gratefully acknowledged.

### **Author details**

Mohammad Poursina, Imad M. Khan and Kurt S. Anderson *Department of Mechanical, Aeronautics, and Nuclear Engineering, Rensselaer Polytechnic Institute*

#### **7. References**


14 Will-be-set-by-IN-TECH

[14] Poursina, M. [2011]. *Robust Framework for the Adaptive Multiscale Modeling of Biopolymers*,

[15] Poursina, M., Bhalerao, K. D. & Anderson, K. S. [2009]. Energy concern in biomolecular simulations with discontinuous changes in system definition, *Proceedings of the ECCOMAS Thematic Conference - Multibody Systems Dynamics*, Warsaw, Poland. [16] Poursina, M., Bhalerao, K. D., Flores, S., Anderson, K. S. & Laederach, A. [2011]. Strategies for articulated multibody-based adaptive coarse grain simulation of RNA,

[17] Poursina, M., Khan, I. & Anderson, K. S. [2011]. Model transitions and optimization problem in multi-flexible-body modeling of biopolymers, *Proceedings of the Eighths International Conference on Multibody Systems, Nonlinear Dynam. and Control, ASME Design Engineering Technical Conference 2011, (IDETC11)*, number DETC2011-48383, Washington,

[18] Praprotnik, M., Site, L. & Kremer, K. [2005]. Adaptive resolution molecular-dynamics simulation: Changing the degrees of freedom on the fly, *J. Chem. Phys.*

[19] Scheraga, H. A., Khalili, M. & Liwo, A. [2007]. Protein-folding dynamics: Overview of

[20] Shahbazi, Z., Ilies, H. & Kazerounian, K. [2010]. Hydrogen bonds and kinematic mobility

[21] Turner, J. D., Weiner, P., Robson, B., Venugopal, R., III, H. S. & Singh, R. [1995]. Reduced variable molecular dynamics, *Journal of Computational chemistry* 16: 1271–1290. [22] Voltz, K., Trylska, J., Tozzini, V., Kurkal-Siebert, V., Langowski, J. & Smith, J. [2008]. Coarse-grained force field for the nucleosome from self-consistent multiscaling, *Journal*

[23] Wu, X. W. & Sung, S. S. [1998]. Constraint dynamics algorithm for simulation of semiflexible macromolecules, *Journal of Computational chemistry* 19(14): 1555–1566.

molecular simulation techniques, *Annu. Rev. Phys. Chem.* 58(1): 57–83.

of protein molecules, *Journal of Mechanisms and Robotics* 2(2): 021009–9.

PhD thesis, Rensselaer Polytechnic Institute, Troy.

*Methods in Enzymology* 487: 73–98.

*of Computational chemistry* 29(9): 1429–1439.

123(22): 224106–224114.

250 Linear Algebra – Theorems and Applications

DC.

## *Edited by Hassan Abid Yasser*

Linear algebra occupies a central place in modern mathematics. Also, it is a beautiful and mature field of mathematics, and mathematicians have developed highly effective methods for solving its problems. It is a subject well worth studying for its own sake. This book contains selected topics in linear algebra, which represent the recent contributions in the most famous and widely problems. It includes a wide range of theorems and applications in different branches of linear algebra, such as linear systems, matrices, operators, inequalities, etc. It continues to be a definitive resource for researchers, scientists and graduate students.

Photo by Selim Dönmez / iStock

Linear Algebra - Theorems and Applications

Linear Algebra

Theorems and Applications

*Edited by Hassan Abid Yasser*