**Method of Synthesized Phase Objects in the Optical Pattern Recognition Problem Provisional chapterMethod of Synthesized Phase Objects in the Optical Pattern Recognition Problem**

Pavel V. Yezhov, Alexander P. Ostroukh, Jin-Tae Kim and Alexander V. Kuzmenko Pavel V. Yezhov, Alexander P. Ostroukh, Jin-Tae Kim and Alexander V. Kuzmenko

Additional information is available at the end of the chapter Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/65370

#### **Abstract**

To solve the pattern recognition problem, a method of synthesized phase objects (SPOmethod) is suggested. The essence of the suggested method is that synthesized phase objects are used instead of real amplitude objects. The former is object-dependent phase distributions calculated using the iterative Fourier transform algorithm. The method is experimentally studied with an optical-digital Vanderlugt and joint Fourier transform 4F-correlators. The development of the SPO-method for the rotation invariant pattern recognition is considered as well. We present the comparative analysis of recognition results with the use of the conventional and proposed methods, estimate the sensitivity of the latter to distortions of the structure of objects, and determine the applicability limits. It is demonstrated that the SPO-method allows one: (a) to simplify the procedure of choice of recognition signs (criteria); (b) to obtain one-type δ-like recognition signals irrespective of the type of objects; and (c) to improve the signal-to-noise ratio for correlation signals by 20–30 dB on the average. To introduce recognition objects in a correlator, we use SLM LC-R 2500 and SLM HEO 1080 Pluto devices.

**Keywords:** pattern recognition, method of synthesized phase objects, iterative Fourier transform algorithm, rotation invariant pattern recognition, optical-digital recognition systems, spatial light modulators

#### **1. Introduction**

The studies in the fields of Fourier optics, holography, and digital and correlation optics aimed at the solution of the pattern recognition problem remain topical for a long time. This is related to the fact that the recognition problem is object-dependent, i.e., the change in the conditions

© 2016 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. © 2016 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

of recognition or in the type of an object requires, as a rule, the optimization of available methods of solution or the development of new ones [1, 2]. Among the known methods, it is worth to note the following ones: the digital synthesis of Fourier filters [3–6], method of discriminant curve [7, 8], method of stabilizing functional [9], method of projections onto convex sets [10], etc. We emphasize that the mentioned and other available methods lead to a significant number of dedicated solutions, for which the choice of characteristic signs of the object and the subsequent analysis of correlation signals are separate problems. Therefore, topical is the search for the more general solutions of the pattern recognition problem by means of the matched filtering [11] or the joint correlation [12], which are directed to a simplification of the analysis of input data and the main signs of recognition, determination of their connection with the parameters of correlation signals, etc.

Here, we develop a new approach to the solution of the recognition problem. The newness of the proposed approach consists in that we recognize not the object itself, but a certain objectdependent synthesized phase object (SP-object). The latter (its distribution of phases) is calculated with the help of the known iterative Fourier transform (IFT) algorithm [13]. In this case, the problem of recognition of amplitude objects, which belong to arbitrary classes, is reduced to the problem of recognition of phase objects of only one type [14–16].

We also present a development of the SPO-method for the rotation invariant pattern recognition [17]. For the conventional method and the SPO-method, the comparison of the parameters of correlation signals for a number of amplitude objects is executed at the realization of their rotation in an optical-digital joint Fourier transform (JT) correlator. It is shown that not only the invariance relative to a rotation at a realization of the joint correlation for SP-objects but also the main advantage of the SPO-method over the reference one such as the unified δ-like recognition signal with the largest possible signal-to-noise ratio (SNR) independent of the type of an object is attained.

The work is organized as follows: in Section 2, a new approach to the pattern recognition on the basis of SP-objects is presented. The basic results of computational and optical experiments are given. The behavior of cross-correlation signals is studied under the addition of a controlled amount of noises to the structure of objects. In Section 3, a development of the SPO-method for the rotation invariant pattern recognition with an optical-digital JT-correlator is presented.

#### **2. SPO-method: definition, substantiation**

We now define an approach, where not the object itself is recognized, but some objectdependent SP-object which is calculated with the help of the known IFT algorithm [13]. In this case, as mentioned above, the problem of recognition of amplitude objects of various classes can be reduced to the problem of recognition of phase objects that belong to the same class. Below, we present the experimental results of recognition of amplitude objects with the use of the conventional and proposed methods, estimate the sensitivity of the latter to changes in the structure of objects, and determine the boundaries of its application. Let us consider the operation of the algorithm (**Figure 1**). For the calculation of SP-objects, we apply IFT algorithm in its kinoform version [18]. In the process of iterations, the phase structure of a kinoform *ψ*(*u*, *υ*) is formed in the spectral plane. Simultaneously, one more phase structure, namely *ϕ*(*x*, *y*), appears in the object plane. The function *φ*(*x*, *y*) = *exp*(*iϕ*(*x*, *y*)) plays the role of a diffusive scatterer, which is optimized for the object *f*(*x*, *y*) and is necessary for the leveling of the field amplitude in the Fourier plane, i.e., in the plane of a kinoform. However, in the context of the correlation methods of recognition, the phase structures *ϕ*(*x*, *y*) with random distribution of the phase can also be of independent interest not related to the problem of calculation of the kinoform. The matter is as follows. Since the form of *ϕ*(*x*, *y*) for the given number of iterations and the given initial diffuser *ϕ*0(*x*, *y*) is determined uniquely by the form of the function *f*(*x*, *y*), it is logical to put two questions:


**Figure 1.** IFT algorithm (a), illustrative scheme of the IFT algorithm (b).

of recognition or in the type of an object requires, as a rule, the optimization of available methods of solution or the development of new ones [1, 2]. Among the known methods, it is worth to note the following ones: the digital synthesis of Fourier filters [3–6], method of discriminant curve [7, 8], method of stabilizing functional [9], method of projections onto convex sets [10], etc. We emphasize that the mentioned and other available methods lead to a significant number of dedicated solutions, for which the choice of characteristic signs of the object and the subsequent analysis of correlation signals are separate problems. Therefore, topical is the search for the more general solutions of the pattern recognition problem by means of the matched filtering [11] or the joint correlation [12], which are directed to a simplification of the analysis of input data and the main signs of recognition, determination of their

Here, we develop a new approach to the solution of the recognition problem. The newness of the proposed approach consists in that we recognize not the object itself, but a certain objectdependent synthesized phase object (SP-object). The latter (its distribution of phases) is calculated with the help of the known iterative Fourier transform (IFT) algorithm [13]. In this case, the problem of recognition of amplitude objects, which belong to arbitrary classes, is

We also present a development of the SPO-method for the rotation invariant pattern recognition [17]. For the conventional method and the SPO-method, the comparison of the parameters of correlation signals for a number of amplitude objects is executed at the realization of their rotation in an optical-digital joint Fourier transform (JT) correlator. It is shown that not only the invariance relative to a rotation at a realization of the joint correlation for SP-objects but also the main advantage of the SPO-method over the reference one such as the unified δ-like recognition signal with the largest possible signal-to-noise ratio (SNR) independent of the type

The work is organized as follows: in Section 2, a new approach to the pattern recognition on the basis of SP-objects is presented. The basic results of computational and optical experiments are given. The behavior of cross-correlation signals is studied under the addition of a controlled amount of noises to the structure of objects. In Section 3, a development of the SPO-method for the rotation invariant pattern recognition with an optical-digital JT-correlator is presented.

We now define an approach, where not the object itself is recognized, but some objectdependent SP-object which is calculated with the help of the known IFT algorithm [13]. In this case, as mentioned above, the problem of recognition of amplitude objects of various classes can be reduced to the problem of recognition of phase objects that belong to the same class. Below, we present the experimental results of recognition of amplitude objects with the use of the conventional and proposed methods, estimate the sensitivity of the latter to changes in the structure of objects, and determine the boundaries of its application. Let us consider the operation of the algorithm (**Figure 1**). For the calculation of SP-objects, we apply IFT algorithm

reduced to the problem of recognition of phase objects of only one type [14–16].

connection with the parameters of correlation signals, etc.

36 Pattern Recognition - Analysis and Applications

**2. SPO-method: definition, substantiation**

of an object is attained.

The computer-based and optical experiments executed by us give a positive answer to both questions. The method of recognition, where the SP-object *φ*(*x*, *y*) is recognized instead of a real amplitude object *f*(*x*, *y*), is called the method of synthesized phase objects. We now consider the advantages and limitations of this method in more details. To study its basic characteristics in model and optical experiments, we need to determine a collection of recognition objects, to calculate an SP-object for each of them, and to carry out the recognition.

In view of the circumstance that the iteration method of synthesis of the functions *ϕ*(*x*, *y*) for *f*(*x*, *y*) gives no possibility to get a solution in the analytic form, we study the SPO-method for a finite collection of recognition objects. In order to most completely show the potentialities of the method, we choose objects with essentially different types of Fourier spectra.

For the comparison of the conventional and SPO methods, we need to compare their sensitivities to changes in the structure of objects under recognition. As a parameter for the estimation of the sensitivity, we chose the controlled changes that are introduced in the structure of recognition objects. These changes are carried out by means of the pairwise rearrangements of points of the object taken in an arbitrary order. The number of such rearrangements *k* varied in the limits from zero to several hundreds.

#### **2.1. SP-objects and their basic properties: model experiments**

For model experiments, we chose ten amplitude objects of the binary type 300 × 300 points in size. In **Figure 2**, the reference objects *fn* (*n* = 1, 2, 3, 4) are presented.

**Figure 2.** Objects: (a) *f*1, (b) *f*2, (c) *f*3, (d) *f*4.

For all of them, we calculated the autocorrelation functions *fn* ⊗ *fn*. The SP-objects *φn* were calculated by the iteration scheme (**Figure 1a**) with the initial distribution of phases *ϕ*0 = *const*. In order to find the degree of connection of *φn* with *fn*, which determines the degree of suitability of the use of *φn* instead of *fn*, we obtained *φn* for different numbers of iterations *N*, by gradually increasing *N*. At a fixed *N*, we calculated the correlation functions *φn*,*<sup>N</sup>* ⊗ *φn*,*<sup>N</sup>* for the entire totality of {*φn*,*<sup>N</sup>*}. In **Figure 3**, we present object *f*4 (1(a)), central fragment of its Fourier spectrum (2(a)), and autocorrelation signal (3(a)). On the right, we show, respectively, a fragment of the phase distribution *ϕ*4,1 of the SP-object (1(b)), shape of its spectrum (2(b)), and a fragment of the autocorrelation signal (3(b)). Analogous results were also obtained for objects *f*1 − *f*3. The presented result is typical and demonstrates the main advantages of SPobjects such as the uniform distribution of the amplitude in the spectral plane and the δ-like autocorrelation signal, which are practically independent of the shapes of Fourier spectra and the type of the autocorrelation signals of real objects, for which they were calculated.

As a result of model experiments, for each *fn*, we determined the criterion of choice of *φn* from the set {*φn*,*<sup>N</sup>*} at varying *N*. The obtained results are demonstrated by the example of object *f*4 (**Figure 4a**). Curve (A) shows the behavior of the variance *σ*<sup>2</sup> of the amplitude of the retrieved image of object *f*4 at the calculation of its SP-object relative to the amplitude of the reference object, and curve (B) presents the change in the maximum value of modulus of the Fourier spectrum amplitude of the *φ*4,*<sup>N</sup>*, as *N* increases. In **Figure 4**(**b**–**d**), we observe the redistribution of phases of the SP-object in the interval [0 − 2*π*] for various numbers of iterations.

Method of Synthesized Phase Objects in the Optical Pattern Recognition Problem http://dx.doi.org/10.5772/65370 39

For the comparison of the conventional and SPO methods, we need to compare their sensitivities to changes in the structure of objects under recognition. As a parameter for the estimation of the sensitivity, we chose the controlled changes that are introduced in the structure of recognition objects. These changes are carried out by means of the pairwise rearrangements of points of the object taken in an arbitrary order. The number of such rearrangements *k* varied

For model experiments, we chose ten amplitude objects of the binary type 300 × 300 points in

For all of them, we calculated the autocorrelation functions *fn* ⊗ *fn*. The SP-objects *φn* were calculated by the iteration scheme (**Figure 1a**) with the initial distribution of phases *ϕ*0 = *const*. In order to find the degree of connection of *φn* with *fn*, which determines the degree of suitability of the use of *φn* instead of *fn*, we obtained *φn* for different numbers of iterations *N*, by gradually increasing *N*. At a fixed *N*, we calculated the correlation functions *φn*,*<sup>N</sup>* ⊗ *φn*,*<sup>N</sup>* for the entire totality of {*φn*,*<sup>N</sup>*}. In **Figure 3**, we present object *f*4 (1(a)), central fragment of its Fourier spectrum (2(a)), and autocorrelation signal (3(a)). On the right, we show, respectively, a fragment of the phase distribution *ϕ*4,1 of the SP-object (1(b)), shape of its spectrum (2(b)), and a fragment of the autocorrelation signal (3(b)). Analogous results were also obtained for objects *f*1 − *f*3. The presented result is typical and demonstrates the main advantages of SPobjects such as the uniform distribution of the amplitude in the spectral plane and the δ-like autocorrelation signal, which are practically independent of the shapes of Fourier spectra and

the type of the autocorrelation signals of real objects, for which they were calculated.

object *f*4 (**Figure 4a**). Curve (A) shows the behavior of the variance *σ*<sup>2</sup>

As a result of model experiments, for each *fn*, we determined the criterion of choice of *φn* from the set {*φn*,*<sup>N</sup>*} at varying *N*. The obtained results are demonstrated by the example of

the retrieved image of object *f*4 at the calculation of its SP-object relative to the amplitude of the reference object, and curve (B) presents the change in the maximum value of modulus of the Fourier spectrum amplitude of the *φ*4,*<sup>N</sup>*, as *N* increases. In **Figure 4**(**b**–**d**), we observe the redistribution of phases of the SP-object in the interval [0 − 2*π*] for various numbers of

of the amplitude of

in the limits from zero to several hundreds.

38 Pattern Recognition - Analysis and Applications

**Figure 2.** Objects: (a) *f*1, (b) *f*2, (c) *f*3, (d) *f*4.

iterations.

**2.1. SP-objects and their basic properties: model experiments**

size. In **Figure 2**, the reference objects *fn* (*n* = 1, 2, 3, 4) are presented.

**Figure 3.** Left (a): distributions for the real object *f*4; right (b): for the SP-object *φ*4 = exp(*iϕ*4): (1) object; (2) Fourier spectrum amplitude modulus; (3) autocorrelation signal.

**Figure 4.** (a) Dependence of *σ*<sup>2</sup> (A) and |ℑ+ 1(*φ*)|*max* (B) on the number of iterations *N*; histograms for: (b) *ϕ*4,1; (c) *ϕ*4,13; (d) *ϕ*4,45 calculated for 1st, 13th, and 45th iterations.

On the basis of the results of numerical experiments with the whole collection of objects (**Figures 3** and **4**, **Table 1**), we can conclude the following:



a 2*ξmax*, effective band of frequencies.

b <SNR>, ratio of the peak value of amplitude of a correlation signal to the mean noise amplitude.

**Table 1.** Results of recognition of objects and SP-objects in model experiments.

The autocorrelation functions of SP-objects have the δ-like shape and ensure:


This is true for both *ϕ*0 = *const* as well as for arbitrary *ϕ*0. We have also established that the SPobjects calculated on the first and all subsequent *N*-iterations satisfy the following conditions:


The first item indicates that the SP-objects obtained for the uncorrelated real objects are statistically independent of one another. The second shows the possibility to obtain a bijective interrelation between cross-correlation curves for the real and SP-objects.

Thus, we have established that, for SP-objects, the highest degree of uniformity of the amplitudes of their Fourier spectra is ensured already after the first iteration, conditions (1, 2) are satisfied, and the properties of real objects *f*(*x*, *y*) (their significant signs) are integrally represented in the distribution of phase elements in the coordinate plane. Any changes in the structure of *f*(*x*, *y*) affect directly the distributions of the phase in a plane of the SP-object. This allows one to quantitatively evaluate the indicated variation in the object by a change in the level of a cross-correlation signal from SP-objects calculated for the reference and modified objects, respectively. The following step is the evaluation of the practical value of the proposed method. With this purpose, we will analyze the results of the recognition by the conventional method and the SPO-method executed in a Vanderlugt (VL) correlator.

On the basis of the results of numerical experiments with the whole collection of objects

**•** The binary distribution (0 or *π*) of a phase in the plane of an SP-object obtained on the first iteration is transformed into a continuous one in the interval [0 − 2*π*], as the iteration number

**•** The modulus of the amplitude of the Fourier spectrum of an SP-object has a uniform

 **2***ξmax***, rel. un. <SNR>b, dB Frequency 2***ξmax***, rel. un. <SNR>, dB**

(**Figures 3** and **4**, **Table 1**), we can conclude the following:

**•** The distribution of phases in the plane of an SP-object is random.

**Objects SP-objects**

<SNR>, ratio of the peak value of amplitude of a correlation signal to the mean noise amplitude.

The autocorrelation functions of SP-objects have the δ-like shape and ensure:

**1.** Maximally possible value of SNR characteristic as for the binary phase masks with a

**2.** Possibility to apply a simple threshold criterion to the analysis of the results of recognition. This is true for both *ϕ*0 = *const* as well as for arbitrary *ϕ*0. We have also established that the SPobjects calculated on the first and all subsequent *N*-iterations satisfy the following conditions: **1.** If there is no correlation between the objects *fn* and *fm* (*fn* ⊗ *fm* = 0), then the correlation is

**2.** If the signal of cross-correlation between the objects *fn* and *fm* exists (*fn* ⊗ *fm* ≠ 0), then it

The first item indicates that the SP-objects obtained for the uncorrelated real objects are statistically independent of one another. The second shows the possibility to obtain a bijective

Thus, we have established that, for SP-objects, the highest degree of uniformity of the amplitudes of their Fourier spectra is ensured already after the first iteration, conditions (1, 2) are satisfied, and the properties of real objects *f*(*x*, *y*) (their significant signs) are integrally represented in the distribution of phase elements in the coordinate plane. Any changes in the

interrelation between cross-correlation curves for the real and SP-objects.

**Table 1.** Results of recognition of objects and SP-objects in model experiments.

random distribution of elements [19].

also absent for SP-objects (*φn*,*<sup>N</sup>* ⊗ *φm*,*<sup>N</sup>* = 0).

exists also for the SP-objects (*φn*,*<sup>N</sup>* ⊗ *φm*,*<sup>N</sup>* ≠ 0).

 0.30 5.2 0.50 26.2 0.25 16.3 0.50 26.2 0.20 7.7 0.50 26.2 0.37 6.8 0.50 26.2

increases.

**Number of an object no.**

a

b

distribution in all cases.

40 Pattern Recognition - Analysis and Applications

2*ξmax*, effective band of frequencies.

**Frequencya**

#### **2.2. Comparison of the SPO and conventional methods of recognition: optical experiment**

We studied the matched filtering of amplitude objects. In **Figure 5**, we present scheme (a) and photo (b) of an optical-digital VL-correlator. In order to introduce the images in the object plane of the correlator, we applied spatial light modulator (SLM) LC-R2500. SLM is operated in the mode of phase modulation of the wave front. The amplitude objects were transformed in phase ones [20] and supplied to SLM as standard graphic files with regard to the characteristic curve of SLM. Let us consider the operation of the correlator in the mode of recording of matched filters and the mode of matched filtering.

**Figure 5.** Optical-digital VL-correlator: (a) scheme; (b) photo: *CCD*1, *PC*1, laser, *Fr*, *k*, *P*1, *Bs*, *SLM*, *P*2, *Mr*, *Sh*, *A*, *L*1, *MF*, *Pmf*, *L*2, *CCD*2, *PC*2 are, respectively, a camera and a computer in the object plane, He-Cd laser (441.6 nm), Fresnel rhomb; collimator, polarizer, splitting cube, spatial light modulator LC-R2500, polarizer, mirror, gate, analyzer, Fourier lens, matched holographic filter, Fourier plane, lens, CCD camera COHU-4800, controlling computer.

*Recording of a matched filter*. The beam of a He-Cd laser is split into the reference and object beams, by passing through collimator *k* and splitter *Bs*. Fresnel rhomb *Fr* and analyzer *A* set the necessary polarization of the object beam, by ensuring the phase mode of operation of SLM. Polarizer *P*1 and gate *Sh* are not used, and polarizer *P*2 controls a level of the intensity of the reference beam. With the help of *CCD*1 and computer *PC*1, the graphic file with the image of the reference object in the grayscale format (1–255) is supplied onto SLM with regard to the characteristic curve of the device. The object beam and the collimated reference beam form a matched filter on a photopolymeric composition [15] in the Fourier plane *Pmf* of the correlator. We optimized the conditions of recording of matched filters in order to get the maximum diffraction efficiency at a minimum level of intrinsic noises and at a maximum SNR.

*Matched filtering*. The operation of the correlator in the mode of matched filtering consists in the following. At closed gate *Sh*, the collimated laser beam with the necessary direction of polarization is reflected from the mirror of SLM, to which the image of a recognition object is supplied. After the Fourier transformation executed by lens *L*1, the beam enters plane *Pmf*, where a matched filter *MF* for the reference object is placed. Then, camera *CCD*2 in the correlation plane registers the signal of mutual correlation, which is obtained as a result of the inverse Fourier transformation of the product of the Fourier transforms of the input and reference images of objects executed by lens *L*2.

Define the procedure of recognition by the SPO-method:


To obtain the cross-correlation dependences, the same collection of objects *f*1 − *f*4 (**Figure 2**), as in computer experiments, was used. For each of the recognition objects, we calculated the series of *fn*(*k*), *k* ∈ [1 − 800] objects obtained by means of the introduction of changes into their structure. As indicated above, the changes are the pairwise permutation of points (pixels) of the object taken in an arbitrary order, *k* being the number of such rearrangements. In **Figure 6**, we present the view of a fragment of the object *f*1 for various numbers of rearrangements.

**Figure 6.** Fragments of object *f*1 for: (a) *k* = 0; (b) *k* = 200; (c) *k* = 400; (d) *k* = 800.

For all objects *fn* and series *fn*(*k*), we calculated the corresponding *φn*,1 and series *φn*,1(*k*). Then, we recorded matched filters and carried out the recognition by the conventional and SPO methods. The cross-correlation signals were registered by camera *CCD*2, and their SNRs were calculated. We obtained the dependences of the intensities of correlation signals *Icorr* on the level of changes in the structure of compared objects. We also estimated the degree of homogeneity of the intensities of the Fourier spectra of objects and SP-objects. The registration of the corresponding spectra was executed by camera *CCD*2 mounted in the plane *Pmf* of the correlator (**Figure 5a**). In **Figure 7**, we show the typical results by the example of object *f*1.

Method of Synthesized Phase Objects in the Optical Pattern Recognition Problem http://dx.doi.org/10.5772/65370 43

polarization is reflected from the mirror of SLM, to which the image of a recognition object is supplied. After the Fourier transformation executed by lens *L*1, the beam enters plane *Pmf*, where a matched filter *MF* for the reference object is placed. Then, camera *CCD*2 in the correlation plane registers the signal of mutual correlation, which is obtained as a result of the inverse Fourier transformation of the product of the Fourier transforms of the input and

**•** For the reference object *fref*, the SP-object *φref* is calculated with the help of the IFT algorithm. Into the object plane of the correlator, *φref* is introduced instead of *fref*, and the recording of the matched filter is realized. For the comparison object *fin*, the SP-object *φin* is calculated in

**•** Into the object plane of the correlator, *φin* is introduced instead of *fin*, and the matched filtering is realized. In the correlation plane, the signal of mutual correlation *Icorr* = |*φref* ⊗ *φin*| is

To obtain the cross-correlation dependences, the same collection of objects *f*1 − *f*4 (**Figure 2**), as in computer experiments, was used. For each of the recognition objects, we calculated the series of *fn*(*k*), *k* ∈ [1 − 800] objects obtained by means of the introduction of changes into their structure. As indicated above, the changes are the pairwise permutation of points (pixels) of the object taken in an arbitrary order, *k* being the number of such rearrangements. In **Figure 6**, we present the view of a fragment of the object *f*1 for various numbers of rearrangements.

For all objects *fn* and series *fn*(*k*), we calculated the corresponding *φn*,1 and series *φn*,1(*k*). Then, we recorded matched filters and carried out the recognition by the conventional and SPO methods. The cross-correlation signals were registered by camera *CCD*2, and their SNRs were calculated. We obtained the dependences of the intensities of correlation signals *Icorr* on the level of changes in the structure of compared objects. We also estimated the degree of homogeneity of the intensities of the Fourier spectra of objects and SP-objects. The registration of the corresponding spectra was executed by camera *CCD*2 mounted in the plane *Pmf* of the correla-

tor (**Figure 5a**). In **Figure 7**, we show the typical results by the example of object *f*1.

reference images of objects executed by lens *L*2.

42 Pattern Recognition - Analysis and Applications

the same way.

registered.

Define the procedure of recognition by the SPO-method:

**Figure 6.** Fragments of object *f*1 for: (a) *k* = 0; (b) *k* = 200; (c) *k* = 400; (d) *k* = 800.

**Figure 7.** Experimental results for object *f*1 (left) and the SP-object *φ*1,1 (right): (a) dependence of the intensity of a crosscorrelation signal on *k*; (b and c) form of correlation signals at points A, B, respectively; (d and e) the shapes of Fourier spectra.

In **Figure 7a**, curves (A, B), we show changes in the correlation signal *Icorr* for *f*1 and *ϕ*1,1, respectively, as the parameter *k* increases. The autocorrelation signals for *f*1(*x*, *y*) with SNR of 2.1 dB and *φ*1,1(*x*, *y*) with SNR of 24.8 dB are shown in **Figure 7b** and **c**, respectively. Fourier spectra of the object and the SP-object are presented in **Figure 7d** and **e**. In the photo of the SPobject Fourier spectrum, we indicate the zero and ± 1 orders of SLM. In the Fourier spectrum of object *f*1, the zero order of SLM distorts the real view of the object Fourier spectrum in the zero frequency region. The character of cross-correlation dependences (A, B) (**Figure 7a**) is conserved for the whole collection of objects, which allows us to conclude that the SPO-method has a higher sensitivity to changes in the structure of an object. This can play both positive and negative roles, depending on the character of the recognition problem. On the basis of the results of matched filtering obtained for the whole collection of objects *f*1 − *f*10, we can conclude that the characteristic peculiarities and distinctions of the compared methods observed in model experiments are conserved also in optical experiments.

We have established that, in the applied scheme of a VL-correlator (it is true for the schemes with SLM) in the process of recording of matched filters, a part of light that does not diffract on SLM falls in the domain of zero frequencies of the Fourier spectrum of the object. These intense peaks are well noticeable in **Figure 7d** and **e**. The presence of such peaks is the reason for the appearance of a superfluous component in the recognition signals, which masks the real course of a curve in the domain of strong changes in the structure of an object. For example, it is seen in **Figure 7a** (curve A) that, for *k* > 400 where the structure of the object changes quite strongly, the intensity of the correlation signal is practically constant. This effect is observed for both the conventional and SPO methods.

**Off-axis matched filtering**. We have realized a means to remove a drawback of a VL-correlator with SLM related to the presence of a masking peak of the intensity on zero frequencies, by introducing a phase grating into the structures of input and reference objects. This allows us to spatially separate the Fourier spectra of objects and the zero-order SLM. For functions of the type *φ*(*x*, *y*) = *exp*(*iϕ*(*x*, *y*)) that are introduced in the objective plane of a correlator with the help of SLM, such grating is formed by means of the adding of a linear phase 2*π*(*xu*0 + *yϑ*0) to the phase *ϕ*(*x*, *y*). The spatial separation of the recognition signal and noise components in the correlation plane by the covering of a synthesized filter in the Fourier plane by a phase grating was demonstrated in [16], but the increase in SNR of the recognition signal by means of the covering of the recognition objects in the objective plane of a VL-correlator by a phase grating is made for the first time by us.

**Figure 8.** (a and b) Fragment of the phase encoded objects [20] *f*2, *f*4 with added gratings; (c and d) on-axis Fourier spectra; (e and f) off-axis Fourier spectra.

Method of Synthesized Phase Objects in the Optical Pattern Recognition Problem http://dx.doi.org/10.5772/65370 45

**Figure 9.** Intensities of cross-correlation signals versus the parameter *k* for object *f*1 (a) and SP-object *φ*1,1 (b) for the onaxis (1) and off-axis (2) matched filtering.


a SNR, correlation peak intensity relative to the maximal intensity of the correlation noise.

**Table 2.** Results of matched filtering of objects and SP-objects.

We have established that, in the applied scheme of a VL-correlator (it is true for the schemes with SLM) in the process of recording of matched filters, a part of light that does not diffract on SLM falls in the domain of zero frequencies of the Fourier spectrum of the object. These intense peaks are well noticeable in **Figure 7d** and **e**. The presence of such peaks is the reason for the appearance of a superfluous component in the recognition signals, which masks the real course of a curve in the domain of strong changes in the structure of an object. For example, it is seen in **Figure 7a** (curve A) that, for *k* > 400 where the structure of the object changes quite strongly, the intensity of the correlation signal is practically constant. This effect is observed

**Off-axis matched filtering**. We have realized a means to remove a drawback of a VL-correlator with SLM related to the presence of a masking peak of the intensity on zero frequencies, by introducing a phase grating into the structures of input and reference objects. This allows us to spatially separate the Fourier spectra of objects and the zero-order SLM. For functions of the type *φ*(*x*, *y*) = *exp*(*iϕ*(*x*, *y*)) that are introduced in the objective plane of a correlator with the help of SLM, such grating is formed by means of the adding of a linear phase 2*π*(*xu*0 + *yϑ*0) to the phase *ϕ*(*x*, *y*). The spatial separation of the recognition signal and noise components in the correlation plane by the covering of a synthesized filter in the Fourier plane by a phase grating was demonstrated in [16], but the increase in SNR of the recognition signal by means of the covering of the recognition objects in the objective plane of a VL-correlator by a phase grating

**Figure 8.** (a and b) Fragment of the phase encoded objects [20] *f*2, *f*4 with added gratings; (c and d) on-axis Fourier spec-

for both the conventional and SPO methods.

44 Pattern Recognition - Analysis and Applications

is made for the first time by us.

tra; (e and f) off-axis Fourier spectra.

The axis, relative to which the spectrum is shifted, passes through the centers of the objective and Fourier planes. We consider the recording of a filter for the reference object with the added phase grating and the subsequent matched filtering of recognition objects with the added phase grating as an off-axis matched filtering relative to the indicated axis (**Figure 8**). As distinct from the on-axis matched filtering, the implementation of such filtering within the conventional and SPO methods for all objects *fn* and series *fn*(*k*), as well as for *φn*,1 and series *φn*,1(*k*), gives the proper behavior of cross-correlation curves for the whole range of variation in the parameter k, including *k* > 400. In **Figure 9a** and **b**, we present the correlation curves for the on-axis (1) and off-axis (2) matched filtering for the object *f*1 and SP-object *φ*1,1, respectively. It is seen that curves (2) are more suitable for the proper comparison of the sensitivities of methods in a wide range of *k*. Hence, the results of model and optical experiments aimed at the study of the SPO-method of recognition of amplitude objects show that the method gives the following possibilities: to simplify the procedure of choice of the criteria (signs) of recognition; to obtain the one-type δ-like signals irrespective of the class, to which the recognition object belongs; and to increase the signal/noise ratio for correlation signals by 2–3 orders. The off-axis matched filtering realized in the experiment increases additionally SNR of correlation signals by one order (**Figures 8** and **9**, **Table 2**).

#### **3. SPO-method for pattern recognition with rotation invariance**

As is known, one of the basic problems hampering the application of optical methods of recognition in practice is a fast degradation of the correlation signal under a variation in the scale of the object and its rotation around the coordinate origin. This problem is solved by means of the use of the integral Fourier-Mellin transformation instead of the pure Fourier transformation for the recognition. For the first time, the possibility of a realization of the Fourier-Mellin transformation in a hybrid electron-optical or optical-digital Fourier system was indicated by Kuzmenko [21]. Casasent used successfully this idea for the recognition of objects, which is invariant to the scaling, rotation, and shift, in a hybrid optical-digital 4Fsystem [22, 23]. In the subsequent years, a lot of works [24–35] were devoted to the invariant methods of recognition. In what follows, we will demonstrate a possibility to use the SPOmethod for the pattern recognition with rotation invariance.

#### **3.1. Computational experiment**

Consider the rotation invariant recognition of objects realized by the conventional method and the SPO-method. Of interest is the comparison of the cross-correlation curves obtained with the help of both methods to recognition objects for various angles of its rotation.

**Figure 10.** Reference objects: *f*1(*x*, *y*) (a); *g*1(*exp*(*ρ*), *θ*) (b); *φ*1(*x*, *y*) (c); *χ*1(*exp*(*ρ*), *θ*) (d).

As the reference objects for numerical experiments, we took a set of objects of the amplitude and half-tone types *fi* (*x*, *y*), *i* = 1, 2, …, 10. For each of them, we define the sets of comparison objects , , , *j* = 1, 2, …, 41, which are obtained by the rotation of corresponding reference objects around the optical axis by an angle *α* with the step *Δα* = 0.5*°* in the limits [0°– 20°]. In addition, for all reference objects and comparison objects, we define the sets *gi*, (*exp*(*ρ*), *θ*), *i* = 1, 2, …, 10, of reference objects and , , , *j* = 1, 2, …, 41, of comparison objects after a logarithmic polar transformation of coordinates [26]. For the comparison of cross-correlation dependences, we define the correlation functions for the SPO conventional recognition: <sup>=</sup> ⊗ , ; by Fourier–Mellin rotation invariant recognition: <sup>=</sup> ⊗ , . For the SPO-method for the above-defined sets by the iteration scheme (**Figure 1**) at the initial *ϕ*0 = 0 we calculate the SP-objects *φi* , *i* = 1, 2, …, 10 for each reference object and for each comparison object , , , *j* = 1, 2, …, 41. All SPobjects were taken on the first iteration. Analogously, we define the correlation functions for the SPO conventional recognition: <sup>=</sup> ⊗ , ; by SPO Fourier–Mellin rotation invariant recognition: <sup>=</sup> ⊗ , .

**3. SPO-method for pattern recognition with rotation invariance**

method for the pattern recognition with rotation invariance.

**Figure 10.** Reference objects: *f*1(*x*, *y*) (a); *g*1(*exp*(*ρ*), *θ*) (b); *φ*1(*x*, *y*) (c); *χ*1(*exp*(*ρ*), *θ*) (d).

(*exp*(*ρ*), *θ*), *i* = 1, 2, …, 10, of reference objects and ,

**3.1. Computational experiment**

46 Pattern Recognition - Analysis and Applications

and half-tone types *fi*

tional recognition:

objects ,

As is known, one of the basic problems hampering the application of optical methods of recognition in practice is a fast degradation of the correlation signal under a variation in the scale of the object and its rotation around the coordinate origin. This problem is solved by means of the use of the integral Fourier-Mellin transformation instead of the pure Fourier transformation for the recognition. For the first time, the possibility of a realization of the Fourier-Mellin transformation in a hybrid electron-optical or optical-digital Fourier system was indicated by Kuzmenko [21]. Casasent used successfully this idea for the recognition of objects, which is invariant to the scaling, rotation, and shift, in a hybrid optical-digital 4Fsystem [22, 23]. In the subsequent years, a lot of works [24–35] were devoted to the invariant methods of recognition. In what follows, we will demonstrate a possibility to use the SPO-

Consider the rotation invariant recognition of objects realized by the conventional method and the SPO-method. Of interest is the comparison of the cross-correlation curves obtained with

As the reference objects for numerical experiments, we took a set of objects of the amplitude

ence objects around the optical axis by an angle *α* with the step *Δα* = 0.5*°* in the limits [0°– 20°]. In addition, for all reference objects and comparison objects, we define the sets *gi*,

son objects after a logarithmic polar transformation of coordinates [26]. For the comparison of cross-correlation dependences, we define the correlation functions for the SPO conven-

(*x*, *y*), *i* = 1, 2, …, 10. For each of them, we define the sets of comparison

<sup>=</sup> ⊗ , ; by Fourier–Mellin rotation invariant recogni-

, , *j* = 1, 2, …, 41, of compari-

, , *j* = 1, 2, …, 41, which are obtained by the rotation of corresponding refer-

the help of both methods to recognition objects for various angles of its rotation.

Below in **Figure 10**, we present an amplitude reference object of the binary type *f*1(*x*, *y*) (a), the object obtained for it after the logarithmic polar transformation of coordinates *g*1(*exp*(*ρ*), *θ*) (b), and the SP-objects *φ*1,0(*x*, *y*) (c) and *χ*1,0(*exp*(*ρ*), *θ*) (d) calculated for them.

To increase the peak values of correlation signal for recognition objects and the SNR, all amplitude objects *fi* , *gi* were transformed in phase ones by the Kallman method [20]. The results were obtained for ten objects with a dimension of 512 × 512 elements by the conventional and SPO methods. We analyzed the parameters of correlation signals and compared the correlation curves defined above. By the examples of **Figures 11**–**13**, we show the typical results of numerical experiments.

In **Figure 11a**, we show the dependence of the SNR of a recognition signal on the rotation angle of the object 1, at the subsequent calculation of the correlation of this object with the reference one *f*1 (**Figure 10a**)— the curve formed by white squares.

**Figure 11.** Dependence of the SNR for cross-correlation signals *Icorr* on the rotation angle *α* for: 1 ⊗ 1, (a), 1 ⊗ 1, (b) the conventional recognition; 1 ⊗ 1, (c), 1 ⊗ 1, (d) the recognition with the use of the Fourier-Mellin transformation.

**Figure 12.** Autocorrelation signal: |*f*<sup>1</sup> ⊗ *f*1| (a), |*φ*<sup>1</sup> ⊗ *φ*1| (b) the conventional recognition; |*g*<sup>1</sup> ⊗ *g*1| (c), |*χ*<sup>1</sup> ⊗ *χ*1| (d) the recognition with the use of the Fourier-Mellin transformation; *α* = 00 .

The curve demonstrates the typical behavior [22, 23], namely the strong degradation of the correlation function under a rotation of the comparison object around the optical axis, while comparing it with the reference object. The correlation signal with *SNR* = 3.4 dB (**Figure 12a**) from the reference object under a rotation of the comparison object already at angles *α* > 5*°* is transformed into noise components of the cross-correlation signal, which change insignificantly their shapes at a subsequent rotation (**Figure 13a**). Curves (**Figure 11a**) demonstrate the above-written method for both conventional and SPO methods. As shown in [16], the SPOmethod demonstrates a faster diminution of the curve with increase in distortions (in the given case, with increase in the rotation angle), δ-like shape of a recognition signal, and higher values of *SNR* about 20.2 dB for the autocorrelation (**Figure 12b**); for the angle *α* = 5*°*, the signal is absent (**Figure 13b**). The curves in **Figure 11b** show variations in SNRs of the correlation functions with increase in the rotation angle for comparison objects for the conventional (light circles) and SPO (dark circles) methods at the Fourier-Mellin rotation invariant recognition. For *g*-objects, SNR of the signal is about 5 dB in the whole interval of change of the angles. For the SPO-method, we observe a change in SNR of the δ-like signal from 20 dB (**Figure 12d**) for autocorrelation to 11 dB; further, *SNR* of the cross-correlation signal is also independent of the angle of rotation of objects of the recognition (**Figure 12d**).

Method of Synthesized Phase Objects in the Optical Pattern Recognition Problem http://dx.doi.org/10.5772/65370 49

**Figure 13.** Cross-correlation signal: |*f*<sup>1</sup> ⊗ *f*1,*<sup>α</sup>*| (a), |*φ*<sup>1</sup> ⊗ *φ*1,*<sup>α</sup>*| (b) the conventional recognition; |*g*<sup>1</sup> ⊗ *g*1,*<sup>α</sup>*| (c), |*χ*<sup>1</sup> ⊗ *χ*1,*<sup>α</sup>*| (d) the recognition with the use of the Fourier-Mellin transformation; *α* = 200 .

**Figure 12.** Autocorrelation signal: |*f*<sup>1</sup> ⊗ *f*1| (a), |*φ*<sup>1</sup> ⊗ *φ*1| (b) the conventional recognition; |*g*<sup>1</sup> ⊗ *g*1| (c), |*χ*<sup>1</sup> ⊗ *χ*1| (d) the

The curve demonstrates the typical behavior [22, 23], namely the strong degradation of the correlation function under a rotation of the comparison object around the optical axis, while comparing it with the reference object. The correlation signal with *SNR* = 3.4 dB (**Figure 12a**) from the reference object under a rotation of the comparison object already at angles *α* > 5*°* is transformed into noise components of the cross-correlation signal, which change insignificantly their shapes at a subsequent rotation (**Figure 13a**). Curves (**Figure 11a**) demonstrate the above-written method for both conventional and SPO methods. As shown in [16], the SPOmethod demonstrates a faster diminution of the curve with increase in distortions (in the given case, with increase in the rotation angle), δ-like shape of a recognition signal, and higher values of *SNR* about 20.2 dB for the autocorrelation (**Figure 12b**); for the angle *α* = 5*°*, the signal is absent (**Figure 13b**). The curves in **Figure 11b** show variations in SNRs of the correlation functions with increase in the rotation angle for comparison objects for the conventional (light circles) and SPO (dark circles) methods at the Fourier-Mellin rotation invariant recognition. For *g*-objects, SNR of the signal is about 5 dB in the whole interval of change of the angles. For the SPO-method, we observe a change in SNR of the δ-like signal from 20 dB (**Figure 12d**) for autocorrelation to 11 dB; further, *SNR* of the cross-correlation signal is also independent of the

.

recognition with the use of the Fourier-Mellin transformation; *α* = 00

48 Pattern Recognition - Analysis and Applications

angle of rotation of objects of the recognition (**Figure 12d**).

**Figure 14.** (a) Scheme and (b) photo of a digital-optical JT-correlator: Laser beam, P, D1, RD, Bs, SLM, *PC*1, L, A, *CCD*1, *CCD*2, *PC*2—He-Ne (543 nm) laser beam, polarizer, circle and rectangle diaphragms, beam splitter, spatial light modulator HEO 1080 Pluto, Fourier lens, analyzer, a 12-bit SPU620 CCD with the BeamGage software and PC.

In view of the similar results obtained for the whole set of objects, we may say that the SPOmethod is applicable for the rotation invariant recognition and conserves the same own advantages, as in the conventional recognition.

#### **3.2. Optical experiment**

For the corroboration of the results obtained in numerical experiments, we carried out experiments with an optical-digital system of recognition (see **Figure 14a** and **b**) on the basis of a JT-correlator. For this purpose, we got the autocorrelation signals for the objects *f*1(*x*, *y*), *g*1,0(*exp*(*ρ*), *θ*), *φ*1(*x*, *y*), and *χ*1,0(*x*, *y*). The cross-correlation signals were obtained at the rotation of the indicated objects by *α* = 5*°*.

**Figure 15.** Pattern recognition results with Fourier-Mellin transformation: object's plane—reference *g*1 (a) and recognition 1, 5° (b) objects; calculation—autocorrelation (c) and cross-correlation (d) signals; experimental peaks—autocorrelation (e) and cross-correlation (f) in the correlation plane of the JT-correlator. The size of objects and JF-spectra is 64 × 64 and 512 × 512 elements, respectively.

The recognition in the JT-correlator includes two steps:

*Formation of the joint Fourier spectrum* (*JF*) *of compared objects*. With the help of a camera (is not shown in the scheme), the object of recognition is introduced in computer *PC*1. We calculate the JT-spectrum moduli of a given object and the reference object. Then, JF-spectrum moduli is supplied to an SLM.

*Production of a correlation signal*. The collimated light beam of a He-Ne laser (543 nm) after masking by a working aperture is reflected from the SLM. In the correlation plane, a *CCD* camera registers the correlation signal obtained as a result of the inverse Fourier transform, which is performed by the lens *L*, of an optical signal reflected with the help of a splitting cube *Bs* from the SLM. The result is supplied to and is processed by computer *PC*2.

**3.2. Optical experiment**

of the indicated objects by *α* = 5*°*.

50 Pattern Recognition - Analysis and Applications

64 × 64 and 512 × 512 elements, respectively.

is supplied to an SLM.

The recognition in the JT-correlator includes two steps:

For the corroboration of the results obtained in numerical experiments, we carried out experiments with an optical-digital system of recognition (see **Figure 14a** and **b**) on the basis of a JT-correlator. For this purpose, we got the autocorrelation signals for the objects *f*1(*x*, *y*), *g*1,0(*exp*(*ρ*), *θ*), *φ*1(*x*, *y*), and *χ*1,0(*x*, *y*). The cross-correlation signals were obtained at the rotation

**Figure 15.** Pattern recognition results with Fourier-Mellin transformation: object's plane—reference *g*1 (a) and recognition 1, 5° (b) objects; calculation—autocorrelation (c) and cross-correlation (d) signals; experimental peaks—autocorrelation (e) and cross-correlation (f) in the correlation plane of the JT-correlator. The size of objects and JF-spectra is

*Formation of the joint Fourier spectrum* (*JF*) *of compared objects*. With the help of a camera (is not shown in the scheme), the object of recognition is introduced in computer *PC*1. We calculate the JT-spectrum moduli of a given object and the reference object. Then, JF-spectrum moduli

**Figure 16.** Pattern recognition results with the SPO-method and the Fourier-Mellin transformation: object's plane—reference *χ*1 (a) and recognition 1, 5° (b) SP-objects; calculation results—autocorrelation (c) and cross-correlation (d) signals; experimental peaks—autocorrelation (e) and cross-correlation (f) in the correlation plane of the JT-correlator. The size of objects and JF-spectra is 64 × 64 and 512 × 512 elements, respectively.

#### **3.3. Results and discussion**

The results of optical experiments and their comparison with the results of numerical experiments (at least the qualitative one) allow us to evaluate a degree of applicability of the SPOmethod for the rotation invariant recognition. **Figure 15** demonstrates objects **Figure 15a** and **b**, calculated autocorrelation **Figure 15c** and cross-correlation signals **Figure 15d**, and the autocorrelation **Figure 15e** and cross-correlation signals **Figure 15f** registered by a camera (**Figure 14a**).

The similar experimental result was obtained also within the SPO-method. The presence of cross-correlation signals is clearly seen for the conventional and studied methods (**Figures 15** (**e**, **f**) and **16** (**e**, **f**)) in the case of the rotation invariant recognition by Fourier-Mellin.

The SNR for recognition signals is in the limit 23–25 dB. Thus, the results (the presence of a recognition signal at a rotation of the recognition object) confirm qualitatively the applicability of the SPO-method to the rotation invariant correlation.

Thus, the numerical and optical experiments show the applicability of the SPO-method to the rotation Fourier-Mellin invariant recognition for amplitude and half-tone objects of the binary type. The estimate of correlation signals and the obtained dependences of *SNR*(*α*) indicate that the SPO-method gives signals of the δ-like shape irrespective of the type of objects that gives a constant value of *SNR* exceeding *SNR* for the conventional method in the whole interval of the angles of rotation of comparison objects by 6 dB higher on the average for the rotation invariant recognition. These results are typical of the whole set of reference objects.

#### **4. Conclusion**

The hypothesis about the possibility to solve the problem of optical recognition by means of the change of the recognized objects by the corresponding object-dependent SP-objects in model and optical experiments is verified. The advantages and the drawbacks of such approach are determined. The conditions of recording of matched filters on original photopolymeric compositions, which ensure the optimum parameters of correlation signals at the recognition of amplitude objects, are determined. Auto- and cross-correlation signals for amplitude objects of various classes and for the corresponding SP-objects are obtained by computer simulation experimentally and compared at the recognition with a hybrid opticaldigital VL-correlator. The influence of controlled changes in the structure of objects on correlation signals in the conventional and proposed approaches is experimentally studied in the optical-digital systems of recognition on the basis of the VL and JT correlators. The development of the SPO-method for the rotation invariant pattern recognition with an opticaldigital JT-correlator is presented.

#### **Acknowledgements**

**3.3. Results and discussion**

52 Pattern Recognition - Analysis and Applications

(**Figure 14a**).

reference objects.

**4. Conclusion**

digital JT-correlator is presented.

The results of optical experiments and their comparison with the results of numerical experiments (at least the qualitative one) allow us to evaluate a degree of applicability of the SPOmethod for the rotation invariant recognition. **Figure 15** demonstrates objects **Figure 15a** and **b**, calculated autocorrelation **Figure 15c** and cross-correlation signals **Figure 15d**, and the autocorrelation **Figure 15e** and cross-correlation signals **Figure 15f** registered by a camera

The similar experimental result was obtained also within the SPO-method. The presence of cross-correlation signals is clearly seen for the conventional and studied methods (**Figures 15**

The SNR for recognition signals is in the limit 23–25 dB. Thus, the results (the presence of a recognition signal at a rotation of the recognition object) confirm qualitatively the applicability

Thus, the numerical and optical experiments show the applicability of the SPO-method to the rotation Fourier-Mellin invariant recognition for amplitude and half-tone objects of the binary type. The estimate of correlation signals and the obtained dependences of *SNR*(*α*) indicate that the SPO-method gives signals of the δ-like shape irrespective of the type of objects that gives a constant value of *SNR* exceeding *SNR* for the conventional method in the whole interval of the angles of rotation of comparison objects by 6 dB higher on the average for the rotation invariant recognition. These results are typical of the whole set of

The hypothesis about the possibility to solve the problem of optical recognition by means of the change of the recognized objects by the corresponding object-dependent SP-objects in model and optical experiments is verified. The advantages and the drawbacks of such approach are determined. The conditions of recording of matched filters on original photopolymeric compositions, which ensure the optimum parameters of correlation signals at the recognition of amplitude objects, are determined. Auto- and cross-correlation signals for amplitude objects of various classes and for the corresponding SP-objects are obtained by computer simulation experimentally and compared at the recognition with a hybrid opticaldigital VL-correlator. The influence of controlled changes in the structure of objects on correlation signals in the conventional and proposed approaches is experimentally studied in the optical-digital systems of recognition on the basis of the VL and JT correlators. The development of the SPO-method for the rotation invariant pattern recognition with an optical-

(**e**, **f**) and **16** (**e**, **f**)) in the case of the rotation invariant recognition by Fourier-Mellin.

of the SPO-method to the rotation invariant correlation.

This research was supported by the research fund from Chosun University, 2015, South Korea.

#### **Author details**

Pavel V. Yezhov1\*, Alexander P. Ostroukh2 , Jin-Tae Kim3\* and Alexander V. Kuzmenko2

\*Address all correspondence to: yezhov@iop.kiev.ua and kimjt@chosun.ac.kr

1 Institute of Physics of the NAS of Ukraine, Kyiv, Ukraine

2 IC "Institute of Applied Optics" of the NAS of Ukraine, Kyiv, Ukraine

3 Department of Photonic Engineering, College of Engineering, Chosun University, Gwangju, South Korea

#### **References**


[22] Casasent D, Psaltis D. Position, rotation, and invariant optical correlator. Appl. Opt. 1994; 15:1795–1799. doi:10.1364/AO.15.001795

[9] Refregier Ph. Application of the stabilizing functional approach to pattern recognition filter. J. Opt. Soc. Am. A. 1994; 11(4):1243–1252. doi:10.1364/JOSAA.11.001243

[10] Rosen J, Shamir J. Application of the projection-onto-constraint-sets algorithm for optical pattern recognition. Opt. Lett. 1991; 16(10):752–754. doi:10.1364/OL.16.000752

[11] Vanderlugt A. Practical considerations for the use of spatial carrier-frequency filters.

[12] Weaver C, Goodman J. Technique for optically convolving two functions. Appl. Opt.

[13] Gerchberg R, Saxton W. A practical algorithm for the determination of phase from

[14] Yezhov P, Kuzmenko A. Synthesized phase objects instead of real ones for opticaldigital recognition systems. In: Proceedings of the SPIE Sixth International Conference on Correlation Optics (CorrOpt 2004); 16–19 September 2003; Ukraine. Chernivtsi: SPIE

[15] Yezhov P, Kuzmenko A, Smirnova A, Ivanovskiy A. Synthesized phase objects used instead of real ones for optical-digital recognition systems: experiment. In: Proceedings of the SPIE Seventh International Conference on Correlation Optics 625419 (CorrOpt 2006); 14 June 2006; Ukraine. Chernivtsi: SPIE 6254; 2006. p. 349–361; doi:

[16] Yezhov P, Kuzmenko A, Kim J, Smirnova T. Method of synthesized phase objects for pattern recognition: matched filtering. Opt. Exp. 2012; 20(28):29854–29866. doi:10.1364/

[17] Ostroukh A, Butok A, Shvets R, Yezhov P, Kim J and Kuzmenko A. Method of synthesized phase objects for pattern recognition with rotation invariance. In: Proceedings of the SPIE Twelfth International Conference on Correlation Optics 98090B (CorrOpt 2015);14 September 2015; Ukraine. Chernivtsi: SPIE 9809; 2006. p. 98090B; doi:

[18] Gallaher N. Method for computing kinoform that reduces image reconstruction error.

[19] Fitio L, Muravsky V, Stefansky A. Using phase masks for image recognition in optical correlators. In: Proceedings of the SPIE Seventh International Conference on Holography and Correlation Optics, 224 (CorrOpt 1995); 10 November 1995; Ukraine. Cher-

[20] Kallman R, Goldstein D. Phase-encoding input images for optical pattern recognition.

[21] Kuzmenko A. Laplace transform in coherent optics and its application for the realiza-

Appl. Opt. 1973; 12(10):2328–2335. doi:10.1364/AO.12.002328

nivtsi: SPIE 2647; 1995. p. 224–234; doi:10.1117/12.226700

tion of the Mellin transform. Autometriya. 1975; 5:22–26.

Opt. Eng. 1994; 33:1806–1811. doi:10.1117/12.171322

image and diffraction plane pictures. Optik. 1972; 35:237–246

Appl. Opt. 1966; 5(11):1760–1765

54 Pattern Recognition - Analysis and Applications

1966; 5:1248–1249. doi:10.1364/AO.5.001248

5477; 2003. p. 412–421; doi:10.1117/12.559771

10.1117/12.679944

OE.20.029854

10.1117/12.2219848


**Pattern Recognition: Applications**

#### **Automated Face Recognition: Challenges and Solutions Automated Face Recognition: Challenges and Solutions**

Joanna Isabelle Olszewska Joanna Isabelle Olszewska

Additional information is available at the end of the chapter Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/66013

#### **Abstract**

Automated face recognition (AFR) aims to identify people in images or videos using pattern recognition techniques. Automated face recognition is widely used in applica‐ tions ranging from social media to advanced authentication systems. Whilst techniques for face recognition are well established, the automatic recognition of faces captured by digital cameras in unconstrained, real‐world environment is still very challenging, since it involves important variations in both acquisition conditions as well as in facial expressions and in pose changes. Thus, this chapter introduces the topic of computer automated face recognition in light of the main challenges in that research field and the developed solutions and applications based on image processing and artificial intelligence methods.

**Keywords:** face recognition, face identification, face verification, face authentication, face labelling in the wild, computational face

#### **1. Introduction**

Automated face recognition (AFR) has received a lot of attention from both research and industry communities since three decades [1] due to its fascinating range of scientific challenges as well as rich possibilities of commercial applications [2], particularly in the context of biometrics/forensics/security [3] and, more recently, in the areas of multimedia and social media [4, 5].

Face recognition is the field trying to bring an answer to the question: *'Whose face it is?'* For this purpose, people have natural abilities through their human perceptive and cognitive systems [6], whereas machines need complex systems involving multiple, advanced algorithms and/or large, adequate face databases. Studying, designing and developing such methods and technologies are the domain of automated face recognition (AFR).

AFR could be distinguished further into the computer automated face identification and the computer automated face verification. Hence, on the one hand, automated face identification consists in a one‐to‐many (1:N) search of a face image among a database containing many different face images in order to answer questions such as '*Is it a known face?*' [7]. On the other hand, automated face verification is a one‐to‐one (1:1) search to solve the matter of '*Is it the face of …?*' search [8].

Moreover, AFR could be the basis to the solution of the '*Who is in the picture?*' problem, leading to the computer automated face labelling/face naming [9].

The general AFR process is illustrated in **Figure 1**. Usually, it first applies techniques address‐ ing questions such as '*Is there a face in the image?*' (face detection) and '*Where is the face in the image?*' (face location) and next, it handles the computer‐automated recognition mechanism itself [10].

**Figure 1.** Overview of the face detection and recognition processes.

In particular, this chapter is dedicated to the 'why' and 'how' of the computer‐automated face recognition in constrained and unconstrained environments. The remaining parts of this chapter are structured as follows: in Section 2, we describe AFR's today challenges, while corresponding scientific solutions and industrial applications are presented in Sections 3 and 4, respectively. Section 5 draws up new trends and future directions for automated face recognition performance improvements and evolution.

#### **2. Challenges**

The study and analysis of faces captured by digital cameras address a wide range of challenges, as detailed in Sections 2.1–2.7, which all have a direct impact on the computer automated face detection and recognition.

#### **2.1. Pose variations**

Head's movements, which can be described by the egocentric rotation angles, i.e. pitch, roll and yaw [11], or camera changing point of views [12] could lead to substantial changes in face appearance and/or shape and generate intra‐subject face's variations as illustrated in **Fig‐ ure 2**, making automated face recognition across pose a difficult task [13].

**Figure 2.** Illustration of pose variations around egocentric rotation angles, namely (a) pitch, (b) roll and (c) yaw.

Since AFR is highly sensitive to pose variations, pose correction is essential and could be achieved by means of efficient techniques aiming to rotate the face and/or to align it to the image's axis as detailed in reference [13].

#### **2.2. Presence/absence of structuring elements/occlusions**

The diversity in the intra‐subject face's images could also be due to the absence of structuring elements (see **Figure 3a**) or the presence of components such as beard and/or moustache (see **Figure 3b**), cap (see **Figure 3c**), sunglasses (see **Figure 3d**), etc. or occlusions of the face (see **Figure 3e**) by background or foreground objects [14].

Thus, face's images taken in an unconstrained environment often require effective recognition of faces with disguise or faces altered by accessories and/or by occlusions, as dealt by appro‐ priate approaches such as texture‐based algorithms [15].

#### **2.3. Facial expression changes**

AFR could be distinguished further into the computer automated face identification and the computer automated face verification. Hence, on the one hand, automated face identification consists in a one‐to‐many (1:N) search of a face image among a database containing many different face images in order to answer questions such as '*Is it a known face?*' [7]. On the other hand, automated face verification is a one‐to‐one (1:1) search to solve the matter of '*Is it the face*

Moreover, AFR could be the basis to the solution of the '*Who is in the picture?*' problem, leading

The general AFR process is illustrated in **Figure 1**. Usually, it first applies techniques address‐ ing questions such as '*Is there a face in the image?*' (face detection) and '*Where is the face in the image?*' (face location) and next, it handles the computer‐automated recognition mechanism

In particular, this chapter is dedicated to the 'why' and 'how' of the computer‐automated face recognition in constrained and unconstrained environments. The remaining parts of this chapter are structured as follows: in Section 2, we describe AFR's today challenges, while corresponding scientific solutions and industrial applications are presented in Sections 3 and 4, respectively. Section 5 draws up new trends and future directions for automated face

The study and analysis of faces captured by digital cameras address a wide range of challenges, as detailed in Sections 2.1–2.7, which all have a direct impact on the computer automated face

Head's movements, which can be described by the egocentric rotation angles, i.e. pitch, roll and yaw [11], or camera changing point of views [12] could lead to substantial changes in face

to the computer automated face labelling/face naming [9].

**Figure 1.** Overview of the face detection and recognition processes.

recognition performance improvements and evolution.

*of …?*' search [8].

60 Pattern Recognition - Analysis and Applications

itself [10].

**2. Challenges**

detection and recognition.

**2.1. Pose variations**

Some more variability in face appearance could be caused by changes of facial expressions induced by varying person's emotional states [16] which are displayed in **Figure 4**.

Hence, efficiently and automatically recognizing the different facial expressions is important for both the evaluation of emotional states and the automated face recognition. In particular, human expressions are composed of macro‐expressions, which could express, e.g., anger, disgust, fear, happiness, sadness or surprise, and other involuntary, rapid facial patterns, i.e. micro‐expressions; all these expressions generating non‐rigid motion of the face. Such facial dynamics can be computed, e.g., by means of the dense optical flow field [17].

**Figure 4.** Illustration of varying facial expressions that reflect emotions such as (a) anger, (b) disgust, (c) sadness or (d) happiness.

#### **2.4. Ageing of the face**

Another reason of face appearance's changes could be engendered by the ageing of the human face, and could impact on the entire AFR process if the time between each image capture is significant [18], as illustrated in **Figure 5**.

**Figure 5.** Illustration of the human ageing process, where the same person has been photographed (a) at a younger age and (b) at an older age, respectively.

To overcome face ageing issue in AFR, methods need to take properly into account facial ageing patterns [18]. Indeed, over time, not only face characteristics such as its shape or lines are modified [19], but other aspects are changing as well, e.g. hairstyle [20].

#### **2.5. Varying illumination conditions**

Large variations of illuminations could degrade the performance of AFR systems. Indeed, for low levels of lighting of the background or foreground, face detection and recognition are much harder to perform [21], since shadows could appear on the face and/or facial patterns could be (partially) indiscernible. On the other hand, too high levels of lights could lead to over‐exposure of the face and (partially) indiscernible facial patterns (see **Figure 6**).

Robust automated face detection and recognition in the case of (close‐to‐) extreme or largely varying levels of lighting apply to image‐processing techniques such as illumination normal‐ ization, e.g. through histogram equalization [22]; or machine‐learning methods involving the actual image global image intensity average value [21].

**Figure 6.** Illustration of camera lighting variations, leading to (a) over‐exposure of the face, (b) deep shadows on the face or (c) partial backlight.

#### **2.6. Image resolution and modality**

**Figure 4.** Illustration of varying facial expressions that reflect emotions such as (a) anger, (b) disgust, (c) sadness or

Another reason of face appearance's changes could be engendered by the ageing of the human face, and could impact on the entire AFR process if the time between each image capture is

**Figure 5.** Illustration of the human ageing process, where the same person has been photographed (a) at a younger age

To overcome face ageing issue in AFR, methods need to take properly into account facial ageing patterns [18]. Indeed, over time, not only face characteristics such as its shape or lines are

Large variations of illuminations could degrade the performance of AFR systems. Indeed, for low levels of lighting of the background or foreground, face detection and recognition are much harder to perform [21], since shadows could appear on the face and/or facial patterns could be (partially) indiscernible. On the other hand, too high levels of lights could lead to

Robust automated face detection and recognition in the case of (close‐to‐) extreme or largely varying levels of lighting apply to image‐processing techniques such as illumination normal‐ ization, e.g. through histogram equalization [22]; or machine‐learning methods involving the

over‐exposure of the face and (partially) indiscernible facial patterns (see **Figure 6**).

modified [19], but other aspects are changing as well, e.g. hairstyle [20].

(d) happiness.

**2.4. Ageing of the face**

62 Pattern Recognition - Analysis and Applications

and (b) at an older age, respectively.

**2.5. Varying illumination conditions**

actual image global image intensity average value [21].

significant [18], as illustrated in **Figure 5**.

Other usual factors influencing AFR performance are related to the quality and resolution of the face image and/or to the set‐up and modalities of the digital equipment capturing the face [23]. For this purpose, ISO/IEC 19794‐5 standard [24] has been developed to specify scene and photographic requirements as well as face image format for AFR, especially in the context of biometrics. However, real‐world situations of face image acquisition imply the use of different photographic hardware, including one or several cameras which could be omnidirectional or pan‐tilt‐zoom [25], and which could include, e.g. wide‐field sensors [25], photometric stereo [26], etc. Cameras could work in the range of the visible light or use infra‐red sensors, leading to multiple modalities for AFR [6]. Hence, faces acquired in real‐world conditions lead to further AFR challenges.

**Figure 7.** Illustration of variations of the image scale and resolution, with (a) a large‐scale picture, (b) a small‐scale pic‐ ture and (c) a low‐resolution picture.

For example, as shown in **Figure 7**, in some situations, a face could be captured at distance resulting in a smaller face region image compared to the one in a large‐scale picture. On the other hand, some digital camera could have a low resolution [27] or even very low resolution [28], if the resolution is below 10 × 10, leading to poor quality face images, from which AFR is very difficult to perform. To deal with this limitation, solutions have been proposed to reconstruct a high‐resolution image based on the low‐resolution one [28] using the super‐ resolution method [29, 30].

#### **2.7. Availability and quality of face datasets**

Each AFR technology requires an available, reliable and realistic face database in order to perform the 1:N or 1:1 face search within it (see **Figure 1**). Hence, the quality such as com‐ pleteness (e.g. including variations in facial expressions, in facial details, in illuminations, etc.) as well as accuracy (e.g. containing ageing patterns, etc.) and the characteristics (e.g. varying image file format and colour/grey level, face resolution, constrained/unconstrained environ‐ ment, etc.) of a face dataset are crucial to the AFR process [31]. Moreover, when dealing with face data, people's consent and privacy should be respected as AFR systems should comply with the Data Protection Act 2010 [32].

For research purpose, several face databases have been developed and are publicly available. Well‐established, online face databases are as follows:


Some databases contain both 2D and 3D face data, e.g. Face Recognition Grand Challenge (FRGC) dataset [39] recorded such 50,000 un‐/controlled images from 4003 subject sessions.

Other datasets have multiple modalities such as XM2VTSDB multi‐modal face database [40] which is the Extended M2VTS database. It is a large, multi‐modal database captured onto high‐ quality, digital video. It contains four recordings, each with a speaking head shot and a rotating head shot, of 295 subjects taken over a period of 4 months. This database includes high‐quality colour images, 32 kHz 16‐bit sound files, video sequences and also a 3D model.

Another multi‐modal database is the Surveillance Cameras Face (SCFace) [41] dataset. It has recorded 4160 static human faces of 130 subjects, in the visible and infrared spectrum, in an unconstrained indoor environment, using a multi‐camera set‐up consisting of five video‐ surveillance cameras which various qualities mimic real‐world conditions.

Recent developments of face databases focus on capturing faces in the wild, i.e. in uncon‐ strained environments. For example, Face Detection Data Set and Benchmark (FDDB) [42] is a dataset of 2845 images, both greyscale and colour ones, with 5171 faces in the wild, which could include occlusions, poses variations, low resolution and out‐of‐focus faces.

Labelled Faces in the Wild (LFW) [43] database is a popular dataset for studying multi‐view faces in an unconstrained environment. It has recorded 13,233 foreground face images; other faces in the images being assimilated to the background. It has targeted 5749 different individuals, which could have one or more images in the database, and presents variations in pose, lighting, expression, background, race, ethnicity, age, gender, clothing, hairstyles, camera quality, colour saturation, focus, etc. Images have a 250 × 250 pixels resolution and are in *jpeg* format; they are mostly in colour, although few are greyscale only.

Some other available face datasets have been designed for specific purposes. Hence, Sponta‐ neous MICro‐expression database (SMIC) [44] is used for facial micro‐expressions recognition, while the Acted Facial Expression in the Wild (AFEW) database [45], which has semi‐auto‐ matically collected face images with acted emotions from movies, is dedicated to macro‐ expression recognition in close‐to‐real conditions. On the other hand, FG‐NET Ageing database (FG‐NET) [46] could be applied for age estimation, age‐invariant face recognition and age progression.

#### **3. Solutions**

pleteness (e.g. including variations in facial expressions, in facial details, in illuminations, etc.) as well as accuracy (e.g. containing ageing patterns, etc.) and the characteristics (e.g. varying image file format and colour/grey level, face resolution, constrained/unconstrained environ‐ ment, etc.) of a face dataset are crucial to the AFR process [31]. Moreover, when dealing with face data, people's consent and privacy should be respected as AFR systems should comply

For research purpose, several face databases have been developed and are publicly available.

**•** ORL [33] is a 400‐picture dataset of 40 distinct subjects, in portable grey map (*pgm*) format and with a 92 × 112 pixel resolution, 8‐bit grey level. Men and women's faces are taken against a dark homogeneous background, under varying illumination conditions. The subjects are in up‐right, frontal position, with variations in face expressions, facial details and poses

**•** Caltech Faces [34] dataset consists of 450 *jpeg* images with a resolution of 896 × 592 pixels. Each image shows the frontal view of a face (single pose) of one out of 27 unique persons,

**•** The Face Recognition Technology (FERET) [35] database has been built with 14,126 face images from 1199 individuals, defining sets of 5–11 greyscale images per person. Each set contains mugshots with different facial expressions and facial details, acquired using

**•** BioID Face database [36] has 1521 frontal face images of 23 people. Images of 384 × 286 pixel resolution are in *pgm* format and have been captured in real‐world conditions, i.e. with a

**•** Yale face database [37] has 165 greyscale, *gif* images of 15 individuals. There are 11 images per subject, one per different facial expression or configuration, i.e. left/centre/right‐light,

**•** Caltech 10,000 web faces [38] have collected 10,524 human faces of various resolutions and in different settings (e.g. portrait images, group of people, etc.) from *Google Image*. Coordi‐ nates of eyes, nose and the centre of the mouth for each frontal face are provided in order to be used as ground truth for face detection algorithms, or to align and/or crop the human

Some databases contain both 2D and 3D face data, e.g. Face Recognition Grand Challenge (FRGC) dataset [39] recorded such 50,000 un‐/controlled images from 4003 subject sessions.

Other datasets have multiple modalities such as XM2VTSDB multi‐modal face database [40] which is the Extended M2VTS database. It is a large, multi‐modal database captured onto high‐ quality, digital video. It contains four recordings, each with a speaking head shot and a rotating head shot, of 295 subjects taken over a period of 4 months. This database includes high‐quality

colour images, 32 kHz 16‐bit sound files, video sequences and also a 3D model.

with the Data Protection Act 2010 [32].

64 Pattern Recognition - Analysis and Applications

within ±20% in yaw and roll.

various cameras and varying lighting.

faces for AFR.

Well‐established, online face databases are as follows:

under different lighting, expressions and backgrounds.

large variety of illumination, background and face size.

with or without glasses and with different expressions.

Major pattern recognition techniques as well as main machine‐learning methods used for AFR systems are presented in Section 3.1, while classic approaches for AFR in still images or video databases/live video streams are mentioned in Section 3.2.

#### **3.1. Face recognition systems**

Most of the AFR systems consist in a two‐step process (see **Figure 8**) based firstly on facial feature extraction, as explained in Section 3.1.1, and second, on facial feature classification/ matching against an available face database, as mentioned in Section 3.1.2.

#### *3.1.1. Feature extraction*

Facial features are representing the face in a codified way which is computationally efficient for further processes such as matching, classification or other machine‐learning techniques, in order to perform AFR. On the other hand, computing facial features in an image could serve to detect a face and to locate it within the image, as illustrated in **Figure 9**.

**Figure 8.** Schematic representation of the automated face recognition system.

**Figure 9.** Face location via (a) a bounding box and (b) an ellipse.

Facial feature representations could be of different nature from sparse to dense ones, and could be focused on face appearance, face texture or face geometry [15].

**Figure 10.** Results of facial feature modelling using different approaches, e.g. (a‐b) Haar‐like features; (c) Linear Binary Patterns (LBP); (d) Edge map; (e) Active shape; (f) SIFT points.

Commonly computed facial features are Haar‐like features [47] (**Figure 10(a, b)**); linear binary patterns (LBP) [48] (**Figure 10(c)**), which have been extended to local directional pattern (LDP) [49] for micro‐expressions recognition in particular; edge maps (**Figure 10(d)**) and their extension to line edge maps (LEM) [50]; active shape or active contours [51] (**Figure 10(e)**); SIFT points [52] (**Figure 10(f)**), etc.

The detected facial features, e.g. with SIFT points usually correspond to some or all elements of the set of facial anthropometric landmarks, i.e., facial fiducial points (FPs) (see **Figure 11**), which are defined as follows: FP1—top of the head, FP2—right eyebrow right corner, FP3 right eyebrow left corner, FP4—left eyebrow right corner, FP5—left eyebrow left corner, FP6 —right eye right corner, FP7—right eye centre of pupil, FP8—right eye left corner, FP9—left eye right corner, FP10—left eye centre of pupil, FP11—left eye left corner, FP12—nose right corner, FP13—nose centre bottom, FP14—nose left corner, FP15—mouth right corner, FP16 mouth left corner, FP17—chin corner, FP18—right ear top corner, FP19—right ear bottom corner, FP20—left ear top corner and FP21—left ear bottom corner [53].

**Figure 11.** Illustration of the 21 facial landmarks.

**Figure 8.** Schematic representation of the automated face recognition system.

66 Pattern Recognition - Analysis and Applications

**Figure 9.** Face location via (a) a bounding box and (b) an ellipse.

Patterns (LBP); (d) Edge map; (e) Active shape; (f) SIFT points.

be focused on face appearance, face texture or face geometry [15].

Facial feature representations could be of different nature from sparse to dense ones, and could

**Figure 10.** Results of facial feature modelling using different approaches, e.g. (a‐b) Haar‐like features; (c) Linear Binary

Commonly computed facial features are Haar‐like features [47] (**Figure 10(a, b)**); linear binary patterns (LBP) [48] (**Figure 10(c)**), which have been extended to local directional pattern (LDP) [49] for micro‐expressions recognition in particular; edge maps (**Figure 10(d)**) and their Computer automated face recognition relies on facial features, in the same way forensic examiners focus their attention not only on the overall similarity of two faces regarding their shape, size, etc. [54], but also on morphological comparisons region by region, e.g. nose, mouth, eyebrows, etc. [53]. Some AFR methods evaluate also discriminative characteristics such as the distance from people's mouth to the nose, nose to eyes, mouth to eyes, etc. [55]. This adds robustness into AFR systems in the case of modification of some facial patterns over the course of time or occlusions.

Once the face is detected/located and the facial features are extracted, actions to crop the face, to correct its alignment by rotating it, etc., could be performed to address the challenges mentioned in Section 2, before passing the facial features into the next stage described in Section 3.1.2.

#### *3.1.2. Feature classification/matching*

For the recognition stage itself of the face recognition process, classification is often used as shown in **Figure 12**. Indeed, it is a machine‐learning technique [56] that has the task of first learning and then applying a function that maps the facial features of an individual to one of the predefined class labels, i.e. class 1 (face of the individual) or class 2 (not the face of the individual), leading in this case to a binary classifier. Classifiers could be applied to the entire set of the extracted facial features or to some specific face attributes, e.g. gender, age, race, etc. [57]. More recently, methods like neural networks are used as classifiers [58].

**Figure 12.** Overview of the model computation.

On the other hand, some AFR systems use the matching technique that could be applied on facial geometric features or templates [59]. This approach is also useful for multimodal face data [60].

#### **3.2. Examples of methods**

Among hundreds of techniques developed in this field [1–10], Sections 3.2.1–3.2.4 explain briefly some well‐established methods for automated face recognition.

#### *3.2.1. Eigenfaces*

The eigenface approach [61] is a very successful AFR method. It involves pixel intensity features and uses the principal component analysis (PCA) of the distribution of faces, or *eigenvectors*, which are a kind of set of features characterizing faces' variations where each face image contributes more or less to each eigenvector. Thus, an eigenvector can be seen as a ghostly face, or *eigenface*. Recognition of a test face is determined by applying the nearest‐ neighbour technique to the probe face projection in the face space [13]. Fisherfaces extend the eigenface approach by using linear discriminant analysis (LDA) instead of PCA [62, 63].

#### *3.2.2. Active appearance models*

*3.1.2. Feature classification/matching*

68 Pattern Recognition - Analysis and Applications

**Figure 12.** Overview of the model computation.

**3.2. Examples of methods**

data [60].

*3.2.1. Eigenfaces*

For the recognition stage itself of the face recognition process, classification is often used as shown in **Figure 12**. Indeed, it is a machine‐learning technique [56] that has the task of first learning and then applying a function that maps the facial features of an individual to one of the predefined class labels, i.e. class 1 (face of the individual) or class 2 (not the face of the individual), leading in this case to a binary classifier. Classifiers could be applied to the entire set of the extracted facial features or to some specific face attributes, e.g. gender, age, race, etc.

On the other hand, some AFR systems use the matching technique that could be applied on facial geometric features or templates [59]. This approach is also useful for multimodal face

Among hundreds of techniques developed in this field [1–10], Sections 3.2.1–3.2.4 explain

The eigenface approach [61] is a very successful AFR method. It involves pixel intensity features and uses the principal component analysis (PCA) of the distribution of faces, or *eigenvectors*, which are a kind of set of features characterizing faces' variations where each face image contributes more or less to each eigenvector. Thus, an eigenvector can be seen as a ghostly face, or *eigenface*. Recognition of a test face is determined by applying the nearest‐

briefly some well‐established methods for automated face recognition.

[57]. More recently, methods like neural networks are used as classifiers [58].

The active appearance model (AAM) [64] combines shape and texture features; thus it is slower but more robust for AFR than active shape models (ASM). AAM is built as a multi‐resolution model based on a Gaussian‐image pyramid. For each level of the pyramid, a separate texture model is computed using 400 face images. Each face is labelled with 68 points around the main features, and the facial region is sampled by c. 10,000 intensity values. AFR is performed by matching the test face with the AAM, following a multi‐resolution approach that improves speed and robustness of this method [64].

#### *3.2.3. Local binary patterns*

In reference [48], local binary patterns (LBP), which are texture features, have been introduced for AFR. In particular, the face image is divided into independent regions where the LBP operator is applied to codify every pixel of each region by thresholding the 3 × 3‐neighbour‐ hood of each pixel with the centre pixel value and by binarizing it, and then, creating a local texture descriptor with the histogram of the codes for each face region. A global description of the face is formed by concatenating the local descriptors. Next, the nearest‐neighbour classifier is used [48]. LBP approach has been widely adopted for AFR, and several enhance‐ ments have been proposed, e.g. the local directional patterns (LDP) [49].

#### *3.2.4. SIFT*

The discriminative deep metric‐learning (DDML) [52] approach for AFR in unconstrained environment uses facial features such as SIFT descriptors and trains a deep neural network as a classifier to learn a Mahalanobis distance metric in order to maximize face's inter‐class variations and minimize face's intra‐class variations, simultaneously [52].

#### **4. Applications**

Nowadays, industry integrates cutting‐edge, face recognition research into the development of the latest technologies for commercial applications such as mentioned in Sections 4.1–4.2.

#### **4.1. Security**

Face recognition is one of the most powerful processes in biometric systems [8] and is exten‐ sively used for security purpose in tracking and surveillance [65, 66], attendance monitoring, passenger management at airports, passport de‐duplication, border control and high security access control as developed by companies like *Aurora* [67].

AFR is applied in forensics for face identification [68], face retrieval in still image databases or CCTV sequences [69], or for facial sketch recognition [70]. It could also help law enforcement through behaviour and facial expression observation [71], lie detection [72], lip tracking and reading [73].

Moreover, AFR is now used in the context of 'Biometrics as a Service' [74], within cloud‐based, online technologies requiring face authentication for trustworthy transactions. For example, *MasterCard* developed an app which uses selfies to secure payments via mobile phones [75]. In this *MasterCard*'s app, AFR is enhanced by facial expression recognition as the application requires the consumer blinks to prove that s/he is human.

#### **4.2. Multimedia**

In our today's life, AFR engines are embedded in a number of multi‐modal applications such as aids for buying glasses or for digital make‐up and other face sculpting or skin smoothing technologies, e.g. designed by *Anthropics* [76].

In social media, many collaborative applications within *Facebook* [77], *Google* [78] or *Yahoo!* [79] are calling upon AFR. Applications such as *Snapchat* require AFR on mobile [80]. With 200 million users of which half of those engage on daily basis [81], *Snapchat* is a popular image messaging and multimedia mobile application, where 'snaps', i.e. a photo or a short video, can be edited to include filters and effects, text caption and drawings. *Snapchat* has features such as the 'Lens', which allows users to add real‐time effects into their snaps by using AFR technologies, and 'Memories' which searches content by date or using local recognition systems [82].

Other multimedia applications are using AFR, e.g. in face naming to generate automated headlines in *Video Google*[83], in face expression tracking for animations and human‐computer interfaces (HCI) [84], or in face animation for socially aware robotics [85]. Companies such as *Double Negative Visual Effects* [86] or *Disney Research* [87] propose also AFR solutions for face synthesis and face morphing for films and games visual effects.

#### **5. Conclusions**

Since constraints shape the path for innovative solutions, we focused this chapter on scientific and technical challenges brought by computer automated face recognition, and we explained current solutions as well as potential applications. Moreover, there are a number of challenges ahead and plenty of room for innovations in this field of automated face recognition. In particular, three emerging directions are discussed in Sections 5.1–5.3.

#### **5.1. Deep face**

On the one hand, the proliferation of mobile devices such as smartphones and tablets, which are world‐widely available for consumers and which allow users to easily record digital pictures, and on the other hand, the outbreak of mobile and web applications, which manip‐ ulate and store thousands of pictures, have paved the way to the Big Data, and, among others, to the necessity to analysis large‐scale, face databases. This phenomenon has given rise to questions such as AFR technology scalability and computational power, and it has led to the development of a new AFR approach called deep face recognition [88], which involves deep‐ learning techniques using convolutional neural networks [89], well fitted for big datasets [90]. Indeed, deep face methods are using large databases for training their models, as by biomi‐ metics, they rely on the familiarity concept [91], which is based on the fact that more people are familiar with a person's face, more easily they recognized his/her face, even in complex situations like occlusions or low resolution. Moreover, the recent development of the deep face approach has benefited from progress in parallel computing tools for acceleration and enhancement of distributed computing techniques for scalability. In particular, for deep face recognition, graphics processing units (GPUs), which are specialized processors for real‐time, high‐resolution 3D graphics, are used as highly parallel multi‐core systems for big data [92], together with the Compute Unified Device Architecture (CUDA), which provides a simple and powerful platform [93], making easier for specialists in parallel programming to utilize GPU resources without advanced skills in graphics programming. Since the above‐mentioned, iterative computation consists of local parallel processing, CUDA implementation is employed for reducing the computation time of the AFR system [93]. However, deep face‐based methods generate themselves further challenges, e.g. face frontalization [94] that is the process of synthesizing frontal facing views of faces appearing in single unconstrained photos, in order to boost AFR performance within intelligent systems.

#### **5.2. Wild face**

through behaviour and facial expression observation [71], lie detection [72], lip tracking and

Moreover, AFR is now used in the context of 'Biometrics as a Service' [74], within cloud‐based, online technologies requiring face authentication for trustworthy transactions. For example, *MasterCard* developed an app which uses selfies to secure payments via mobile phones [75]. In this *MasterCard*'s app, AFR is enhanced by facial expression recognition as the application

In our today's life, AFR engines are embedded in a number of multi‐modal applications such as aids for buying glasses or for digital make‐up and other face sculpting or skin smoothing

In social media, many collaborative applications within *Facebook* [77], *Google* [78] or *Yahoo!* [79] are calling upon AFR. Applications such as *Snapchat* require AFR on mobile [80]. With 200 million users of which half of those engage on daily basis [81], *Snapchat* is a popular image messaging and multimedia mobile application, where 'snaps', i.e. a photo or a short video, can be edited to include filters and effects, text caption and drawings. *Snapchat* has features such as the 'Lens', which allows users to add real‐time effects into their snaps by using AFR technologies, and 'Memories' which searches content by date or using local recognition

Other multimedia applications are using AFR, e.g. in face naming to generate automated headlines in *Video Google*[83], in face expression tracking for animations and human‐computer interfaces (HCI) [84], or in face animation for socially aware robotics [85]. Companies such as *Double Negative Visual Effects* [86] or *Disney Research* [87] propose also AFR solutions for face

Since constraints shape the path for innovative solutions, we focused this chapter on scientific and technical challenges brought by computer automated face recognition, and we explained current solutions as well as potential applications. Moreover, there are a number of challenges ahead and plenty of room for innovations in this field of automated face recognition. In

On the one hand, the proliferation of mobile devices such as smartphones and tablets, which are world‐widely available for consumers and which allow users to easily record digital pictures, and on the other hand, the outbreak of mobile and web applications, which manip‐ ulate and store thousands of pictures, have paved the way to the Big Data, and, among others, to the necessity to analysis large‐scale, face databases. This phenomenon has given rise to

requires the consumer blinks to prove that s/he is human.

synthesis and face morphing for films and games visual effects.

particular, three emerging directions are discussed in Sections 5.1–5.3.

technologies, e.g. designed by *Anthropics* [76].

reading [73].

70 Pattern Recognition - Analysis and Applications

**4.2. Multimedia**

systems [82].

**5. Conclusions**

**5.1. Deep face**

Another challenge that has appeared with the generation of a large amount of visual data captured 'in the wild', i.e. in an unconstrained environment, by commercial cameras is the automated recognition of faces in the wild. It involves the enhancement of AFR methods [95] in order they efficiently deal with complex, real‐world backgrounds [96], multiple‐face scenes [51], skin‐colour variations [97], gender variety [98] and with inherent challenges such as image quality, resolution, illumination or facial pose correction [23, 27, 99].

#### **5.3. Dynamic face**

In the recent years, handling facial dynamics efficiently is crucial for AFR systems, because people have recorded a large amount of faces as still digital images, e.g. selfies or as video streams, e.g. CCTV sequences or online movies. Indeed, on the one hand, the different variations in facial micro/macro expressions [100], which generate fast, facial dynamics and the different processes such as ageing, which is an extremely slow, dynamic problem since the face evolves over large periods of time [18], have all an impact on AFR techniques. On the other hand, face acquisition in videos intrinsically creates facial dynamics due to camera motion, change of point of view, as well as head's movements or pose variations. Such situations require AFR engines perform in real time [84], apply image/frames pre‐processing such as face alignment [101], cope with intra‐class variations/inter‐class similarities [102] and are able to process single/multiple camera views [41] or synthesize a 3D face model from a single camera [103], leading to the wider study of the computational face.

#### **Author details**

Joanna Isabelle Olszewska

Address all correspondence to: joanna.olszewska@ieee.org

School of Computing and Technology, University of Gloucestershire, Cheltenham, UK

#### **References**


[12] Zhang X., Gao, Y. Face recognition across pose: A review. *Pattern Recognition*. 2009. 42(11):2876–2896.

**Author details**

**References**

Joanna Isabelle Olszewska

72 Pattern Recognition - Analysis and Applications

4(3):519–524.

Address all correspondence to: joanna.olszewska@ieee.org

*ACM Computing Surveys.* 2003. 35(4):399–458.

survey. *Proceedings of the IEEE.* 1995. 83(5):705–740.

rec.org/ [Accessed: 2016‐07‐07]

2016.11(11):2453–2465

626.

School of Computing and Technology, University of Gloucestershire, Cheltenham, UK

[1] Sirovich L., Kirby M. Low‐dimensional procedure for the characterization of human faces. *Journal of the Optical Society of America A ‐ Optics, Image Science and Vision.* 1987.

[2] Zhao W., Chellappa R., Rosenfeld A., Phillips P.J. Face recognition: A literature survey.

[3] Kisku D. R., Gupta P., Sing J. K., editors. *Advances in Biometrics for Secure Human*

[4] Face Detection & Recognition Online Resources. Maintained by Dr. R. Frischholz [Internet]. 2016. Available from: https://facedetection.com/ [Accessed: 2016‐07‐07]

[5] Face Recognition Online Resources. Information Pool for the Face Recognition Com‐ munity. Maintained by Dr. M. Grgic [Internet]. 2016. Available from: http://www.face‐

[6] Chellappa R., Wilson C.L., Sirohey S. Human and machine recognition of faces: A

[7] Iosifidis, A., Gabbouj, M. Scaling‐up class‐specific kernel discriminant analysis for large‐scale face verification. *IEEE Transactions on Information Forensics and Security*.

[8] Almudhahka N., Nixon M., Hare J. Human face identification via comparative soft biometrics. In: *Proceedings of the IEEE International Conference on Identity, Security and*

[9] Berg T.I., Berg A.C., Edwards J., Forsyth D.A. Who's in the Picture?. In: *Proceedings of the Neural Information Processing Systems Conference (NIPS).* Springer. 2004. pp. 137–144.

[10] Torres L. Is there any hope for face recognition? In: *Proceedings of the IEEE International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS)*. 2004.

[11] Murphy‐Chutorian E., Trivedi M.M. Head pose estimation in computer vision: A survey. *IEEE Transactions on Pattern Analysis and Machine Intelligence*. 2009. 31(4):607–

*Behavior Analysis (ISBA).*Sendai, JP*,* 29 Feb–02 Mar 2016. pp. 1–6.

*Authentication and Recognition*. CRC Press, Taylor and Francis. 2013. 450 p.


jaoued Y., Mayoraz E. Comparison of face verification results on the XM2VTS database. In: *Proceedings of the IEEE International Conference on Pattern Recognition (ICPR).* 2000. pp. 858–863.

[41] Grgic M., Delac K., Grgic S., Klimpak B. SCface ‐ Surveillance cameras face database. *Multimedia Tools Applications.* 2011. 51:863–879.

[26] Kautkar S.N., Atkinson G.A., Smith M.L. Face recognition in 2D and 2.5 D using ridgelets and photometric stereo. *Pattern Recognition.* 2012. 45(9):3317–3327.

[27] Mudunuri S.P., Biswas S. Low resolution face recognition across variations in pose and illumination. *IEEE Transactions on Pattern Analysis and Machine Intelligence.* 2016. 38(5):

[28] Zou W.W.W., Yuen P.C. Very low resolution face recognition problem. *IEEE Transaction*

[29] Huang H., He H. Super‐resolution method for face recognition using non‐linear map‐ pings on coherent features. *IEEE Transaction on Neural Networks*. 2011. 22(1):121–130.

[30] Li H., Lam K.M. Guided iterative back‐projection scheme for single image super‐ resolution. In: *Proceedings of the IEEE Global High Tech Congress on Electronics (GHTCE).*

[31] Gross R. Face databases. In: Li S.Z., Jain A.K., editors. *Handbook of Face Recognition.*

[32] Senior A.W, Pankanti S. Privacy protection and face recognition. In: Li S.Z., Jain A.K., editors. *Handbook of Face Recognition*. 2nd ed. Springer. 2011. pp. II.5–II.21.

[33] ORL. Face Database. AT&T Laboratories Cambridge [Internet]. 1994. Available from: www.cl.cam.ac.uk/research/dtg/attarchive/facedatabase.html [Accessed: 2016‐07‐07]

[34] Caltech Faces. A Public Dataset for Face Recognition. CalTech University. USA [Internet]. 1999. Available from: http://www.vision.caltech.edu/html‐files/archive.html

[35] Phillips P.J., Moon H., Rizvi S.A., Rauss P.J. The FERET evaluation methodology for face‐recognition algorithms. *IEEE Transactions on Pattern Analysis and Machine Intelli‐*

[36] Jesorsky O., Kirchberg K., Frischholz R. Face detection using the Hausdorff distance. In Bigun J., Smeraldi F., editors. *Audio and Video based Person Authentication*. LNCS

[37] Yale Face Database. Yale University. Connecticut, USA [Internet]. 2001. Available from: http://vision.ucsd.edu/yale\_face\_dataset\_original/yalefaces.zip [Accessed: 2016‐07‐07]

[38] Caltech 10000 Web Faces. Human Faces Collected from Google Image Search. CalTech University. Pasadena, California, USA [Internet]. 2005. Available from: www.vision.cal‐

[39] Phillips P.J., Flynn P.J., Scruggs T., Bowyer K.W., Chang J., Hoffman K., Marques J., Min J., Worek W. Overview of the face recognition grand challenge. In: *Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).* 2005. pp. I.947–I.954.

[40] Matas J., Hamouz M., Jonsson K., Kittler J., Li Y., Kotroupolous C., Tefas A., Pitas I., Tan T., Yan H., Smeraldi F., Bigun J., Capdevielle N., Gerstner W., Ben‐Yacoub S., Abdul‐

tech.edu/Image\_Datasets/Caltect\_10K\_WebFaces [Accessed: 2016‐07‐07]

1034–1040.

74 Pattern Recognition - Analysis and Applications

2013. pp. 175–180.

[Accessed: 2016‐07‐07]

*gence*. 2000. 22(10):1090–1104.

Springer. 2001. pp. 90–95.

*on Image Processing*. 2012. 21(1):327–340.

Springer‐Verlag. 2005. pp. 301–327.


[67] Aurora Computer Services. Provider of Biometric Solutions [Internet]. 2016. Available from: http://auroracs.co.uk/ [Accessed: 2016‐07‐07]

[53] Tome P., Fierrez J., Vera‐Rodriguez R., Ramos D. Identification using face regions: Application and assessment in forensic scenarios. *Forensic Science International.* 2013.

[54] Tanaka J.W., Farah M.J. Parts and wholes in face recognition. *Quarterly Journal of Experimental Psychology, Section A: Human Experimental Psychology.* 1993. 46(2):225–245.

[55] Guo J.M., Tseng S.H., Wong K.S. Accurate facial landmark extraction. *IEEE Signal*

[57] Kumar N., Berg A., Belhumeur P.N., Nayar S.K. Attribute and smile classifiers for face verification. In: *Proceedings of the IEEE International Conference on Computer Vision*

[58] Gallego‐Jutgla E., de Ipin K. L., Marti‐Puig P., Sole‐ Casals J. Empirical mode decom‐ position‐based face recognition system. In: *Proceedings of the International Conference on*

[59] Brunelli R., Poggio T. Face recognition: Features versus Templates. *IEEE Transactions on*

[60] Sun Y., Nasrollahi K., Sun Z., Tan T. Complementary cohort strategy for multimodal face pair matching. *IEEE Transactions on Information Forensics and Security*. 2016. 11(5):

[61] Turk J., Pentland A. Face recognition using eigenfaces. In: *Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)*. 1991. pp. 586–591.

[62] Ruiz‐del‐Solar J., Navarrete P. Eigenspace‐based face recognition: A comparative study of different approaches. *IEEE Transactions on Systems, Man and Cybernetics, Part C*. 2005.

[63] Belhumeur P.N., Hespanha J.P., Kriegman D.J. Eigenfaces vs. Fisherfaces: Recognition using class specific linear projection. *IEEE Transactions on Pattern Analysis and Machine*

[64] Cootes T.F., Edwards G.J., Taylor C.J. Active appearance model. *IEEE Transactions on*

[65] Olszewska J. I., De Vleeschouwer C., Macq B. Multi‐Feature Vector Flow for Active Contour Tracking. In: *Proceedings of the IEEE International Conference on Acoustics, Speech*

[66] Uiboupin T., Rasti P., Anbarjafari G., Demirel H. Facial image super resolution using sparse representation for improving face recognition in surveillance monitoring. In: *Proceedings of the IEEE Signal Processing and Communication Application Conference (SIU).*

*Pattern Analysis and Machine Intelligence*. 2001. 23(6):681–685.

*and Signal Processing (ICASSP)*. 2008. pp. 721–724.

*Bio‐Inspired Systems and Signal Processing*. 2013. pp. 445–450.

*Pattern Analysis and Machine Intelligence*. 1993. 15(10):1042–1052.

233:75–83.

76 Pattern Recognition - Analysis and Applications

937–950.

35(3):315–325.

2016. pp. 437–440.

*Intelligence*. 1997. 19(7):711–720

*Processing Letters*. 2016. 23(5):605–609.

*(ICCV)*. 2009. pp. 365–372.

[56] Mitchell T. *Machine Learning.* McGraw Hill. 1997.


[94] Sagonas C., Panagakis Y., Zafeiriou S., Pantic M. Robust statistical face frontalization. In: *Proceedings of the IEEE International Conference on Computer Vision (ICCV).* 2015. pp. 3871–3879.

[81] Temelkov I. Why Facial Recognition will Change Marketing Forever [Internet]. 2016. Available from: http://curatti.com/facial‐recognition/ [Accessed: 2016‐07‐07]

[82] Snapchat. Image Messaging & Multimedia Mobile Application [Internet]. 2016.

[83] Everingham M., Sivic J., Zisserman A. "Hello! My name is … Buffy ‐ Automatic naming of characters in TV video". In: *Proceedings of the British Machine Vision Conference*

[84] Tasli H.E., den Uyl T.M., Boujut H., Zaharia T. Real‐time facial character animation. In: *Proceedings of the IEEE International Conference on Automatic Face and Gesture Recognition*

[85] Han M.J., Lin C.H., Song K.T. Robotic emotional expression generation based on mood transition and personality model. *IEEE Transactions on Cybernetics.* 2013. 43(4):1290–

[86] Double Negative Visual Effects. Provider of Visual Effects for Films [Internet]. 2016.

[87] Disney Research. Provider of Advanced Visual Technologies [Internet]. 2016. Available

[88] Parkhi O.M., Vedaldi A., Zisserman A. Deep Face recognition. In: *Proceedings of the*

[89] Hu G., Yang Y., Yi D., Kittler J., Christmas W., Li S.Z., Hospedales T. When face recognition meets with deep learning: An evaluation of convolutional networks for face recognition. In: *Proceedings of the IEEE International Conference on Computer Vision*

[90] Sun Y., Wang X., Tang X. Deep learning face representation from predicting 10,000 classes. In: *Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition*

[91] Sinha P., Balas B., Ostrovsky Y., Russell R. Face recognition by humans: 19 results all computer vision researchers should know about. *Proceedings of the IEEE*. 2006. 94(11):

[92] Fung J., Mann S. Using graphics devices in reverse: GPU‐based image processing and compute revision. In: *Proceedings of the IEEE International Conference on Multimedia &*

[93] Owens J. D., Luebke D., Govindaraju N., Harris M., Kruger J., Lefohn A. E., Purcell T. J. A survey of general‐purpose computation on graphics hardware. *Computer Graphics*

Available from: http://www.dneg.com/ [Accessed: 2016‐07‐07]

from: https://www.disneyresearch.com/ [Accessed: 2016‐07‐07]

*British Machine Vision Conference (BMVC).* 2015. pp. 41.1–41.12.

Available from: https://www.snapchat.com [Accessed: 2016‐07‐07]

*(BMVC).* 2006. pp. 898–908.

78 Pattern Recognition - Analysis and Applications

*Workshops (ICCV).* 2015.

1948–1962.

*(CVPR).* 2014. pp. 1891–1898.

*Expo (ICME)*. 2008. pp. 9–12.

*Forum*. 2007. 26(1): 80–113.

(AFGR). 2015.

1303.


#### **Histogram-Based Texture Characterization and Classification of Brain Tissues in Non-Contrast CT Images of Stroke Patients Histogram-Based Texture Characterization and Classification of Brain Tissues in Non-Contrast CT Images of Stroke Patients**

Kenneth K. Agwu and Christopher C. Ohagwu Kenneth K. Agwu and Christopher C. Ohagwu Additional information is available at the end of the chapter

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/65349

#### **Abstract**

This chapter describes histogram-based texture characterization and classification of brain tissue in CT images of stroke patients using a case study. It explored texture analysis in medical imaging. In the case study, two radiologists independently inspected non-contrast CT images of 164 stroke to identify and categorize brain tissue into normal, ischaemic and haemorrhagic strokes. Four regions of interest (ROIs) in each CT slice with lesion were selected for analysis; two each represented the lesion and normal tissue. Histogram texture parameters were calculated for them. Raw data analysis identified parameters that discriminated between normal brain tissue, ischaemic and haemorrhagic stroke lesions. The artificial neural network (ANN) and k-nearest neighbour (k-NN) algorithms were used to classify the ROIs into normal tissue, ischaemic and haemorrhagic lesions using the radiologists' categorization as the gold standard, and further analysed using the ROC curve. Three parameters namely mean, 90 and 99 percentiles discriminated between normal brain tissue, ischaemic and haemorrhagic stroke lesions. With ANN and k-NN, the weighted sensitivity and specificity were above 0.9 while the false positive and false negative rates were negligible. The characterization and classification of brain tissue using histogram parameters were satisfactory and may be suitable for automated diagnosis of stroke.

**Keywords:** histogram texture parameters, texture analysis, characterization, classification, brain tissues, stroke, computed tomography

© 2016 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. © 2016 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### **1. Introduction**

Medical imaging is a rapidly developing branch of modern medicine. It has in the past few decades evolved into a highly sophisticated diagnostic tool. It has improved the study of human internal anatomy and to an extent physiology and detection of pathologies which were previously impossible. At this stage of its development, detection of lesions and their interpretation is becoming an automated computer-aided process. It can safely be said now that machine vision has become an emerging part of radiology and imaging in medicine. This is as a result of advances in medical imaging technology and computer science [1] which have greatly enhanced the interpretation of medical images and contributed to early diagnosis. The bases for computer-aided diagnosis (CAD) in radiology are medical image processing and artificial intelligence.

Stroke accounts for a significant proportion of neurological disorders seen in Nigerian hospitals [2]. It carries a high morbidity and mortality statistics in industrialized countries [3– 6], and in Africa, it is reported to be the leading neurological cause of death [7]. The World Health Organization (WHO) defined stroke a rapidly developing clinical syndrome of focal or global disturbance of cerebral function presumably of vascular origin, lasting longer than 24 hours unless interrupted by surgery or death [8]. A stroke occurs when the blood supply to the brain is disturbed which results in brain cells being starved of oxygen and consequently, some cells die while others are left damaged. Brain cells being permanent in nature achieve only very limited recovery, and thus, the patient may be left with a permanent disability. Clinical diagnosis of stroke and its subtyping is sometimes inaccurate [9–12]. Neuroimaging is, therefore, essential for accurate diagnosis. Stroke remains one of the most important clinical diagnoses for which patients are referred to the radiology department for emergency imaging because a timely and accurate diagnosis would help in the management of the patients [13]. Previous studies have highlighted the time-critical nature of ischaemic stroke diagnosis. Ischaemic stroke has a narrow therapeutic window in the first few hours following stroke ictus and a dramatic rise in haemorrhage complications thereafter [14–20].

Non-contrast head computed tomography (NCCT) has been suggested as the mainstay for early stroke diagnosis because computed tomography (CT) scanners are more widely available in the communities and may be accessed much more easily [13]. Computed tomography examinations are not only cheaper than magnetic resonance imaging (MRI) but also faster to perform. Thus, taking the time-critical nature of early stroke diagnosis into consideration, NCCT is the preferred first-line imaging tool. Computed tomography and other neuroimaging procedures will, however, not benefit the patient until the images have been accurately interpreted. For visual analysis and interpretation of stroke CT images, the radiologist seeks to identify affected areas of the brain by examining the dissimilarity between the left and right cerebral hemispheres. The challenges associated with the visual interpretation of stroke CT images are dearth of neuroradiologists [21] and the human errors of interpretation and diagnosis. Errors in visual interpretation result from poor technique, failures of perception, lack of knowledge and misjudgements [22]. Visual interpretation can be improved upon by texture analysis which will make it possible for automated computer-aided approach to be used as a second opinion for clinicians, especially in equivocal cases. Automatic method of stroke detection follows the same pattern as visual analysis and interpretation used by radiologists [23].

Computer-aided diagnosis (CAD) in medical imaging is an application of artificial intelligence in medicine. Artificial intelligence (IA) simulates the human brain or recreates it electronically. It is defined as the study and design of intelligent agents [24], where an intelligent agent is a system that perceives its environment and takes actions that maximize its chances of success [24–26]. The simplest intelligent agents are programs written to solve specific problems. More complicated intelligent agents include human beings and organization of human beings such as a firm or a team. Artificial intelligence is based on the central characteristic of human beings: intelligence—the sapience of *Homo sapiens*. This can be so precisely described that it can be simulated by a machine.

One very important stage in medical image processing leading to CAD is image texture analysis. Texture analysis of a medical image is the measurement of the quantitative parameters that constitute the image of a supposed lesion or normal tissue. This has the advantages of helping clinicians make accurate diagnosis and monitor disease processes under treatment. The analysis of texture parameters is a useful way of increasing the information obtainable from medical images [27].

#### **2. The concept of texture and analysis of texture**

**1. Introduction**

82 Pattern Recognition - Analysis and Applications

artificial intelligence.

Medical imaging is a rapidly developing branch of modern medicine. It has in the past few decades evolved into a highly sophisticated diagnostic tool. It has improved the study of human internal anatomy and to an extent physiology and detection of pathologies which were previously impossible. At this stage of its development, detection of lesions and their interpretation is becoming an automated computer-aided process. It can safely be said now that machine vision has become an emerging part of radiology and imaging in medicine. This is as a result of advances in medical imaging technology and computer science [1] which have greatly enhanced the interpretation of medical images and contributed to early diagnosis. The bases for computer-aided diagnosis (CAD) in radiology are medical image processing and

Stroke accounts for a significant proportion of neurological disorders seen in Nigerian hospitals [2]. It carries a high morbidity and mortality statistics in industrialized countries [3– 6], and in Africa, it is reported to be the leading neurological cause of death [7]. The World Health Organization (WHO) defined stroke a rapidly developing clinical syndrome of focal or global disturbance of cerebral function presumably of vascular origin, lasting longer than 24 hours unless interrupted by surgery or death [8]. A stroke occurs when the blood supply to the brain is disturbed which results in brain cells being starved of oxygen and consequently, some cells die while others are left damaged. Brain cells being permanent in nature achieve only very limited recovery, and thus, the patient may be left with a permanent disability. Clinical diagnosis of stroke and its subtyping is sometimes inaccurate [9–12]. Neuroimaging is, therefore, essential for accurate diagnosis. Stroke remains one of the most important clinical diagnoses for which patients are referred to the radiology department for emergency imaging because a timely and accurate diagnosis would help in the management of the patients [13]. Previous studies have highlighted the time-critical nature of ischaemic stroke diagnosis. Ischaemic stroke has a narrow therapeutic window in the first few hours following stroke ictus

Non-contrast head computed tomography (NCCT) has been suggested as the mainstay for early stroke diagnosis because computed tomography (CT) scanners are more widely available in the communities and may be accessed much more easily [13]. Computed tomography examinations are not only cheaper than magnetic resonance imaging (MRI) but also faster to perform. Thus, taking the time-critical nature of early stroke diagnosis into consideration, NCCT is the preferred first-line imaging tool. Computed tomography and other neuroimaging procedures will, however, not benefit the patient until the images have been accurately interpreted. For visual analysis and interpretation of stroke CT images, the radiologist seeks to identify affected areas of the brain by examining the dissimilarity between the left and right cerebral hemispheres. The challenges associated with the visual interpretation of stroke CT images are dearth of neuroradiologists [21] and the human errors of interpretation and diagnosis. Errors in visual interpretation result from poor technique, failures of perception, lack of knowledge and misjudgements [22]. Visual interpretation can be improved upon by texture analysis which will make it possible for automated computer-aided approach to be

and a dramatic rise in haemorrhage complications thereafter [14–20].

Texture is a very difficult term to give a precise definition. This is because there is no unified definition of texture and every definition that has been used has rather aimed at relating it to the area of its application. The non-existence of a universally agreed-upon definition of texture is an acknowledged fact [28, 29]. In general, texture can be defined as a descriptor that provides measures of properties such as smoothness, coarseness and regularity [28]. For medical images, image texture is defined as the appearance, structure and arrangement of the parts of an object within the image [27]. The concept of texture as a quantitative measure is applied only to digital images which are made up of numerous rectangular picture elements (pixels) as illustrated in **Figure 1**.

In consideration of this technicality, the texture concept in a digital image is regarded as the distribution of grey-level values among the pixels of a given region of interest in the image [27]. This definition is in agreement with a recent one which referred to texture as the spatial variation of pixel intensities in an image [29]. In order to understand texture better, it is important to draw an analogy from the way the human visual system perceives scenes. The human eye perceives scenes as sets of objects that are related to each other over various surfaces despite varying ambient illumination [30]. Texture has components called texels, which are notional uniform micro-objects placed in an appropriate way to form any particular texture. The placing may be random, regular, directional and so on, and there may be a degree of overlap in some cases [30]. From the foregoing, texture in very simple physical concept is composed of the randomness, periodicity, directionality and orientation of the composite elements making up an object's structure.

Texture analysis is an aspect of imaging science which analyses pixel intensity variations or its spatial distribution on a pixel-by-pixel scale to unravel patterns which may not be perceptible to the human visual system. The technique evaluates the location and signal intensity of the image represented by the pixel and contrast index for digital images [27]. Texture features represent the mathematical parameters obtained from the distribution of pixels which characterize the texture type and hence the structural components of an object [27]. Texture analysis is employed in image classification, segmentation and synthesis. It also plays a very vital role in computer-aided detection or diagnosis or more broadly machine vision.

**Figure 1.** An illustration of the pixel concept of digital medical images using a cranial CT.

#### **3. Methods of texture analysis**

There are four major issues in texture analysis, namely feature extraction, texture discrimination, texture classification and shapes from texture [31]. The purpose of feature extraction is to compute a characteristic of a digital image able to numerically describe its texture properties, while texture discrimination partitions a textured image into regions, each corresponding to a perceptually homogeneous texture (leads to image segmentation). In texture classification, the goal is to determine to which of a finite number of physically defined classes, such as normal or abnormal tissue, a homogeneous texture region belongs, while shape from texture reconstructs the three-dimensional surface geometry from texture information.

The first stage in texture analysis is the extraction of texture parameters, and the results obtained during this process are used for the remaining stages in texture analysis. The approaches to texture analysis are categorized into structural, statistical, model-based and transform methods [31]. These approaches are herewith described briefly.

#### **3.1. Structural methods**

composed of the randomness, periodicity, directionality and orientation of the composite

Texture analysis is an aspect of imaging science which analyses pixel intensity variations or its spatial distribution on a pixel-by-pixel scale to unravel patterns which may not be perceptible to the human visual system. The technique evaluates the location and signal intensity of the image represented by the pixel and contrast index for digital images [27]. Texture features represent the mathematical parameters obtained from the distribution of pixels which characterize the texture type and hence the structural components of an object [27]. Texture analysis is employed in image classification, segmentation and synthesis. It also plays a very

vital role in computer-aided detection or diagnosis or more broadly machine vision.

**Figure 1.** An illustration of the pixel concept of digital medical images using a cranial CT.

There are four major issues in texture analysis, namely feature extraction, texture discrimination, texture classification and shapes from texture [31]. The purpose of feature extraction is to compute a characteristic of a digital image able to numerically describe its texture properties, while texture discrimination partitions a textured image into regions, each corresponding to

**3. Methods of texture analysis**

elements making up an object's structure.

84 Pattern Recognition - Analysis and Applications

In this method, texture is represented by well-defined primitives. In other words, a square object is represented in terms of the straight lines or the primitives that form its border [27]. To describe texture using the structural approach, one must first define the primitives (microtexture) and then the placement rules. Primitives are the parts from which texture is composed. Note well that primitives may be tonal, that is, grey levels. Tonal primitives are regions of an image with tonal properties [32]. The advantage of structural methods is that they provide a good symbolic description of the image [31], but the disadvantage is that it is not a very powerful way describing texture.

#### **3.2. Statistical methods**

The statistical approach to texture analysis uses grey-level distribution within an image to describe texture. This approach provides better discrimination between classes than structural or transforms methods. It is the most widely used method in medical applications. Statistical methods can be used to analyse the spatial distribution of pixel grey values in an image. This is done by computing local features at each point in the image and then deriving a set of statistics from the distributions of the local features [33]. Statistical methods are classified as first-order, second-order and higher-order statistics based on the number of pixels that define the local feature. In the first-order statistics, only one pixel is involved; in second-order statistics, a pair of pixels; and higher-order statistics, three or more pixels [33]. There are differences between the different statistical methods. In the first-order statistics, properties such as average and variance of individual pixel values are estimated, but the spatial interaction between the image pixels is not taken into consideration. More specifically, first-order statistics measure the frequency of a particular grey level at a random image position without taking into account the correlations or co-occurrences between the pixels. Thus, information on texture is derived from the histogram of image pixel grey values [29]. The second-order and higher-order statistics estimate properties of two or more pixel values occurring at specific locations relative to each other, and thus, pixel-pixel interaction is a feature of these two methods [33]. Specifically, information on the texture of an image based on second-order statistical texture analysis is based on the probability of finding a pair pixels with the same grey level at random distances and orientations over an entire image, while higher-order statistics means the number of variables studied is increased [29].

#### *3.2.1. The co-occurrence matrix (COM)*

The co-occurrence matrix is a second-order histogram that analyses the grey-level distribution of pairs of pixels [27]. In grey-level co-occurrence matrix method, the probability of finding a pixel with a defined grey level (*i*) at a defined distance (*d*) and a defined angle (*α*) from another pixel with defined grey level (*j*) is calculated. So, the co-occurrences of pixel pairs are calculated in vertical, horizontal and two diagonal directions, as well as distances up to five pixels. An essential feature of this arrangement is that each pixel has eight nearest neighbours connected to it except when the pixel is located at the periphery. A very simple illustration of grey-level co-occurrence matrix as relative positions of pixels of the same grey-level intensities is shown in **Figure 2**. In this illustration, the reference pixel (X) is of the same grey-level value with the pixels X1 in horizontal direction for inter-pixel distance of 1, X2 in vertical direction for interpixel distance of 2, X3 in 45° diagonal direction for inter-pixel distance of 3 and X4 in 135° diagonal direction for inter-pixel distance of 3.

**Figure 2.** An illustration of the grey-level co-occurrence matrix concept of texture computation.

A co-occurrence matrix is produced in each direction (*α*), for each inter-pixel distance (*d*), with the matrix dimension being equal to the number of intensity levels. It, therefore, means that the process becomes computationally intense and the number of grey levels in an image would undergo a rescaling and re-binning procedure to reduce the range of pixel values contained within an image [34]. The implication of rescaling and re-binning of the grey levels in the image is loss of texture information.

The co-occurrence matrix parameters include the angular second moment, contrast, correlation, sum of squares, inverse difference moment, sum average, sum variance, sum entropy, entropy, difference variance and difference entropy. The construction of the co-occurrence matrix and mathematical derivation of the formulae for calculating the parameters are both tedious processes and further reading is necessary for better understanding [28, 31].

#### *3.2.2. The run-length matrix (RLM)*

*3.2.1. The co-occurrence matrix (COM)*

86 Pattern Recognition - Analysis and Applications

diagonal direction for inter-pixel distance of 3.

**Figure 2.** An illustration of the grey-level co-occurrence matrix concept of texture computation.

is loss of texture information.

A co-occurrence matrix is produced in each direction (*α*), for each inter-pixel distance (*d*), with the matrix dimension being equal to the number of intensity levels. It, therefore, means that the process becomes computationally intense and the number of grey levels in an image would undergo a rescaling and re-binning procedure to reduce the range of pixel values contained within an image [34]. The implication of rescaling and re-binning of the grey levels in the image

The co-occurrence matrix is a second-order histogram that analyses the grey-level distribution of pairs of pixels [27]. In grey-level co-occurrence matrix method, the probability of finding a pixel with a defined grey level (*i*) at a defined distance (*d*) and a defined angle (*α*) from another pixel with defined grey level (*j*) is calculated. So, the co-occurrences of pixel pairs are calculated in vertical, horizontal and two diagonal directions, as well as distances up to five pixels. An essential feature of this arrangement is that each pixel has eight nearest neighbours connected to it except when the pixel is located at the periphery. A very simple illustration of grey-level co-occurrence matrix as relative positions of pixels of the same grey-level intensities is shown in **Figure 2**. In this illustration, the reference pixel (X) is of the same grey-level value with the pixels X1 in horizontal direction for inter-pixel distance of 1, X2 in vertical direction for interpixel distance of 2, X3 in 45° diagonal direction for inter-pixel distance of 3 and X4 in 135°

The grey-level run-length matrix is a higher-order statistical method of texture feature extraction. The run-length matrix aims to calculate the number of consecutive pixels in a given direction that has the same grey-level intensity. It is a number of pixels in a particular direction with the same grey-level intensity value [29]. A coarse texture will, therefore, be dominated by relatively long runs, whereas a fine texture will be populated by much shorter runs [29]. The parameters derivable from the run-length matrix are usually computed in four different directions: horizontal, vertical and two diagonals. The grey-level run-length matrix is illustrated in **Figure 3** which shows a run-length of 4 pixels in a 45° diagonal direction [34].

**Figure 3.** An illustration of the grey-level run-length matrix concept of texture computation.

The run-length emphasis describes a number of consecutive pixels with the same grey-level value. It could be suitably termed long- or short-run emphasis depending on the number of consecutive pixels in the chosen direction with the same grey-level value [35]. The run-length and grey-level non-uniformity describe the disorderliness in pixel and pixel grey-level runs. The fraction of the image in runs simply refers to run percentages. That is, the ratio of the total number of runs in the image to the total number of pixels in the image expressed as a percentage [35].

The run-length method of texture analysis was first introduced by Galloway [36], but it has not gained the desired general acceptance as an efficient way of calculating texture [35]. It is therefore not popular among researchers working to develop diagnostic tools for medical applications.

The calculation of the run-length matrix parameters using MaZda® can be illustrated as follows. If , is the frequency of the run of a length *j* with a grey-level intensity *i, Ng* is the number of grey-level intensities and *Nr* is the number of runs. Then, the parameters for the run-length matrix *p(i, j)* can be calculated using the following equations:

$$\text{Short Run } Emphaus = \left(\sum\_{i=1}^{N\mathfrak{g}} \sum\_{j=1}^{N\mathfrak{r}} \frac{p(i,j)}{j^2}\right) \Big/ C \tag{1}$$

$$Long\ Run\ HNO\\_E\\_Exp\\_{\times}\ ^{N\nu} = \left(\sum\_{i=1}^{N\mathfrak{g}} \sum\_{j=1}^{N\nu} j^2\ \,\mathrm{p}\left(i,j\right)\right)\left/\stackrel{\circ}{C}\right.\tag{2}$$

$$\text{Grey Level Nonuniformity} = \left(\sum\_{i=1}^{\text{Ng}} \sum\_{j=1}^{\text{N}\prime} p\left(i, j\right)^2\right) \Big/ \text{C} \tag{3}$$

$$R\text{Run Length}\text{ }Nomun\text{}iform\text{ity} = \left(\sum\_{j=1}^{N\nu} \sum\_{i=1}^{N\mathfrak{g}} p\left(i,j\right)^2\right) \Big/ C\tag{4}$$

$$Fraction\text{ of }Imagein\text{Runs} = \sum\_{i=1}^{Ng} \sum\_{j=1}^{Nr} p\left(i, j\right) \Big/ \sum\_{i=1}^{Ng} \sum\_{j=1}^{Nr} jp\left(i, j\right) \tag{5}$$

The coefficient C in Eqs. (1)–(4) above is defined as:

$$C = \sum\_{i=1}^{N\_B} \sum\_{j=1}^{Nr} p(i, j) \tag{6}$$

#### *3.2.3. The absolute gradient (Gr)*

The gradient of an image measures the spatial variation in grey-level values across the image [27]. This method evaluates the relationship of variations in grey-level intensity values across neighbouring pixels as shown in **Figure 4** according to the illustration by Waugh [34]. A high gradient is produced when there is abrupt change, from extreme pixel grey-level intensity value to another extreme grey-level intensity value. Conversely, a low gradient is produced in gradually changing pixel grey-level values. The five parameters derived from absolute gradient are the gradient mean, gradient variance, gradient skewness, gradient kurtosis and gradient non-zeros. Conventionally, only the magnitude of the gradient is taken into consideration [27]. The direction of variation, whether it is positive or negative, is irrelevant and hence the term "absolute gradient".

**Figure 4.** An illustration of the gradient concept of texture computation.

The gradient non-zero is the number of pixels in an image with a grey-level value greater than zero, and gradient variance is the deviation of absolute pixel grey-level value from the mean, while gradient mean is the average variation in pixel grey-level value across the image [31]. The absolute gradient as a method of texture analysis find application in accentuating the boundaries of an image [27] and therefore is useful in edge enhancement.

#### *3.2.4. The histogram*

number of runs in the image to the total number of pixels in the image expressed as a per-

The run-length method of texture analysis was first introduced by Galloway [36], but it has not gained the desired general acceptance as an efficient way of calculating texture [35]. It is therefore not popular among researchers working to develop diagnostic tools for medical

The calculation of the run-length matrix parameters using MaZda® can be illustrated as follows. If , is the frequency of the run of a length *j* with a grey-level intensity *i, Ng* is the number of grey-level intensities and *Nr* is the number of runs. Then, the parameters for the

(. )

*pij Short Run Emphasis <sup>C</sup>*

,

,

,

*Run Length Nonuniformity p i j C*

*Fraction of Imagein Runs p i j jp i j*

1 1

*Ng Nr i j C pij* = =

The coefficient C in Eqs. (1)–(4) above is defined as:

*3.2.3. The absolute gradient (Gr)*

,,

( )

,

The gradient of an image measures the spatial variation in grey-level values across the image [27]. This method evaluates the relationship of variations in grey-level intensity values across neighbouring pixels as shown in **Figure 4** according to the illustration by Waugh [34]. A high gradient is produced when there is abrupt change, from extreme pixel grey-level intensity

*Grey Level Nonuniformity p i j C*

2

( ) <sup>2</sup>

( )<sup>2</sup>

( )<sup>2</sup>

() ()

å å (1)

å å (2)

å å (3)

åå (4)

<sup>=</sup> åå åå (5)

<sup>=</sup> å å (6)

1 1

1 1

= =

*Ng Nr i j Long Run Emphasis j p i j C*

*j* = = æ ö <sup>=</sup> ç ÷ è ø

æ ö <sup>=</sup> ç ÷ è ø

1 1

1 1

1 1 1 1

= = = =

*Ng Nr Ng Nr i j i j*

= = æ ö <sup>=</sup> ç ÷ è ø

*Nr Ng j i*

= = æ ö <sup>=</sup> ç ÷ è ø

*Ng Nr i j*

*Ng Nr i j*

run-length matrix *p(i, j)* can be calculated using the following equations:

centage [35].

88 Pattern Recognition - Analysis and Applications

applications.

This is a first-order statistical analysis and uses pixel occurrence probability to calculate texture. To illustrate the histogram approach to texture analysis, assume in an image the grey levels are in the range 0 ≤ ≤ 1, where Ng is the total number of particular grey levels. If N(i) is the total number of pixels with intensity i and M is the total number of pixels in the image, then the pixel occurrence probability P(i) is given by [29]

$$P\left(i\right) = N\left(i\right) \div M \tag{7}$$

The probability of occurrence of a pixel of particular grey level (intensity) is called the histogram. It does not consider the spatial relationships, and correlations, between pixels [29]. The main advantage of the histogram is its simplicity by the use of standard descriptors such as mean and variance to characterize texture data. The features derivable from the histogram are mean, variance, skewness, kurtosis, percentile 01, percentile 10, percentile 50, percentile 90 and percentile 99. Some of the features from the histogram used to characterize texture are represented by the equations below:

$$Mean\left(\mu\right) = \sum\_{i=0}^{N-1} ip(i) \tag{8}$$

$$Variance\left(\sigma^{\circ}\right) = \sum\_{i=0}^{N-1} (i - \mu)^{2} \left. p(i) \right| \tag{9}$$

$$LSkenness\left(\mu\_{\odot}\right) = \sigma^{-\beta} \sum\_{i=0}^{N-1} \left(i - \mu\right)^{\beta} p(i) \tag{10}$$

$$Kurtosis(\mu\_4) = \sigma^{-4} \sum\_{i=0}^{N-1} (i - \mu)^4 \, p(i) - 3 \tag{11}$$

#### **3.3. The model-based methods**

In model-based texture analysis, there is an attempt to fit an image texture to a computational (mathematical) model. For MaZda**®** texture analysis software, the model used is referred to as the auto-regressive model (ARM). In this model, an assumption that knowing the grey-level intensity value of one pixel, the grey-level intensity values of other neighbouring pixels can be deduced holds. In a more formal way, the ARM assumes a local interaction between image pixels in that pixel grey-level value is a weighted sum of the grey-level values of the neighbouring pixels [27]. The main disadvantage of the model-based approach to texture analysis is the complexity involved in the computations to estimate the model parameters. Other models of texture aside ARM are Markov random field (MRF) and fractal models [31].

#### **3.4. The transform methods**

In the transform methods, the texture of an image can be analysed in the frequency or scale space. These methods can employ the Fourier [37], Gabor [38] or wavelet transform [39]. However, the wavelet transform is the most popular because it can easily be adjusted to suit the problem at hand as desired by the user [27]. Wavelet is a technique that analyses the frequency content of an image with different scales of that image. The wavelet analysis yields a set of numbers called the wavelet coefficients which correspond to different scales and frequency directions [27]. Each pixel of an image analysed by wavelet transform is associated with a set of wavelet coefficients which describe the frequency content of the image at that point over a set of scales.

#### **4. Texture analysis of medical images**

Texture analysis of medical images remained without much clinical interest until 1998 when it took a giant leap. This was when MaZda**®**, a computer program for calculating texture parameters (features) in digitized images, was developed. The software has been under development since 1998, to satisfy the needs of the participant of COST B11 European Project "Quantitative Analysis of Magnetic Resonance Image Texture" and the subsequent COST B21 "Physiological Modelling of Magnetic Resonance Image Formation" [31]. MaZda**®** is a very versatile software package that is capable of 2D and 3D image texture analysis. It can be used for quantitative analysis of image texture, computation of texture features, feature selection and extraction. The software also has algorithms for data classification, data visualization and image segmentation tools [40]. The software was originally developed in 1996 at the Institute of Electronics, Technical University of Lodz (TUL), Poland, for texture analysis of mammograms [41]. The software has been further developed and made more versatile to be used in the analysis of other textured image. It has been found to be efficient and reliable for quantitative image analysis even in more accurate and objective medical diagnosis. There has also been a non-medical application in the food industry to assess food product quality [40]. Other computer softwares that are used for texture analysis of digital images are MATLAB**®** and Scilab® [42, 43]. Scilab**®** is available to users free, while MATLAB**®** is commercially available.

( ) <sup>1</sup> 2 2 0

( ) <sup>1</sup> 3 3

*N i Skewness µ i µ p i* s


0

<sup>1</sup> 4 4

 *i µ pi* - - =

0 ( ) ( ) () 3 *N i*

In model-based texture analysis, there is an attempt to fit an image texture to a computational (mathematical) model. For MaZda**®** texture analysis software, the model used is referred to as the auto-regressive model (ARM). In this model, an assumption that knowing the grey-level intensity value of one pixel, the grey-level intensity values of other neighbouring pixels can be deduced holds. In a more formal way, the ARM assumes a local interaction between image pixels in that pixel grey-level value is a weighted sum of the grey-level values of the neighbouring pixels [27]. The main disadvantage of the model-based approach to texture analysis is the complexity involved in the computations to estimate the model parameters. Other models of texture aside ARM are Markov random field (MRF) and fractal models [31].

In the transform methods, the texture of an image can be analysed in the frequency or scale space. These methods can employ the Fourier [37], Gabor [38] or wavelet transform [39]. However, the wavelet transform is the most popular because it can easily be adjusted to suit the problem at hand as desired by the user [27]. Wavelet is a technique that analyses the frequency content of an image with different scales of that image. The wavelet analysis yields a set of numbers called the wavelet coefficients which correspond to different scales and frequency directions [27]. Each pixel of an image analysed by wavelet transform is associated with a set of wavelet coefficients which describe the frequency content of the image at that

Texture analysis of medical images remained without much clinical interest until 1998 when it took a giant leap. This was when MaZda**®**, a computer program for calculating texture parameters (features) in digitized images, was developed. The software has been under development since 1998, to satisfy the needs of the participant of COST B11 European Project "Quantitative Analysis of Magnetic Resonance Image Texture" and the subsequent COST B21 "Physiological Modelling of Magnetic Resonance Image Formation" [31]. MaZda**®** is a very

3

s

4

*Kurtosis µ*

**3.3. The model-based methods**

90 Pattern Recognition - Analysis and Applications

**3.4. The transform methods**

point over a set of scales.

**4. Texture analysis of medical images**


*N i Variance i µ p i* s

( ) ()

( ) ()

= - å (9)

= - å (10)

= -- å (11)

The medical importance of texture analysis cannot be over-emphasized. Analysis of medical image texture helps to increase the information obtained from medical images [27], which may improve diagnosis. It is an emerging aspect of medical imaging and finds applications in segmentation of specific anatomical structures and detection of lesions. The detection of lesions implies differentiating between unhealthy and healthy tissues in the different organs of the body. The differentiation between unhealthy and healthy tissues implies that texture parameters obtained from medical images form the basis for computer-aided diagnosis. Just recently, it was demonstrated that texture analysis can be used in patients undergoing neoadjuvant chemotherapy treatment of breast cancer to indicate whether the patient will respond well or not. The results of that study appeared to correlate well with the final pathological outcome [34].

#### **5. Role of texture analysis in computer-aided diagnosis**

Many researchers have shown interest in texture analysis of medical images. The researches in texture analysis of medical images have been targeted at developing computer-aided diagnosis systems. Computer-aided diagnosis systems are gaining popularity in one way or another because of their ability to improve the precision and accuracy of characterization of lesions beyond what radiologists do by visual inspection [44]. The main objectives of a CAD system in the diagnostic process are to accurately detect and precisely characterize potential abnormalities [45]. This a very important step towards the effective treatment of diagnosed abnormalities. The radiologist detects and characterizes abnormalities by visual interpretation. To do this, the radiologists must successfully integrate of two distinct processes, namely image perception to recognize unique image patterns and the process of reasoning to identify the relationships between perceived patterns and possible diagnosis. The two processes are heavily dependent on the empirical knowledge, memory, intuition and diligence of the radiologist. The approach of the radiologist is not always error-free as there are well-documented errors and variations in the human interpretation of clinical images [46]. In summary of the foregoing, CAD aims to provide a computer output as a second opinion in order to assist physicians in the detection of abnormalities, quantification of disease progress and differential diagnosis of lesions [1]. One important step in the generic architecture of CAD system is feature extraction (texture analysis), and thus, texture analysis is the fundamental basis of CAD at its present stage of development [1].

The human visual system can discriminate between different morphologic information such as shape and size, but there is evidence that the human visual system has difficulty in the discrimination of textural information that is related to higher-order statistics or spectral properties of an image [47, 48]. The human visual system if unaided has a limited number of grey levels it can tell apart. Thus, texture analysis can potentially augment the visual skills of the radiologist by extracting image features that may be relevant to the diagnostic problem but that are not necessary visually extractable [45]. In the use of image texture analysis as a preprocessing step in CAD schemes, the input generation process is automated and, therefore, is reproducible and robust. Although useful to the diagnostic process, texture analysis is not a panacea for the diagnostic interpretation of radiologic images [45]. The pursuit of texture analysis is based on the hypothesis that the texture signature of an image is relevant to the diagnostic problem at hand. A major drawback is that the effectiveness of texture analysis is bound by the type of algorithm that is used to extract meaningful textural features.

#### **6. Decision making in computer-aided diagnosis**

Texture analysis is the fundamental basis of computer-aided diagnosis in radiology and is, therefore, indispensable to the process. The main problem with calculated texture is that it produces an avalanched of outputs, especially co-occurrence matrix. The outputs need to be reduced to a manageable level so that useful information which could be used for decision making can be obtained from the further analysis. Using the MaZda**®** software, feature reduction is achieved by using the Fisher coefficient, classification error combined with the correlation coefficient, mutual information [41, 49] and a selection of optimal feature subsets with minimal classification error of 1-nearest neighbour (1-NN) classifier [50, 51]. The Fisher coefficient selects features by reducing intra-group variance and maximizing inter-group difference [52]. If the above methods do not reduce the features sufficiently initially, further reduction is carried out by transforming the original features into a new feature space with lower dimensionality [40]. This method is called feature extraction or projection [53] and can be achieved in MaZda**®** using principal component analysis (PCA), linear discriminant analysis (LDA), nonlinear discriminant analysis (NDA) [50, 54–57] and raw data analysis (RDA). Artificial intelligence tools are used for automated decision making in computer-aided diagnosis. Such tools include different algorithms which are provided by different computer softwares. The Waikato Environment for Knowledge Analysis (WEKA) version 3.6.11 data mining software is useful software equipped with many classification algorithms. It is a landmark system in data mining and machine learning [58]. The software came about through the perceived need for a unified workbench that would allow researchers easy access to the state-of-art techniques in machine learning [59].

The two tools for decision making or classification in computer-aided diagnosis popular with researchers are the artificial neural networks (ANN) and k-nearest neighbour (k-NN). The ANN and k-NN algorithms are part of the resources provided in the WEKA software. Both algorithms perform supervised classifications implying that the classification is under the guidance of a human being. In supervised classification, the user selects sample pixels in an image that he considers representative of specific classes and then initiates the software to use these training sites as references for the classification of other pixels in the image.

#### **6.1. The artificial neural network**

extraction (texture analysis), and thus, texture analysis is the fundamental basis of CAD at its

The human visual system can discriminate between different morphologic information such as shape and size, but there is evidence that the human visual system has difficulty in the discrimination of textural information that is related to higher-order statistics or spectral properties of an image [47, 48]. The human visual system if unaided has a limited number of grey levels it can tell apart. Thus, texture analysis can potentially augment the visual skills of the radiologist by extracting image features that may be relevant to the diagnostic problem but that are not necessary visually extractable [45]. In the use of image texture analysis as a preprocessing step in CAD schemes, the input generation process is automated and, therefore, is reproducible and robust. Although useful to the diagnostic process, texture analysis is not a panacea for the diagnostic interpretation of radiologic images [45]. The pursuit of texture analysis is based on the hypothesis that the texture signature of an image is relevant to the diagnostic problem at hand. A major drawback is that the effectiveness of texture analysis is

bound by the type of algorithm that is used to extract meaningful textural features.

Texture analysis is the fundamental basis of computer-aided diagnosis in radiology and is, therefore, indispensable to the process. The main problem with calculated texture is that it produces an avalanched of outputs, especially co-occurrence matrix. The outputs need to be reduced to a manageable level so that useful information which could be used for decision making can be obtained from the further analysis. Using the MaZda**®** software, feature reduction is achieved by using the Fisher coefficient, classification error combined with the correlation coefficient, mutual information [41, 49] and a selection of optimal feature subsets with minimal classification error of 1-nearest neighbour (1-NN) classifier [50, 51]. The Fisher coefficient selects features by reducing intra-group variance and maximizing inter-group difference [52]. If the above methods do not reduce the features sufficiently initially, further reduction is carried out by transforming the original features into a new feature space with lower dimensionality [40]. This method is called feature extraction or projection [53] and can be achieved in MaZda**®** using principal component analysis (PCA), linear discriminant analysis (LDA), nonlinear discriminant analysis (NDA) [50, 54–57] and raw data analysis (RDA). Artificial intelligence tools are used for automated decision making in computer-aided diagnosis. Such tools include different algorithms which are provided by different computer softwares. The Waikato Environment for Knowledge Analysis (WEKA) version 3.6.11 data mining software is useful software equipped with many classification algorithms. It is a landmark system in data mining and machine learning [58]. The software came about through the perceived need for a unified workbench that would allow researchers easy access to the

The two tools for decision making or classification in computer-aided diagnosis popular with researchers are the artificial neural networks (ANN) and k-nearest neighbour (k-NN). The

**6. Decision making in computer-aided diagnosis**

state-of-art techniques in machine learning [59].

present stage of development [1].

92 Pattern Recognition - Analysis and Applications

Artificial neural networks are regarded as relatively crude electronic networks of "neurons" which simulate the neural structure of the human brain. They literally imitate the decisionmaking process of the human brain. The networks are the electronic equivalent of the human brain and are therefore trainable for improved performance. They process records one at a time as the records are fed into them and "learn" from "experience" by comparing their classification of each record with a known actual classification of the record. The subsequent classifications are therefore made more accurate by using the errors from the classification of previous records which are fed back into the network to modify the networks' algorithm.

A multilayer feed-forward neural network is the one that has one or more hidden layers. The neurons in the hidden layer arbitrate between the input and the output of the network. The source nodes in the input layer of the neural network receive the input feature vector. The input signals which are applied to the neurons in the hidden layer are made up of the neurons in the input layer. The output signals of the hidden layer can be used as inputs to the next hidden or output layer, and this process continues but terminates when the output layer produces the final output result [60].

#### **6.2. The k-nearest neighbour**

The k-nearest neighbour is a non-parametric method used for classification and regression [61]. In the algorithm, the training data set is stored, so that classifying a previously unclassified (new) record is by comparing it to the most similar records in the training data set. Simply put, in the k-nearest neighbour classification algorithm, a database in which data points are separated into several separate classes is used to predict the classification of a new data point. The data set is assumed to be in space and classification is achieved by assigning the new data point to its closest neighbour. It is a rather simple and versatile concept.

#### **7. The case study**

#### **7.1. Research design and location**

A prospective cross-sectional design that targeted patients clinically diagnosed with stroke and who underwent non-contrast CT (NCCT) investigation of the brain was adopted for the study. The research design and protocol were approved by the Research Ethics Committee of Nnamdi Azikiwe University Teaching Hospital, Nnewi, Anambra State, Nigeria. The study was carried in two locations, namely Onitsha, Anambra State in south-eastern Nigeria, and Ibadan, Oyo State in south-western Nigeria. Two privately owned radiodiagnostic centres were selected. The choice of the centres was to have an adequate number of patients because the centres have a high number of stroke patients referred to them for brain CT examination.

#### **7.2. Sample size determination**

The minimum sample size required for this study was determined using the Taro Yamane's formula for finite population [62]:

$$m = N \div 1 + N e^2 \tag{12}$$

where n = sample size; N = number of patients clinically diagnosed with a stroke who underwent NCCT study of the brain in the two radiodiagnostic centres in previous one year: May 2012 to April 2013; e = the level of precision or confidence level required.

So,

$$n = 208 \div 1 + 208(0.05)^2 = 137 \tag{13}$$

Within the period: May 2012 and April 2013, a total 208 patients with clinically diagnosed stroke underwent non-contrast CT of the brain in the two centres, and thus, a minimum sample of approximately 137 was calculated as shown above.

#### **7.3. Patient selection**

A total of 164 clinically diagnosed stroke patients who were referred to the two radiodiagnostic centres for CT scan and who met the inclusion criteria for the study were enlisted in the study to improve its precision. The inclusion criteria were:


All the participating patients directly or indirectly, through their relatives, expressed willingness to participate in the study by signing an informed consent form before enlistment in the study.

#### **7.4. Equipment and softwares used**

The equipment and computer softwares used include the following:


#### **7.5. Patient data and image acquisition**

Ibadan, Oyo State in south-western Nigeria. Two privately owned radiodiagnostic centres were selected. The choice of the centres was to have an adequate number of patients because the centres have a high number of stroke patients referred to them for brain CT examination.

The minimum sample size required for this study was determined using the Taro Yamane's

where n = sample size; N = number of patients clinically diagnosed with a stroke who underwent NCCT study of the brain in the two radiodiagnostic centres in previous one year: May

Within the period: May 2012 and April 2013, a total 208 patients with clinically diagnosed stroke underwent non-contrast CT of the brain in the two centres, and thus, a minimum sample

A total of 164 clinically diagnosed stroke patients who were referred to the two radiodiagnostic centres for CT scan and who met the inclusion criteria for the study were enlisted in the study

**1.** Patients clinically diagnosed with stroke at the Nnamdi Azikiwe University Teaching Hospital (NAUTH), Nnewi, Anambra State, and University College Hospital (UCH) Ibadan, Oyo State, and peripheral private and public hospitals in these two states. **2.** Patients clinically diagnosed with stroke who underwent non-contrast CT of the brain at

All the participating patients directly or indirectly, through their relatives, expressed willingness to participate in the study by signing an informed consent form before enlistment in the

**3.** Patients in whose CT images stroke lesions were identified by the radiologist.

**4.** Patients who met criteria 1–3 and consented to participate in the study.

The equipment and computer softwares used include the following:

2012 to April 2013; e = the level of precision or confidence level required.

of approximately 137 was calculated as shown above.

to improve its precision. The inclusion criteria were:

the two selected private radiodiagnostic centres.

**7.4. Equipment and softwares used**

<sup>2</sup> *n N Ne* = ¸+1 (12)

<sup>2</sup> *n* = ¸+ = 208 1 208(0.05) 137 (13)

**7.2. Sample size determination**

94 Pattern Recognition - Analysis and Applications

formula for finite population [62]:

So,

study.

**7.3. Patient selection**

The enlistment of patients in the study, collection data and acquisition CT images commenced in May 2013 and ended in April 2014. The patients after being clinically diagnosed with stroke in the hospitals were referred to undergo NCCT of the head to confirm or rule out the disease as the cause of their signs and symptoms. On arriving the radiodiagnostic centre, the patient or his/her relatives were approached and the study explained to them. The researcher through the request form identified the provisional diagnosis necessitating the scan. If it was a stroke, an appeal was made to the patient or his/her relatives to enlist in the study. If the response is affirmative, an informed consent form is signed by the patient or his/her relatives. There was no financial reward for participating in the study. Demographic data of the patient such as age and gender were thereafter obtained and documented. The approximate time interval between the onset of symptoms and head CT examination was ascertained and documented. Noncontrast CT images of the brain were obtained using the CT machine, *Toshiba Asteion*™ in one centre. In the second centre, a *Philips MX8000 Dual™* CT scanner was used for the same purpose. Scans were obtained at 0.5–1 mm contiguous sections from the base of the skull to the vertex. The scan parameters used were exclusively chosen by the attending radiographer in each centre. The images were transferred from the CT archive to a DVD and then loaded into an *HP 2000*™ laptop for viewing using either *Medysynapse*™ or *Microdom*™, both DICOM viewing softwares.

#### **7.6. Radiological reporting of the images**

The CT images obtained were visually inspected and reported by a team of two radiologists with experiences in CT diagnosis of stroke. The first radiologist had five-year post-qualification experience as a consultant radiologist, while the second had seven-year post-qualification experience. Both radiologists reported on the images independently and were blind to each other. The reports included in the study were those in which the two radiologists were in agreement for the presence of stroke, the subtype and anatomical location of the lesions. The reports that indicated there were no radiological signs of abnormality and those that indicated neurological abnormalities mimicking stroke were excluded from the study.

**Figure 5.** A non-contrast CT image showing left cerebral ischaemia (arrows). Note there is a small area of ischaemia on the right parietal lobe.

The anatomical locations of the lesions were identified and the lesions categorized as ischaemic or haemorrhagic lesion by the two radiologists as shown in **Figures 5** and **6**. The radiologist's reports contained the patient's name, identification number, age, sex, provisional diagnosis and radiological diagnosis, which contained details such as the type of stroke lesions identified, their number, anatomical locations of the lesions and geographic extent in the brain.

#### **7.7. Texture analysis of stroke CT images**

other. The reports included in the study were those in which the two radiologists were in agreement for the presence of stroke, the subtype and anatomical location of the lesions. The reports that indicated there were no radiological signs of abnormality and those that indicated

**Figure 5.** A non-contrast CT image showing left cerebral ischaemia (arrows). Note there is a small area of ischaemia on

**Figure 6.** A non-contrast CT image showing left cerebral haemorrhage (arrows). Note the marked compression of the

The anatomical locations of the lesions were identified and the lesions categorized as ischaemic or haemorrhagic lesion by the two radiologists as shown in **Figures 5** and **6**. The radiologist's reports contained the patient's name, identification number, age, sex, provisional diagnosis

the right parietal lobe.

96 Pattern Recognition - Analysis and Applications

right and left ventricles.

neurological abnormalities mimicking stroke were excluded from the study.

Texture analyses of stroke CT images were done using the *MaZda*® texture analysis software. The procedure for the texture analysis of the CT images is represented in the block diagram shown in **Figure 7** below.

**Figure 7.** Block diagram illustrating the analytical procedure.

All the images in which lesion appeared were loaded into the computer program and analysed. Four regions of interest (ROIs) in each CT image that demonstrated the lesions were selected for analysis. Two ROIs each represented the lesion and normal brain tissue as shown in **Figure 8**. The lesioned brain tissue contained ROI 1 and RO1 2, while the adjacent normal brain tissue contained ROI 3 and ROI 4 as shown in **Figure 8**.

**Figure 8.** Illustration of the method of selection of the regions of interest (ROIs). Note that ROI 1 (red) and ROI 2 (green) are on ischaemic tissues on the left cerebral hemisphere, while ROIs 3 and 4 (blue and sky blue) are on normal tissues on the right cerebral hemisphere.

Precaution was taken to ensure that machine settings which differed between cases did not affect the image during texture analysis. This was achieved by normalizing the image. Normalization process literally changes the range of pixel grey-level values of different images so that they appear to have been obtained with the same machine settings. This is called image consistency. The method of normalization prior to texture analysis was the ±3 sigma method selected from the program functions. Histogram texture parameters for the four ROIs were computed using the *MaZda*® version 4.7 program. The output of the parameters computed for each CT image was saved as a comma separated value (CSV) file in *Microsoft Excel* for further analysis.

#### **7.8. Statistical analyses**

Statistical analyses were carried out in two stages. In the first stage, the lesioned brain tissues for which texture parameters were calculated were divided according to lesion types. The discriminating histogram texture parameters were obtained by raw data analysis (RDA). In the second stage, the normal brain tissues and lesions from which the histogram texture parameters were computed were then classified by the artificial neural network and k-nearest neighbour algorithms as normal tissue, haemorrhagic or ischaemic tissues. The classifications were then cross-validated with the radiologist's report as gold standard using the receiver operating characteristic (ROC) curve analysis. Raw data analysis of computed histogram texture parameters was performed with *MaZda*® and classification of brain tissues with *WEKA* 3.6.11.

#### *7.8.1. Feature reduction*

In order to reduce the computed histogram texture parameters to only the ones useful for further analyses and eliminate redundant data, the Fisher coefficient was used. The Fisher coefficient reduced the intra-group variance and maximized the inter-group difference. It is a feature of the *MaZda*® texture analysis software.

#### *7.8.2. Feature extraction*

The histogram texture parameters computation reports on the selected ROIs saved in *Micro‐ soft Excel* files were loaded into *MaZda*®, first according to lesion type and in combined lesion form, and raw data analysis was performed on them. The best discriminating texture parameters were extracted through the raw data analysis and displayed in a three-dimensional (3D) feature space. The process also classified the ROIs as that of normal tissue, ischaemic or haemorrhagic lesions using the best discriminating texture parameters. In this process, the ROIs in space were picked one at a time and assigned a class to which it belonged with the radiologist's interpretation taken as the expected ideal outcome.

#### *7.8.2.1. Artificial neural network and k-nearest neighbour classifications*

A multilayer feed-forward neural network and k-nearest neighbour algorithm were used to classify brain tissues as lesions, according to lesion type or normal tissues. For the purpose of classifying ROIs into normal brain tissue, ischaemic and haemorrhagic lesions using the knearest neighbour algorithm, a value of 1 was chosen for *k*. The *Waikato Environment for Knowledge Analysis (WEKA)* version 3.6.11 data mining software was used to perform these classifications. Both algorithms were trained by creating a model on retrospective data before applying them to a test data.

The performance of the neural network and k-nearest neighbour algorithms in classifying the ROIs as normal brain tissue or lesioned and according to lesion type was cross-validated with the radiologist's report using the ROC curve analysis. The accuracy, sensitivity, specificity, positive predictive value and negative predictive value were determined from the ROC curves plotted. The parameters from ROC analysis were calculated.

#### **8. Results**

consistency. The method of normalization prior to texture analysis was the ±3 sigma method selected from the program functions. Histogram texture parameters for the four ROIs were computed using the *MaZda*® version 4.7 program. The output of the parameters computed for each CT image was saved as a comma separated value (CSV) file in *Microsoft Excel* for further

Statistical analyses were carried out in two stages. In the first stage, the lesioned brain tissues for which texture parameters were calculated were divided according to lesion types. The discriminating histogram texture parameters were obtained by raw data analysis (RDA). In the second stage, the normal brain tissues and lesions from which the histogram texture parameters were computed were then classified by the artificial neural network and k-nearest neighbour algorithms as normal tissue, haemorrhagic or ischaemic tissues. The classifications were then cross-validated with the radiologist's report as gold standard using the receiver operating characteristic (ROC) curve analysis. Raw data analysis of computed histogram texture parameters was performed with *MaZda*® and classification of brain tissues with *WEKA*

In order to reduce the computed histogram texture parameters to only the ones useful for further analyses and eliminate redundant data, the Fisher coefficient was used. The Fisher coefficient reduced the intra-group variance and maximized the inter-group difference. It is a

The histogram texture parameters computation reports on the selected ROIs saved in *Micro‐ soft Excel* files were loaded into *MaZda*®, first according to lesion type and in combined lesion form, and raw data analysis was performed on them. The best discriminating texture parameters were extracted through the raw data analysis and displayed in a three-dimensional (3D) feature space. The process also classified the ROIs as that of normal tissue, ischaemic or haemorrhagic lesions using the best discriminating texture parameters. In this process, the ROIs in space were picked one at a time and assigned a class to which it belonged with the

A multilayer feed-forward neural network and k-nearest neighbour algorithm were used to classify brain tissues as lesions, according to lesion type or normal tissues. For the purpose of classifying ROIs into normal brain tissue, ischaemic and haemorrhagic lesions using the knearest neighbour algorithm, a value of 1 was chosen for *k*. The *Waikato Environment for Knowledge Analysis (WEKA)* version 3.6.11 data mining software was used to perform these

analysis.

3.6.11.

**7.8. Statistical analyses**

98 Pattern Recognition - Analysis and Applications

*7.8.1. Feature reduction*

*7.8.2. Feature extraction*

feature of the *MaZda*® texture analysis software.

radiologist's interpretation taken as the expected ideal outcome.

*7.8.2.1. Artificial neural network and k-nearest neighbour classifications*

The raw data analysis was used to analyse the data from histogram texture parameters. The raw data analysis was discriminated between the various ROIs as normal brain tissue, ischaemic stroke lesion or haemorrhagic stroke lesions. The classifications of the ROIs obtained in the discrimination are shown in the 3D feature space diagram (**Figure 8**). In the figure, the ischaemic lesion is represented by 1, haemorrhage by 2 and normal brain tissues by 3. The discriminating histogram parameters were the mean, 90 percentile and 99 percentile as shown in **Figure 8**. The result of the raw data analysis shows that histogram texture parameters were very accurate in discriminating between normal brain tissues, ischaemic lesion and haemorrhagic lesions as shown in **Table 1** and illustrated in **Figure 9**.


**Table 1.** Classification accuracy of the ROIs by raw data analysis.

**Figure 9.** The distribution of ROIs in 3D feature space using data obtained from the histogram.


**Table 2.** Receiver operating characteristic analysis of artificial neural network classification of brain tissues.


**Table 3.** Receiver operating characteristic analysis of k-nearest neighbour classification of brain tissues.


**Table 4.** Comparison of artificial neural network and k-nearest neighbour in classification of brain tissues.

The statistics in **Tables 2** and **3** show the performance of histogram-based texture parameters in the classification of brains tissues as normal, ischaemic or haemorrhagic using the artificial neural network and k-nearest neighbor algorithms. There was no difference in sensitivity, specificity, false positive rate and area under ROC curve between the artificial neural network and k-nearest neighbour classifications (p > 0.05) as shown in **Table 4**.

## **9. Discussion**

**Evaluation parameters Tissue/lesion type**

100 Pattern Recognition - Analysis and Applications

**Evaluation parameters Tissue/lesion type**

Sensitivity or true positive rate (TPR) 0.971 0.949 0.888 0.947 True negative rate or (TNR) Specificity 0.937 0.989 0.983 0.962 False positive rate (FPR) 0.063 0.011 0.017 0.038 False negative rate (FNR) 0.029 0.051 0.112 0.053 Positive predictive value (PPV) 0.938 0.971 0.936 0.947 Negative predictive value (NPV) 0.953 0.857 0.050 0.693 Area under ROC curve 0.979 0.986 0.977 0.980

**Table 2.** Receiver operating characteristic analysis of artificial neural network classification of brain tissues.

Sensitivity or true positive rate (TPR) 0.954 0.944 0.853 0.929 True negative rate (TNR) or Specificity 0.934 0.983 0.966 0.955 False positive rate (FPR) 0.066 0.017 0.034 0.045 False negative rate (FNR) 0.046 0.056 0.147 0.071 Positive predictive value (PPV) 0.934 0.957 0.878 0.928 Negative predictive value (NPV) 0.949 0.273 0.029 0.693 Area under ROC curve 0.944 0.963 0.909 0.942

**Table 3.** Receiver operating characteristic analysis of k-nearest neighbour classification of brain tissues.

**Classification algorithm Sensitivity Specificity FPR AUROCC** ANN 0.947 0.962 0.038 0.980 k-NN 0.929 0.955 0.045 0.942 Remark p = 0.061 p = 0.378 p = 0.378 p = 0.373

**Table 4.** Comparison of artificial neural network and k-nearest neighbour in classification of brain tissues.

and k-nearest neighbour classifications (p > 0.05) as shown in **Table 4**.

The statistics in **Tables 2** and **3** show the performance of histogram-based texture parameters in the classification of brains tissues as normal, ischaemic or haemorrhagic using the artificial neural network and k-nearest neighbor algorithms. There was no difference in sensitivity, specificity, false positive rate and area under ROC curve between the artificial neural network

**Normal Haemorrhage Ischaemia Weighted average**

**Normal Haemorrhage Ischaemia Weighted average**

Medical image analysis techniques play very important roles in several radiological interpretations. In general, the applications involve the automatic extraction of texture features from images which are then used for a variety of classification tasks, such as distinguishing normal tissue from abnormal tissue [33].

In this study, histogram parameters were computed for the selected ROIs chosen from stroke lesions and adjacent normal brain tissues using *MaZda*®. The whole process involved computation of histogram texture parameters, feature selection or reduction and raw data analysis to extract discriminating parameters namely the mean, percentile 90 and percentile 99, were the best discriminators. They achieved very high accuracy in discriminating between normal brain tissues, ischaemic and haemorrhagic stroke lesions. According to the result of a previous study, histogram features when used with Radial Basis Function of Nerve Network (RBFNN) achieved accuracies of over 80% in classification brain of tissues [63]. The histogram measures the frequency of occurrence of the different grey-scale patterns throughout the image by moving in steps of one pixel across the image. This approach is attractive for its conceptual simplicity and most people are at ease with it. The result of this study shows that histogram is highly accurate in discriminating between normal brain tissues and lesions, and between ischaemic stroke and haemorrhagic stroke lesions. In another similar study, grey-level cooccurrence matrix features were used in automatic detection of ischaemic stroke [64]. Four different algorithms were used, namely decision tree, artificial neural network, k-nearest neighbour and support vector machine (SVM), and the results were quite similar to ours. The sensitivity was 93% for decision tree, 98% for artificial neural network, 96% for k-nearest neighbour and 98% for SVM, while specificity was 90% for decision tree and artificial neural network and 100% for k-nearest neighbour and SVM. The accuracy of detection was 92% for decision tree, 96% for artificial neural network, 97% for k-nearest neighbour and 98% for SVM [64].

The results of ROC curve analysis of the performance of the artificial neural network and knearest neighbour classifications of brain tissues based on data obtained from the histogram show that histogram-based texture parameters are highly accurate. A classification accuracy of over 90% was achieved, and the weighted average sensitivity, specificity and area under ROC curve of almost unity were recorded for both artificial neural network and k-nearest neighbour. Correspondingly, the false positive rate (referred to as fall-out in machine learning) and false negative rate in both methods were very low. Sensitivity and specificity are important measures of the diagnostic accuracy of a test [65]. A diagnostic test with high sensitivity is useful in ruling out a disease condition when the test result is negative. Correspondingly, a diagnostic test with high specificity is useful in ruling in a disease condition when the test result is positive. The foregoing explanation of the importance of sensitivity and specificity in diagnostic test performance can be applied to the present study which was aimed at being used for automatic detection of stroke lesions.

Studies similar to ours have been carried out in the past with quite good outcomes. In one such study, classification of stroke lesions into acute infarct, chronic infarct and haemorrhage on non-contrast brain CT were done [23]. The researchers used histogram-based comparison and wavelet energy-based texture information to classify stroke lesions. In a study to propose a method for automatic diagnosis of abnormal tumour region present in CT images using wavelet-based statistical texture features and support vector machine (SVM) for classification of brain tissues, the researchers obtained a very high classification accuracy [66]. In another study, using extracted texture features from CT images with inductive learning techniques and Radial Basis Function Neural Network, brain tissues were classified as normal and abnormal with very high accuracy [63].

In this study, comparison of artificial neural network and k-nearest neighbour classifications of brain tissues showed that histogram-derived data achieved the same classification performance with both algorithms. This implies that either of the two algorithms can be used for classification and therefore may be used in real clinical situations. Histogram method of texture analysis is a rather simple concept and may be found attractive by many researchers with a view of developing computer-aided diagnostic softwares. The present database could be used in building a computer-aided diagnosis tool for stroke based on content-based image retrieval similar to that proposed by Yuan et al. [67].

The computer-aided diagnostic tool tries to emulate the radiologist's visual inspection and interpretation of brain CT images or any other image it has been presented with depending on the case under investigation. Classification is typically accomplished by using a decision or discriminant function [68]. In this study, supervised classification was carried using the artificial neural network [69, 70] and k-nearest neighbour [71], two algorithms popular with researchers in artificial intelligence in medicine. The performance of the artificial neural network and k-nearest neighbour algorithms in classifying brain tissues in non-contrast brain CT into normal, ischaemic and haemorrhagic lesions was evaluated using the ROC curves. In the ROC curve analysis, the classification of data points as belonging to normal brain tissue, ischaemic stroke or haemorrhagic stroke was cross-validated with the radiologist's identification of stroke lesions and normal brain tissues. Receiver operating characteristic curves are used to compare the diagnostic performance of two or more diagnostic tests [72–74] and also to discriminate between diseased and normal cases. With data from the histogram texture parameters obtained in this study, there was no difference in the results of ROC analysis of the classifications using the artificial neural network and k-nearest neighbour. This implies that both algorithms can be used with histogram-derived data to build automatic diagnostic tools for stroke.

The following factors may affect a generalization of the result of this study. So, its use should be with the following points in mind:


especially because of the wide variability in lesion and background appearance [75]. It may be the case that majority of the computer-aided detection schemes may never be trained with enough cases to "see" all possible variations in a given target lesion. Even for a scheme that uses artificial neural networks and continues to learn with each successive case they analyse, the sensitivity of 100% may not be achieved [75]. Thus, computer-aided detection systems should be used with caution and it ideally should not completely replace visual inspection and interpretation. Such systems are meant to complement visual inspection and interpretation. Heavy reliance on computer-aided detection system to detect and classify lesions may alter the normal search and decisionmaking processes [76].


In view of the findings of this study, a larger-scale study in an actual clinical environment is recommended. This study will evaluate the performance of this proposed automatic method of detecting and classifying stroke lesions and compare it with radiologist's visual interpretation. This study will also include the changes in CT appearance of stroke lesions with the passage of time. The chronological sub-typing will be crucial to identifying hyperacute, acute and chronic stroke lesions on CT. This will help neurologist to estimate the post-stroke neurological deficit that should be expected in any individual case. In conclusion, this study has established that histogram-derived texture parameters are accurate in classifying brain tissues in NCCT images and therefore suitable for automatic detection and classification of stroke lesions using the artificial neural network and knearest neighbour classifiers. The results obtained in this study suggest that computeraided diagnostic tool for stroke diagnosis utilizing histogram-derived texture parameters may be ideal.

#### **Author details**

non-contrast brain CT were done [23]. The researchers used histogram-based comparison and wavelet energy-based texture information to classify stroke lesions. In a study to propose a method for automatic diagnosis of abnormal tumour region present in CT images using wavelet-based statistical texture features and support vector machine (SVM) for classification of brain tissues, the researchers obtained a very high classification accuracy [66]. In another study, using extracted texture features from CT images with inductive learning techniques and Radial Basis Function Neural Network, brain tissues were classified as normal and abnormal

In this study, comparison of artificial neural network and k-nearest neighbour classifications of brain tissues showed that histogram-derived data achieved the same classification performance with both algorithms. This implies that either of the two algorithms can be used for classification and therefore may be used in real clinical situations. Histogram method of texture analysis is a rather simple concept and may be found attractive by many researchers with a view of developing computer-aided diagnostic softwares. The present database could be used in building a computer-aided diagnosis tool for stroke based on content-based image retrieval

The computer-aided diagnostic tool tries to emulate the radiologist's visual inspection and interpretation of brain CT images or any other image it has been presented with depending on the case under investigation. Classification is typically accomplished by using a decision or discriminant function [68]. In this study, supervised classification was carried using the artificial neural network [69, 70] and k-nearest neighbour [71], two algorithms popular with researchers in artificial intelligence in medicine. The performance of the artificial neural network and k-nearest neighbour algorithms in classifying brain tissues in non-contrast brain CT into normal, ischaemic and haemorrhagic lesions was evaluated using the ROC curves. In the ROC curve analysis, the classification of data points as belonging to normal brain tissue, ischaemic stroke or haemorrhagic stroke was cross-validated with the radiologist's identification of stroke lesions and normal brain tissues. Receiver operating characteristic curves are used to compare the diagnostic performance of two or more diagnostic tests [72–74] and also to discriminate between diseased and normal cases. With data from the histogram texture parameters obtained in this study, there was no difference in the results of ROC analysis of the classifications using the artificial neural network and k-nearest neighbour. This implies that both algorithms can be used with histogram-derived data to build automatic diagnostic tools

The following factors may affect a generalization of the result of this study. So, its use should

**1.** This study was not hospital-based. It was conducted in two radiodiagnostic centres, and the patients were carefully selected. The research conditions may therefore not reflect the

**2.** Sensitivity and specificity levels in this study were high but not 100% implying that a computer-aided scheme can make mistakes. This study recognizes this fact, but it did not consider how the mistaken cases may be identified. Sensitivity is rarely 100%

with very high accuracy [63].

102 Pattern Recognition - Analysis and Applications

for stroke.

be with the following points in mind:

actual clinical situation.

similar to that proposed by Yuan et al. [67].

Kenneth K. Agwu1 and Christopher C. Ohagwu2\*


## **References**


[15] Burnette WC, Nesbit GM. Intra-arterial thrombolysis for acute ischaemic stroke. Eur Radiol 2001; 11: 626–634.

**References**

2006.08.134

104 Pattern Recognition - Analysis and Applications

neur.1995.00540280029015.

10.1016/0895-4356(88)90084–4.

352–358.

[1] Stoitsis J, Ioannis V, Stavroula GM, Spyretta G, Alexandra N, Konstantina SN. Computer aided diagnosis based on medical image processing and artificial intelligence methods. Nucl Instrum Methods Phys Res A 2006; 569 (2): 591–595. Doi: 10.1016.j.nima.

[2] Ojini FI, Danesi MA. The pattern of neurological admissions at the Lagos University

[4] Sudlow CLM, Warlow CP. Comparing stroke incidence worldwide: what makes studies

[5] Gorelick PB. Stroke prevention. Arch Neurol 1995; 52(4): 347–355. Doi: 10.1001.arch-

[6] Wolf PA, Kannel WB, Dawber TR. Prospective investigations: The Framingham study

[7] Howlett WP. Neurology in Africa: clinical skills and neurological disorders. Kilimanjaro Christian Medical Centre, Tanzania and Centre for International Health, University of Bergen, Norway. 2012. Pp. 101–115. Accessed on 12/09/2012 at https://bora.uib.no

[8] The WHO MONICA project principal investigators. The World Health Organization MONICA Project (monitoring trends and determinants in cardiovascular disease): a major international collaboration. J Clin Epidemiol 1988; 41(2): 105–114. Doi:

[9] Chukwuonye II, Ohagwu KA, Uche EO, Chuku A, Nwanke RI, Ohagwu CC, Ezeani IU, Nwabuko CO, Nnoli MA, Oviasu E, Ogah SO. Validation of Siriraj stroke score in

southeast Nigeria. Int J Gen Med. 2015; 8: 349–353. Doi: 10.2147/IJGM.S87293.

comparison to radiological evidence. Br J Sci 2012; 6(2): 1–7.

[10] Sheta YS, Al-Gohary AA, El-Mahdy M. Accuracy of clinical sub-typing of stroke in

[11] Imarhiagbe FA, Ogbeide E. Clinical-imaging dissociation in strokes in a southern Nigerian tertiary hospital: review of 123 cases. Niger J Hosp Pract. 2011; 8(1–2): 3–7.

[12] Khan J, Rehman A. Comparison of clinical diagnosis with computed tomography in ascertaining type of stroke. J Ayub Med Coll Abbottabad 2005; 17(3): 65–67.

[13] Mullins ME. Modern emergent stroke imaging: pearls, protocols and pitfalls. Radiol Clin N Am 2006; 44(1): 41–62. Doi: http://dx.doi.org/10.1016/j.rcl.2005.08.002.

[14] Keris V, Rudnicka S, Vorona V, Enina G, Tilgale B, Fricbergs J. Combined intraarterial/ intravenous thrombolysis for acute ischaemic stroke. Am J Neuroradiol 2001; 22(2):

[3] Warlow CP. Epidemiology of stroke. Lancet 1998; 352 (Suppl 3): S1111-S1114.

comparable? Stroke 1996; 27(3): 550–558. Doi: 10.1161/01.STR.27.3.550

and the epidemiology of stroke. Adv Neurol. 1978; 19: 107–120.

Teaching Hospital. Niger J Clin Pract 2003; 5(1): 38–41.


[47] Julesz B, Gilbert EN, Shepp LA, Frish HL. Inability of humans to discriminate between visual features that agree in second-order statistics-revisited. Perception 1973; 2: 391– 405.

[31] Materka A, Strzelecki M. Texture analysis methods - a review. Technical University of Lodz, Institute of Electronics, COST B11 report, Brussels. 1998. Available online at:

[32] Haralick RM. Statistical and structural approaches to texture. Proc IEEE 1979; 67 (5):

[33] Pham TA. Optimization of texture feature extraction algorithm. A Master of Science thesis submitted to Delft University of Technology, Netherlands. 2010. P. 8.

[34] Waugh SA (2014). The use of texture analysis in breast magnetic resonance imaging.Doctoral thesis submitted to University of Dundee, Scotland. Available online at:

[35] Tang X. Texture information in run-length matrices. IEEE Trans Image Process. 1998; 7

[36] Galloway MM. Texture analysis using grey level run lengths. Comput Graph Image

[37] Bracewell R. The Fourier transform and its applications (3rd edition). New York:

[38] Qian S, Chen D. Discrete Gabor transform. IEEE Trans Signal Process 1993; 41: 2429–

[39] Walnut DF. An introduction to wavelet analysis. Boston, Massachusetts: Birkhauser.

[40] Szcypiński PM, Strzelescki M, Materka A, Klepaczko A. MaZda – a software package for image texture analysis. Comput Methods Progr Biomed 2009; 94 (1): 66–76.

[41] Materka A, Strzelecki M, Szcypinski P. MaZda manual. 2006. Available at: http://

[42] Gonzalez RC, Woods RE, Eddins SL. Digital image processing using MATLAB (2nd

[43] Galda HJ. Image processing with scilab and image processing design toolbox. 2011.

[44] Jiang Y, Nishikawa RM, Schmidt RA, Metz CE, Giger ML, Doi K. Improving breast cancer diagnosis with computer-aided diagnosis. Acad Radiol 1999; 6:

[45] Tourassi GD. Journey toward computer-aided diagnosis: role of image texture analysis.

[46] Robinson PJA. Radiology's Achilles' heel: error and variation in the interpretation of

www.eletel.p.lodz.pl/mazda/download/mazda\_manual.pdf

Available at: www.d.umn.edu/˜snorr/ee5351s14/ImProcTut.pdf

edition). USA: Gatesmark Publishing. 2009. Pp. 1–12.

the Roentgen image. Br J Radiol 1997; 70: 1085–1098.

http://www.eletelop.lodz.pl/program/cost/pdf\_1.pdf

http://discovery.dundee.ac.uk/portal/en/theses/.

786–804.

106 Pattern Recognition - Analysis and Applications

(11): 1602–1609.

Process 1975; 4: 172–179.

McGraw-Hill. 1999.

2438.

2001.

22–33.

Radiology 1999; 213: 317–320.


**Provisional chapter**

#### **Data-Driven Methodologies for Structural Damage Detection Based on Machine Learning Applications Detection Based on Machine Learning Applications**

**Data-Driven Methodologies for Structural Damage** 

[64] Rajini NH, Bhavani R. Computer aided detection of ischaemic stroke using segmenta-

[65] Akobeng AK. Understanding diagnostic tests 1: sensitivity, specificity and predictive

[66] Padma A, Sukanesh R. Automatic diagnosis of abnormal tumor region from brain computed tomography images using wavelet based statistical texture features. Int J

[67] Yuan K, Tian Z, Zou J, Bai Y, You Q. Brain CT image database building for computeraided diagnosis using content-based image retrieval. Inform Process Manage 2011; 47

[68] Kassner A, Thronhill RE. Texture analysis: a review of neurologic MR imaging appli-

[69] Tzacheva AA, Najarian K, Brockway JP. Breast cancer detection in gadoliniumenhanced MR images by static region descriptors and neural networks. J Magn Resonan

[70] Georgiadis P, Cavouras D, Kalatzis I, Daskalakis A, Kagadis GC, Sifaki K, Malamas M, Nikiforidis G, Solomou E. Improving brain tumor characterization on MRI by probabilistic neural networks and non-linear transformation of textural features. Comput

[71] Cover TM, Hart PE. Nearest neighbour pattern classification. IEEE Trans Inform Theor

[72] Griner PF, Mayewski RJ, Mushlin AI, Greenland P. Selection and interpretation of

[73] Zweig MH, Campbell G. Receiver-operating characteristic (ROC) plots: a fundamental

[75] Krupinski EA. Computer-aided detection in clinical environment: benefits and

[76] Krupinski EA. An eye-movement study on the use of CAD information during mammographic search. Presented at the Seventh Far West Perception Conference,

diagnostic tests and procedures. Ann Intern Med 1981; 94: 555–600.

[74] Metz CE. Basic principles of ROC analysis. Semin Nucl Med 1978; 8: 283–298.

evaluation tool in clinical medicine. Clin Med 1993; 39: 561–577.

challenges for radiologists. Radiology 2004; 231 (1): 7–9.

Comput Sci Eng Inform Technol 2011; 1 (3) 1–3. Doi: 10.5121/ijcseit.2011.1303.

tion and texture features. Measurement 2013; 46 (6): 1865–1874.

values. Acta Paediatr 2006; 96: 338–341.

108 Pattern Recognition - Analysis and Applications

cations. Am J Neuroradiol 2010; 31: 809–816.

Methods Progr Biomed 2008; 89: 24–32.

Tuscon, Arizona October 16–17, 1997.

(2): 176–185.

Imag 2003; 17: 337–342.

1967; 13: 21–27.

Jaime Vitola, Maribel Anaya Vejar, Diego Alexander Tibaduiza Burgos and Francesc Pozo Alexander Tibaduiza Burgos and Francesc Pozo Additional information is available at the end of the chapter

Jaime Vitola, Maribel Anaya Vejar, Diego

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/65867

#### **Abstract**

Structural health monitoring (SHM) is an important research area, which interest is the damage identification process. Different information about the state of the structure can be obtained in the process, among them, detection, localization and classification of damages are mainly studied in order to avoid unnecessary maintenance procedures in civilian and military structures in several applications. To carry out SHM in practice, two different approaches are used, the first is based on modelling which requires to build a very detailed model of the structure, while the second is by means of data-driven approaches which use information collected from the structure under different structural states and perform an analysis by means of data analysis . For the latter, statistical analysis and pattern recognition have demonstrated its effectiveness in the damage identification process because real information is obtained from the structure through sensors installed permanently to the observed object allowing a real-time monitoring. This chapter describes a damage detection and classification methodology, which makes use of a piezoelectric active system which works in several actuation phases and that is attached to the structure under evaluation, principal component analysis, and machine learning algorithms working as a pattern recognition methodology. In the chapter, the description of the developed approach and the results when it is tested in one aluminum plate are also included.

**Keywords:** SHM, PCA, machine learning, structural health monitoring

#### **1. Introduction**

Structural health monitoring (SHM) is a very interesting area, which main objective is the damage identification using permanently installed sensors to the structure. In general, one

© 2016 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. © 2016 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

of the aims is to monitor in real time a structure in order to know the current state starting from the damage detection, from this point of view, damage detection is extremely important: first, for safety, because it helps manage the downside risk resulting in a reduction cost by improving the visual inspection and maintenance processes [1, 2]. Currently, the new developments in several areas include the use of more complex structures. In many cases, the relation between the structure and the rest of the elements introduces interdependences which can be non-linear increasing the difficulty of the damage detection process. In these cases, a multicomponent and systemic approach can be incorporated to result in a safe and optimal maintenance model [3]. It is also important to note that there is infrastructure, which has been in use for several years, some examples can be found in historical buildings, bridges, aeronautical and aerospace structures, among others. This aging process brings new challenges [4] for SHM systems.

It is mandatory also to highlight the wide range of opportunities offered by the automation of the structural health monitoring process which can be used in conjunction with other automation systems such as an integrated transport system (ITS - Intelligent Transportation Systems), auto guided vehicles, among others. This symbiosis can offer benefits and give news perspectives about the use of the structures by providing additional information that the SHM systems can leverage to increase reliability, robustness and efficiency, reducing the probability of error, and providing tools for a better decision-making [5]. Structural health systems have a wide application in countless civilian infrastructures such as bridges [24] and buildings [6]. Similarly, SHM systems have been also applied to monitor mechanical components such as fuselages helicopters [7], wind turbines installed on land [8, 9] and sea (offshore) [10], aerospace equipment [11], aircraft [12], high-speed trains [13], aircraft turbines [14] and boats [15], in the same way SHM systems have been applied to marine renewable energy equipment [16]. It is noteworthy that the environmental conditions need to be considered to ensure a robust damage detection, in this sense, some works have been introduced to compensate the effects of the temperature changes [17, 18].

Regardless of the infrastructure design or the technology used in the development of the maintenance decision making, there are some factors to consider. Factors, such as information about the physical infrastructure, administrative information, use, and many others such as reliability, maintainability, operability, bearing capacity, and policy-adopted maintenance [19], need to be considered. Added to this it must be remembered drift probability [20]. The theories and the definition about the best inspection process are really complex, for instance in the machines which are working all time it is necessary to develop maintenance methodologies to avoid the failure or breakdown maintenance, in this sense, preventive maintenance and reliability-centered maintenance, among others need to be included [21]. This chapter includes a description of a methodology for damage detection and classification and the experimental validation with data from an aluminum plate instrumented with piezoelectric transducers permanently attached to its surface. In this sense, the chapter is organized as follows: Chapter 2 presents general concepts about the methods and concepts used in the methodology, Chapter 3 explains the methodology. Chapter 4 describes the experimental setup, after Chapter 5 presents the results, finally the conclusions are included.

## **2. General concepts**

of the aims is to monitor in real time a structure in order to know the current state starting from the damage detection, from this point of view, damage detection is extremely important: first, for safety, because it helps manage the downside risk resulting in a reduction cost by improving the visual inspection and maintenance processes [1, 2]. Currently, the new developments in several areas include the use of more complex structures. In many cases, the relation between the structure and the rest of the elements introduces interdependences which can be non-linear increasing the difficulty of the damage detection process. In these cases, a multicomponent and systemic approach can be incorporated to result in a safe and optimal maintenance model [3]. It is also important to note that there is infrastructure, which has been in use for several years, some examples can be found in historical buildings, bridges, aeronautical and aerospace structures, among others. This aging process brings new challenges

It is mandatory also to highlight the wide range of opportunities offered by the automation of the structural health monitoring process which can be used in conjunction with other automation systems such as an integrated transport system (ITS - Intelligent Transportation Systems), auto guided vehicles, among others. This symbiosis can offer benefits and give news perspectives about the use of the structures by providing additional information that the SHM systems can leverage to increase reliability, robustness and efficiency, reducing the probability of error, and providing tools for a better decision-making [5]. Structural health systems have a wide application in countless civilian infrastructures such as bridges [24] and buildings [6]. Similarly, SHM systems have been also applied to monitor mechanical components such as fuselages helicopters [7], wind turbines installed on land [8, 9] and sea (offshore) [10], aerospace equipment [11], aircraft [12], high-speed trains [13], aircraft turbines [14] and boats [15], in the same way SHM systems have been applied to marine renewable energy equipment [16]. It is noteworthy that the environmental conditions need to be considered to ensure a robust damage detection, in this sense, some works have been introduced to compensate the effects

Regardless of the infrastructure design or the technology used in the development of the maintenance decision making, there are some factors to consider. Factors, such as information about the physical infrastructure, administrative information, use, and many others such as reliability, maintainability, operability, bearing capacity, and policy-adopted maintenance [19], need to be considered. Added to this it must be remembered drift probability [20]. The theories and the definition about the best inspection process are really complex, for instance in the machines which are working all time it is necessary to develop maintenance methodologies to avoid the failure or breakdown maintenance, in this sense, preventive maintenance and reliability-centered maintenance, among others need to be included [21]. This chapter includes a description of a methodology for damage detection and classification and the experimental validation with data from an aluminum plate instrumented with piezoelectric transducers permanently attached to its surface. In this sense, the chapter is organized as follows: Chapter 2 presents general concepts about the methods and concepts used in the methodology, Chapter 3 explains the methodology. Chapter 4 describes the experimental setup, after Chapter 5 presents the results, finally the conclusions are

[4] for SHM systems.

110 Pattern Recognition - Analysis and Applications

of the temperature changes [17, 18].

included.

The methodology described in this work uses some well-known methods for data driven, however in this section some of this concepts will be introduced.

#### **2.1. Principal components analysis**

One of the greatest difficulties in data analysis occurs when the amount of data is very large and there is no apparent relationship between all the information or if it is very difficult to find. As solution, principal component analysis (PCA) was born as a very useful tool to reduce and analyze a big quantity of information. The principal component analysis technique was described by Pearson in 1901, as a Mechanism of Multivariate analysis and was also used by Hotelling in 1933 [22]. This method allows to find the principal components, which are a reduced version of the original dataset and include relevant information that identifies the reason for the variation between them. To find these variables, the analysis includes the transformation of the current coordinate space to a new space in order to re-express the original data trying to filter the noise and redundancies. These redundancies are measured by means of the correlation between the variables [23].

There are two mechanisms to implement the analysis of main components: first method is based on correlations and second is based on covariance. It is necessary to highlight that PCA is not invariant to scale, so the data under study must be normalized. Many methods can be used to do this as is shown in [23, 24]. In many applications, PCA is used as a tool to reduce the dimensionality of the data to be applied in a subsequent process to work with a reduced number of data. Currently, there are many useful toolboxes to apply PCA and analyze the reduced data provided by the technique [25], this is one of the reasons about PCA still being used. More information about PCA and the normalization process can be consulted in Refs. [24, 26–28].

#### **2.2. Machine learning**

Since Alan Turing showed interest in learning by machines, this area has remained at the forefront of the research by increasing his popularity and expanding its field of performance [29]. This has revolutionized the way in which complex problems has been tackled. In the relentless pursuit of best tools for data analysis, machine learning has been highlighted by finding a set of strategies for pattern recognition, which are able to find the relationship between data that at first glance have no correlation and are very difficult to define a deterministic mathematical model. Machine learning strategies and bio-inspired algorithms allow to avoid this difficulty through mechanisms designed to find the answer by themselves. In SHM or related areas, it is possible to find some applications about how machine learning has been used to detect problems, such as breaks, corrosion, cracks, impact damage, delamination, disunity, breaking fibers (some pertinent to metals and the others to composite materials [30]), in addition it has been used to provide information about the future behavior of a structure under extreme events such as earthquakes [31].

Depending on how the algorithms work, machine learning can be classified into two main approaches: unsupervised and supervised learning. First, the information is grouped and interpreted only using the input data, however, the second, requires information about the output data to perform the learning task. **Figure 1** shows this classification and includes information about the works that each one of these learning can be used.

Since this work is aimed to classify damages, supervised learning is used. In practice, this task is performed through the classification learner toolbox of MATLAB®, and **Table 1** includes the methods used in the development of this work.

**Figure 1.** Machine learning approaches according to the learning.


**Table 1.** Methods included in the classification learner toolbox of MATLAB®.

## **3. Damage classification methodology**

The methodology used in this work is aimed to the damage detection and classification. To perform this task, it is necessary to highlight that pattern recognition point of view is used, in this sense, the methodology works first with the definition of a healthy pattern which is obtained from different states of the structure. In this work, data from healthy and different damages are used as inputs to the machines. This stage is defined as training and is developed as in **Figure 2**.

In general terms, the process includes a pre-processing step, where all the experiments are organized in a matrix per each actuation phase as in **Figure 3**, and normalization is applied before to create PCA models.

**Figure 2.** Training process.

interpreted only using the input data, however, the second, requires information about the output data to perform the learning task. **Figure 1** shows this classification and includes infor-

Since this work is aimed to classify damages, supervised learning is used. In practice, this task is performed through the classification learner toolbox of MATLAB®, and **Table 1** includes

mation about the works that each one of these learning can be used.

the methods used in the development of this work.

112 Pattern Recognition - Analysis and Applications

**Figure 1.** Machine learning approaches according to the learning.

**Table 1.** Methods included in the classification learner toolbox of MATLAB®.

**Decision trees Nearest neighbor classifiers Support vector machines Ensemble classifiers** Simple tree Fine KNN Linear SVM Boosted trees Medium tree Cubic SVM Fine Gaussian SVM Bagged trees Complex Tree Medium KNN Medium Gaussian SVM Subspace KNN

Coarse KNN Coarse Gaussian SVM Subspace discriminant

Cosine KNN Quadratic SVM RUSBoosted Weighted KNN Cubic SVM Trees

**Figure 3.** Organization and normalization data.

**Figure 4.** Test process.

After training step, same experiments with unknown scenarios are applied to the structure, and these data are pre-processed and projected in the principal components and included in the trained machine to determine to which state it correspond. **Figure 4** presents a description of the steps used on that process.

#### **4. Experimental setup**

**Figure 5** shows a scheme of the SHM system, it is composed of one oscilloscope of four channels with an usb interface, one arbitrary generator, and a CPU as processing unit, additionally there is a switching device, which is implemented for automatizing the measurement as it is shown in **Figure 5**.

The inspection process can be summarized in the following steps:


The system collects the information in several files, in this case four since there are four transducers, and pre-processes, as was explained in the previous section. To validate the methodology, four structural states including the healthy state and three simulated damages were used as in **Figure 6**. These kinds of damages are used to produce changes in the wave propagation [27] and to provide different scenarios for validating the methodology.

Data-Driven Methodologies for Structural Damage Detection Based on Machine Learning Applications http://dx.doi.org/10.5772/65867 115

**Figure 5.** Experimental setup.

After training step, same experiments with unknown scenarios are applied to the structure, and these data are pre-processed and projected in the principal components and included in the trained machine to determine to which state it correspond. **Figure 4** presents a description

**Figure 5** shows a scheme of the SHM system, it is composed of one oscilloscope of four channels with an usb interface, one arbitrary generator, and a CPU as processing unit, additionally there is a switching device, which is implemented for automatizing the measurement as it is

• A burst signal is applied to one PZT and the rest of the transducers are used as sensors.

• A multiplexing system allows to change the actuator and collects the information from the rest of the sensors. This process is applied as many times as piezoelectric sensors are

• A digitizer is finally used to capture the information collected by the sensors via an oscil-

The system collects the information in several files, in this case four since there are four transducers, and pre-processes, as was explained in the previous section. To validate the methodology, four structural states including the healthy state and three simulated damages were used as in **Figure 6**. These kinds of damages are used to produce changes in the wave propa-

gation [27] and to provide different scenarios for validating the methodology.

The inspection process can be summarized in the following steps:

of the steps used on that process.

114 Pattern Recognition - Analysis and Applications

**4. Experimental setup**

attached to the structure.

loscope with usb interface.

shown in **Figure 5**.

**Figure 4.** Test process.

**Figure 6.** Structural states used in the damage classification validation.

#### **5. Experimental results**

In order to validate the methodology with several machine learning methods, three experiments were implemented. The objective is to determine the behavior of the different methods of machine learning described in Section 2 and its performance under different scenarios which are obtained by changes in the input data and the pre-processing step. In most of the cases, these kinds of changes are the responsible for producing false alarms in the damage identification process. In this way, the acquisition process was made by looking the effect of the attenuation with long cables (2.5 m) and short cables (0.5 m), the addition of Gaussian noise to the acquired signals and the use of a Golay filter in the pre-processing step. These experiments are explained below.

First experiment: acquisition performed with a short cable (0.5 m) from the digitizer to the sensors, and the acquired signals filtered with a Golay filter algorithm in this experiment after adding white Gaussian noise.

Second experiment: acquisition performed with long cable to sensors (2.5 m), and signals filtered with the Golay algorithm.

Third experiment: acquisition performed with a short cable (0.5 m) from the digitizer to the sensors, and the signal filter without a Golay filter algorithm.

As it was previously introduced, in the first group of experiments, the influence of added noise to the data will be explored in order to determine how it affects the results in the principal components. For this, the Golay filter is applied to reduce the influence of aleatory signals and after the white Gaussian noise is added to the signals. Later, the methodology was applied to the signals with and without noise to determine the influence of the white noise in the detection process. An example of the signals used by the algorithms in the actuation phase 2 can be seen in **Figure 7**, similar results are obtained with all the signals.

**Figure 7.** Signal received by sensors in the first experiment, without damage (a) with Golay filter applied without white Gaussian noise (b) with Golay filter applied with white Gaussian noise.

Data-Driven Methodologies for Structural Damage Detection Based on Machine Learning Applications http://dx.doi.org/10.5772/65867 117

**Figure 8.** First two principal components for experiment 1: (a) without added noise (b) with 25dB of white Gaussian noise.


**Figure 9.** The bad case confusion matrix for experiment 1.

**5. Experimental results**

116 Pattern Recognition - Analysis and Applications

experiments are explained below.

adding white Gaussian noise.

filtered with the Golay algorithm.

sensors, and the signal filter without a Golay filter algorithm.

Gaussian noise (b) with Golay filter applied with white Gaussian noise.

2 can be seen in **Figure 7**, similar results are obtained with all the signals.

In order to validate the methodology with several machine learning methods, three experiments were implemented. The objective is to determine the behavior of the different methods of machine learning described in Section 2 and its performance under different scenarios which are obtained by changes in the input data and the pre-processing step. In most of the cases, these kinds of changes are the responsible for producing false alarms in the damage identification process. In this way, the acquisition process was made by looking the effect of the attenuation with long cables (2.5 m) and short cables (0.5 m), the addition of Gaussian noise to the acquired signals and the use of a Golay filter in the pre-processing step. These

First experiment: acquisition performed with a short cable (0.5 m) from the digitizer to the sensors, and the acquired signals filtered with a Golay filter algorithm in this experiment after

Second experiment: acquisition performed with long cable to sensors (2.5 m), and signals

Third experiment: acquisition performed with a short cable (0.5 m) from the digitizer to the

As it was previously introduced, in the first group of experiments, the influence of added noise to the data will be explored in order to determine how it affects the results in the principal components. For this, the Golay filter is applied to reduce the influence of aleatory signals and after the white Gaussian noise is added to the signals. Later, the methodology was applied to the signals with and without noise to determine the influence of the white noise in the detection process. An example of the signals used by the algorithms in the actuation phase

**Figure 7.** Signal received by sensors in the first experiment, without damage (a) with Golay filter applied without white

**Figure 10.** The good case confusion matrix for experiment 1.

**Figure 11.** Signal received by sensors by experiment 2.

**Figure 12.** PCA components for experiment 2.

**Figure 11.** Signal received by sensors by experiment 2.

**Figure 10.** The good case confusion matrix for experiment 1.

118 Pattern Recognition - Analysis and Applications

**Figure 8a** shows the first two principal components of the signal for the actuation phase 1, which are after used to train the machines, this train was made with methods included in the classification learner toolbox of MATLAB® shows in **Table 1**. This behavior is the same in all the actuation phases.

As seen in **Figure 8a** and **8b**, the first the principal components are able to eliminate the noise and prove that they are a good tool for defining the elements to include in the machine this is the experiment one.

After searching the principal components, the machines are trained with these data. Although all the machine learning methods were explored, following worst and best results are shown for a better understanding. **Figure 9** shows the confusion matrix with test Coarse KNN machine, and the result in all cases was very poor, with most machines having this behavior.

**Figure 10** shows the confusion matrix with test Bagged Trees machine, the result in all cases was good, Fine KNN, Weighted KNN, Bagged Tree and subspace KNN, also the behavior was good, but only in some machines good response was obtained.

**Figure 13.** The best confusion matrix for experiment 2.


**Figure 14.** The bad case confusion matrix for experiment 2.

**Figure 15.** Signal received by sensors by experiment.

**Figure 13.** The best confusion matrix for experiment 2.

120 Pattern Recognition - Analysis and Applications

**Figure 14.** The bad case confusion matrix for experiment 2.

**Figure 16.** PCA components for experiment 3.

**Figure 17.** The worst case confusion matrix for processing with other training.


**Figure 18.** The bad case confusion matrix for processing with other training.

In general, the response of these machine learning algorithms was good with or without added noise because PCA has shown great ability to reject the noise.

The second case was considered when the acquisition system is connected with long cables, and Golay filter for pre-processing is used, in this case the signals in some cases were bad digitalized because of the impedance of cable, the noise, the low voltage of the stimulus, and other experimental features. An example of the captured signals is shown in **Figure 11**.

**Figure 12** shows the first two principal components obtained from the signal, which were used to train the machines.

As in the previous experiment, all the methods were explored and best and worst results are included in this work. **Figure 13** shows the confusion matrix with Weighted KNN, and the behavior was similar to the first experiment. Similar results are obtained with adding Fine KNN, Weighted KNN, Bagged Tree, and subspace KNN.

Bad results were obtained with other methods for Coarse KNN. **Figure 14** shows this behavior, which is similar to the experiment 1.

Similar results were obtained with the third experiment; in this case, a short cable was used and unfiltered signals were used to calculate the scores. **Figure 15** shows the acquired signal in the actuation phase 1.

**Figure 16** shows the first two principal components of the signal, however in this experiment these data were not used to train the machines, this means, principal components are projected into the machines trained in the first experiment to determine the influence of these changes in the results.

**Figure 17** shows the response of the Coarse KNN machine, in this last case, the training is not success with this data series.

**Figure 18** shows the response of the Fine KNN machine, similar results to the previous case are obtained, this means, a bad classification is provided by the machine.

### **6. Conclusions and future work**

**Figure 17.** The worst case confusion matrix for processing with other training.

122 Pattern Recognition - Analysis and Applications

**Figure 18.** The bad case confusion matrix for processing with other training.

The piezoelectric transducers working as an active inspection system provide a good system to produce mechanical waves over materials under evaluation. In all the cases, the information obtained from the healthy state and the different damage scenarios applied to the methodology showed that algorithm is available to detect real and simulated damages in both structures in spite of shapes and differences in the element under inspection.

For all the experiments, the results showed that the behavior was very similar, only few machines architecture presented good results, these are: Fine KNN, Weighted KNN, Bagged Tree, and subspace KNN. Others types of machines did not work well for the experiments.

In all cases, it is necessary to train the machines with data pre-processed in the same way as in the definition of the healthy state, changes in the elements such as the cable length and the use of the Golay filter are enough to change the results in the PCA model obtained which do that the machines do not work correctly.

PCA is a robust mechanism to characterize data since it was demonstrated to eliminate the noise, however, more experiments need to be considered by including environmental and operational noise to determine the effectiveness of the algorithm.

#### **Acknowledgement**

This work is supported by Universidad Santo Tomas through Grant FODEIN 2016, project code FODEIN 1608303-017.

## **Author details**

Jaime Vitola1,2, Maribel Anaya Vejar<sup>2</sup> , Diego Alexander Tibaduiza Burgos3,\* and Francesc Pozo<sup>4</sup>

\*Address all correspondence to: dtibaduiza@gmail.com

1 Universitat Politecnica de Catalunya, CoDAlab, Department of Mathematics, Escola d'Enginyeria de Barcelona, Barcelona, Spain

2 Faculty of Electronics Engineering, Research Group MEM - Modeling, Electronics and Monitoring, Universidad Santo Tomas, Bogotá, Colombia

3 Faculty of Engineering, Fundación Universitaria Los Libertadores, Bogotá, Colombia

4 CoDAlab, Department of Mathematics, Escola d'Enginyeria de Barcelona Est (EEBE), Universitat Politècnica de Catalunya, Sant Adrià de Besòs (Barcelona), Spain

#### **References**


ment: current status and future direction," *IEEE Trans. Intell. Transp. Syst*., vol. PP, no. 99, pp. 1–16, 2016.

[6] K. S. Raju, Y. Pratap, Y. Sahni, and M. Naresh Babu, "Implementation of a WSN system towards SHM of civil building structures," in *Intelligent Systems and Control (ISCO), 2015 IEEE 9th International Conference on*, 2015, pp. 1–7.

use of the Golay filter are enough to change the results in the PCA model obtained which do

PCA is a robust mechanism to characterize data since it was demonstrated to eliminate the noise, however, more experiments need to be considered by including environmental and

This work is supported by Universidad Santo Tomas through Grant FODEIN 2016, project

1 Universitat Politecnica de Catalunya, CoDAlab, Department of Mathematics, Escola

2 Faculty of Electronics Engineering, Research Group MEM - Modeling, Electronics and

4 CoDAlab, Department of Mathematics, Escola d'Enginyeria de Barcelona Est (EEBE),

[1] N. Mrad, "SHM implementation," in *Fly by Wireless Workshop (FBW), 2011 4th Annual* 

[2] D. M. Laveuve, M. Lehmann, K. Erdmann, and A. Büter, "Shm - Reliability Demands on the Multidisciplinary Challenge of Structural Health Monitoring," in *NDT in Progress,* 

[3] A. Van Horenbeek and L. Pintelon, "A prognostic maintenance policy - effect on component lifetimes," in *Reliability and Maintainability Symposium (RAMS), 2013 Proceedings* 

[4] W. J. Staszewski and K. Worden, "Signal processing for damage detection," *Encycl.* 

[5] S. M. Khan, S. Atamturktur, M. Chowdhury, and M. Rahman, "Integration of structural health monitoring and intelligent transportation systems for bridge condition assess-

3 Faculty of Engineering, Fundación Universitaria Los Libertadores, Bogotá, Colombia

Universitat Politècnica de Catalunya, Sant Adrià de Besòs (Barcelona), Spain

*5th International Workshop of NDT Experts*, 2009, pp. 12–14.

, Diego Alexander Tibaduiza Burgos3,\* and Francesc Pozo<sup>4</sup>

that the machines do not work correctly.

124 Pattern Recognition - Analysis and Applications

**Acknowledgement**

code FODEIN 1608303-017.

Jaime Vitola1,2, Maribel Anaya Vejar<sup>2</sup>

d'Enginyeria de Barcelona, Barcelona, Spain

\*Address all correspondence to: dtibaduiza@gmail.com

Monitoring, Universidad Santo Tomas, Bogotá, Colombia

**Author details**

**References**

*Caneus*, 2011, pp. 1–4.

*- Annual*, 2013, pp. 1–6.

*Struct. Heal. Monit*., Vol 1, 2009.

operational noise to determine the effectiveness of the algorithm.


[19] T. Chitra, "Life based maintenance policy for minimum cost," in *Reliability and* 

[20] H. Cherkaoui, K. T. Huynh, and A. Grall, "Towards an efficient and robust maintenance decision-making," in *2016 Second International Symposium on Stochastic Models in Reliability* 

*Engineering, Life Science and Operations Management (SMRLO)*, 2016, pp. 225–232.

[21] Z. Fu, G. Wang, F. Gao, X. Tian, Y. Li, and B. Lu, "Review of high-speed train maintenance," in *Quality, Reliability, Risk, Maintenance, and Safety Engineering (ICQR2MSE), 2012* 

[22] D. K. Stangl, "Encyclopedia of statistics in behavioral science," *J. Am. Stat. Assoc*., vol.

[23] M. Anaya, D. A. Tibaduiza, and F. Pozo, "A bioinspired methodology based on an artificial immune system for damage detection in structural health monitoring," *Shock Vib*.,

[24] D. A. T. Burgos, "Design and validation of a structural health monitoring system for

[25] D. H. Jeong, C. Ziemkiewicz, B. Fisher, W. Ribarsky, and R. Chang, "iPCA: an interactive system for pca based visual analytics," In computer Graphics Forum, vol. 28, no. 3, pp.

[26] D. A. Tibaduiza, L. E. Mujica, J. Rodellar, and A. Güemes, "Structural damage detection using principal component analysis and damage indices," *J. Intell. Mater. Syst. Struct*.,

[27] M. Anaya, "Design and validation of a structural health monitoring system based on bioinpired algorithms," PhD thesis, Universitat Politecnica de Catalunya, CoDAlab, Department of Mathematics, Escolad'Enginyeria de Barcelona, Barcelona, Spain, July

[28] D. A. Tibaduiza Burgos, L. E. Mujica Delgado, A. Güemes Gordo, and J. Rodellar Benedé, "Active piezoelectric system using PCA," in *Fifth European Workshop on Structural Health* 

[29] S. Muggleton, "Alan turing and the development of artificial intelligence," *AI Commun*.,

[30] K. Worden and C. R. Farrar, Structural Health Monitoring: a machine learning perspec-

[31] Singh S, Seah WKG, Ng B. Cluster-centric medium access control for WSNs in structural health monitoring. In: Modeling and Optimization in Mobile, Ad Hoc, and Wireless

Networks (WiOpt), 2015 13th International Symposium on. 2015. p. 275–82.

*Maintainability Symposium, 2003. Annual*, 2003, pp. 470–474.

*International Conference on*, 2012, pp. 419–422.

aeronautical structures," *PhD thesis*, vol. 1, 2013.

103, no. 482, pp. 881–882, 2008.

vol. 501, p. 648097, 2015.

126 Pattern Recognition - Analysis and Applications

p. 1045389X14566520, 2015.

*Monitoring*, 2010, pp. 164–169.

vol. 27, no. 1, pp. 3–10, 2014.

tive. Wiley, isbn 9781118443200, 2012.

767–774, 2009.

2016 .

## *Edited by S. Ramakrishnan*

Pattern recognition continued to be one of the important research fields in computer science and electrical engineering. Lots of new applications are emerging, and hence pattern analysis and synthesis become significant subfields in pattern recognition. This book is an edited volume and has six chapters arranged into two sections, namely, pattern recognition analysis and pattern recognition applications. This book will be useful for graduate students, researchers, and practicing engineers working in the field of machine vision and computer science and engineering.

Photo by Samuel Zeller / unsplash

Pattern Recognition - Analysis and Applications

Pattern Recognition

Analysis and Applications

*Edited by S. Ramakrishnan*