**Part 3**

**Image Processing** 

276 Fourier Transform – Signal Processing

Min, Mart; Parve, Toomas; Land, Raul; Annus, Paul (2012). Scalable Impedance Spectroscopy:

Annus, P.; Min, M.; Ojarand, J.; Paavle, T.; Land, R.; Ellervee, P.; Parve, T. (2011). Multisine

Olfa Kanoun (ed), Taylor & Francis Group, London, 33-38.

2011, 1265 - 1268.

comparative study of sinusoidal and rectangular chirp excitations. In: Lecture Notes on Impedance Spectroscopy v. 2 Measurement, Modeling and Applications –

and Binary Multifrequency Waveforms in Impedance Spectrum Measurement – A Comparative Study. 5th European IFMBE MBEC 2011, Budapest, 14-18 Sept.,

**12** 

*México* 

**Invariant Nonlinear Correlations** 

Ángel Coronel-Beltrán1 and Josué Álvarez-Borrego2 *1Universidad de Sonora, Departamento de Investigación en Física, 2CICESE, División de Física Aplicada, Departamento de Óptica,* 

A great advance has been done in optical and digital pattern recognition since the first introduction of the well known classical matched filter (CMF) (Vander Lugt, 1964), which was the cornerstone for the developments of new and more effective filters. However, the use of the CMF is inefficient because the output correlation peak is degraded drastically by the geometrical distortions of the target, such as scale and rotation changes. Many attempts have been done to achieve distortion-invariant sensitivity pattern recognition. To overcome these problems, the Fourier-Mellin transform was introduced (Casasent, 1976a, 1976b, 1976c) for scale and rotation invariance. The realization of the Mellin transform is given by a

Some studies have been done confirming the invariance to position, scale and rotation in the visual cortex in primates and humans (Schwartz, 1977, 1980, 1984) through a logarithmic polar mapping from the retina to the visual cortex as is known in the optical and digital pattern recognition. In this last aspect, for the majority of the digital systems the process is very different to the sensorial visual system, because to its natural complexity composition. The digital images are obtained in different environment conditions in an optical-digital sensor that causes several problems for the identification and characterization of the object using computerized vision systems, also the study is limited to static objects, and only bi-

The scale transform has been proposed (Cohen, 1993, 1995) and is very suitable to scale sensitivities instead of the well known Mellin-Fourier transform. It has been applied to different areas such as image filtering and denoising (Cristóbal et al., 1998), in the identification and registration of some alphabetic letters and diatoms by computing the power cepstrum of the log-polar scale mapping (Pech-Pacheco et al., 2000) and in the automatic identification of phytoplanktonic algae (Pech-Pacheco et al., 2001) and in the

In this work, we obtain some results with a new computational algorithm for the recognition of several objects, independently of its size, angular orientation, displacement and noise. We use the scale transform and the *k*-th law nonlinear filter (Vijaya Kumar & Hassebrook, 1990) with a nonlinearity strength factor of *k*=0.3 (Coronel-Beltrán & Álvarez-Borrego, 2008). A *k*th

logarithmic polar mapping of the input image followed by a Fourier transform.

analysis of letters scaled and rotated (Pech-Pacheco et al., 2003).

**1. Introduction** 

dimensional images are considered.

**via Fourier Transform** 

## **Invariant Nonlinear Correlations via Fourier Transform**

Ángel Coronel-Beltrán1 and Josué Álvarez-Borrego2 *1Universidad de Sonora, Departamento de Investigación en Física, 2CICESE, División de Física Aplicada, Departamento de Óptica, México* 

### **1. Introduction**

A great advance has been done in optical and digital pattern recognition since the first introduction of the well known classical matched filter (CMF) (Vander Lugt, 1964), which was the cornerstone for the developments of new and more effective filters. However, the use of the CMF is inefficient because the output correlation peak is degraded drastically by the geometrical distortions of the target, such as scale and rotation changes. Many attempts have been done to achieve distortion-invariant sensitivity pattern recognition. To overcome these problems, the Fourier-Mellin transform was introduced (Casasent, 1976a, 1976b, 1976c) for scale and rotation invariance. The realization of the Mellin transform is given by a logarithmic polar mapping of the input image followed by a Fourier transform.

Some studies have been done confirming the invariance to position, scale and rotation in the visual cortex in primates and humans (Schwartz, 1977, 1980, 1984) through a logarithmic polar mapping from the retina to the visual cortex as is known in the optical and digital pattern recognition. In this last aspect, for the majority of the digital systems the process is very different to the sensorial visual system, because to its natural complexity composition. The digital images are obtained in different environment conditions in an optical-digital sensor that causes several problems for the identification and characterization of the object using computerized vision systems, also the study is limited to static objects, and only bidimensional images are considered.

The scale transform has been proposed (Cohen, 1993, 1995) and is very suitable to scale sensitivities instead of the well known Mellin-Fourier transform. It has been applied to different areas such as image filtering and denoising (Cristóbal et al., 1998), in the identification and registration of some alphabetic letters and diatoms by computing the power cepstrum of the log-polar scale mapping (Pech-Pacheco et al., 2000) and in the automatic identification of phytoplanktonic algae (Pech-Pacheco et al., 2001) and in the analysis of letters scaled and rotated (Pech-Pacheco et al., 2003).

In this work, we obtain some results with a new computational algorithm for the recognition of several objects, independently of its size, angular orientation, displacement and noise. We use the scale transform and the *k*-th law nonlinear filter (Vijaya Kumar & Hassebrook, 1990) with a nonlinearity strength factor of *k*=0.3 (Coronel-Beltrán & Álvarez-Borrego, 2008). A *k*th

$$H\_{MLF}(u,v) = |\mathbb{S}(u,v)|^k \exp[-j\phi(u,v)], \ \ 0 < k < 1\tag{1}$$

$$POF = \exp[-j\phi(u,v)].\tag{2}$$

$$M\_f(\mathbf{p}) = \int\_0^\infty f(\mathbf{x}) \, \mathbf{x}^{p-1} d\mathbf{x},\tag{3}$$

$$D\_f(\mathbf{c}) = (2\pi)^{-1/2} \int\_0^\infty f(\mathbf{x}) \exp[(-j\mathbf{c} - \mathbf{1}/2)\ln \mathbf{x}] \, d\mathbf{x},\tag{4}$$

$$f(\mathbf{x}) = (2\pi)^{-1/2} \int\_{-\infty}^{\infty} D\_f(\mathbf{c}) \exp[\left(j\mathbf{c} - 1/2\right) \ln \mathbf{x}] \, d\mathbf{c}.\tag{5}$$

$$D(c\_{\lambda}, c\_{\theta}) = (2\pi)^{-1/2} \int\_0^\infty \int\_0^{2\pi} \exp(\lambda/2) f(\lambda, \theta) \exp\left[-j(\lambda c\_{\lambda} + \theta c\_{\theta})\right] d\lambda d\theta,\tag{6}$$

Invariant Nonlinear Correlations via Fourier Transform 283

Fourier spectrum). This process is what differentiates the scale transform from the Mellin

Fig. 1. Simplified block diagram for obtaining the nonlinear filter

Fig. 2. Simplified block diagram representing the invariant correlation system with a

transform.

nonlinear filter

### **3. Metrics used in performance evaluation**

In this section, we present two well-known metrics used to evaluate the performance of the filters in the correlation output plane. These are: the peak-to-correlation energy and the discrimination capability metrics. The first is used for objets free of noise and the second for objetcs immersed in noise.

### **3.1 The peak-to-correlation energy (PCE)**

The peak-to-correlation energy PCE performance metric is defined as (Javidi & Horner, 1994)

$$PCE = \frac{|E\{c(0,0)\}|^2}{E\{\overline{\|c(\chi,\overline{\chi})\|^2}\}},\tag{7}$$

where the numerator is the correlation peak intensity expected value and the denominator is the mean energy expected value in the correlation plane.

### **3.2 The discrimination capability (DC)**

If we consider that an object is embedded in a noise background, the discrimination coefficient that is a modified version of the discrimination ratio (Vijaya & Hassebrook, 1990) is given by

$$DC = 1 - \frac{\left|\mathcal{C}^N(\mathbf{0}, \mathbf{0})\right|^2}{\left|\mathcal{C}^{OBJ}(\mathbf{0}, \mathbf{0})\right|^2},\tag{8}$$

where the correlation peak produced by the object is ����, and the highest peak of just the noise background is ��.

### **4. The nonlinear invariant correlation method**

In this section we present the methodology used in this work of the digital invariant correlation system with a nonlinear filter. First, we describe in each step the process to obtain the nonlinear filter of the target. Secondly, the invariant correlation system with the nonlinear filter is described. In both cases, the algorithms are represented with simplified block diagrams.

### **4.1 Obtaining the digital invariant system with a nonlinear filter (DISNF)**

The steps to obtain the DISNF are showed in Fig. 1. In step (1), we have the original image, ���, �� consisting of the target. The fast Fourier transform (FFT) is calculated in step (2). The modulus of the Fourier transform denoted as |���, ��|, is obtained in step (3); in this way the displacement of the input image is not affected in the Fourier plane according to the well known shift theorem (Goodman, 2005). Next, we applied a parabolic filter (step 4) (Pech-Pacheco et al., 2003). This type of filter attenuates low frequencies and passes high frequencies that increases the sharpness in the details of the object.

The next step is to introduce the scale factor given by �� �⁄ (step 5) (*r* is the radial spatial frequency, the origin of which lies at zero frequency in the optical representation of the 282 Fourier Transform – Signal Processing

In this section, we present two well-known metrics used to evaluate the performance of the filters in the correlation output plane. These are: the peak-to-correlation energy and the discrimination capability metrics. The first is used for objets free of noise and the second for

The peak-to-correlation energy PCE performance metric is defined as (Javidi & Horner,

��� � |�����,���|

where the numerator is the correlation peak intensity expected value and the denominator is

If we consider that an object is embedded in a noise background, the discrimination coefficient that is a modified version of the discrimination ratio (Vijaya & Hassebrook, 1990)

�� � � � �����,���

where the correlation peak produced by the object is ����, and the highest peak of just the

In this section we present the methodology used in this work of the digital invariant correlation system with a nonlinear filter. First, we describe in each step the process to obtain the nonlinear filter of the target. Secondly, the invariant correlation system with the nonlinear filter is described. In both cases, the algorithms are represented with simplified

The steps to obtain the DISNF are showed in Fig. 1. In step (1), we have the original image, ���, �� consisting of the target. The fast Fourier transform (FFT) is calculated in step (2). The modulus of the Fourier transform denoted as |���, ��|, is obtained in step (3); in this way the displacement of the input image is not affected in the Fourier plane according to the well known shift theorem (Goodman, 2005). Next, we applied a parabolic filter (step 4) (Pech-Pacheco et al., 2003). This type of filter attenuates low frequencies and passes high

The next step is to introduce the scale factor given by �� �⁄ (step 5) (*r* is the radial spatial frequency, the origin of which lies at zero frequency in the optical representation of the

**4.1 Obtaining the digital invariant system with a nonlinear filter (DISNF)** 

frequencies that increases the sharpness in the details of the object.

��|���,��| ��������������

�

�

, (7)


**3. Metrics used in performance evaluation** 

**3.1 The peak-to-correlation energy (PCE)** 

**3.2 The discrimination capability (DC)** 

the mean energy expected value in the correlation plane.

**4. The nonlinear invariant correlation method** 

objetcs immersed in noise.

1994)

is given by

noise background is ��.

block diagrams.

Fourier spectrum). This process is what differentiates the scale transform from the Mellin transform.

Fig. 1. Simplified block diagram for obtaining the nonlinear filter

Fig. 2. Simplified block diagram representing the invariant correlation system with a nonlinear filter

Invariant Nonlinear Correlations via Fourier Transform 285

A box plot graph for the peak-to-correlation energy PCE *vs* the nonlinearities *k* values is shown in Fig. 4 for the rotation case, from *k*=0 to *k*=1 for the target **A** correlated with the 15 problem images denoted for **A**, **B**, **C**,…**O**, shown in Fig. 3, rotated, with one rotation degree increment, from 0 to 179 rotation degrees. This graph shows the mean value with one standard error (±SE) and two standard errors (±2\*SE) for the PCE. The number of images processed for each *k* value was 2700. Because we select 11 different *k* values, 0, 0.1, 0.2, …,

From Fig. 4, we observe that for values 0.1≤*k* ≤0.4 we have the best performance of the

±2\*SE k=0k=0.1k=0.2k=0.3k=0.4k=0.5k=0.6k=0.7k=0.8k=0.9k=1 -0.002

Fig. 4. Box plot graph for the peak-to-correlation energy PCE vs the nonlinearities k values,

In a similar manner, a box plot graph for the peak-to-correlation energy PCE *vs* the nonlinearities *k* values is shown in Fig. 5 for the scale case, from *k*=0 to *k*=1 for the target **A**

±2\*SE k=0k=0.1k=0.2k=0.3k=0.4k=0.5k=0.6k=0.7k=0.8k=0.9k=1 -0.002

Fig. 5. Box Plot graph for the peak-to-correlation energy PCE *vs* the nonlinearities k values,

for the image **A** correlated with the 15 scaled problem images in Fig. 3

for the image **A** correlated with the 15 rotated problem images in Fig. 3

 Mean ±SE

 Mean ±SE

1.0, so we have a total of 29700 statistically processed images.

0.000 0.002 0.004 0.006 0.008 0.010 0.012 0.014 0.016 0.018 0.020

0.000 0.002 0.004 0.006 0.008 0.010 0.012 0.014 0.016 0.018 0.020 0.022 0.024

PCE Mean

PCE Mean

nonlinear filter.

Cartesian coordinates are mapped to polar coordinates to obtain the rotation invariance (step 6). In this step we introduced a bilinear interpolation of the first data of coordinate conversion. This is done to avoid the aliasing due to the log-polar sampling. A logarithmic scaling is made in the radial part in polar coordinates (step 7). And taking the FFT (step 8), we obtain a filter with invariance to position, rotation and scale (step 9). In this step we write the filter in a similar manner to equation (1), where |�| and � are the modulus and the phase of the Fourier transform of the object to be recognized, respectively, after to consider the invariance.

### **4.2 The nonlinear invariant correlation**

The steps to realize the invariant correlation with a nonlinear filter are shown in Fig. 2. The problem image which contains or not the target is the input image (step 1). From the step (1) to (9), the procedure is the same as in Fig. 1. The step (9) shows the nonlinear information of the problem image, where |�| and � are the modulus and the phase of the Fourier transform of the problem image, respectively, after to consider the invariance. The steps (9) and (10) show the correlation procedure to obtain the invariant digital correlation (step 11) to position, rotation and scale using a nonlinear filter. The linear and nonlinear filter inside this invariant process is a combination more powerful to recognize different patterns with different scale and rotations.

## **5. Computer simulations**

In this section, we present a numerical statistical experiment to find the best *k* nonlinearity strength factor value for the nonlinear filter.The results were box plotted, peak-tocorrelation energy (PCE) *vs k*, one for the rotation and the other for the scale. Image **A** was used as the target (Fig. 3). Obtaining that the results for the maximum *k* value were the same as in (Coronel-Beltrán & Álvarez-Borrego, 2008), that is *k*=0.3.

Fig. 3. Images used for obtaining the best k value. Image **A** was used like the target

284 Fourier Transform – Signal Processing

Cartesian coordinates are mapped to polar coordinates to obtain the rotation invariance (step 6). In this step we introduced a bilinear interpolation of the first data of coordinate conversion. This is done to avoid the aliasing due to the log-polar sampling. A logarithmic scaling is made in the radial part in polar coordinates (step 7). And taking the FFT (step 8), we obtain a filter with invariance to position, rotation and scale (step 9). In this step we write the filter in a similar manner to equation (1), where |�| and � are the modulus and the phase of the Fourier transform of the object to be recognized, respectively, after to consider

The steps to realize the invariant correlation with a nonlinear filter are shown in Fig. 2. The problem image which contains or not the target is the input image (step 1). From the step (1) to (9), the procedure is the same as in Fig. 1. The step (9) shows the nonlinear information of the problem image, where |�| and � are the modulus and the phase of the Fourier transform of the problem image, respectively, after to consider the invariance. The steps (9) and (10) show the correlation procedure to obtain the invariant digital correlation (step 11) to position, rotation and scale using a nonlinear filter. The linear and nonlinear filter inside this invariant process is a combination more powerful to recognize different patterns with

In this section, we present a numerical statistical experiment to find the best *k* nonlinearity strength factor value for the nonlinear filter.The results were box plotted, peak-tocorrelation energy (PCE) *vs k*, one for the rotation and the other for the scale. Image **A** was used as the target (Fig. 3). Obtaining that the results for the maximum *k* value were the same

Fig. 3. Images used for obtaining the best k value. Image **A** was used like the target

as in (Coronel-Beltrán & Álvarez-Borrego, 2008), that is *k*=0.3.

the invariance.

**4.2 The nonlinear invariant correlation** 

different scale and rotations.

**5. Computer simulations** 

A box plot graph for the peak-to-correlation energy PCE *vs* the nonlinearities *k* values is shown in Fig. 4 for the rotation case, from *k*=0 to *k*=1 for the target **A** correlated with the 15 problem images denoted for **A**, **B**, **C**,…**O**, shown in Fig. 3, rotated, with one rotation degree increment, from 0 to 179 rotation degrees. This graph shows the mean value with one standard error (±SE) and two standard errors (±2\*SE) for the PCE. The number of images processed for each *k* value was 2700. Because we select 11 different *k* values, 0, 0.1, 0.2, …, 1.0, so we have a total of 29700 statistically processed images.

From Fig. 4, we observe that for values 0.1≤*k* ≤0.4 we have the best performance of the nonlinear filter.

Fig. 4. Box plot graph for the peak-to-correlation energy PCE vs the nonlinearities k values, for the image **A** correlated with the 15 rotated problem images in Fig. 3

In a similar manner, a box plot graph for the peak-to-correlation energy PCE *vs* the nonlinearities *k* values is shown in Fig. 5 for the scale case, from *k*=0 to *k*=1 for the target **A**

Fig. 5. Box Plot graph for the peak-to-correlation energy PCE *vs* the nonlinearities k values, for the image **A** correlated with the 15 scaled problem images in Fig. 3

Invariant Nonlinear Correlations via Fourier Transform 287

We used the *k* strength factor value for the nonlinear filter using the letter **E** for rotation and scale, *k*=0.3. The results of the filter **E** for rotation and scale are shown in Figs. 6 and 7. The filter is correlated with each one of the 26 alphabet letter, with one rotation degree increment, from 0 to 359 deg and scaled from 75% to 125% with one percent increment

With respect to rotation the algorithm has a good performance to ±2\*SE when we want to recognize the **E** letter (Fig. 6). More results were obtained using different characters like target (Coronel-Beltrán & Álvarez-Borrego, 2008). From Fig. 7 we observe an overlap of the letter **E** with the **F** and **L** letters, but, to ±1\*SE the invariant correlation had a good

**5.2 Comparison of an invariant digital system using nonlinear and phase only filters**  We present another correlation examples using a nonlinear filter with the same *k*=0.3, and compared with the phase only filter. Figs. 8 and 9 show the graphs of the output correlation performance of these filters where the filter and the problem image are the letters E. We observe a less noisy correlation plane (Fig. 8a) with peaks well defined p1, p2, and p3 when a nonlinear filter is used. The PCE value was 0.0215. An output plane correlation with more background noise is presented when in the invariant correlation a phase only filter is used

Because the images in the problem image have the same scale like the target we analyze in this example the rotation angles only. So, Fig. 8c shows one transect in the rotation axis of

So, in Fig. 8c, the peaks are along the rotation axes that correspond to each letter E rotated in

(a)

the Fig. 8a. We can observe three peaks, p1, p2 and p3 and a small peak p4.

considering that the filter is 100% in scale.

(Fig. 8b). For this case the PCE value was 0.0057.

performance.

the problem image.

correlated with the 15 problem images scaled, with one percent increment, from 75% to 125%. This graph shows the mean value with one standard error (±1\*SE) and two standard errors (±2\*SE) for the PCE. The number of images processed for each *k* value was 765. We have the same 11 different values of *k* used for the rotation problem, hence we have a total of 8415 statistically processed images.

From Fig. 5, we observe that for values 0.1≤*k* ≤0.4 we have the best performance of the nonlinear filter again.

For the rotated and scaled cases, the result for the best *k* value was the same, *k*=0.3.

### **5.1 DISNF applied to alphabetic letters**

Some results obtained using a DISNF for testing this method for changes of rotation and scale of the letter **E** are presented. In other example the cross-correlations were calculated for a nonlinear and a phase only filters and its performance was compared.

Fig. 6. Performance of the filter **E** for rotation

Fig. 7. Performance of the filter **E** for scale from 75% to 125%

286 Fourier Transform – Signal Processing

correlated with the 15 problem images scaled, with one percent increment, from 75% to 125%. This graph shows the mean value with one standard error (±1\*SE) and two standard errors (±2\*SE) for the PCE. The number of images processed for each *k* value was 765. We have the same 11 different values of *k* used for the rotation problem, hence we have a total

From Fig. 5, we observe that for values 0.1≤*k* ≤0.4 we have the best performance of the

Some results obtained using a DISNF for testing this method for changes of rotation and scale of the letter **E** are presented. In other example the cross-correlations were calculated for

**±2\*SE ABCDE FGH I J KL MNOPQRS T UVWX Y Z 0.00**

**ABCDE F GH I J K L MNOP QRS TUVWX Y Z**

 **Mean ±SE** 

> **Mean ±SE ±2\*SE**

For the rotated and scaled cases, the result for the best *k* value was the same, *k*=0.3.

a nonlinear and a phase only filters and its performance was compared.

of 8415 statistically processed images.

**5.1 DISNF applied to alphabetic letters** 

**0.01**

Fig. 6. Performance of the filter **E** for rotation

**-0.02 0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14 0.16**

Fig. 7. Performance of the filter **E** for scale from 75% to 125%

**PCE Mean**

**0.02**

**0.03**

**PCE Mean**

**0.04**

**0.05**

**0.06**

nonlinear filter again.

We used the *k* strength factor value for the nonlinear filter using the letter **E** for rotation and scale, *k*=0.3. The results of the filter **E** for rotation and scale are shown in Figs. 6 and 7. The filter is correlated with each one of the 26 alphabet letter, with one rotation degree increment, from 0 to 359 deg and scaled from 75% to 125% with one percent increment considering that the filter is 100% in scale.

With respect to rotation the algorithm has a good performance to ±2\*SE when we want to recognize the **E** letter (Fig. 6). More results were obtained using different characters like target (Coronel-Beltrán & Álvarez-Borrego, 2008). From Fig. 7 we observe an overlap of the letter **E** with the **F** and **L** letters, but, to ±1\*SE the invariant correlation had a good performance.

### **5.2 Comparison of an invariant digital system using nonlinear and phase only filters**

We present another correlation examples using a nonlinear filter with the same *k*=0.3, and compared with the phase only filter. Figs. 8 and 9 show the graphs of the output correlation performance of these filters where the filter and the problem image are the letters E. We observe a less noisy correlation plane (Fig. 8a) with peaks well defined p1, p2, and p3 when a nonlinear filter is used. The PCE value was 0.0215. An output plane correlation with more background noise is presented when in the invariant correlation a phase only filter is used (Fig. 8b). For this case the PCE value was 0.0057.

Because the images in the problem image have the same scale like the target we analyze in this example the rotation angles only. So, Fig. 8c shows one transect in the rotation axis of the Fig. 8a. We can observe three peaks, p1, p2 and p3 and a small peak p4.

So, in Fig. 8c, the peaks are along the rotation axes that correspond to each letter E rotated in the problem image.

Invariant Nonlinear Correlations via Fourier Transform 289

are complementary angles, because when we rotate an object to certain angle, there is

In Fig. 8b we have the appearance of peak p4. This peak corresponds to 160 deg that is

In the image problem of Fig. 9 we have two E letters; one is rotated 110 deg and the other to zero deg and scaled to 150% with respect to the filter. The peak p1 corresponds to zero deg.

We can observe from these figures that when the rotation angle of the character is small the

Figs. 9a and 9b show the output correlation plane when we have in the problem image one character with different rotation and scale and we are using a nonlinear and a linear filter respectively in order to make the recognition of the letter. However, one transect along the scale axis is shown in Fig. 9c. The peak p1 is observed exactly in the 150% value in the scale

Because we are studying the behavior of the nonlinear versus the linear filter a quantitative comparison between these filters were done using the PCE metric. A desirable attribute of a correlator is capable of producing sharp correlation peaks. This characterization is taken account by the PCE. Considering a nonlinear filter compared with the phase only, the first one concentrates most of the energy that passes through the filter in the correlation peak. We denote PCENLF, PCEPOF the PCE values for the nonlinear filter and the phase only filter, respectively. In this manner, the ratios of these filters were, from Fig. 8: PCENLF/PCEPOF=3.77; and from Fig. 9: PCENL/PCEPOF=4.1. In all these cases, the PCE value

(a)

another angle for the same object (Fig. 8c). So, in correlation values p1+p4=p2+p3.

The peak p2 corresponds to 110 deg and its correspondent peak p3 to -70 deg.

complementary to -20 deg of the peak p1. So, p1+p4=p2+p3 again.

for the nonlinear filter was better compared with the phase only filter.

secondary peak tends to be smaller.

axis.

Fig. 8. Invariant correlation to rotation using (a) a nonlinear filter with *k*=0.3 and PCE=0.0215, (b) a phase only filter with PCE=0.0057, and (c) a rotation transect of (a)

In the image problem of Fig. 8a we have two E letters, one is rotated -20 deg and the other to 110 deg. The peak p1 corresponds to -20 deg, the peak p2 corresponds to 110 deg, the peak p3 to -70 deg and the peak p4 corresponds to 160 deg. Peaks p1 and p4, and peaks p2 and p3 288 Fourier Transform – Signal Processing

(b)

p2

p4

p1

p3

(c)

**-180 -135 -90 -70 -45 -20 0 45 90 110 135 160 180**

**Rotation degrees**

In the image problem of Fig. 8a we have two E letters, one is rotated -20 deg and the other to 110 deg. The peak p1 corresponds to -20 deg, the peak p2 corresponds to 110 deg, the peak p3 to -70 deg and the peak p4 corresponds to 160 deg. Peaks p1 and p4, and peaks p2 and p3

Fig. 8. Invariant correlation to rotation using (a) a nonlinear filter with *k*=0.3 and PCE=0.0215, (b) a phase only filter with PCE=0.0057, and (c) a rotation transect of (a)

**-500**

**0**

**500**

**1000**

**1500**

**Non normalized correlation value**

**2000**

**2500**

**3000**

**3500**

are complementary angles, because when we rotate an object to certain angle, there is another angle for the same object (Fig. 8c). So, in correlation values p1+p4=p2+p3.

In Fig. 8b we have the appearance of peak p4. This peak corresponds to 160 deg that is complementary to -20 deg of the peak p1. So, p1+p4=p2+p3 again.

In the image problem of Fig. 9 we have two E letters; one is rotated 110 deg and the other to zero deg and scaled to 150% with respect to the filter. The peak p1 corresponds to zero deg. The peak p2 corresponds to 110 deg and its correspondent peak p3 to -70 deg.

We can observe from these figures that when the rotation angle of the character is small the secondary peak tends to be smaller.

Figs. 9a and 9b show the output correlation plane when we have in the problem image one character with different rotation and scale and we are using a nonlinear and a linear filter respectively in order to make the recognition of the letter. However, one transect along the scale axis is shown in Fig. 9c. The peak p1 is observed exactly in the 150% value in the scale axis.

Because we are studying the behavior of the nonlinear versus the linear filter a quantitative comparison between these filters were done using the PCE metric. A desirable attribute of a correlator is capable of producing sharp correlation peaks. This characterization is taken account by the PCE. Considering a nonlinear filter compared with the phase only, the first one concentrates most of the energy that passes through the filter in the correlation peak. We denote PCENLF, PCEPOF the PCE values for the nonlinear filter and the phase only filter, respectively. In this manner, the ratios of these filters were, from Fig. 8: PCENLF/PCEPOF=3.77; and from Fig. 9: PCENL/PCEPOF=4.1. In all these cases, the PCE value for the nonlinear filter was better compared with the phase only filter.

Inv

variant Nonlinear Co

orrelations via Four

treat with two k d the other is the S

of noise (a), with 0.3 (d), and with S

**(b**

**b)** 

rier Transform

kinds of noise ap S&PP noise (Fig.

plied to the targ

get. One is the ad

291

dditive

 **(d)**

 **(g)**

zero and varianc (e), 0.2 (f) and 0.3

ce of 3 (g)

merical dditive e have hat the a noise

nt using 30 num *k*=0.3 when an ad n both graphs w phs we observe th ately 0.27 and a espectively.

> +95% Mean Value -95%

additive Gaussia

an

8 0.9 1

an noise of mean ise density of 0.1

ination coefficien inear filter with *k* are considered. In From these grap nce of approxima nd S&PP noise, re

n the presence of

0.6 0.7 0.8

10).

 **(c)** 

 **(f)** 

**(e**

**e)** 

additive Gaussia S&PP noise of noi

 of the discrimi dence for a nonli nd a S&PP noise a he density noise. h a noise varian ditive Gaussian an

filter with *k*=0.3 in

0.4 0.5 **Variance**

how the graphs 5% level of confid th mean zero, an he variance and th e the object with mately 0.3 for add

**oise** 

**Target with no** this section we aussian noise and

**6.**  In Ga

Fig 0.1

g. 10. Target free 1 (b), 0.2 (c) and 0

**(a)** 

gs. 11 and 12 sh periments to a 95 aussian noise, wit e DC mean *vs.* th ter can recognize nsity of approxim

Fig exp Ga the filt de

Fig no

oise

g. 11. Performanc




**Discrimination Coefficient Mean**

0

0.5

1

<sup>0</sup> 0.1 -2

ce of a nonlinear f

10.2 0.3

Fig. 9. Rotation and scale invariant correlation using (a) nonlinear filter with *k*=0.3 and PCE=0.025, (b) a phase only filter with PCE=0.0061, and (c) a scale transect of (a)

#### **6. Target with no oise**

290 Fourier Transform – Signal Processing

(b)

**p1**

**4000**

**3500**

**3000**

**2500**

**2000**

**Non normalized correlation**

**1500**

**1000**

 **500**

 **0**

**-500**

(c)

**150 120 100 80 60 Scale in %**

Fig. 9. Rotation and scale invariant correlation using (a) nonlinear filter with *k*=0.3 and PCE=0.025, (b) a phase only filter with PCE=0.0061, and (c) a scale transect of (a)

In Ga this section we aussian noise and treat with two k d the other is the S kinds of noise ap S&PP noise (Fig. plied to the targ 10). get. One is the ad dditive

Fig 0.1 g. 10. Target free 1 (b), 0.2 (c) and 0 of noise (a), with 0.3 (d), and with S additive Gaussia S&PP noise of noi an noise of mean ise density of 0.1 zero and varianc (e), 0.2 (f) and 0.3 ce of 3 (g)

Fig exp Ga the filt de gs. 11 and 12 sh periments to a 95 aussian noise, wit e DC mean *vs.* th ter can recognize nsity of approxim how the graphs 5% level of confid th mean zero, an he variance and th e the object with mately 0.3 for add of the discrimi dence for a nonli nd a S&PP noise a he density noise. h a noise varian ditive Gaussian an ination coefficien inear filter with *k* are considered. In From these grap nce of approxima nd S&PP noise, re nt using 30 num *k*=0.3 when an ad n both graphs w phs we observe th ately 0.27 and a espectively. merical dditive e have hat the a noise

Fig no g. 11. Performanc oise ce of a nonlinear f filter with *k*=0.3 in n the presence of additive Gaussia an

Invariant Nonlinear Correlations via Fourier Transform 293

Fig. 13. Simplified block diagram representing the invariant correlation system with a

Fig. 14. Performance comparison with SDISNF when the target is embedded in additive

nonlinear filter using the SDISNF

Gaussian noise

Fig. 12. Performance of a nonlinear filter with k=0.3 in the presence of S&PP noise

It is important to say that the maximum value that can be obtained with the DC is unity, while values below zero indicate that the filter does not recognize the object.

### **7. DISNF and the Spearman correlation**

To study the improvement of the method when the images are embedded in an additive Gaussian noise and in one S&PP noise, we used a nonparametric method, called the rank statistics, to calculate correlation between two images. Substituting the value of each pixel for its corresponding rank, in the normalized correlation expression, a nonlinear correlation expression is obtained [Spearman's rank correlation (SRC)]. Taking this into consideration, the 2-D SRC can be expressed as (Guerrero-Moreno & Álvarez-Borrego, 2009)

$$R(k,l) = 1 - \frac{6\,\Sigma\_{m,\text{net}\vert W}(r\_t(m,n) - r\_t(m+k,n+l))^2}{\vert W \vert \cdot (\vert W \vert^2 - 1)}\,,\tag{9}$$

where {*rt(m,n), m* = 1,2,…*N*; *n* = 1,2,…*M*} is the rank of the target, {*rs(m+k,n+l), k* = 1,2,…*N*; *l* = 1,2,…*M*} is the rank of the problem image, *W* is the size of each image, *R* is the Spearman rank correlation and *m*, *n*, *k*, *l* are the spatial coordinates.

Fig. 13 shows a block diagram representing the SRC in conjunction with the nonlinear method (SDISNF) in order to increase the tolerance to the noise. In this figure, from the step (1) to the step (10), the procedure is the same as in Fig. 2, except that now, step (11), we realize the inverse FFT for the steps (9) and (10). The step (12) is the SRC in the spatial domain and the step (13) give us the output correlation plane.

From Figs. 14 and 15 we observe the improvement of the filter performance when an additive Gaussian noise, with mean zero, and S&PP noise is added respectively. From these graphs we observe that the filter can support a noise variance of approximately 0.92 and a noise density of approximately 0.54 for additive Gaussian and S&PP noise, respectively.

292 Fourier Transform – Signal Processing

Fig. 12. Performance of a nonlinear filter with k=0.3 in the presence of S&PP noise

while values below zero indicate that the filter does not recognize the object.

the 2-D SRC can be expressed as (Guerrero-Moreno & Álvarez-Borrego, 2009)

**7. DISNF and the Spearman correlation** 





**Discrimination Coefficient Mean**


0

1

rank correlation and *m*, *n*, *k*, *l* are the spatial coordinates.

domain and the step (13) give us the output correlation plane.

It is important to say that the maximum value that can be obtained with the DC is unity,

<sup>0</sup> 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 <sup>1</sup> -6

**Density**

To study the improvement of the method when the images are embedded in an additive Gaussian noise and in one S&PP noise, we used a nonparametric method, called the rank statistics, to calculate correlation between two images. Substituting the value of each pixel for its corresponding rank, in the normalized correlation expression, a nonlinear correlation expression is obtained [Spearman's rank correlation (SRC)]. Taking this into consideration,

���, �� ��� � <sup>∑</sup> ��� �,��� ��,���������,�����

where {*rt(m,n), m* = 1,2,…*N*; *n* = 1,2,…*M*} is the rank of the target, {*rs(m+k,n+l), k* = 1,2,…*N*; *l* = 1,2,…*M*} is the rank of the problem image, *W* is the size of each image, *R* is the Spearman

Fig. 13 shows a block diagram representing the SRC in conjunction with the nonlinear method (SDISNF) in order to increase the tolerance to the noise. In this figure, from the step (1) to the step (10), the procedure is the same as in Fig. 2, except that now, step (11), we realize the inverse FFT for the steps (9) and (10). The step (12) is the SRC in the spatial

From Figs. 14 and 15 we observe the improvement of the filter performance when an additive Gaussian noise, with mean zero, and S&PP noise is added respectively. From these graphs we observe that the filter can support a noise variance of approximately 0.92 and a noise density of approximately 0.54 for additive Gaussian and S&PP noise, respectively.


�

, (9)

+95% Mean Value -95%

Fig. 13. Simplified block diagram representing the invariant correlation system with a nonlinear filter using the SDISNF

Fig. 14. Performance comparison with SDISNF when the target is embedded in additive Gaussian noise

Invariant Nonlinear Correlations via Fourier Transform 295

foreground/background colors were selected with color coordinates RGB(r,g,b), normalized to [0-255](bytes), as follows; red: (255,0,0), green: (0,128,0), blue: (0,0,255), yellow: (255,255,0), black: (0,0,0) and white: (255,255,255). The letters are 512x512 pixels in size and bitmap file format with 256 colors. The results show that the letters in TNR font have a better mean PCE

For the italicized style we can observe from this graph that TNR font had a greater mean PCE value compared with Ar and CN fonts. All of them in this style had the same

For the plain style we can observe that TNR font had also a greater mean PCE value compared with Ar and CN fonts. But contrary to the italicized style, in the plain style we have found that the color combinations affect the mean PCE value of the font type letters. The black-on-white and the white-on-blue color combinations had the higher mean PCE values. For the green-on-yellow, red-on-green and yellow-on-blue color combinations, there were no significant changes in the mean PCE values, and like in the italicized case, their behaviour was the same, and the color combinations were irrelevant as well as their mean

Using the RT of some participant individuals, reading a text, the effects of six color combinations, three font types (Arial, Courier New & Times New Roman) in Italicized and Plain styles were studied (Hill, 1997) and the results showed that for green-on-yellow, in italicized style, the TNR font had best RT values than Arial font. For all color combinations in italicized and plain style, the TNR font has RT smaller than the Ar font, except for blackon-white and red-on-green in italicized style, and for black-on-white and white-on-blue in

> Mean ±1 Conf. Interval ±0.95 Conf. Interval

when is compared with the Ar and CN fonts.

PCE was statistically not significant.

BK/W Arial\_Italicized

**0.2**

**0.3**

**0.4**

**0.5**

**Mean PCE**

**0.6**

**0.7**

**0.8**

BK/W CN\_Italicized

BK/W TNR\_Italicized

GN/Y\_Arial\_Italicized

GN/Y\_CN\_Italicized

GN/Y\_TNR\_Italicized

R/GN\_Arial\_Italicized

R/GN\_CN\_Italicized

R/GN\_TNR\_Italicized

W/BL\_Arial\_Italicized

W/BL\_CN\_Italicized

W/BL\_TNR\_Italicized

Y/BL\_Arial\_Italicized

Y/BL\_CN\_Italicized

Y/BL\_TNR\_Italicized

Fig. 16. Performance of the letters Arial (Ar), Courier New (CN) and Times New Roman (TNR) with font types in Italicized & Plain style and 5 foreground/background color combinations

BK/W Arial\_Plain

BK/W CN\_Plain

BK/W TNR\_Plain

GN/Y Arial\_Plain

GN/Y CN\_Plain

GN/Y TNR\_Plain

R/GN Arial\_Plain

R/GN CN\_Plain

R/GN TNR\_Plain

W/BL Arial\_Plain

W/BL CN\_Plain

W/BL TNR\_Plain

Y/BL Arial\_Plain

Y/BL CN\_Plain

Y/BL TNR\_Plain

plain style.

performance and the color combinations were irrelevant.

Fig. 15. Performance comparison with SDISNF when the target is immersed in S&PP noise

These figures show the performance comparison when we used the SDISNF for the target immersed in additive Gaussian and S&PP noise. From these figures we can see that the discrimination coefficient is greater than for the only the additive Gaussian noise by a factor of three and for S&PP noise the factor is approximately two.

### **8. Comparative analysis between different font types and styles letters using a nonlinear invariant digital correlation**

In this section we present a comparative analysis of the letters in Times New Roman (TNR), Courier New (CN) and Arial (Ar) font types in Plain & Italicized style and the effects of 5 foreground/background color combinations using an invariant digital correlation system with a nonlinear filter with *k*=0.3 (Coronel-Beltrán & Álvarez-Borrego, 2010). The evaluation of the output plane with this filter is given by the peak-to-correlation energy (PCE) metric. The results show that the letters in TNR font have a better mean PCE value when is compared with the CN and Ar fonts. This result is in agreement with some studies (Hill, 1997) about text legibility and for readability where the reaction time (RT) of some participant individuals reading a text is measured. We conclude that the PCE metric is proportional to 1/RT.

Fig. 16 shows the mean PCE values obtained versus the letters in Arial (Ar), Courier New (CN) and Times New Roman (TNR) font types in Italicized and Plain style with 5 foreground/background color combinations; black-on-white (BK/W), green-on-yellow (GN/Y), red-on-green (R/GN), white-on-blue (W/BL) and yellow-on-blue (Y/BL), using an invariant digital correlation system with a nonlinear filter with *k*=0.3. The 294 Fourier Transform – Signal Processing

Fig. 15. Performance comparison with SDISNF when the target is immersed in S&PP noise

of three and for S&PP noise the factor is approximately two.

**a nonlinear invariant digital correlation** 

proportional to 1/RT.

These figures show the performance comparison when we used the SDISNF for the target immersed in additive Gaussian and S&PP noise. From these figures we can see that the discrimination coefficient is greater than for the only the additive Gaussian noise by a factor

**8. Comparative analysis between different font types and styles letters using** 

In this section we present a comparative analysis of the letters in Times New Roman (TNR), Courier New (CN) and Arial (Ar) font types in Plain & Italicized style and the effects of 5 foreground/background color combinations using an invariant digital correlation system with a nonlinear filter with *k*=0.3 (Coronel-Beltrán & Álvarez-Borrego, 2010). The evaluation of the output plane with this filter is given by the peak-to-correlation energy (PCE) metric. The results show that the letters in TNR font have a better mean PCE value when is compared with the CN and Ar fonts. This result is in agreement with some studies (Hill, 1997) about text legibility and for readability where the reaction time (RT) of some participant individuals reading a text is measured. We conclude that the PCE metric is

Fig. 16 shows the mean PCE values obtained versus the letters in Arial (Ar), Courier New (CN) and Times New Roman (TNR) font types in Italicized and Plain style with 5 foreground/background color combinations; black-on-white (BK/W), green-on-yellow (GN/Y), red-on-green (R/GN), white-on-blue (W/BL) and yellow-on-blue (Y/BL), using an invariant digital correlation system with a nonlinear filter with *k*=0.3. The foreground/background colors were selected with color coordinates RGB(r,g,b), normalized to [0-255](bytes), as follows; red: (255,0,0), green: (0,128,0), blue: (0,0,255), yellow: (255,255,0), black: (0,0,0) and white: (255,255,255). The letters are 512x512 pixels in size and bitmap file format with 256 colors. The results show that the letters in TNR font have a better mean PCE when is compared with the Ar and CN fonts.

For the italicized style we can observe from this graph that TNR font had a greater mean PCE value compared with Ar and CN fonts. All of them in this style had the same performance and the color combinations were irrelevant.

For the plain style we can observe that TNR font had also a greater mean PCE value compared with Ar and CN fonts. But contrary to the italicized style, in the plain style we have found that the color combinations affect the mean PCE value of the font type letters. The black-on-white and the white-on-blue color combinations had the higher mean PCE values. For the green-on-yellow, red-on-green and yellow-on-blue color combinations, there were no significant changes in the mean PCE values, and like in the italicized case, their behaviour was the same, and the color combinations were irrelevant as well as their mean PCE was statistically not significant.

Using the RT of some participant individuals, reading a text, the effects of six color combinations, three font types (Arial, Courier New & Times New Roman) in Italicized and Plain styles were studied (Hill, 1997) and the results showed that for green-on-yellow, in italicized style, the TNR font had best RT values than Arial font. For all color combinations in italicized and plain style, the TNR font has RT smaller than the Ar font, except for blackon-white and red-on-green in italicized style, and for black-on-white and white-on-blue in plain style.

Fig. 16. Performance of the letters Arial (Ar), Courier New (CN) and Times New Roman (TNR) with font types in Italicized & Plain style and 5 foreground/background color combinations

Invariant Nonlinear Correlations via Fourier Transform 297

Casasent, D. & Psaltis, D. (1976b). Scale invariant optical transform. *Opt. Eng.,* Vol. 15, No. 3,

Casasent, D. & Psaltis, D. (1976c). Position, rotation, and scale invariant otical correlation.

Cohen, L. (1993). The scale representation. *IEEE-SP,* Vol. 41, No. 12, Dec. 1993, pp. 3275-

Cohen, L. (1995). *Time Frequency Analysis* (A.V. Oppenheim), Prentice Hall, ISBN-10

Coronel-Beltrán, A. & Álvarez-Borrego, J. (2008). Nonlinear filter for pattern recognition

Coronel-Beltrán, A. & Álvarez-Borrego, J. (2010). Comparative analysis between different

Cristóbal, G.; Cuesta, J. & Cohen, L. (1998). Image filtering and denoising through the scale

De Sena, A. & Rocchesso, D. (2007). A Fast Mellin and Scale Transform. *EURASIP Journal on* 

Goodman, J.W. (2005). *Introduction to Fourier Optics*, Roberts & Company, ISBN 0-9747077-2-

Guerrero-Moreno, R. & Álvarez-Borrego, J. (2009). Nonlinear composite filter performance.

Hill, A.L. (1997). Readability of websites with various foreground/background color

Horner, J.L. & Gianino, P.D. (1984). Phase-only matched filtering. *Appl. Opt.* Vol. 23, No. 6,

Javidi, B. (1989a). Nonlinear joint power spectrum based optical correlation. *Appl. Opt.* Vol.

Javidi, B. (1989b). Synthetic discriminant function-based binary nonlinear optical correlator.

Javivi, B. (1989c). Nonlinear matched filter based optical correlation. *Appl. Opt.* Vol. 28, No.

Javidi, B. (1990). Comparison on nonlinear joint transform correlator and nonlinear matched

Javidi, B. & Horner, J.L. (1994). *Real-Time Optical Information Processing*, Academic Press,

Pech-Pacheco, J.L.; Cristóbal, G., Álvarez-Borrego, J. & Cohen, L. (2000). Power cepstral image analysis through the scale transform, *Proc. SPIE*, Vol. 4113, pp. 68-79

filter based correlator. *Opt. Commun.* Vol. 75, No. 1, 1 February 1990, pp. 8-13, ISSN

combinations, font types and word styles. Stephen F. Austin State University.

using the scale transform, *Proc. SPIE,* Vol. 7073, 70732H, San Diego, California,

font types and letter styles using a nonlinear invariant digital correlation. *J Mod Optics,*Vol. 57, No. 1, (January 2010), pp. 58-64, ISSN 0950-0340 print/ISSN 1362-

transform, *Proc. IEEE-SP, International Symposium on Time-Frequency and Time-Scale Analysis,* ISBN 0-7803-5073-1, pp. 617-620, Pittsburgh, PA, USA, October 1998 De Sena, A. & Rocchesso, D. (2004). A fast Mellin transform with applications in DAFx,

*Proceedings of the 7th International Conference on Digital Audio Effects (DAFx'04),*

*Appl. Opt.,* Vol. 15, No. 7, July1, pp. 1795-1799, ISSN 0003-6935

pp. 258-261, May-June 1976, ISSN 0091-3286

3292, ISSN 1053-587X

USA, August 2008

3044 online

0030-4018

Boston, ISBN 0123811805

0135945321, New Jersey

Naples, Italy, October 2004, pp. 65-69

March 15, pp. 812-816, ISSN 0003-6935

*Advances in Signal Processing,* Vol. 2007, pp. 1-9

4, Greenwood Village, CO, USA, pp. 435-436

*Opt. Eng.* Vol. 48, No. 6, 067201, June 2009, ISSN 0091-3286

http://www.laurenscharff.com/research/AHNCUR.html

*Appl. Opt.* Vol. 28, No. 13, July 1, pp. 2490-2493, ISSN 0003-6935

28, No. 12, June 15, pp. 2358-2367, ISSN 0003-6935

21, November 1, pp. 4518-4520, ISSN 0003-6935

This algorithm was compared with respect to the one published by Alvarez-Borrego & Castro-Longoria (2003), both similar in their use for pattern recognition. Our new algorithm has a computing time that is 20% less. To perform the simulations, a computer, Dell PS\_420 with Intel (R) Corel (TM) 2 Quad CPU Q6700 @ 2.66 GHz processor, 2.00 GB of RAM, was used.

## **9. Conclusion**

A digital system for invariant correlation using a nonlinear filter was tested with the maximum nonlinearity strength factor value *k*=0.3 that was determined experimentally for both scale and rotation. We used the alphabet letters in Arial font type where each one of these letters was taken as a target and correlated with each one for the other letters and showed that our system worked efficiently for the discrimination objects. This nonlinear invariance correlation method was applied using nonlinear and only phase filters for different objects scaled and rotated, Times New Roman font type E letter, and we found a better PCE performance for the nonlinear filters. The DC performance of the noisy target, for additive Gaussian noise and impulse salt-and-pepper noise, was improved significantly applying the SDISNF.

 This new combination (SDISNF) is so more powerful than other methods when images with noise are analyzed.

In addition, we presented an analytical method, using a digital invariant correlation, for comparison and to analyze different font and style letters with five color combinations. Our results showed a better output for the TNR in comparison with the Ar font letters in italicized & plain style. And generally, italicized letter style had greater PCE values than plain style letter. These results were in accord to some other studies realized with subjective methods. Our results can be useful for studying the effects of background color combinations on legibility displayed on a computer screen from the Web, email, or other texts written in books, newspapers, magazines, etc., and its improvement as well as in readability. We conclude that the PCE metric is proportional to 1/RT. It would be interesting investigate the relationship between our method based in analytical mathematical functions and those other subjective empirical methods used by psychologists and Web designers, among others.

## **10. Acknowledgment**

This work was partially supported by the grant 102007 from CONACYT and a grant from Universidad de Sonora.

## **11. References**


296 Fourier Transform – Signal Processing

This algorithm was compared with respect to the one published by Alvarez-Borrego & Castro-Longoria (2003), both similar in their use for pattern recognition. Our new algorithm has a computing time that is 20% less. To perform the simulations, a computer, Dell PS\_420 with Intel (R) Corel (TM) 2 Quad CPU Q6700 @ 2.66 GHz processor, 2.00 GB of RAM, was

A digital system for invariant correlation using a nonlinear filter was tested with the maximum nonlinearity strength factor value *k*=0.3 that was determined experimentally for both scale and rotation. We used the alphabet letters in Arial font type where each one of these letters was taken as a target and correlated with each one for the other letters and showed that our system worked efficiently for the discrimination objects. This nonlinear invariance correlation method was applied using nonlinear and only phase filters for different objects scaled and rotated, Times New Roman font type E letter, and we found a better PCE performance for the nonlinear filters. The DC performance of the noisy target, for additive Gaussian noise and impulse salt-and-pepper noise, was improved significantly

This new combination (SDISNF) is so more powerful than other methods when images with

In addition, we presented an analytical method, using a digital invariant correlation, for comparison and to analyze different font and style letters with five color combinations. Our results showed a better output for the TNR in comparison with the Ar font letters in italicized & plain style. And generally, italicized letter style had greater PCE values than plain style letter. These results were in accord to some other studies realized with subjective methods. Our results can be useful for studying the effects of background color combinations on legibility displayed on a computer screen from the Web, email, or other texts written in books, newspapers, magazines, etc., and its improvement as well as in readability. We conclude that the PCE metric is proportional to 1/RT. It would be interesting investigate the relationship between our method based in analytical mathematical functions and those other subjective empirical methods used by psychologists and Web designers, among

This work was partially supported by the grant 102007 from CONACYT and a grant from

Álvarez-Borrego, J. & Castro-Longoria, E. Discrimination between *Acartia* (Copepoda:

Calanoida) species using their diffraction pattern in a position, rotation invariant digital correlation. *J. Plankton Res.,* Vol. 25, No. 2, pp. 229-233, ISSN 0142-7873 Casasent, D. & Psaltis, D. (1976a). Scale invariant optical correlation using Mellin transforms. *Opt. Commun.,* Vol. 17, No. 1, April 1976, pp. 59-63, ISSN 0030-4018

used.

**9. Conclusion** 

applying the SDISNF.

noise are analyzed.

others.

**10. Acknowledgment** 

Universidad de Sonora.

**11. References** 


**1. Introduction**

Since the computer's evolution in the middle of last century, pattern recognition of digital images based on correlations has been applied in science as well as technology areas. Their applications are broad and variety(Gonzalez & Woods, 2002). The techniques developed are used to identify micro-objects, for example the inclusion of virus bodies, bacteria, chromosomes, etc. (Álvarez-Borrego & Castro-Longoria, 2003; Álvarez-Borrego et al., 2002; Álvarez-Borrego & Solorza, 2010; Bueno et al., 2011; Forero-Vargas et al., 2003; Pech-Pacheco & Álvarez-Borrego, 1998; Zavala & Álvarez-Borrego, 1997). The analysis of those samples requires experience and moreover, the samples analyzed frequently contain material with different fragmentation degrees and this can lead to confusion and loss of information. There are other works in which the pattern recognition are based on probabilistic methodologies; here the objects are represented by their statistical characteristic features (Fergus et al., 2003; Holub et al., 2005). These are used to face identification and image restoration (Alon et al., 2009; Kong et al., 2010a;b; Ponce et al., 2006); fingerprints classification (Jain & Feng, 2011; Komarinski et al., 2005; Moses et al., 2009 ). Also, these works are useful to classify or count micro-objects, where the object variation, background, scale, illumination, etc., are taking into account (Arandjelovic & Zisserman, 2010; Barinova et al., 2010; Lempitsky & Zisserman, 2010). Recently, digital systems invariant correlation to position, rotation and scale are utilized in the pattern recognition field (Álvarez-Borrego & Solorza, 2010; Coronel-Beltrán & Álvarez-Borrego, 2010; Lerma-Aragón & Álvarez-Borrego, 2009a;b; Solorza & Álvarez-Borrego, 2010). Such invariants are made of by the Fourier and Fourier-Mellin transforms in conjunction with non linear filters (*k*-law filter). The non linear filters have advantages compared with the classical filters (POF, BAPOF, VanderLugt, CHF) due to their great capacity to discriminate objects, the maximum value of the correlation peak is well localized, and the output plane is less noisy (Guerrero-Moreno & Álvarez-Borrego, 2009). At the same decade, different contributions have been proposed to identify the target when different kinds of noise were presented in the input scene, one of the techniques that had a very good performance is the adaptive synthetic discriminant functions (ASDF) (González-Fraga et al., 2006) also used in an optical system (Díaz-Ramírez et al., 2006).

**Digital Images by One-Dimensional Signatures** 

Josué Álvarez-Borrego2 and Gildardo Chaparro-Magallanez2

*2CICESE, División de Física Aplicada, Departamento de Óptica* 

**Pattern Recognition of** 

Selene Solorza1,

*México* 

**13**

*1UABC, Facultad de Ciencias* 


## **Pattern Recognition of Digital Images by One-Dimensional Signatures**

Selene Solorza1, Josué Álvarez-Borrego2 and Gildardo Chaparro-Magallanez2 *1UABC, Facultad de Ciencias 2CICESE, División de Física Aplicada, Departamento de Óptica México* 

### **1. Introduction**

298 Fourier Transform – Signal Processing

Pech-Pacheco, J.L.; Cristóbal, G., Álvarez-Borrego, J. & Cohen, L. (2001). Automatic system

Pech-Pacheco, J.L.; Álvarez-Borrego, J., Cristóbal, G. & Mathias, S.K. (2003). Automatic

Schwartz, E.L. (1977). Afferent Geometry in the Primate Visual Cortex and the Generation of

Schwartz, E.L. (1980). Computational anatomy and functional architecture of striate cortex:

Schwartz, E.L.; Christman, D. R. & Wolf, A.P. (1984). Human primary visual cortex

Vander Lugt, A.B. (1964). Signal detection by complex spatial filtering. *IEEE Trans. Inf.* 

Vijaya Kumar, B.V.K. & Hassebrook, L. (1990). Performance measures for correlation filters.

*Theory,* Vol. 10, No. 2, Apr. 1964, pp. 139-145, ISSN 0018-9448

*Appl. Opt.,* Vol. 29, No. 20, July 10, pp. 2997-3006, ISSN 0003-6935

February 2003, pp. 551-559, ISSN 0091-3286

1980, pp. 645-669, ISSN 0042-6989

pp. 225-230, ISSN 00068993

0213-8409

for phytoplanktonic algae identification. *Limnetica,* Vol. 20(1), pp. 143-157, ISSN

object identification irrespective to geometric changes. *Opt. Eng.*, Vol. 42, No. 2,

Neuronal Trigger Features. *Biol. Cybernetics,* Vol. 28, No. 1, pp. 1-14, ISSN 0340-1200

A spatial mapping approach to perceptual coding. *Vision Res.,* Vol. 20, No. 8, Aug.

topography imaged via positron tomography. *Brain Res.,* Vol. 294, No. 2, Mar. 1984,

Since the computer's evolution in the middle of last century, pattern recognition of digital images based on correlations has been applied in science as well as technology areas. Their applications are broad and variety(Gonzalez & Woods, 2002). The techniques developed are used to identify micro-objects, for example the inclusion of virus bodies, bacteria, chromosomes, etc. (Álvarez-Borrego & Castro-Longoria, 2003; Álvarez-Borrego et al., 2002; Álvarez-Borrego & Solorza, 2010; Bueno et al., 2011; Forero-Vargas et al., 2003; Pech-Pacheco & Álvarez-Borrego, 1998; Zavala & Álvarez-Borrego, 1997). The analysis of those samples requires experience and moreover, the samples analyzed frequently contain material with different fragmentation degrees and this can lead to confusion and loss of information. There are other works in which the pattern recognition are based on probabilistic methodologies; here the objects are represented by their statistical characteristic features (Fergus et al., 2003; Holub et al., 2005). These are used to face identification and image restoration (Alon et al., 2009; Kong et al., 2010a;b; Ponce et al., 2006); fingerprints classification (Jain & Feng, 2011; Komarinski et al., 2005; Moses et al., 2009 ). Also, these works are useful to classify or count micro-objects, where the object variation, background, scale, illumination, etc., are taking into account (Arandjelovic & Zisserman, 2010; Barinova et al., 2010; Lempitsky & Zisserman, 2010).

Recently, digital systems invariant correlation to position, rotation and scale are utilized in the pattern recognition field (Álvarez-Borrego & Solorza, 2010; Coronel-Beltrán & Álvarez-Borrego, 2010; Lerma-Aragón & Álvarez-Borrego, 2009a;b; Solorza & Álvarez-Borrego, 2010). Such invariants are made of by the Fourier and Fourier-Mellin transforms in conjunction with non linear filters (*k*-law filter). The non linear filters have advantages compared with the classical filters (POF, BAPOF, VanderLugt, CHF) due to their great capacity to discriminate objects, the maximum value of the correlation peak is well localized, and the output plane is less noisy (Guerrero-Moreno & Álvarez-Borrego, 2009). At the same decade, different contributions have been proposed to identify the target when different kinds of noise were presented in the input scene, one of the techniques that had a very good performance is the adaptive synthetic discriminant functions (ASDF) (González-Fraga et al., 2006) also used in an optical system (Díaz-Ramírez et al., 2006).

There are many works about digital systems invariant to position, rotation and scale but they have a big computational cost. The composite non linear filters like ASDF filters can be used for obtain a digital invariant correlation to rotation, but they must have the information of all angles of rotation of the object to be recognized. In order to have a new correlation digital system invariant to position, rotation and scale with low computational cost, in this work a new methodology based on one-dimensional signatures of the images is presented.

Other important aspect in the study of digital system is the information leak. Then, the objective is neglect only the irrelevant information when the mask applies. Hence, we used binary rings masks associated to the image. The mask is constructed based on the real part of the Fourier transform of a given image, therefore each image will have one unique binary rings mask, called adaptive mask, in this form the information leak is avoided because the frequencies filtered are not the same always (Solorza & Álvarez-Borrego, 2010).

This chapter presents three digital systems based on the Fourier plane, adaptive binary rings masks, one-dimensional signatures and non linear correlations. Section 2 presents the methodology to obtain the adaptive binary rings mask of the image, the technique to associate the signature at each image and the non linear correlation used in the classification. Also, a position and rotation invariance digital system is presented and tested in the classification of fingerprints digital images. This section shows that the modulus of the Fourier transform of the image is the natural and easier manner to achieve the invariance translation in the digital system; taking advantage of the translation invariant property of the Fourier transform. Also, presents how the rotational invariance is obtained by the binary rings masks filters. In section 3, a digital system invariance to position, rotation and scale is built and used to classify image of Arial font type which are shifted, scaled and rotated. Here, the scale transform (which is a particular case of the Mellin transform) is introduced and utilized for the scale recognition. Section 4, gives an alternative method to get scale invariance using composite filters. Finally, concluding remarks are presented.

### **2. The digital system**

An important aspect in the study of digital systems is the information leak. In this work, the frequencies are filtered using binary rings masks and the objective is neglect only the irrelevant information when the mask applies. Therefore, it is proposed a methodology to build a mask associated to the image, in this form, the information leak is avoided because the frequencies filtered are not the same always. Next, the procedure to construct a binary rings mask of the image chosen is explained.

#### **2.1 Binary rings mask of the image**

The mask associated of a given image, named *I*, is built by taking the real part of its 2*D*-fast Fourier transform (*FFT*), given by

$$f(\mathbf{x}, y) = \text{Re}(FFT(I(\mathbf{x}, y))),\tag{1}$$

Fig. 1. Binary rings mask construction example. The asterisk means the point to point

Pattern Recognition of Digital Images by One-Dimensional Signatures 301

multiplication of both images.

where (*x*, *y*) represent a pixel of the image. For example, Fig. 1(b) shows the real part of the fast Fourier transform (Eq. (1)) of the 417 × 417 gray-scale image in Fig. 1(a). This 3*D*-graph is 2*D*-mapping by *min*<sup>1</sup>≤*x*,*y*≤<sup>417</sup> { *f*(*x*, *y*)} *max*<sup>1</sup>≤*x*,*y*≤<sup>417</sup> { *f*(*x*, *y*)} → 0 1 , shown in Fig. 1(c). 2 Will-be-set-by-IN-TECH

There are many works about digital systems invariant to position, rotation and scale but they have a big computational cost. The composite non linear filters like ASDF filters can be used for obtain a digital invariant correlation to rotation, but they must have the information of all angles of rotation of the object to be recognized. In order to have a new correlation digital system invariant to position, rotation and scale with low computational cost, in this work a

Other important aspect in the study of digital system is the information leak. Then, the objective is neglect only the irrelevant information when the mask applies. Hence, we used binary rings masks associated to the image. The mask is constructed based on the real part of the Fourier transform of a given image, therefore each image will have one unique binary rings mask, called adaptive mask, in this form the information leak is avoided because the

This chapter presents three digital systems based on the Fourier plane, adaptive binary rings masks, one-dimensional signatures and non linear correlations. Section 2 presents the methodology to obtain the adaptive binary rings mask of the image, the technique to associate the signature at each image and the non linear correlation used in the classification. Also, a position and rotation invariance digital system is presented and tested in the classification of fingerprints digital images. This section shows that the modulus of the Fourier transform of the image is the natural and easier manner to achieve the invariance translation in the digital system; taking advantage of the translation invariant property of the Fourier transform. Also, presents how the rotational invariance is obtained by the binary rings masks filters. In section 3, a digital system invariance to position, rotation and scale is built and used to classify image of Arial font type which are shifted, scaled and rotated. Here, the scale transform (which is a particular case of the Mellin transform) is introduced and utilized for the scale recognition. Section 4, gives an alternative method to get scale invariance using composite filters. Finally,

An important aspect in the study of digital systems is the information leak. In this work, the frequencies are filtered using binary rings masks and the objective is neglect only the irrelevant information when the mask applies. Therefore, it is proposed a methodology to build a mask associated to the image, in this form, the information leak is avoided because the frequencies filtered are not the same always. Next, the procedure to construct a binary rings

The mask associated of a given image, named *I*, is built by taking the real part of its 2*D*-fast

where (*x*, *y*) represent a pixel of the image. For example, Fig. 1(b) shows the real part of the fast Fourier transform (Eq. (1)) of the 417 × 417 gray-scale image in Fig. 1(a). This 3*D*-graph

*min*<sup>1</sup>≤*x*,*y*≤<sup>417</sup> { *f*(*x*, *y*)} *max*<sup>1</sup>≤*x*,*y*≤<sup>417</sup> { *f*(*x*, *y*)}

*f*(*x*, *y*) = *Re*(*FFT*(*I*(*x*, *y*))), (1)

 → 0 1 

, shown in Fig. 1(c).

new methodology based on one-dimensional signatures of the images is presented.

frequencies filtered are not the same always (Solorza & Álvarez-Borrego, 2010).

concluding remarks are presented.

mask of the image chosen is explained.

**2.1 Binary rings mask of the image**

Fourier transform (*FFT*), given by

is 2*D*-mapping by

**2. The digital system**

Fig. 1. Binary rings mask construction example. The asterisk means the point to point multiplication of both images.

(a) (b) (c) (d) (e) (f) (g)

Pattern Recognition of Digital Images by One-Dimensional Signatures 303

(h) (i) (j) (k) (l) (m) (n)

(o) (p) (q) (r) (s) (t)

corresponding ring index to obtain the signature of the image (Fig. 4(e)). Here, the rings are

In the target images data base we have twenty image without rotation (Fig. 2), thus we calculated twenty different binary rings masks (Fig. 3). Each mask will be applied at Fig. 4(b) to obtain twenty different signatures, which are shown in Fig. 5(a) in black-dashed curves. Because of the use of different rings masks the length of the signatures are different, hence zeros are added at the end of those signatures shorter than the length of the larger signature to match the size of all of them. Finally, taking the average of those twenty signatures the average signature of the image is displayed as the red-curve in Fig. 5(a). This procedure has the advantage that more information is incorporated in the average signature because of the relevant information filtered by the twenty different masks used instead of the information obtained from one mask only. When the image is changed, it is expected that the average signature changes. But, because of the uses of the adaptive rings masks associated to the target images data base, the hypothesis is that when the image presents a rotation, the average signature will be similar to the average signature of the non-rotated image, hence the digital system will has the capability to identify rotated objects. Fig. 5(b) shows the average signature of the Fig. 2(a) without rotation (called a0) and with a rotation angle of 10 degrees (a10), also the average signature of Fig. 2(b) is presented (b0). As it is expected the form of the average signatures a0 and a10 are very similar when the signature has high values, not being the case for the average signature of b0. Notice that the construction of the signature is related to the adaptive binary rings masks, hence if the images resolution changes the masks also change, this could be a problem for the digital system that should be take into account, however this topic is beyond this work and hereafter we assume that the images have the same resolution

Fig. 3. Binary masks associated at images in Fig. 2.

numbered from the center to outside of the circles.

Next, this result is filtered by the binary disk of 417 pixels diameter presented in Fig. 1(d). Based on the result of this operation, we obtain 180 profiles of 417 pixel length that passes for the (209, 209) pixel, center of the image (separated by a Δ*θ* = 1 deg, sampling of this way the entire circle). The next step is to compute the addition of the intensity values in each profile and after that select the profile whose sum has the maximum value, it is called the maximum intensity profile. Fig. 1(e), displays the profile of zero*th* degrees (blue-line) and the maximum intensity profile in red-line (*θ* = 174 degrees). Fig. 1(f) shows the graph of the maximum intensity profile in the cartesian plane, where **x** = 1, 2, 3, . . . , 417 represents the points (*r* cos *θ*,*r* sin *θ*), −209 ≤ *r* ≤ 209 and *θ* = 174. Notice the symmetry of the graph in the **x** = 209 axis, which will be preserved in the one-variable binary function

$$Z(\mathbf{x}) = \begin{cases} 1, f(\mathbf{x}) > 0, \\ 0, \text{otherwise.} \end{cases} \tag{2}$$

The graph of the *Z* function is plotted in Fig. 1(g), for clarity and due to the symmetry of the *Z* function, we only plot the points 209 ≤ **x** ≤ 300. Next, taking **x** = 209 as the rotation axis, the graph of *Z* is rotated 180 degrees to obtain concentric cylinders of height one, different widths and centered in (209, 209). Finally, mapping those cylinders in two dimensions we built the binary rings mask associated to the image. The mask for the fingerprint digital images given in Fig. 1(a) is shown in Fig. 1(h).

Each fingerprint digital image used in this work (Fig. 2) has its own binary rings mask Fig. 3. Notice how the mask changes with the images, so they are called adaptive mask. The next step in the construction of the digital system is build a signature for the image.

Fig. 2. Fingerprints digital images. The images was downloaded from Biometrics Ideal Test web page (biometrics.idealtest.org).

#### **2.2 The signature of the image**

The aim of this work is identify a specific target (the object to be recognized) no matter the position or the angle of rotation presented on the plane. For example, using the fingerprint image in Fig. 4(a), the invariance to position is obtained using the modulus of the frequency content of the image, |*FFT*(*I*)| (Fig. 4(b)). Next, the mask (Fig. 4(c)) is applied in the Fourier plane for sampling the frequencies pattern of the object (Fig. 4(d)). Finally, the different modulus values of the Fourier transform for each ring is summed and then assigned to the 4 Will-be-set-by-IN-TECH

Next, this result is filtered by the binary disk of 417 pixels diameter presented in Fig. 1(d). Based on the result of this operation, we obtain 180 profiles of 417 pixel length that passes for the (209, 209) pixel, center of the image (separated by a Δ*θ* = 1 deg, sampling of this way the entire circle). The next step is to compute the addition of the intensity values in each profile and after that select the profile whose sum has the maximum value, it is called the maximum intensity profile. Fig. 1(e), displays the profile of zero*th* degrees (blue-line) and the maximum intensity profile in red-line (*θ* = 174 degrees). Fig. 1(f) shows the graph of the maximum intensity profile in the cartesian plane, where **x** = 1, 2, 3, . . . , 417 represents the points (*r* cos *θ*,*r* sin *θ*), −209 ≤ *r* ≤ 209 and *θ* = 174. Notice the symmetry of the graph in the

1, *f*(**x**) *>* 0,

The graph of the *Z* function is plotted in Fig. 1(g), for clarity and due to the symmetry of the *Z* function, we only plot the points 209 ≤ **x** ≤ 300. Next, taking **x** = 209 as the rotation axis, the graph of *Z* is rotated 180 degrees to obtain concentric cylinders of height one, different widths and centered in (209, 209). Finally, mapping those cylinders in two dimensions we built the binary rings mask associated to the image. The mask for the fingerprint digital images given

Each fingerprint digital image used in this work (Fig. 2) has its own binary rings mask Fig. 3. Notice how the mask changes with the images, so they are called adaptive mask. The next

(a) (b) (c) (d) (e) (f) (g) (h) (i) (j)

(k) (l) (m) (n) (o) (p) (q) (r) (s) (t)

Fig. 2. Fingerprints digital images. The images was downloaded from Biometrics Ideal Test

The aim of this work is identify a specific target (the object to be recognized) no matter the position or the angle of rotation presented on the plane. For example, using the fingerprint image in Fig. 4(a), the invariance to position is obtained using the modulus of the frequency content of the image, |*FFT*(*I*)| (Fig. 4(b)). Next, the mask (Fig. 4(c)) is applied in the Fourier plane for sampling the frequencies pattern of the object (Fig. 4(d)). Finally, the different modulus values of the Fourier transform for each ring is summed and then assigned to the

0, otherwise. (2)

**x** = 209 axis, which will be preserved in the one-variable binary function

in Fig. 1(a) is shown in Fig. 1(h).

web page (biometrics.idealtest.org).

**2.2 The signature of the image**

*Z*(**x**) =

step in the construction of the digital system is build a signature for the image.

Fig. 3. Binary masks associated at images in Fig. 2.

corresponding ring index to obtain the signature of the image (Fig. 4(e)). Here, the rings are numbered from the center to outside of the circles.

In the target images data base we have twenty image without rotation (Fig. 2), thus we calculated twenty different binary rings masks (Fig. 3). Each mask will be applied at Fig. 4(b) to obtain twenty different signatures, which are shown in Fig. 5(a) in black-dashed curves. Because of the use of different rings masks the length of the signatures are different, hence zeros are added at the end of those signatures shorter than the length of the larger signature to match the size of all of them. Finally, taking the average of those twenty signatures the average signature of the image is displayed as the red-curve in Fig. 5(a). This procedure has the advantage that more information is incorporated in the average signature because of the relevant information filtered by the twenty different masks used instead of the information obtained from one mask only. When the image is changed, it is expected that the average signature changes. But, because of the uses of the adaptive rings masks associated to the target images data base, the hypothesis is that when the image presents a rotation, the average signature will be similar to the average signature of the non-rotated image, hence the digital system will has the capability to identify rotated objects. Fig. 5(b) shows the average signature of the Fig. 2(a) without rotation (called a0) and with a rotation angle of 10 degrees (a10), also the average signature of Fig. 2(b) is presented (b0). As it is expected the form of the average signatures a0 and a10 are very similar when the signature has high values, not being the case for the average signature of b0. Notice that the construction of the signature is related to the adaptive binary rings masks, hence if the images resolution changes the masks also change, this could be a problem for the digital system that should be take into account, however this topic is beyond this work and hereafter we assume that the images have the same resolution

(a) Average signature associated to the fingerprint

Fig. 6. The non linear correlation vector examples.

(b) Some average signature examples.

images taken were Fig. 2(a) without rotation (autocorrelation), its rotated ten degrees image (average signature a10), and the Fig. 2(b) (average signature b0). If the maximum value of the magnitude for the correlations (plotted as dots, Fig. 6(b)) are significant, that is similar to the autocorrelation maximum value, hence the PI contains the target, otherwise has a fingerprint

Pattern Recognition of Digital Images by One-Dimensional Signatures 305

(a) Fig. 2(a) was chosen as target. (b) Amplification around the center.

**2.4 The position and rotation invariance digital system applied to the fingerprint data base** The position and rotational invariance non linear correlation digital method described in the algorithm in Fig. 7 (steps explained in above sections) was applied to the twenty 417 × 417 gray-scale fingerprint digital images (Fig. 2). Each image was rotated ±15 degrees, one by one, hence in the problem images data base we worked with 620 images, moreover the saw tooth

in Fig 4a.

Fig. 5. Average signatures.

different to the target.

Fig. 4. Signature of a given fingerprint. Only for visualization purposes the (b) and (d) figures are shown in log scale. The asterisk means the point to point multiplication of both images.

always. Once the average signatures can be assigned at each image, the next step is the classification.

#### **2.3 The non linear correlation**

The average signature of the problem image (PI) is compared with the average signature of the target (T) to recognize or not the target using the non linear correlation, *CNL*,

$$\mathbf{C}\_{NL}(\mathbf{PI}, \mathbf{T}) = \mathbf{PI} \otimes \mathbf{T} = FFT^{-1} \left( |FFT(\mathbf{PI})|^{k} \ e^{i\phi\_{Pl}} \, |FFT(\mathbf{T})|^{k} \ e^{-i\phi\_{\tau}} \right), \tag{3}$$

where <sup>⊗</sup> means correlation, *<sup>i</sup>* <sup>=</sup> √−1, *<sup>φ</sup>PI* and *<sup>φ</sup><sup>T</sup>* are the phases of the fast Fourier transform of the problem image and the target, respectively, 0 *< k <* 1 is the non linear coefficient factor (Solorza & Álvarez-Borrego, 2010). Fig. 6(a) shows the result of Eq. (3) with *k* = 0.1 using Fig. 2(a) as the target (its average signature is given in Fig. 5(b) as a0), and the problem

(a) Average signature associated to the fingerprint in Fig 4a.

Fig. 5. Average signatures.

6 Will-be-set-by-IN-TECH

Fig. 4. Signature of a given fingerprint. Only for visualization purposes the (b) and (d) figures are shown in log scale. The asterisk means the point to point multiplication of both images.

always. Once the average signatures can be assigned at each image, the next step is the

The average signature of the problem image (PI) is compared with the average signature of

where <sup>⊗</sup> means correlation, *<sup>i</sup>* <sup>=</sup> √−1, *<sup>φ</sup>PI* and *<sup>φ</sup><sup>T</sup>* are the phases of the fast Fourier transform of the problem image and the target, respectively, 0 *< k <* 1 is the non linear coefficient factor (Solorza & Álvarez-Borrego, 2010). Fig. 6(a) shows the result of Eq. (3) with *k* = 0.1 using Fig. 2(a) as the target (its average signature is given in Fig. 5(b) as a0), and the problem


*k e*

*<sup>i</sup>φPI* <sup>|</sup>*FFT*(T)<sup>|</sup>

*k e* −*iφ<sup>T</sup>* 

, (3)

the target (T) to recognize or not the target using the non linear correlation, *CNL*,

*CNL*(PI, T) = PI <sup>⊗</sup> <sup>T</sup> <sup>=</sup> *FFT*−<sup>1</sup>

classification.

**2.3 The non linear correlation**

images taken were Fig. 2(a) without rotation (autocorrelation), its rotated ten degrees image (average signature a10), and the Fig. 2(b) (average signature b0). If the maximum value of the magnitude for the correlations (plotted as dots, Fig. 6(b)) are significant, that is similar to the autocorrelation maximum value, hence the PI contains the target, otherwise has a fingerprint different to the target.

Fig. 6. The non linear correlation vector examples.

#### **2.4 The position and rotation invariance digital system applied to the fingerprint data base**

The position and rotational invariance non linear correlation digital method described in the algorithm in Fig. 7 (steps explained in above sections) was applied to the twenty 417 × 417 gray-scale fingerprint digital images (Fig. 2). Each image was rotated ±15 degrees, one by one, hence in the problem images data base we worked with 620 images, moreover the saw tooth

(a) (b)

Pattern Recognition of Digital Images by One-Dimensional Signatures 307

(c) (d)

In some analysis of real image is so important to consider the scale invariance. In order to analyze this case, images of Fig. 9 were used (hence, the target data base has five image). They are black and white image of 257 × 257 white contours of Arial font type. We choose this font type and these image because of their similarity between some of them. Each image were scaled ±5%, one by one. Hence, we have eleven image for each letter obtaining a total of 55 images. Each of those 55 images were rotated 360 degrees, degree by degree, so we worked with 19, 800 in the problem images data base to test the methodology. Fig. 9(f) shows a 257 × 257 images with the B letter's contour rotated 315 degrees and incremented 5% from

To introduce the scale invariance in the methodology described in Fig. 7, the average signatures for the target and the problem image (PI) are computed similar of those in Fig. 7(f) and Fig. 7(j). The difference in this procedure is that the binary rings mask of the problem image (Fig. 10(c)) is also computed and the signature obtained from this mask, Fig. 10(g), contributes to build the average signature for the target too, Fig. 10(h). Figure 10(i) shows an example of the average signature when B letter is used like a target. Analogously, to obtain

Fig. 8. Box plot examples for some fingerprint images.

its original size (Fig. 9(a)).

**3. The position, rotation and scale invariance methodology**

effect (noise) is incorporated into the problem. Therefore, the digital system is more robust in the pattern recognition. Each image in Fig. 2 was selected as target, thus the target data base has twenty image only and the corresponding average signature was constructed. Those twenty average signatures were correlated with the 620 average signatures of the rotated images using *k* = 0.1 in Eq. (3). The maximum value of the magnitude for the corresponding correlation was assigned to the image. The results were box plotting by the mean correlation with two standard errors (± 2SE) and outliers.

It is impossible to show the box plot developed for each target used; hence, to exemplify we choose Fig. 2(a), (b) , (c) and (d) as targets. The mean of the maximum of the magnitude for the non linear correlation values are shown in Fig. 8. In the horizontal-axis are setting the letters corresponding to the images in Fig. 2. The vertical-axis represents the mean of the maximum values without normalize the magnitude of the non linear correlation for the images of each fingerprints and their corresponding rotated images.

The digital system has a confidence level of 95.4% according with the statistical analysis, when the target is the fingerprint in Fig. 2(f). Moreover, when the other images are chosen as target the system gives a confidence level of 100%. Therefore, the digital system in Fig. 7 shows an excellent performance in the identification of fingerprints images.

Fig. 7. Position and rotation invariant non linear correlation digital system algorithm. The asterisk means the point to point multiplication of both images.

8 Will-be-set-by-IN-TECH

effect (noise) is incorporated into the problem. Therefore, the digital system is more robust in the pattern recognition. Each image in Fig. 2 was selected as target, thus the target data base has twenty image only and the corresponding average signature was constructed. Those twenty average signatures were correlated with the 620 average signatures of the rotated images using *k* = 0.1 in Eq. (3). The maximum value of the magnitude for the corresponding correlation was assigned to the image. The results were box plotting by the mean correlation

It is impossible to show the box plot developed for each target used; hence, to exemplify we choose Fig. 2(a), (b) , (c) and (d) as targets. The mean of the maximum of the magnitude for the non linear correlation values are shown in Fig. 8. In the horizontal-axis are setting the letters corresponding to the images in Fig. 2. The vertical-axis represents the mean of the maximum values without normalize the magnitude of the non linear correlation for the images of each

The digital system has a confidence level of 95.4% according with the statistical analysis, when the target is the fingerprint in Fig. 2(f). Moreover, when the other images are chosen as target the system gives a confidence level of 100%. Therefore, the digital system in Fig. 7 shows an

Fig. 7. Position and rotation invariant non linear correlation digital system algorithm. The

asterisk means the point to point multiplication of both images.

with two standard errors (± 2SE) and outliers.

fingerprints and their corresponding rotated images.

excellent performance in the identification of fingerprints images.

Fig. 8. Box plot examples for some fingerprint images.

### **3. The position, rotation and scale invariance methodology**

In some analysis of real image is so important to consider the scale invariance. In order to analyze this case, images of Fig. 9 were used (hence, the target data base has five image). They are black and white image of 257 × 257 white contours of Arial font type. We choose this font type and these image because of their similarity between some of them. Each image were scaled ±5%, one by one. Hence, we have eleven image for each letter obtaining a total of 55 images. Each of those 55 images were rotated 360 degrees, degree by degree, so we worked with 19, 800 in the problem images data base to test the methodology. Fig. 9(f) shows a 257 × 257 images with the B letter's contour rotated 315 degrees and incremented 5% from its original size (Fig. 9(a)).

To introduce the scale invariance in the methodology described in Fig. 7, the average signatures for the target and the problem image (PI) are computed similar of those in Fig. 7(f) and Fig. 7(j). The difference in this procedure is that the binary rings mask of the problem image (Fig. 10(c)) is also computed and the signature obtained from this mask, Fig. 10(g), contributes to build the average signature for the target too, Fig. 10(h). Figure 10(i) shows an example of the average signature when B letter is used like a target. Analogously, to obtain

average signature, hence, when the PI is a rotated image of the target the same frequencies are filtered, so the average signature of both image are similar not being the case when the target

Pattern Recognition of Digital Images by One-Dimensional Signatures 309

Fig. 11 shows the digital system algorithm modified to incorporate scale invariance in the algorithm of Fig. 7. Once we have the average signature for the target (Fig. 11(g)) and PI (Fig. 11(l)), in order to determine if the PI is the rotated and/or scaled target or another image, the scale transform for the average signature of the target, named *DT*(*c*) (Fig. 11(h)) and the problem image, *DPI*(*c*) (Fig. 11(m)), are obtained by the scale transform function (De Sena &

*Ds*(*c*) = <sup>1</sup>

<sup>√</sup>2*<sup>π</sup>*

 ∞ 0

*s*(*t*)*e*

where *s*(*t*) represents the average signature of the image. Then, the |*DPI*(*c*)| (Fig. 11(i)) is compared with the |*DT*(*c*)| (Fig. 11(n)) to recognize or not the target using the non linear correlation Eq. (3) with *k* = 0.3. Finally, if the maximum value for the magnitude of the correlation is significant the PI contains the target, otherwise has a different image of the

The statistical analysis using ± 2SE and outliers gives that the digital system has a confidence level of 95.4% when the targets are the image of B and D and a 100% when the targets are E, F and X, (Fig. 12). Hence, the digital system invariant to position, rotation and scale shows an

One alternative method to get scale invariance is the use of training images to generate composite filter with Fourier transform too. This methodology have information of the target in different sizes. Fig. 13 shows the procedure to train the composite filter that will be used in the digital system invariance to rotation, scale and position. Once it is established the number of training images, for example *I*1, *I*2, ..., *In* (Fig. 13(a)), the modulus of the fast Fourier transform of each image is calculated, that is |*FFT*(*I*1)|, |*FFT*(*I*2)|, ..., |*FFT*(*In*)| (Fig. 13(c)). Then, the sum of all those modulus is done to generate a new composite Fourier plane (Fig. 13(d)), which has more information than that obtained using only one image. Mathematically,

*fc* =

(every 45 degrees). Hence, we have processed 528 images in this example.

one-dimensional signature for a given target image is obtained.

*n* ∑ *k*=1

The Fourier plane obtained by *fc* (Fig. 13(e)) can substitute the modulus plane that is presented in the procedure shown in Fig. 4(b). Then, using the binary rings mask the

The composite filter was applied to the 256 × 256 gray-scale diatoms images shows in Fig. 14. The diatoms are a class of microscopic unicellular algae, are photosynthetic organisms that live in sea water being a quite important part in the food chain. Each image in Fig. 14 was scaled in the range of 90% to 107% (one by one) and all those image were rotated 360 degrees

(−*ic*−1/2)ln *t*

*dt*, (4)


and the PI have different letters.

Rocchesso, 2004)

excellent performance.

**4. Composite filters for scale invariance**

this new Fourier plane is expressed as

target.

Fig. 9. 257 × 257 black and white images with Arial font type letter contours.

the average signature for the PI in Fig. 10(c) the same procedure is following but instead of the |*FFT*| of the target the |*FFT*| of the PI is used and the same six binary rings mask are utilized to construct the corresponding average signature. Therefore, when the PI changes the average signature of the target changes too, because the associated mask is different. However, the adaptive mask of the target and the PI are utilized in the construction of their corresponding

Fig. 10. Position, rotation and scale invariant average signature.

10 Will-be-set-by-IN-TECH

(a) (b) (c) (d) (e) (f)

the average signature for the PI in Fig. 10(c) the same procedure is following but instead of the |*FFT*| of the target the |*FFT*| of the PI is used and the same six binary rings mask are utilized to construct the corresponding average signature. Therefore, when the PI changes the average signature of the target changes too, because the associated mask is different. However, the adaptive mask of the target and the PI are utilized in the construction of their corresponding

Fig. 9. 257 × 257 black and white images with Arial font type letter contours.

Fig. 10. Position, rotation and scale invariant average signature.

average signature, hence, when the PI is a rotated image of the target the same frequencies are filtered, so the average signature of both image are similar not being the case when the target and the PI have different letters.

Fig. 11 shows the digital system algorithm modified to incorporate scale invariance in the algorithm of Fig. 7. Once we have the average signature for the target (Fig. 11(g)) and PI (Fig. 11(l)), in order to determine if the PI is the rotated and/or scaled target or another image, the scale transform for the average signature of the target, named *DT*(*c*) (Fig. 11(h)) and the problem image, *DPI*(*c*) (Fig. 11(m)), are obtained by the scale transform function (De Sena & Rocchesso, 2004)

$$D\_s(c) = \frac{1}{\sqrt{2\pi}} \int\_0^\infty s(t) e^{(-ic - 1/2)\ln t} dt,\tag{4}$$

where *s*(*t*) represents the average signature of the image. Then, the |*DPI*(*c*)| (Fig. 11(i)) is compared with the |*DT*(*c*)| (Fig. 11(n)) to recognize or not the target using the non linear correlation Eq. (3) with *k* = 0.3. Finally, if the maximum value for the magnitude of the correlation is significant the PI contains the target, otherwise has a different image of the target.

The statistical analysis using ± 2SE and outliers gives that the digital system has a confidence level of 95.4% when the targets are the image of B and D and a 100% when the targets are E, F and X, (Fig. 12). Hence, the digital system invariant to position, rotation and scale shows an excellent performance.

### **4. Composite filters for scale invariance**

One alternative method to get scale invariance is the use of training images to generate composite filter with Fourier transform too. This methodology have information of the target in different sizes. Fig. 13 shows the procedure to train the composite filter that will be used in the digital system invariance to rotation, scale and position. Once it is established the number of training images, for example *I*1, *I*2, ..., *In* (Fig. 13(a)), the modulus of the fast Fourier transform of each image is calculated, that is |*FFT*(*I*1)|, |*FFT*(*I*2)|, ..., |*FFT*(*In*)| (Fig. 13(c)). Then, the sum of all those modulus is done to generate a new composite Fourier plane (Fig. 13(d)), which has more information than that obtained using only one image. Mathematically, this new Fourier plane is expressed as

$$f\_{\mathcal{E}} = \sum\_{k=1}^{n} |FFT(I\_k)| \,. \tag{5}$$

The Fourier plane obtained by *fc* (Fig. 13(e)) can substitute the modulus plane that is presented in the procedure shown in Fig. 4(b). Then, using the binary rings mask the one-dimensional signature for a given target image is obtained.

The composite filter was applied to the 256 × 256 gray-scale diatoms images shows in Fig. 14. The diatoms are a class of microscopic unicellular algae, are photosynthetic organisms that live in sea water being a quite important part in the food chain. Each image in Fig. 14 was scaled in the range of 90% to 107% (one by one) and all those image were rotated 360 degrees (every 45 degrees). Hence, we have processed 528 images in this example.

(a) (b)

Pattern Recognition of Digital Images by One-Dimensional Signatures 311

(c) (d)

(e)

has a good performance to discriminate between other images that do not correspond to this

In general if we compares these techniques with other methodologies, we can affirm that these algorithms are robust and they have low computational cost. The algorithms were programming in Matlab 7.1, in a MacBook Pro 3.1 with a Intel Core 2 Duo processor of 2.4 GHz, memory of 2 GB 667 MHz DDR2 SDRAM, L2 Cache of 4 MB and 800 MHz of Bus Speed. The time machine for the correlation per image were around 0.25 seconds. (Álvarez-Borrego

Fig. 12. Box plot analysis of images in Fig. 9.

& Solorza, 2010; Solorza & Álvarez-Borrego, 2010)

diatom.

Fig. 11. Position, rotation and scale invariant non linear correlation digital system algorithm.

For example, Fig. 15(a) is chosen as the target. Then, its Fourier plane was made of training images using Eq. (5), shown in Fig. 15(b). Then, the corresponding binary rings mask is applied to it (Fig. 15(c)) to obtain the signature of the target (Fig. 15(d)). Analogously, the problem image is selected (Fig. 15(e)), the |*FFT*(*PI*)| is calculated (Fig. 15(f)). The respective binary rings mask is applied (Fig. 15(g)) to construct the signature corresponding to the PI (Fig. 15(h)). Once it is obtained the signatures, the non linear correlation in Eq. (3) with *k* = 0.1 is applied (Fig. 15(i)). If the maximum value of the magnitude for the correlation of the problem image is significant, that is similar to the autocorrelation maximum value of the target, hence the PI contains the target, otherwise has a image different to the target (Fig. 15(j)).

In Fig. 16 is shown the box plot results when Fig. 14(a) is taken as target. As expected the correlations for the Fig. 14(a) are more separate from the rest, showing the digital system 12 Will-be-set-by-IN-TECH

Fig. 11. Position, rotation and scale invariant non linear correlation digital system algorithm.

For example, Fig. 15(a) is chosen as the target. Then, its Fourier plane was made of training images using Eq. (5), shown in Fig. 15(b). Then, the corresponding binary rings mask is applied to it (Fig. 15(c)) to obtain the signature of the target (Fig. 15(d)). Analogously, the problem image is selected (Fig. 15(e)), the |*FFT*(*PI*)| is calculated (Fig. 15(f)). The respective binary rings mask is applied (Fig. 15(g)) to construct the signature corresponding to the PI (Fig. 15(h)). Once it is obtained the signatures, the non linear correlation in Eq. (3) with *k* = 0.1 is applied (Fig. 15(i)). If the maximum value of the magnitude for the correlation of the problem image is significant, that is similar to the autocorrelation maximum value of the target, hence the PI contains the target, otherwise has a image different to the target (Fig.

In Fig. 16 is shown the box plot results when Fig. 14(a) is taken as target. As expected the correlations for the Fig. 14(a) are more separate from the rest, showing the digital system

15(j)).

Fig. 12. Box plot analysis of images in Fig. 9.

has a good performance to discriminate between other images that do not correspond to this diatom.

In general if we compares these techniques with other methodologies, we can affirm that these algorithms are robust and they have low computational cost. The algorithms were programming in Matlab 7.1, in a MacBook Pro 3.1 with a Intel Core 2 Duo processor of 2.4 GHz, memory of 2 GB 667 MHz DDR2 SDRAM, L2 Cache of 4 MB and 800 MHz of Bus Speed. The time machine for the correlation per image were around 0.25 seconds. (Álvarez-Borrego & Solorza, 2010; Solorza & Álvarez-Borrego, 2010)

Fig. 15. Non linear correlation between signatures.

Pattern Recognition of Digital Images by One-Dimensional Signatures 313

Fig. 16. Box plot example for diatom images in Fig. 14.

Fig. 13. Procedure to train a composite filter.

Fig. 14. Examples of diatoms.

14 Will-be-set-by-IN-TECH

(a) (b) (c) (d) (e) (f)

Fig. 13. Procedure to train a composite filter.

Fig. 14. Examples of diatoms.

Fig. 15. Non linear correlation between signatures.

Fig. 16. Box plot example for diatom images in Fig. 14.

Barinova, O., Lempitsky, V., Kohli, P. (2010). On the detection of multiple object instances

Pattern Recognition of Digital Images by One-Dimensional Signatures 315

Bueno-Ibarra, M.A., Chávez-Sánchez, M.C. & Álvarez-Borrego, J. (2011). K-law spectral

Coronel-Beltrán, A. & Álvarez-Borrego, J. (2010). Comparative analysis between different

De Sena, A. & Rocchesso, D. (2004). A study on using the Mellin transform for vowel

Díaz-Ramírez, V.H., Kober, V. & Álvarez-Borrego, J. (2006). Pattern recognition with an

Fergus, R., Perona, P. & Zisserman, A. (2003). Object class recognition by unsupervised

Forero-Vargas, M., Cristóbal-Pérez, G. & Álvarez-Borrego, J. (2003). Automatic identification

González-Fraga, J.A., Kober, V. & Álvarez-Borrego, J. (2006). Adaptive SDF filters for pattern recognition. *Optical Engineering*, Vol. 45, No. 5, 057005, ISSN 009 1-3286. Gonzalez, R.C. & Woods, R.E. (2002). *Digital image processing*, Prentice Hall, ISBN

Guerrero-Moreno, R.E. & Álvarez-Borrego, J. (2009). Nonlinear composite filter performance.

Holub, A.D., Welling, M. & Perona, P. (2005). Combining generative models and Fisher kernels

Jain, A.K. & Feng, J. (2011) Latent fingerprint matching. *IEEE Transactions on Pattern Analysis*

Komarinski, P., Higgins, P.T., Higgins, K.M. & Fox, L.K. (2005). *Automated fingerprint*

Kong, H., Audibert, J. & Ponce, J. (2010a). Detecting abandoned objects with a moving camera. *IEEE Transactions on Image Processing*, Vol. 19, No. 8, 2201-2210, ISSN 1057-7149. Kong, H., Audibert, J. & Ponce, H. (2010b). General road detection from a single image. *IEEE Transactions on Image Processing*, Vol. 19, No.8, 2211-2220, ISSN 1057-7149. Lempitsky, V., Zisserman, A. (2010). Learning to count objects in images. *Advances in Neural*

for object recognition. *Proceedings of the IEEE International Conference on Computer Vision (ICCV)*, 136-143, ISBN 0-7695-2334-X, Beijing, China, October 2005, IEEE

*identification system (AFIS)*, Elsevier Academic Press, ISBN 0-12-418351-4, San Diego,

*Information Processing Systems 23*, Curran Associates Inc, ISBN 1615679111, NY, USA.

*Optical Engineering*, Vol. 48, No. 6, 06720, ISSN 009 1-3286.

*and Machine Intelligence*, Vol. 33, No. 1, 88-100, ISSN 0162-8828.

88-901479-0-3, Naples, Italy, October 2004, DAFx' 04, Naples, Italy.

USA, June 2010, IEEE Computer Society Press, USA.

tissues. *Aquaculture*, Vol. 318, 283-289, ISSN 0044-8486.

*Modern Optics*, Vol. 57 No. 1, 1-7, ISSN 0950-0340.

2003, IEEE Computer Society Press, USA.

1559-128X.

SPIE Press, USA.

0-201-18075-8, USA.

CA, USA.

Computer Society Press, USA.

using Hough transforms. *Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)*, 2233-2240, ISBN 978-1-4244-6984-0, San Francisco, CA,

signature correlation algorithm to identify white spot syndrome virus in shrimp

font types and letter styles using a nonlinear invariant digital correlation. *Journal of*

recognition. *Proceedings of the International Conference on Digital Audio Effects*, 5-8, ISBN

adaptive joint transform correlator. *Applied Optics*, Vol. 45, No. 23, 5929-5941, ISSN

scale-invariant learning. *Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)*, 264, ISBN 0-7695-1900-8, Madison, Wisconsin, USA, June

techniques of tuberculosis bacteria. *Proceedings of SPIE: Applications of Digital Image Processing XXVI*. 5203, 71-81, ISBN 0819450766, San Diego, CA, USA, August 2003,

### **5. Conclusion**

In this chapter was presented a simple and efficient non linear correlation digital systems invariant to position and rotation and other two digital systems that incorporate the scale invariance. Taking advantage of the translation property for the Fourier transform of a function the invariance to position was achieved. The relevant information of the images is captured in the average signatures via the adaptive mask, helping us on the discrimination between rotated objects. Moreover, the masks were constructed based on the given image; hence, the mask is adapted to problem avoiding in this form the information leak. Here is presented two forms to obtain the scale invariant. The first form uses the scale transform applied to the average signature of the images. The second is based on the Fourier plane to built a composite filters.

Because the signatures of the images are vectors, the computational time cost was reduced considerably compared to those systems that use bi-dimensional signatures (matrices). One of the advantages of these kinds of methodologies is that an entire process can be repeated over and over in the same way without mistakes.

In the particular examples presented in this chapter, the three digital systems present an excellent performance, a confidence level of 95.4% or greater. The digital system in Fig. 7 identified fingerprints images rotated ±15 degrees. The algorithm in Fig. 11 classified contour letters which are translated, rotated and scaled. The composite filters with training images (Fig. 15) recognize the positional shifted, rotated and scaled diatoms images. The biggest advantage of these methodologies are the simplicity in their construction.

### **6. Acknowledgments**

This work was partially supported by CONACyT with grant No. 102007 and PROMEP. The authors thank to Professors De Sena A. and Rocchesso D. by the Matlab routines to calculate the one-dimensional scale transform.

### **7. References**


16 Will-be-set-by-IN-TECH

In this chapter was presented a simple and efficient non linear correlation digital systems invariant to position and rotation and other two digital systems that incorporate the scale invariance. Taking advantage of the translation property for the Fourier transform of a function the invariance to position was achieved. The relevant information of the images is captured in the average signatures via the adaptive mask, helping us on the discrimination between rotated objects. Moreover, the masks were constructed based on the given image; hence, the mask is adapted to problem avoiding in this form the information leak. Here is presented two forms to obtain the scale invariant. The first form uses the scale transform applied to the average signature of the images. The second is based on the Fourier plane to

Because the signatures of the images are vectors, the computational time cost was reduced considerably compared to those systems that use bi-dimensional signatures (matrices). One of the advantages of these kinds of methodologies is that an entire process can be repeated

In the particular examples presented in this chapter, the three digital systems present an excellent performance, a confidence level of 95.4% or greater. The digital system in Fig. 7 identified fingerprints images rotated ±15 degrees. The algorithm in Fig. 11 classified contour letters which are translated, rotated and scaled. The composite filters with training images (Fig. 15) recognize the positional shifted, rotated and scaled diatoms images. The biggest

This work was partially supported by CONACyT with grant No. 102007 and PROMEP. The authors thank to Professors De Sena A. and Rocchesso D. by the Matlab routines to calculate

Alon, J., Athitsos, V., Yuan, Q. y Sclaroff, S. (2009). A Unified framework for gesture

Álvarez-Borrego, J. & Castro-Longoria, E. (2003). Discrimination between Acartia(Copepoda:

Álvarez-Borrego, J., Mouriño-Pérez, R.R., Cristóbal-Pérez, G. & Pech- Pacheco, J.L. (2002).

Álvarez-Borrego, J. & Solorza, S. (2010). Comparative analysis of several digital methods to recognize diatoms. *Hidrobiológica*, Vol. 20, No. 2, 158-170, ISSN 0188-8897. Arandjelovic, R. & Zisserman, A. (2010). Efficient image retrieval for 3D structures. *Proceedings*

recognition and spatiotemporal gesture segmentation. *IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)*, Vol. 31, No. 9, 1685-1699, ISSN 0162-8828.

Calanoida) species using their diffraction pattern in a position, rotation invariant digital correlation. *Journal of Plankton Research*, Vol. 25, No. 2, 229-233, ISSN 0142-7873.

Invariant recognition of polychromatic images of Vibrio cholerae 01. *Optical*

*of the British Machine Vision Conference (BMVA)*, 30.1-30.11, ISBN 1-901725-40-5,

advantage of these methodologies are the simplicity in their construction.

*Engineering*, Vol. 41, No. 4, 827-833, ISSN 009 1-3286.

Aberistwyth, UK, September 2010, BMVA Press, UK.

**5. Conclusion**

built a composite filters.

**6. Acknowledgments**

**7. References**

the one-dimensional scale transform.

over and over in the same way without mistakes.


**14** 

*México* 

**Vectorial Signatures for Pattern Recognition** 

Due to the variety of shapes and sizes that present both living organisms and static objects, the necessity to look for automated systems of identification, both in industry and scientific research has arisen. During the 1960s, the scientific community in the field of optics has used the Fourier transform and other types of mathematical transformations for pattern recognition, taking advantage of their different properties; for example, invariance to

Many invariant descriptors use the Fourier transform to extract invariant features. The Fourier transform has been a powerful tool for pattern recognition. One important property of the Fourier transform is that a shift in the time domain causes no change in the magnitude spectrum. This can be used to extract invariant features in pattern recognition. In the last few years, they have been used as a tool in the digital processing of pattern recognition. Recently several books have been published (Pratt, 2007, Gonzalez et al., 2008 and 2009, Cheriet et al*.,* 2007, and Obinata, 2007), that show us a general view, the progress and development of pattern recognition, as well as different tools for image pre-processing, extraction, selection and creation of features, classification methods, using different types of transforms, etc. However, some studies have focused on solving the problem of invariance to rotation and scale (Cohen, 1993, Casasent, 1976a, Pech et al, 2001 and Pech et al, 2003,

For practical implementation of the theory it should be considered, according to Casasent and Psaltis (1976a, 1976b, 1976c), a system that is invariant to scale through the manipulation of the Fourier transform directly changing the input function. We must also consider the practical realization of the Mellin transformation given by a logarithmic

Schwartz (1994) found that there is strong evidence in many physiological and psychophysical visual systems (including human) use of such logarithmic mapping between the retina and visual cortex. The motion of changing a scaling to a shift by a logarithmic transformation of the coordinates occurs in many areas. In fact, the log-polar coordinate system described above seems to have a biological analogue. Since biological systems can shift objects to the centre of the field of view by movement of the eyes, it seems logical that a mapping that facilitates scale and rotation invariant recognition would be most useful. This is not to say that biological systems also compute Fourier transforms of these representations.

Solorza and Álvarez-Borrego, 2010, Solorza and Álvarez-Borrego, 2011).

mapping of the input stage followed by a Fourier transform.

**1. Introduction** 

position, rotation and scale.

Jesús Ramón Lerma-Aragón1 and Josué Álvarez-Borrego2 *1Facultad de Ciencias, Universidad Autónoma de Baja California, 2CICESE, División de Física Aplicada, Departamento de Óptica* 

