**5. Performance analysis**

50 Advances in Object Recognition Systems

Fig. 4. Modified NNET block of the M-HONN system for multiple objects recognition of

objects recognition of different class objects is written as follows:

or in the frequency domain the above equation is re-written as:

= <sup>1</sup> a *class*

Thus, now for N 2 classes there will be N transformed images being created for class 1 and N transformed images being created for class 2. Then, both sets of transformed images are used for the synthesis of the system's composite image. M-HONN system for multiple

> *classes*

*i = 1LN ×N M - HONN =* <sup>i</sup> a *class S m, n <sup>i</sup>*

*N ×N*

*classes*

*c 1 Γ × X m,n +* <sup>2</sup> a *class*

<sup>N</sup> a *class*

 *class*

*i = 1LN ×N*

*N ×N*

*class*

*c N Γ × X m,n*

*M - HONN =* **<sup>i</sup> a** *class S m,n <sup>i</sup>* (32)

*c 2 Γ × X m,n + L +* (31)

different classes

We have constructed a data set of input images of an S-type Jaguar car model at 10 increments of out-of-plane rotation at an elevation angle of approximately 45 to be used for the M-HONN system. A second set of images was constructed for the Police car model Mazda Efini RX-7 at the same elevation angle to serve as the out-of-class data for discrimination tests (see Fig. 5). A third data set was created of the background images of typical car parks (see Fig. 6) and the images of the S-type car model and the Mazda RX-7 car model added in the background scene. The size of all the images was 256 256 and all the images are in grey-scale bitmap format. All the input training images (and all the input test set images) for M-HONN system are concatenated row-by-row into a vector of size 1 256 256 prior to input to the NNET block. Normally this size of image is impossibly large for

processing by any artificial neural network architecture, since to be implemented by enough input and layer weights:

$$\begin{aligned} \text{N}\_{\text{iw}} &= 10 \times \left[ 256 \times 256 \right] \\ &= 10 \times 65,536 \\ &= 655,360 \end{aligned} \tag{33}$$

Thus, for a training set of N 10 individual vectors of size 256 256 , there would, in total, be more than half-a-million input weight connections needed. Thus the selective weight connection architecture is employed to overcome this problem. To overcome this problem we developed a novel selective weight connection architecture (see Section 2). Also, applying the heuristic training algorithm with momentum and an adaptive learning rate into the NNET training session (Nguyen & Widrow, 1989; Nguyen & Widrow, 1990), has speeded up the learning phase and reduced the memory size needed to complete fully the training session. Here, it worth mentioning that the NNET block and, in overall, M-HONN system is able to process input still images and video frames for all the test series in few a msec with a Dual Core CPU at 2.4 GHz with 4.0GB RAM. Additionally, due to the

Performance Analysis of

peaks in the correlation plane.

degrees at increments of 20

60

*<sup>W</sup>* and *<sup>o</sup> <sup>60</sup> <sup>x</sup>*

where *<sup>o</sup> <sup>60</sup> <sup>x</sup>*

*o o <sup>60</sup> <sup>60</sup>*

*x x*

**5.1 Peak sharpness and detectability** 

increasing Cl

correlation planes of M-HONN system for different Cl

the Modified-Hybrid Optical Neural Network Object Recognition System Within Cluttered Scenes 53

of the composite images comprising M-HONN system, which in turn leads to a more localised response, sharper peaks, and reduction in the plane's sidelobes. By decreasing

Cl value it leads to an increased emphasis on peripheral lower spatial frequency content of the composite images comprising M-HONN system, which in turn leads to a broader

Next, we summarise the tests series for assessing M-HONN system's peak sharpness and detectability, distortion range, and discrimination ability, which we have all described them in full details in our previous work (Kypraios et al., 2008). We focus afterwards in analysing

Here we assessed (Jamal-Aldin et al., 1997; Jamal-Aldin et al., 1998; Kumar & Hassebrook, 1990) M-HONN system's ability to detect non-training in-class images that are oriented at the intermediate angle of view between the training images (Refregier, 1990; Refregier, 1991). The training set consisted of still images out-of-plane rotated between 20 70

intermediate car poses over the same range at 10 increments. Two randomly chosen intermediate car poses, at 130 and at 140 , were added in the training set of the M-HONN system to create a false-class. We set the target of the false-class object to be T 40 false and of the true-class object to be T 40 true . The M-HONN system had no information on the non-training, intermediate car images in the construction of its composite image. We explicitly constrained the correlation peak in the constraint matrix. Thus, we constrained the correlation peaks in the constraint matrix to be 1 for the images of the true-class object

both the training set and the test set was built from the training set image at 60 , i.e. c= 60 :

*<sup>L</sup>* are the matrices of the input and layer weights. *<sup>o</sup> <sup>60</sup> <sup>x</sup>*

weights from the input neuron of the input vector element at row m and column n to the

layer weights from the hidden neuron of the layer vector element at row m and column n to the associated output neuron. We set q = 1 since the output layer had only one neuron for a single class of objects. In M-HONN system, instead of multiplying each training image with the corresponding weight connections as done for the constrained- HONN (C-HONN)

associated hidden layer for the training image *<sup>o</sup> <sup>60</sup> x m,n* at 60 angle of view. *<sup>o</sup> <sup>60</sup> <sup>x</sup>*

*W ×L = ×*

and 0 for the images of the false-class object. The randomly chosen mask

. We tested the M-HONN system with the true-class object's

*o o o o oo <sup>60</sup> <sup>60</sup> <sup>60</sup> <sup>60</sup> <sup>60</sup> <sup>60</sup>*

*x x x x xx 11 12 1n-1 1n 11 1q x x x x xx 21 22 2n-1 2n 21 2q*

*w w L w w l Ll*

*w w L w w l Ll*

*o o o o oo <sup>60</sup> <sup>60</sup> <sup>60</sup> <sup>60</sup> <sup>60</sup> <sup>60</sup>*

*o o o o oo <sup>60</sup> <sup>60</sup> <sup>60</sup> <sup>60</sup> <sup>60</sup> <sup>60</sup>*

*x x x x xx m1 m2 mn-1 mn n1 nq*

*w w L w w l Ll*

the performance of the M-HONN object recognition system within cluttered scenes.

value it leads to an increased emphasis of the high spatial frequency content

values, one could observe that by

c applied on

(35)

*m n l* are the

*w m n* are the input

generalization properties exhibited by a NNET architecture, the number of the training images decreases, in comparison to the typical number of images required for the training set of linear combinatorial filters (such as the SDF filter).

Fig. 5. RX-7 Mazda Efini Police patrol car used in the training and test sets

Fig. 6. Car park scene used in the training and test sets

It was proven experimentally that by choosing different values of the classification levels for the true-class ClT and false-class ClF objects, one can control the M-HONN system's behaviour to suit different application requirements, similarly with all the HONN-type systems. Thus we define:

$$\mathcal{ACI} = \left| \mathbf{C} \mathbf{1}\_{\mathbb{T}} - \mathbf{C} \mathbf{1}\_{\mathbb{F}} \right| \tag{34}$$

where Cl is the absolute distance of the classification levels between the true-class objects and the false-class objects. When we increase Cl , then the resulting M-HONN system behaves more like a high-pass biased filter, which generally gives sharp correlation peaks and good clutter suppression but is more sensitive to intra-class distortions. Now, when we decrease Cl , then the resulting M-HONN system behaves more like a minimum variance synthetic discriminant function (MVSDF) (Kumar, 1986) filter with relatively good intraclass distortion invariance but producing broad correlation peaks. In effect, when Cl increases, the M-HONN system possesses better discriminatory properties but when Cl decreases the M-HONN system has better generalising properties. By plotting the isometric correlation planes of M-HONN system for different Cl values, one could observe that by increasing Cl value it leads to an increased emphasis of the high spatial frequency content of the composite images comprising M-HONN system, which in turn leads to a more localised response, sharper peaks, and reduction in the plane's sidelobes. By decreasing Cl value it leads to an increased emphasis on peripheral lower spatial frequency content of the composite images comprising M-HONN system, which in turn leads to a broader peaks in the correlation plane.

Next, we summarise the tests series for assessing M-HONN system's peak sharpness and detectability, distortion range, and discrimination ability, which we have all described them in full details in our previous work (Kypraios et al., 2008). We focus afterwards in analysing the performance of the M-HONN object recognition system within cluttered scenes.
