7.1. Top-hat transformation

The thermal differences among all values of the 40 squared regions were evaluated, thus creating a matrix 40 40. We had to compare the difference matrices which correspond to the same person when he was sober and in the case he consumed alcohol (Figure 4). The maximum variation between the corresponding differences is monitored and actually reveals the regions which change temperature with alcohol consumption. It was found that for the drunk

The main finding of this approach is that two locations, as shown in Figure 3, are good candidates for proving intoxication, namely the forehead and the nose. For the drunk person,

Figure 3. Each of the black regions is of 10 10 pixels area. A total of 8 5 regions are taken on each face. The

Figure 4. Three difference matrices, (a) for the sober person and (b) for the drunk person, (c) the difference of the difference matrices (values normalized to full grayscale). Large changes for the thermal differences on the face are

indicated by white points on this matrix. The white circle corresponds to the largest difference equal to 29.8.

temperature difference between the regions is monitored as the person consumes alcohol.

person the nose and mouth has increased temperature in relation to the forehead.

154 Human-Robot Interaction - Theory and Application

The basic morphological operation is that of erosion [26, 27]. Erosion is a shrinking procedure carried out when a signal A (binary or gray scale) is affected by another signal S, the structuring element:

$$A \oplus \mathcal{S} = \{ (i, j) \in A : \mathcal{S}(i, j) \subset A \text{ \} \tag{11}$$

where ð Þ i; j is the position of A on which S lies.

Figure 6. (a) Hot and cold spots of area 5 � 5 pixels on a varying grayscale background, (b) top-hat transformation of image in (a) using the summation of (15) and (16). The structuring element used, was a flat disk of radius 5.

A complementary operation to that of erosion is dilation. It is a kind of expansion of the signal. It is defined as the erosion of the complement of A:

$$A \oplus \mathbb{S} = \left( A^{\mathcal{C}} \oplus \mathbb{S} \right)^{\mathcal{C}} \tag{12}$$

Top � hatcold <sup>¼</sup> AS � <sup>A</sup> (16)

Intoxication Identification Using Thermal Imaging http://dx.doi.org/10.5772/intechopen.72128 157

j ¼ �D∙∇u (17)

∂tu ¼ �divð Þ D∙∇u (19)

<sup>2</sup> (20)

(18)

and actually allows to extract dark (cold) features against a brighter background (see Figure 6

However, before applying top-hat transformation on the image, anisotropic diffusion is

Thermal infrared images contain noise, which many times distorts significant information and details that are important for the interpretation of the image. Anisotropic diffusion technique [28] is capable of filtering out noise leaving significant parts of the image very important in

The physical background of diffusion is based on the concentration distribution u (pixel

where D is the diffusion tensor, which is in general a positive definite symmetric matrix, and is a function of the structure of the image. Diffusion corresponds to mass transport (gray values

where ∂tu is the time partial derivative of the concentration distribution u. From the above

In anisotropic nonlinear diffusion, the diffusion tensor is not constant over the image smoothing thus only along edges and living the information across edges unchanged. Specifically, if

D ¼ g j j ∇u

then the diffusion preserves edges since no diffusion is performed vertically to edges but parallel to them. In real problems anisotropic nonlinear diffusion is capable to sharpen edges

The implementation of Eq. (19) in the experimental procedure can be carried out in the

the diffusion tensor D is defined to be a function of the gradient of u, that is,

∂x þ ∂j ∂y

<sup>∂</sup>tu ¼ �divj ¼ � <sup>∂</sup><sup>j</sup>

a and b).

performed to eliminate noise.

7.2. Anisotropic diffusion

equations, we have:

following way.

if the function g(.) is chosen properly.

perceptual vision, like edges or lines, unchanged.

distribution), so that its gradient causes flux j according to Fick's law:

in images) without destroying mass or creating new mass. So,

When an erosion is followed by a dilation the smoothing-shrinking morphological operation called opening is obtained. Opening smoothes out from the signal A, all details that are smaller than the structuring element S. It is denoted as

$$A\_s = (A \oplus S) \oplus S \tag{13}$$

Furthermore, when a dilation is followed by an erosion, the smoothing-expanding operation called closing is obtained. Closing covers (smoothes) all details (intrusions) of the signal A that are smaller than the structuring element S:

$$A^s = (A \oplus S) \oplus S \tag{14}$$

Employing a top-hat transformation (hot or cold), someone can extract small features from a signal A. Actually, protrusions in the signal can be obtained by subtracting the opened signal from the original (hot top-hat transformation)

$$Top - hat\_{hot} = A - A\_S \tag{15}$$

which allows to extract white (hot) features against a dark background. On the other hand, an intrusion of the signal can be obtained by subtracting the original signal from the closed one (cold top-hat transformation)

$$Top - hat\_{cold} = \ \ A^S - A \tag{16}$$

and actually allows to extract dark (cold) features against a brighter background (see Figure 6 a and b).

However, before applying top-hat transformation on the image, anisotropic diffusion is performed to eliminate noise.

#### 7.2. Anisotropic diffusion

A complementary operation to that of erosion is dilation. It is a kind of expansion of the signal.

Figure 6. (a) Hot and cold spots of area 5 � 5 pixels on a varying grayscale background, (b) top-hat transformation of

image in (a) using the summation of (15) and (16). The structuring element used, was a flat disk of radius 5.

When an erosion is followed by a dilation the smoothing-shrinking morphological operation called opening is obtained. Opening smoothes out from the signal A, all details that are smaller

Furthermore, when a dilation is followed by an erosion, the smoothing-expanding operation called closing is obtained. Closing covers (smoothes) all details (intrusions) of the signal A that

Employing a top-hat transformation (hot or cold), someone can extract small features from a signal A. Actually, protrusions in the signal can be obtained by subtracting the opened signal

which allows to extract white (hot) features against a dark background. On the other hand, an intrusion of the signal can be obtained by subtracting the original signal from the closed one

<sup>A</sup> <sup>⊕</sup> <sup>S</sup> <sup>¼</sup> <sup>A</sup><sup>c</sup> ð Þ <sup>⊖</sup> <sup>S</sup> <sup>c</sup> (12)

As ¼ ð Þ A ⊖ S ⊕ S (13)

<sup>A</sup><sup>s</sup> <sup>¼</sup> ð Þ <sup>A</sup> <sup>⊕</sup> <sup>S</sup> <sup>⊖</sup> <sup>S</sup> (14)

Top � hathot ¼ A � AS (15)

It is defined as the erosion of the complement of A:

156 Human-Robot Interaction - Theory and Application

than the structuring element S. It is denoted as

are smaller than the structuring element S:

from the original (hot top-hat transformation)

(cold top-hat transformation)

Thermal infrared images contain noise, which many times distorts significant information and details that are important for the interpretation of the image. Anisotropic diffusion technique [28] is capable of filtering out noise leaving significant parts of the image very important in perceptual vision, like edges or lines, unchanged.

The physical background of diffusion is based on the concentration distribution u (pixel distribution), so that its gradient causes flux j according to Fick's law:

$$j = -D \cdot \nabla u \tag{17}$$

where D is the diffusion tensor, which is in general a positive definite symmetric matrix, and is a function of the structure of the image. Diffusion corresponds to mass transport (gray values in images) without destroying mass or creating new mass. So,

$$
\partial\_t \mu = -d\text{div}\mathbf{j} = -\left(\frac{\partial \mathbf{j}}{\partial \mathbf{x}} + \frac{\partial \mathbf{j}}{\partial \mathbf{y}}\right) \tag{18}
$$

where ∂tu is the time partial derivative of the concentration distribution u. From the above equations, we have:

$$
\partial\_t u = -\text{div}(D \cdot \nabla u) \tag{19}
$$

In anisotropic nonlinear diffusion, the diffusion tensor is not constant over the image smoothing thus only along edges and living the information across edges unchanged. Specifically, if the diffusion tensor D is defined to be a function of the gradient of u, that is,

$$D = \operatorname{g}\left(|\nabla u|^2\right) \tag{20}$$

then the diffusion preserves edges since no diffusion is performed vertically to edges but parallel to them. In real problems anisotropic nonlinear diffusion is capable to sharpen edges if the function g(.) is chosen properly.

The implementation of Eq. (19) in the experimental procedure can be carried out in the following way.

Let u0ð Þ x; y be the original input image and utð Þ x; y the digital image at iteration t. The discreet in time implementation of (19) is carried out by employing the four nearest neighbors and the Laplacian operator which was used in [28]:

$$\mathfrak{u}\_{t+1}(\mathbf{x}, \boldsymbol{y}) = \mathfrak{u}\_t(\mathbf{x}, \boldsymbol{y}) + \lambda \sum\_{i=1}^{4} \left[ \mathcal{g} \{ \nabla \mathfrak{u}\_t^i(\mathbf{x}, \boldsymbol{y}) \} \cdot \nabla \mathfrak{u}\_t^i(\mathbf{x}, \boldsymbol{y}) \right] \tag{21}$$

where in the experimental procedure was used 0 ≤ λ ≤ <sup>1</sup> <sup>4</sup> and

$$
\nabla \mathfrak{u}\_t^1(\mathbf{x}, y) = \mathfrak{u}\_t(\mathbf{x}, y+1) - \mathfrak{u}\_t(\mathbf{x}, y) \tag{22}
$$

for the intoxicated (right) and the sober person (left) are shown. Image registration was applied in order to compare the images. Discrimination between sober and intoxicated persons' images was achieved based on the number of bright pixels. For the intoxicated persons, the number of bright pixels is larger for sober persons. This concept is the main supporting idea that the proposed method that contributes significantly in the forensic science. Brighter vessels constitute a clear evidence to suspect for alcohol consumption and proceed to further check up and

Intoxication Identification Using Thermal Imaging http://dx.doi.org/10.5772/intechopen.72128 159

It is worthy to mention that it is possible using an image like those in Figure 7, to infer about intoxication since white pixels for the drunk person are more intense around the nose, the mouth, and on the forehead. The fact that the corresponding image from the sober person is not required for comparison constitutes the substantial forensic contribution of this method.

Neural networks have been used as a classification tool in a variety of machine vision techniques such as face recognition [29] and thermal infrared pattern recognition [30–33]. Especially, a thermo vision application for biometric recognition is addressed in [32], while neural structures are employed in [33] for recognition of facial expressions using thermal maps of the face.

This method offers a way of discriminating sober from drunk persons, using thermal infrared images and neural networks. The neural networks are employed as a black box to discriminate intoxication by means of the values of simple pixels from the thermal images of the persons' face. In this work, the neural networks were used by means of two different approaches. According to the first approach, a different neural structure is used from location to location on the thermal image of the face and the convergence capabilities of the network are monitored. A successful convergence characterizes the corresponding location of the face as being a good candidate for intoxication identification. According to the second approach, a single neural structure is trained with data from the thermal images of the whole face of a person (sober and drunk) and its capability to operate with high classification success to other persons

In the first approach, different networks are trained on different locations of the same face. Thus, there will be a serious indication on the suitability of the specific face locations for drunk identification. Consequently, the face of each person is partitioned into a matrix of squared regions of 10 10 pixels each as the one depicted in Figure 8. There is a complete correspondence between these locations on the images of sober and drunk persons. Figure 8 is illustrated one of these square regions of 100 pixels on a pair of infrared images (sober-drunk) of a specific person. A simple neural network is trained using the data in the two black regions as shown in Figure 8. The vectors used as input to the neural structure are of nine elements obtained when a small 3 3 window moves all over each of the two 10 10 pixels regions. In this way, 200 vectors are obtained to train a three-level neural structure of [9 30 1] neurons, for these two specific regions of 10 10 pixels. Furthermore, a larger network of [49 49 1] neurons was employed using as input vectors of 49 elements. These elements were obtained when a

8. Neural networks for discriminating drunk persons

is tested. Its generalization performance is also accessed.

inspection of the person.

is the gradient of south direction,

$$
\nabla \mathfrak{u}\_t^2(\mathbf{x}, y) = \mathfrak{u}\_t(\mathbf{x}, y - 1) - \mathfrak{u}\_t(\mathbf{x}, y) \tag{23}
$$

is the gradient of north direction,

$$
\nabla \mathfrak{u}\_t^3(\mathbf{x}, y) = \mathfrak{u}\_t(\mathbf{x} + 1, y) - \mathfrak{u}\_t(\mathbf{x}, y) \tag{24}
$$

is the gradient of east direction and

$$
\nabla \mathfrak{u}\_t^4(\mathbf{x}, \mathbf{y}) = \mathfrak{u}\_t(\mathbf{x} - \mathbf{1}, \mathbf{y}) - \mathfrak{u}\_t(\mathbf{x}, \mathbf{y}) \tag{25}
$$

is the gradient of west direction.

The nonlinear anisotropic diffusion method was applied to all 41 faces corresponding to sober and intoxicated persons. In order the diffusion to take place only along edges the value of k which affects the degree of smoothing was selected equal to 20.

If thresholding is used on images after diffusion and top-hat transformation, the image obtained is richer for the intoxicated person compared to that of the sober person. In our experiments, the threshold was chosen to be equal to 100. In Figure 7, two images obtained

Figure 7. Binary images obtained using a threshold equal to 100. Sober left and intoxicated right. Vessels on the drunk person are more distinct compared to those on the sober person.

for the intoxicated (right) and the sober person (left) are shown. Image registration was applied in order to compare the images. Discrimination between sober and intoxicated persons' images was achieved based on the number of bright pixels. For the intoxicated persons, the number of bright pixels is larger for sober persons. This concept is the main supporting idea that the proposed method that contributes significantly in the forensic science. Brighter vessels constitute a clear evidence to suspect for alcohol consumption and proceed to further check up and inspection of the person.

It is worthy to mention that it is possible using an image like those in Figure 7, to infer about intoxication since white pixels for the drunk person are more intense around the nose, the mouth, and on the forehead. The fact that the corresponding image from the sober person is not required for comparison constitutes the substantial forensic contribution of this method.
