**5. Multivariate data processing**

The sequence of the eight sampled pulses consists of 217 sampling points. There are four positions of the movable table which then leads to 4 × 217 = 868 data points per object. These pulses are modified by many factors: the shape, the position, the rotation, and the intrinsic variables of the material under test. However, as can be seen in Figure 8 the value to be measured has only a relatively low influence on the apparent shape of the pulses, and

#### 10 Will-be-set-by-IN-TECH 332 Ultra-Wideband Radio Technologies for Communications, Localization and Sensor Applications ISOPerm: Non-Contacting Measurement of Dielectric Properties of Irregular Shaped Objects <sup>11</sup>

practically all data points are modified when one or more of the factors mentioned above changes. Indeed each factor modifies the curve shape in a different manner. Therefore all measured points of the curve contribute in part to the variable(s) to be measured. Often these values consist only of one variable (for instance water content), a small set of variables (complex permittivity, quality) or an abstract class (shape of the object). Hence the challenge of data processing for the application discussed here is to extract the (hidden) relevant information from a huge data array. Due to the complexity physical modelling is impracticable.

Multivariate calibrations are established techniques for the extraction of relevant information from observed (measured) data without physical modelling. In the following, principal components analysis and regression (PCA/PCR), artificial neural networks (ANN) and partial least squares regression (PLSR) are applied to the data measured during the experiments described in section 4. Multivariate calibration methods have the disadvantage that they require a calibration procedure i.e. training. This means a portion of the measurements carried out on known samples need to be used to determine parameters or coefficients that enable a determination of the variable to be measured for unknown samples.

For this reason the measurements are divided randomly into a calibration and validation group. In general these two groups have an equal size. For test series 1 the number of data sets of the calibration and validation groups is *nc* = *nv* = 45 and for test series 2 they are *nc* = *nv* = 10. The more samples are available the more robust is the calibration and the meaningfulness of the validation.

In order to reduce the amount of data, in a pre-processing step the points having a lower variance may be removed from the input variables. For the experiments described here a majority of the 868 time points can be neglected when the threshold value of the standard deviation is set to 20% of the maximum standard deviation8. Thus for test series 1 and 2 the number of points used is *m*<sup>1</sup> = 305 and *m*<sup>2</sup> = 187, respectively. The raw matrix of the calibration data consists of the measured and pre-processed pulse sequence in each line. Hence the columns contain the data of the selected time points of a measurement:

$$Y\_{\mathcal{C}} = \begin{bmatrix} y\_{11} & \cdots & y\_{1m} \\ \vdots & \ddots & \vdots \\ y\_{n\_{\mathcal{C}}1} & \cdots & y\_{n\_{\mathcal{C}}m} \end{bmatrix} . \tag{2}$$

The matrix of the pre-processed validation data is calculated similarly. However the data is normalized and standardized using the means and the standard deviations of the calibration

ISOPerm: Non-Contacting Measurement of Dielectric Properties of Irregular Shaped Objects 333

As can be seen in Figure 8 neighbouring data points are highly correlated. Therefore it is not possible to use the selected data points directly in a linear regression to estimate the variable of interest9. For the two test series described in section 4 such a linear calibration equation would have *m*<sup>1</sup> = 305 and *m*<sup>2</sup> = 187 coefficients. Furthermore, and this aspect is more relevant, the calculation of the coefficients is numerically unstable because a matrix with correlated data

A solution to this problem is found using principal component analysis (PCA). The original

*Hc* comprises the so called principal components (the scores) and *P* is a matrix of the so called loadings. The scores have the advantage that they are uncorrelated and are arranged in such a way that the first principal component has the highest variance and the others are arranged

The matrix of the loadings *P* is composed of the eigenvectors. They are also orthogonal and of unit length. This transformation can be interpreted as a transformation into a new orthogonal coordinate system. The basis vectors of the new coordinate system are the eigenvectors and

The properties of the principal components, their orthogonality and their arrangement regarding the variance, enable data reduction because the relevant information of the matrix is already described by the first few principal components. In order to obtain the properties of the transformed data as described above, the matrix *P* needs to be calculated by an eigenvalue decomposition [36–38] but this is not described in detail here. For the results calculated here the statistical toolbox of MATLAB is used. The eigenvalue decomposition of the PCA is processed without any consideration of the variable(s) of interest. This will be done in the

As mentioned above a multiple linear regression of the untransformed (therefore correlated) data is numerically unstable, but after the transformation (see eq. (5)) the data is uncorrelated and the value to be determined can be estimated by a linear combination of the principal

where *z*ˆ*<sup>c</sup>* is the estimated variable of interest (objective variable), *H*˜*<sup>c</sup>* is the matrix of the selected principal components and the vector *β* contains the coefficients of the linear equation.

<sup>9</sup> The variable of interest or objective variable is the parameter to be determined later, e.g. the moisture content.

*Hc* = *Xc* · *P*. (5)

*<sup>z</sup>*ˆ*<sup>c</sup>* <sup>=</sup> *<sup>H</sup>*˜*<sup>c</sup>* · *<sup>β</sup>*, (6)

**5.1. Principal component analysis and regression**

needs to be inverted in the regression algorithm.

in decreasing order of variance.

next step of the data processing.

data is linear transformed into a new set of variables

their direction is along the variances in decreasing order.

components (principal component regression, PCR):

set.

Due to numerical reasons it is advantageous to standardize the raw data. Firstly the means of each column are calculated with

$$\overline{Y}\_{\mathcal{L}} = \left[ \frac{1}{n\_{\mathcal{L}}} \sum\_{i=1}^{n\_{\mathcal{L}}} y\_{i1} \cdot \dots \cdot \frac{1}{n\_{\mathcal{L}}} \sum\_{i=1}^{n\_{\mathcal{L}}} y\_{im} \right]. \tag{3}$$

The matrix of the normalized, standardized calibration data is calculated by subtracting the means of each column from each value of the columns, then dividing each by the standard deviations of the columns *σc*<sup>1</sup> ··· *σcm*:

$$\mathbf{X}\_{\mathcal{C}} = \left[ \mathbf{Y}\_{\mathcal{C}} - \begin{bmatrix} 1 \\ \vdots \\ \vdots \\ 1 \end{bmatrix} \cdot \overline{\mathbf{Y}}\_{\mathcal{C}} \right] \cdot \begin{bmatrix} \frac{1}{\sigma\_{\mathcal{C}1}} \cdot \cdots \cdot \mathbf{0} \\ \vdots & \ddots & \vdots \\ \mathbf{0} & \cdots & \frac{1}{\sigma\_{\mathcal{C}m}} \end{bmatrix} . \tag{4}$$

<sup>8</sup> which is 0.0037 and 0.0017 for test series 1 and 2, respectively.

The matrix of the pre-processed validation data is calculated similarly. However the data is normalized and standardized using the means and the standard deviations of the calibration set.

### **5.1. Principal component analysis and regression**

10 Will-be-set-by-IN-TECH

practically all data points are modified when one or more of the factors mentioned above changes. Indeed each factor modifies the curve shape in a different manner. Therefore all measured points of the curve contribute in part to the variable(s) to be measured. Often these values consist only of one variable (for instance water content), a small set of variables (complex permittivity, quality) or an abstract class (shape of the object). Hence the challenge of data processing for the application discussed here is to extract the (hidden) relevant information from a huge data array. Due to the complexity physical modelling is

Multivariate calibrations are established techniques for the extraction of relevant information from observed (measured) data without physical modelling. In the following, principal components analysis and regression (PCA/PCR), artificial neural networks (ANN) and partial least squares regression (PLSR) are applied to the data measured during the experiments described in section 4. Multivariate calibration methods have the disadvantage that they require a calibration procedure i.e. training. This means a portion of the measurements carried out on known samples need to be used to determine parameters or coefficients that enable a

For this reason the measurements are divided randomly into a calibration and validation group. In general these two groups have an equal size. For test series 1 the number of data sets of the calibration and validation groups is *nc* = *nv* = 45 and for test series 2 they are *nc* = *nv* = 10. The more samples are available the more robust is the calibration and the

In order to reduce the amount of data, in a pre-processing step the points having a lower variance may be removed from the input variables. For the experiments described here a majority of the 868 time points can be neglected when the threshold value of the standard deviation is set to 20% of the maximum standard deviation8. Thus for test series 1 and 2 the number of points used is *m*<sup>1</sup> = 305 and *m*<sup>2</sup> = 187, respectively. The raw matrix of the calibration data consists of the measured and pre-processed pulse sequence in each line.

*y*<sup>11</sup> ··· *y*1*<sup>m</sup>*

⎤ ⎥

*nc* ∑ *i*=1 *yim* �

<sup>⎦</sup> . (2)

. (3)

<sup>⎦</sup> . (4)

*ync*<sup>1</sup> ··· *yncm*

*yi*<sup>1</sup> ··· <sup>1</sup> *nc*

> ⎤ ⎥ ⎦ ·

⎡ ⎢ ⎢ ⎣

1 *<sup>σ</sup>c*<sup>1</sup> ··· 0 . . . ... . . .

<sup>0</sup> ··· <sup>1</sup> *σcm* ⎤ ⎥ ⎥

Due to numerical reasons it is advantageous to standardize the raw data. Firstly the means of

The matrix of the normalized, standardized calibration data is calculated by subtracting the means of each column from each value of the columns, then dividing each by the standard

Hence the columns contain the data of the selected time points of a measurement:

⎡ ⎢ ⎣

> *nc* ∑ *i*=1

⎡ ⎢ ⎣ 1 . . . 1 ⎤ ⎥ <sup>⎦</sup> · *Yc*

. . . ... . . .

*Yc* =

*Yc* = � 1 *nc*

> ⎡ ⎢ ⎣ *Yc* −

*Xc* =

<sup>8</sup> which is 0.0037 and 0.0017 for test series 1 and 2, respectively.

determination of the variable to be measured for unknown samples.

impracticable.

meaningfulness of the validation.

each column are calculated with

deviations of the columns *σc*<sup>1</sup> ··· *σcm*:

As can be seen in Figure 8 neighbouring data points are highly correlated. Therefore it is not possible to use the selected data points directly in a linear regression to estimate the variable of interest9. For the two test series described in section 4 such a linear calibration equation would have *m*<sup>1</sup> = 305 and *m*<sup>2</sup> = 187 coefficients. Furthermore, and this aspect is more relevant, the calculation of the coefficients is numerically unstable because a matrix with correlated data needs to be inverted in the regression algorithm.

A solution to this problem is found using principal component analysis (PCA). The original data is linear transformed into a new set of variables

$$H\_{\mathcal{C}} = \mathcal{X}\_{\mathcal{C}} \cdot \mathcal{P}.\tag{5}$$

*Hc* comprises the so called principal components (the scores) and *P* is a matrix of the so called loadings. The scores have the advantage that they are uncorrelated and are arranged in such a way that the first principal component has the highest variance and the others are arranged in decreasing order of variance.

The matrix of the loadings *P* is composed of the eigenvectors. They are also orthogonal and of unit length. This transformation can be interpreted as a transformation into a new orthogonal coordinate system. The basis vectors of the new coordinate system are the eigenvectors and their direction is along the variances in decreasing order.

The properties of the principal components, their orthogonality and their arrangement regarding the variance, enable data reduction because the relevant information of the matrix is already described by the first few principal components. In order to obtain the properties of the transformed data as described above, the matrix *P* needs to be calculated by an eigenvalue decomposition [36–38] but this is not described in detail here. For the results calculated here the statistical toolbox of MATLAB is used. The eigenvalue decomposition of the PCA is processed without any consideration of the variable(s) of interest. This will be done in the next step of the data processing.

As mentioned above a multiple linear regression of the untransformed (therefore correlated) data is numerically unstable, but after the transformation (see eq. (5)) the data is uncorrelated and the value to be determined can be estimated by a linear combination of the principal components (principal component regression, PCR):

$$
\hat{z}\_{\mathfrak{c}} = \tilde{H}\_{\mathfrak{c}} \cdot \mathfrak{h} \tag{6}
$$

where *z*ˆ*<sup>c</sup>* is the estimated variable of interest (objective variable), *H*˜*<sup>c</sup>* is the matrix of the selected principal components and the vector *β* contains the coefficients of the linear equation.

<sup>9</sup> The variable of interest or objective variable is the parameter to be determined later, e.g. the moisture content.

The entries in the first column of *H*˜*<sup>c</sup>* are all unity in order to describe the mean of the value of interest in the linear equation.

In *H*˜*<sup>c</sup>* only the first *k* principal components are included. This selection leads to the desired data reduction. The value *k* need to be determined heuristically. Here for test series 1 the number of principal components used is *k*<sup>1</sup> = 12 and for test series 2 it is *k*<sup>2</sup> = 2.

The coefficients of the linear equation can be calculated by the following equation:

$$\mathcal{B} = \left(\tilde{H}\_{\mathfrak{c}}^{\ \ \ \ \ \ \tilde{H}\_{\mathfrak{c}}} \cdot \tilde{H}\_{\mathfrak{c}}\right)^{-1} \cdot \tilde{H}\_{\mathfrak{c}} \cdot \mathbf{z}\_{\mathfrak{c}\prime} \tag{7}$$

*RER* Classification Application Up to 6 Very poor Not recommended 7-12 Poor Very rough screening

13-20 Fair Screening 21-30 Good Quality control 31-40 Very good Process control 41+ Excellent Any application

the calibration and validation group would be on the so called quality line. For test series 1 a *RMSEc* = 0.731% and *RMSEv* = 1.04% is achieved. For test series 2 the errors are *RMSEc* = 2.12% and *RMSEv* = 2.73%. The meaningfulness of *RMSE* depends on the range of the variable to be predicted. Therefore the range error ratio *RER* is a better choice to evaluate the calibration. It is the ratio between the variable range <sup>Δ</sup>*<sup>z</sup>* <sup>=</sup> max *<sup>z</sup>* <sup>−</sup> min *<sup>z</sup>* and the RMSE10:

ISOPerm: Non-Contacting Measurement of Dielectric Properties of Irregular Shaped Objects 335

*RER* <sup>=</sup> <sup>Δ</sup>*<sup>z</sup>*

The quality of the performance can be assessed using the ranges suggested in Table 1. For test series 1 the *RERc* = 26.7 and *RERv* = 18.75, hence the performance is *good*. But for test series 2 the accuracy obtained with PCA/PCR is only *poor* because *RERc* = 8.5 and *RERv* = 6.6.

Although PCA/PCR is a linear operation it is more or less capable of processing non-linear data. However, when the unknown function describing the relationship between the pulse sequence and the value of interest is non-linear a purely linear method may not be the best choice. Artificial neural networks (ANN) can approximate unknown non-linear functions. For this application multilayer-feed-forward (MLFF) networks have a suitable architecture [40]. Such a network is shown in Figure 10. The input variables are weighted and processed by the neurons of the hidden layer. The activation functions of the neurons are non-linear11. This enables the non-linear function approximation. The output variable of the hidden layer is weighted again and processed by the neuron(s) of the output layer. The output of this layer needs to be post-processed (scaling and mean value) and the estimated variable of interest is

Due to their architecture ANN have several degrees of freedom: the number of hidden layers, the kind of activation function in each layer, and the number of neurons in the hidden layer. For the application discussed here one hidden layer and *nHL* = 10 neurons in this hidden layer are sufficient. The problem is that the number of weighting factors between the layers increases with the number of neurons and for an optimal determination of the weighting

<sup>10</sup> In [39] the standard error is used instead of the root mean square error; for large numbers of samples there is practically

factors a relatively large number of samples for the calibration (training) is necessary.

*RMSE*. (8)

**Table 1.** *Classification using RER-values according to [39].*

**5.2. Artificial neural networks**

available.

no difference. <sup>11</sup> tansig-function

where *zc* consists of the variable of interest determined by a reference method, e.g. oven drying for moisture content.

After the calibration data is processed the system is essentially calibrated and is ready to handle unknown samples. However prior to that, the performance of the calibration still needs to be evaluated using the validation data. The target variables of interest are also determined for the pre-processed validation data using a reference method. The validation data (or later in use, the data of a measurement of an unknown sample) is processed in the following manner:


For the evaluation of the quality of the calibration the root mean square error of calibration group *RMSEc* and the validation group *RMSEv* are calculated.

The results obtained with PCA/PCR for both test series are shown in Figure 9. The predicted moisture or water content is plotted vs. its true values. With perfect prediction all points of

**Figure 9.** Results obtained with PCA/PCR for both test series.


**Table 1.** *Classification using RER-values according to [39].*

12 Will-be-set-by-IN-TECH

The entries in the first column of *H*˜*<sup>c</sup>* are all unity in order to describe the mean of the value of

In *H*˜*<sup>c</sup>* only the first *k* principal components are included. This selection leads to the desired data reduction. The value *k* need to be determined heuristically. Here for test series 1 the

−<sup>1</sup>

where *zc* consists of the variable of interest determined by a reference method, e.g. oven

After the calibration data is processed the system is essentially calibrated and is ready to handle unknown samples. However prior to that, the performance of the calibration still needs to be evaluated using the validation data. The target variables of interest are also determined for the pre-processed validation data using a reference method. The validation data (or later in use, the data of a measurement of an unknown sample) is processed in the

1. the scores are estimated using the loadings determined during the calibration procedure:

For the evaluation of the quality of the calibration the root mean square error of calibration

The results obtained with PCA/PCR for both test series are shown in Figure 9. The predicted moisture or water content is plotted vs. its true values. With perfect prediction all points of

0

5

10

water content [%] (predicted)

15

20

2. the unused principal components are removed and the unit column is added: *H*ˆ

3. the value of interest is estimated by the linear equation *<sup>z</sup>*ˆ*<sup>v</sup>* <sup>=</sup> ˜

group *RMSEc* and the validation group *RMSEv* are calculated.

=1.04%

cal. val.

=0.731%, RMSEv

0 5 10 15 20 25

moisture [%]

**Figure 9.** Results obtained with PCA/PCR for both test series.

(a) Test series 1: clay granules.

· *<sup>H</sup>*˜*<sup>c</sup>* · *zc*, (7)

*H*ˆ *<sup>v</sup>* · *β*.

PCR : RMSEc

0 5 10 15 20

(b) Test series 2: ethanol-water mixtures.

water content [%]

=2.12%, RMSEv

=2.73%

cal. val.

*<sup>v</sup>* <sup>⇒</sup> ˜ *H*ˆ *v*,

number of principal components used is *k*<sup>1</sup> = 12 and for test series 2 it is *k*<sup>2</sup> = 2. The coefficients of the linear equation can be calculated by the following equation:

> *β* = *H*˜*c <sup>T</sup>* · *<sup>H</sup>*˜*<sup>c</sup>*

interest in the linear equation.

drying for moisture content.

following manner:

*<sup>v</sup>* = *Xv* · *P*,

PCR : RMSEc

moisture [%] (predicted)

*H*ˆ

the calibration and validation group would be on the so called quality line. For test series 1 a *RMSEc* = 0.731% and *RMSEv* = 1.04% is achieved. For test series 2 the errors are *RMSEc* = 2.12% and *RMSEv* = 2.73%. The meaningfulness of *RMSE* depends on the range of the variable to be predicted. Therefore the range error ratio *RER* is a better choice to evaluate the calibration. It is the ratio between the variable range <sup>Δ</sup>*<sup>z</sup>* <sup>=</sup> max *<sup>z</sup>* <sup>−</sup> min *<sup>z</sup>* and the RMSE10:

$$RER = \frac{\Delta z}{RMSE}.\tag{8}$$

The quality of the performance can be assessed using the ranges suggested in Table 1. For test series 1 the *RERc* = 26.7 and *RERv* = 18.75, hence the performance is *good*. But for test series 2 the accuracy obtained with PCA/PCR is only *poor* because *RERc* = 8.5 and *RERv* = 6.6.

### **5.2. Artificial neural networks**

Although PCA/PCR is a linear operation it is more or less capable of processing non-linear data. However, when the unknown function describing the relationship between the pulse sequence and the value of interest is non-linear a purely linear method may not be the best choice. Artificial neural networks (ANN) can approximate unknown non-linear functions. For this application multilayer-feed-forward (MLFF) networks have a suitable architecture [40].

Such a network is shown in Figure 10. The input variables are weighted and processed by the neurons of the hidden layer. The activation functions of the neurons are non-linear11. This enables the non-linear function approximation. The output variable of the hidden layer is weighted again and processed by the neuron(s) of the output layer. The output of this layer needs to be post-processed (scaling and mean value) and the estimated variable of interest is available.

Due to their architecture ANN have several degrees of freedom: the number of hidden layers, the kind of activation function in each layer, and the number of neurons in the hidden layer. For the application discussed here one hidden layer and *nHL* = 10 neurons in this hidden layer are sufficient. The problem is that the number of weighting factors between the layers increases with the number of neurons and for an optimal determination of the weighting factors a relatively large number of samples for the calibration (training) is necessary.

<sup>10</sup> In [39] the standard error is used instead of the root mean square error; for large numbers of samples there is practically no difference.

<sup>11</sup> tansig-function

#### 14 Will-be-set-by-IN-TECH 336 Ultra-Wideband Radio Technologies for Communications, Localization and Sensor Applications ISOPerm: Non-Contacting Measurement of Dielectric Properties of Irregular Shaped Objects <sup>15</sup>

0 5 10 15 20 25

moisture [%]

**Figure 11.** Results obtained with ANN for both test series.

method has also been used for several applications in other fields.

1. between the input variables and the factor, and 2. between the variable of interest and the factor.

0.5

**Figure 12.** Influence of the number of factors *H* on the performance of the PLSR.

1

1.5

RMSE [%]

2

2.5

(a) Test series 1: clay granules.

and is only summarized in the following.

the variable of interest is maximal.

=0.516%, RMSEv

=0.87%

cal. val.

0

regression (PLSR) the data is decorrelated regarding the variables(s) of interest. Several PLSR algorithms exist and sometimes the data is pre-processed non-linearly. Although PLSR was developed, more or less intuitively, in order to analyze economic data, in the meantime this

The algorithm used here for the processing of the measured data is described in [7] in detail

• Firstly, the input values are weighted in such a way that the covariance between them and

• Secondly, the projection of the input values on the vector of the weighting values is called a *factor* or a *hidden path variable*. In the following, two regression analyses are considered:

<sup>0</sup> <sup>10</sup> <sup>20</sup> <sup>30</sup> <sup>0</sup>

number of iteration

5

10

water content [%] (predicted)

15

20

ANN : RMSEc

ISOPerm: Non-Contacting Measurement of Dielectric Properties of Irregular Shaped Objects 337

0 5 10 15 20

(b) Test series 2: ethanol-water mixtures.

cal. val.

water content [%]

=0.706%, RMSEv

=0.99%

cal. val.

moisture [%] (predicted)

ANN : RMSEc

**Figure 10.** Architecture of the used multilayer-feed-forward-ANN.

For this reason a pre-processing of the data is recommended. If all selected time points were to be fed into the input layer *m*1*nHL* = 3050 and *m*2*nHL* = 1870 weighting factors would need to be found, with only *nc* = 45 (test series 1) *nc* = 10 training data sets. Therefore it is useful to feed the ANN with the selected principal components because they include the relevant information. This means the linear principal components regression is replaced by the non-linear ANN.

The training of the ANN has a relatively high calculation effort. Furthermore the starting values for the weighting factors are set randomly at the beginning of the training. This means the method is not strongly deterministic and it is not known for example, whether the optimal weighting factors were found because the training stopped in a local minimum of the error function. The training of the ANN was effected using the artificial neural network toolbox of MATLAB.

The results of the ANN are plotted in Figure 11. In comparison to the results of PCA/PCR there is an improvement observable:


Despite the much higher calculation effort of ANN the improvements are not very satisfactory.

### **5.3. Partial least squares regression**

PCA decorrelates the data by eigenvalue decomposition. Therefore the variable(s) of interest are not considered in this procedure. Only at the stage of PCR are they taken into account and a selection of relevant principal components is necessary. With the partial-least-squares

**Figure 11.** Results obtained with ANN for both test series.

14 Will-be-set-by-IN-TECH

H1

H2

Hh

Selected principal components

of pulse sequence

the non-linear ANN.

MATLAB.

(*fair*).

v11

vh2

v1k

h1

h2

hm

there is an improvement observable:

This means there is a slight overfitting,

**5.3. Partial least squares regression**

v12

vhm

**Figure 10.** Architecture of the used multilayer-feed-forward-ANN.

v2m

v21

v22

vh1

Input-layer Hidden layer Output-layer

y1

y2

yh

For this reason a pre-processing of the data is recommended. If all selected time points were to be fed into the input layer *m*1*nHL* = 3050 and *m*2*nHL* = 1870 weighting factors would need to be found, with only *nc* = 45 (test series 1) *nc* = 10 training data sets. Therefore it is useful to feed the ANN with the selected principal components because they include the relevant information. This means the linear principal components regression is replaced by

The training of the ANN has a relatively high calculation effort. Furthermore the starting values for the weighting factors are set randomly at the beginning of the training. This means the method is not strongly deterministic and it is not known for example, whether the optimal weighting factors were found because the training stopped in a local minimum of the error function. The training of the ANN was effected using the artificial neural network toolbox of

The results of the ANN are plotted in Figure 11. In comparison to the results of PCA/PCR

• for test series 1 the *RERc* increases to 37.8 (rating: *very good*) and the *RERv* = 22.4 (*good*).

• for test series 2 the following ratings are obtained: *RERc* = 25.5 (*good*) and *RERv* = 18.1

Despite the much higher calculation effort of ANN the improvements are not very satisfactory.

PCA decorrelates the data by eigenvalue decomposition. Therefore the variable(s) of interest are not considered in this procedure. Only at the stage of PCR are they taken into account and a selection of relevant principal components is necessary. With the partial-least-squares

w2 w1

I

wh

Predicted value

z

Scaling+ Mean value

z

regression (PLSR) the data is decorrelated regarding the variables(s) of interest. Several PLSR algorithms exist and sometimes the data is pre-processed non-linearly. Although PLSR was developed, more or less intuitively, in order to analyze economic data, in the meantime this method has also been used for several applications in other fields.

The algorithm used here for the processing of the measured data is described in [7] in detail and is only summarized in the following.

	- 1. between the input variables and the factor, and
	- 2. between the variable of interest and the factor.

**Figure 12.** Influence of the number of factors *H* on the performance of the PLSR.

Publication range

ISOperm:

Other technologies:

defined shape of the object under test.

**6. Conclusions**

microwave sensing.

**Author details**

*University of Kiel, Germany*

[%]

[41]: Tobacco, PLSR 10-50 - 2 - 20 [42]: Scots pine, PLSR 0-15 0.46 0.74 32.6 20.3 [42]: Scots pine, PLSR 0-175 15.92 12.52 11 14 [9]: Clay granules, ANN 6.3-34.2 1.6 2.1 17.4 13.3

Clay granules, PLSR 4.5-24 0.38 0.69 52 28.1 Ethanol water mix. in bottle, ANN 2-20 0.71 1 25.5 18.1 [14], PLSR 70-100 1.28 2.55 23.4 11.8 [16], ANN 5-29.2 1.29 1.88 18.8 12.9 [17], PLSR 4.6-24.1 0.35 0.69 55.3 28.2 [19], PLSR 1.8-20.2 0.39 0.61 47.3 30.2 [20], PLSR 4.3-23.4 0.31 0.55 61.7 34.7

[43]: Wheat, admittance, PCR 9-20 - 0.39 - 28.2 [44]: Salmon, NIR, PLSR 61-70.8 - 0.98 - 10 [45]: Paper, NIR, PLSR 0-2.4 - 0.056 - 43.1 [46]: Theophyllin, NIR, ANN 1-22 0.45 0.83 46.7 25.3 **Table 2.** Comparison to other publications regarding the accuracy of the determination of moisture or water content. Except the method investigated in ISOPerm, all others are contacting and/or require a

Many industrial and scientific applications require extensive on-line process monitoring and quality control. Often the composition of goods (e.g. moisture content) is of great interest but also abstract parameters, for example quality or freshness, play an important role. The microwave sensor described is able to penetrate the investigated materials and by using UWB-techniques it is possible to gain information at various frequencies. The applied time domain techniques operate with low hardware effort and fast measurement speed while having a high accuracy. Using commercial MMICs signals exceeding a bandwidth of 10GHz can be generated and sampled with cheap and compact dedicated hardware. Today it is possible to employ multivariate calibration methods like artificial neural networks, which have a high computational effort, in real time. These methods are well established in, for example, NIR or image processing and are successfully adopted. The feasibility of the method has been successfully proven with accuracy even greater than in many previous publications using contacting methods. It has a great potential for many kinds of future applications in

Henning Mextorf, Frank Daschner, Mike Kent and Reinhard Knöchel

*RMSEc* [%]

ISOPerm: Non-Contacting Measurement of Dielectric Properties of Irregular Shaped Objects 339

*RMSEv* [%]

*RERc RERv*

**Figure 13.** Results obtained with PLSR for both test series.


All determined regression coefficients and weighting factors are used finally for the calculation of the regression equation. This means for the validation (and later application) that only a linear combination of the input values need to be calculated. Hence the calculation effort is much smaller in comparison to the ANN. The only degree of freedom is *H*, the number of factors to be used (number of iterations). When H is too high the *RMSEc* is significantly smaller than the *RMSEv*. This means that overfitting occurs. However, as shown in Figure 12 *H* should be selected where *RMSEv* has a minimum. Furthermore *RMSEc* should not be much smaller (factor 1/2) than *RMSEv*, otherwise the PLSR calibration could not handle unknown samples.

The performance of the PLSR is shown in Figure 13. For test series 1 *RERc* = 52 (*excellent*) and *RERv* = 28.1 (*good*). This is a further improvement in comparison to the ANN. For test series 2 the results stay similar to those of ANN: *RERc* = 26.5 (*good*) and *RERv* = 17.1 (*fair*).
