**3. Quantification of a composite with overlapping bands - Using vibrational spectroscopy in a beer sample**

Beer brewing is a relatively long and complex biotechnological process, which can generate a range of products with distinct quality and organoleptic characteristics, all of which may be relevant to determine the type of product that should be made. Failures during important steps such as saccharification and fermentation can lead to major financial losses, i.e. to a loss of a whole batch of beer. Currently, analyses of the physic-chemical processes are carried out offline using traditional tests which do not provide any immediate response, e.g. HPLC (High Performance Liquid Chromatography). In the case of micro- breweries, which currently increase, some of these tests cannot be performed at all, due to the prohibitive cost of these tests. Therefore, many breweries do not have a possibility to identify errors during the production and to take corrective actions early-on. Today, problems are detected only at a later stage, towards the end of the brewing process. Currently, most systems used in the breweries consume time and can potentially compromise the quality of a whole batch. A solution to this problem is a new method consisting of a system to monitor in real time (online) the saccharification and fermentation steps of the wort. The amounts of alphaamylase and beta-amylase in the grain are correlated with the time required to convert all grain starch into sugars. [3, 9-11]

490 Advanced Aspects of Spectroscopy

relationship with time.

**-4.8**

powerful solution.

**-4.5**

**-4.2**

**PC2 (4.8%)**

**-3.9**

**-3.6**

**-3.3**

**Sample 1 Sample 2 Sample 3 Sample 4 Sample 5**

correlation coefficient, indicating that the chosen pattern presented a direct linear

**234567**

**PC1 (86,9%)**

However, other cases of application of spectroscopic techniques require a more robust processing. One such case is the other example cited above, where the goal is to quantify various compounds in a complex sample containing various interferences, and this most often occurs in the regions of overlapping absorption (or transmission and emission) of compounds of interest. For these cases, the application of artificial neural networks is a

Beer brewing is a relatively long and complex biotechnological process, which can generate a range of products with distinct quality and organoleptic characteristics, all of which may be relevant to determine the type of product that should be made. Failures during important steps such as saccharification and fermentation can lead to major financial losses, i.e. to a loss of a whole batch of beer. Currently, analyses of the physic-chemical processes are carried out offline using traditional tests which do not provide any immediate response, e.g. HPLC (High Performance Liquid Chromatography). In the case of micro- breweries, which currently increase, some of these tests cannot be performed at all, due to the prohibitive cost of these tests. Therefore, many breweries do not have a possibility to identify errors during the production and to take corrective actions early-on. Today, problems are detected only at

**Figure 2.** PC1 versus PC2 showing a more evident distinction between the samples.

**3. Quantification of a composite with overlapping bands - Using** 

**vibrational spectroscopy in a beer sample** 

The brewing-process is based on traditional recipes, a defined period of time and temperature. The amount of different types of enzymes in the grain when the wort is produced is, however, not known, since this amount depends on many factors, e.g. storage conditions, temperature, humidity, transport. Due to these factors, the saccharification step could be stopped, which would mean that a significant amount of starch would remain in the wort, and therefore the procedure would result in poor wort. Or, it is also possible that all starch may have been converted to sugar and that the process continues longer than necessary. It is therefore critical to obtain data concerning the amount of sugar and alcohol in the wort fast. It is possible to get these data, using absorbance data in the mid infrared region (MIR) and analyzing these statistically using PCA and Artificial Neural Network (ANN) to determine the amount of sugars and alcohol in the wort during the saccharification and the fermentation procedure. These optical techniques provide huge advantages because they can be easily adapted to the industrial equipment, providing realtime responses with a high specificity and sensitivity. By applying these techniques, the procedure of saccharification and fermentation can be modified in each brewing step to increase the quality of the wort and eventually of the beer. This routine analysis during processing can also be used for other liquid samples.

A main feature of ANN is its ability to learn from examples, without having been specifically programmed in a certain way. In the case of spectroscopy, satisfying results can be achieved when ANN is used with supervised training algorithms. The external supervisor (researcher) provides information about the desired response for the input patterns, i.e. where there is an "a priori knowledge" of the problem. A neural network can be defined as applying non-linear vector spaces between input and output. This is done through layers of neurons and activation functions, where the input values are added according to weight and "bias" specific, producing a single output value [12-14]. A network "feedforward" is progressive or shows no recursion, if the input vector and a layer formed by the values precede the output layer, as shown in figure 3.

Formally, the activation function of the i-th neuron in the j-th layer is denoted by Fi,j(×); its output itself, j, can be calculated from the output of the previous layer itself, j-1,the weights Wi,k,j-1 (the index k indicates the neuron connected to the preceding layer) and bias bi, j according to the following formula

$$s\_{i,j} = F\_{i,j} \left( \left. b\_{i,j} + \sum\_{k} w\_{i,k,j-1} \left. s\_{k,j-1} \right| \right. \right) \tag{6}$$

**Figure 3.** Schematic architecture of a neural network (perceptron multilayer).

The input and output values of the network being denoted by ξi and ηi respectively, the mapping can be determined due to a successive application of equation 6, which results for example in the following equation in the previous case:

$$\mathcal{L}\_i = F\_{i,3}\left(\begin{array}{c} \boldsymbol{b}\_{i,3} + \sum\_{k=1}^4 \boldsymbol{w}\_{i,k,2} \, \boldsymbol{F}\_{k,2} \left( \begin{array}{c} \boldsymbol{b}\_{k,2} + \sum\boldsymbol{w}\_{k,m,1} \boldsymbol{F}\_{m,1} \left( \begin{array}{c} \boldsymbol{b}\_{m,1} + \sum\boldsymbol{w}\_{m,n,0} \, \boldsymbol{\eta}\_{n} \end{array} \right) \right) \end{array} \right) \end{array} \tag{7}$$

Since the choice of the activation function usually falls on the logistic sigmoid due to some of its mathematical properties (be class C ∞, for example), the above expression shows the relationship between ξi and ηi wich is defined by the weighing values and the bias. A very important characteristic of NN is its ability to learn, or the ability to reproduce the inputoutput pairs predetermined by properly adjusting the weights and the bias from training data and according to an adjustment rule. The method of a "*backpropagation*" rule is probably the best known training, and it is especially suited for progressive architectures. This rule is based on the successive application of the maximum slope algorithm determined from the first derivatives of the error between the desired outputs obtained by the parameters of the internal network. The backpropagation can be summarized in the following steps: (1) initialize the network parameters, bi,j and wi,k,j (2) select an entry ξ<sup>i</sup> p training data and form the pair (η<sup>i</sup> p ,δ<sup>i</sup> p) , (3) calculate the error with a standard convenient Euclidean, e.g.

$$e = \sqrt{\Sigma\_l (\delta\_l^p - \eta\_l^p)^2} \tag{8}$$

(4) Calculate the error derived from the above equation in relation to bi,j and wi,k,j (5) modify the parameters of the network according to the following rule and learning rate:

$$a\_{i,j} \gets b\_{i,j} - a \frac{\partial e}{\partial b\_{i,j}} \qquad \text{and} \qquad w\_{i,k,j} \gets w\_{i,k,j} - a \frac{\partial e}{\partial w\_{i,k,j}} \tag{9}$$

(6) Iterate steps (2) through (5) until a number of training cycles or stopping criteria has been achieved.[12, 13, 15, 16]

492 Advanced Aspects of Spectroscopy

training data and form the pair (η<sup>i</sup>

Euclidean, e.g.

**Figure 3.** Schematic architecture of a neural network (perceptron multilayer).

p ,δ<sup>i</sup>

� � �∑ ���

the parameters of the network according to the following rule and learning rate:

� � �� � ��

(4) Calculate the error derived from the above equation in relation to bi,j and wi,k,j (5) modify

example in the following equation in the previous case:

The input and output values of the network being denoted by ξi and ηi respectively, the mapping can be determined due to a successive application of equation 6, which results for

> 45 5 ,3 ,3 , ,2 ,2 ,2 , ,1 ,1 ,1 , ,0 11 1 *i i i ik k k km m m mn n km n*

 

(7)

 *Fb wF b wF b w* 

Since the choice of the activation function usually falls on the logistic sigmoid due to some of its mathematical properties (be class C ∞, for example), the above expression shows the relationship between ξi and ηi wich is defined by the weighing values and the bias. A very important characteristic of NN is its ability to learn, or the ability to reproduce the inputoutput pairs predetermined by properly adjusting the weights and the bias from training data and according to an adjustment rule. The method of a "*backpropagation*" rule is probably the best known training, and it is especially suited for progressive architectures. This rule is based on the successive application of the maximum slope algorithm determined from the first derivatives of the error between the desired outputs obtained by the parameters of the internal network. The backpropagation can be summarized in the following steps: (1) initialize the network parameters, bi,j and wi,k,j (2) select an entry ξ<sup>i</sup>

 

p) , (3) calculate the error with a standard convenient

� (8)

p

We can show in our case of beer analysis to which extent this processing technique is powerful. It has been applied widely in the interpretation of spectral data. In this case, an infrared absorption spectrum was obtained by Fourier Transform Infrared (FTIR) spectrometer [1, 2], the spectra is show in figure 4 and figure 5. In this case, the research objective was to provide a new method to determine the concentration of sugars and ethanol in beer wort during beer saccharification and fermentation in a short processing time. In our example, compounds of interest to be quantified can be separated into four main types of sugars present in the sample: glucose, maltose, maltotriose, dextrin (sugar chain length) and ethanol. It is important to note that the maltose binding is composed of two molecules of glucose, maltotriose three molecules of glucose sugar and that dextrins are composed of a large number of glucoses. Thus, the fundamental basis of these sugars is the same, the glucose, being differentiated only by the number of basic elements connected.

The absorption bands of these elements are expected to be so close that there is an overlap in the spectra, making the detection and quantification very complex. Figure 4 shows an example of absorption spectrum of a sample of ethanol, maltose 10%, and beer wort which contains some types of sugars. It is quite difficult to distinguish between the absorption spectra of the beer wort and the maltose, which contains certain types of sugars.

If we consider also the presence of ethanol (which has an absorption band in the same spectral region as the sugar) in the fermentation step, the procedure becomes even more complex. In figure 5 the extent of absorption during the fermentation step is shown, where the sample had initially all sugars without ethanol and ends up having only a part of dextrin (no fermentable sugar) and ethanol.

In this case, we first use the technique of principal component analysis in order to achieve a reduction of the number of variables to be analyzed. These spectra, which originally had about 1000 variables (wavelength where the absorbance is measured), can by these means represented by a few (two, in this case) variables, or principal components, with a high representation of information: 97.9%. The relationship between the two higher principal components is presented in figure 6 below.

Each spectrum of figure 5 is represented in figure 6 by a single point. In this new base of analysis, the wavelengths do not have any more significance, but the variance is important now. Each pair (PC1, PC2) represents a specific concentration of sugar and ethanol, which changes during the fermentation process. It is computationally feasible at this time, to apply an artificial neural network based on the values of the pairs for each of these points. For the first time, this experiment should be performed as previously described: as a case of a supervised NN e.g. a multilayer perception network). Therefore a method is required as the gold standard to calibrate or to train our neural network. One of the most widely accepted methods is the technique of HPLC (High Performance Liquid Chromatography). Using this technique, we can accurately quantify all the types of sugars of interest and the ethanol. The compounds of interest were measured using the standard method and assembling the ANN. In the neural network input (ηi) using ordered pairs of principal components and the output (ξi), the results obtained using the HPLC techniques show the amounts of compounds of interest, in this case, the sugars and the ethanol.

**Figure 4.** Absorption spectrum of a sample of ethanol, maltose 10%, and beer wort which contains certain types of sugars.

A certain part of the data (approximately 1/3), must be separated first in order to perform a further validation step. With 2/3 of the remaining data, the neural network is performed in the training stage following the equations and the structures described before, where the weight of each neural layer is adjusted in order to converge the network. The adjustment can be done as often as necessary, until the output (ξi) is as close to the true (real) value as required.

The training is complete, oncet the weight of the neurons has been adjusted, and the network has been converged with the desired error. The weight values should then be saved and stored before proceeding to the next step, which is the validation step: using the neural network to provide results of new spectra. With the data that were originally separated (1/3 data) and the values of the weights defined by the training stage, the neural network is run again. At this stage, note that the backpropagation system should not be executed. Simply use the matrix of the weights saved, and insert the data that have been separated for this validation step as inputs for the new network. Thus, the network will be performed only in the forward direction, supplying in a very short time of processing the output values ξi. These output values are compared with the expected values using the HPLC technique, using a correlation curve between the two techniques. If these results are satisfactory, the process of mounting the system to quantify the compounds of interest is complete, and can be passed on for practical use.

**Figure 5.** Absorption during the fermentation process.

494 Advanced Aspects of Spectroscopy

**0.4**

**0.6**

**Abs (arb. units)**

certain types of sugars.

**0.8**

**1.0**

gold standard to calibrate or to train our neural network. One of the most widely accepted methods is the technique of HPLC (High Performance Liquid Chromatography). Using this technique, we can accurately quantify all the types of sugars of interest and the ethanol. The compounds of interest were measured using the standard method and assembling the ANN. In the neural network input (ηi) using ordered pairs of principal components and the output (ξi), the results obtained using the HPLC techniques show the amounts of

**950 1000 1050 1100 1150 1200 1250 1300**

**)**

 **Ethanol Maltose 10% Beer Wort**

**Wavenumber (cm-1**

**Figure 4.** Absorption spectrum of a sample of ethanol, maltose 10%, and beer wort which contains

often as necessary, until the output (ξi) is as close to the true (real) value as required.

A certain part of the data (approximately 1/3), must be separated first in order to perform a further validation step. With 2/3 of the remaining data, the neural network is performed in the training stage following the equations and the structures described before, where the weight of each neural layer is adjusted in order to converge the network. The adjustment can be done as

The training is complete, oncet the weight of the neurons has been adjusted, and the network has been converged with the desired error. The weight values should then be saved and stored before proceeding to the next step, which is the validation step: using the neural network to provide results of new spectra. With the data that were originally separated (1/3 data) and the values of the weights defined by the training stage, the neural network is run again. At this stage, note that the backpropagation system should not be executed. Simply use the matrix of the weights saved, and insert the data that have been separated for this validation step as inputs for the new network. Thus, the network will be performed only in the forward direction, supplying in a very short time of processing the output values ξi.

compounds of interest, in this case, the sugars and the ethanol.

**Figure 6.** PC1 versus PC2 showing the time evolution of the fermentation process.

To be able to use spectroscopy with the neural processing requires using a standard method. We can simply use the PCA to reduce processing variables, entering the values of ordered pairs into the network, together with the weight values and collecting predetermined output results, in this case the amounts of sugars and ethanol. In the case of fermentation of the wort, using a number of principal components around three , a neural network comprising an input layer with 23 neurons and an output layer of 5 neurons , it is possible to quantify each type of sugar and ethanol with a quoted error of ± 0.2%. Here we exemplify our results showing the correlation between the value determined by the concentration of maltose using spectroscopy and HPLC technique (figure 7), where R2 and the coefficient slope is 0.991 and 0.999 respectively. The results of a linear fit show a good agreement between the proposed new method and the standard procedure. This result allows the use of our technique in brewery, as it enables monitoring quality and making process control less time consuming.

**Figure 7.** Correlation between the standard method (HPLC) and proposed procedure (MIR absorption).

#### **4. Conclusion**

In the analysis of spectroscopic data, not only the technique to obtain the values of different properties is important, but the correct mathematical processing of the data is actually the main issue to obtain the correct information. Especially the distinctions of multiple values which are correlated to a specific class of phenomena are the hide information that can be conveniently extracted. During our exposition in this chapter, we have concentrated in demonstrating how powerful the correct spectroscopy analysis can be when the first obtained data have been correctly arranged, allowing a mathematical procedure that treats the information as a whole instead of concentrations in individual values. Many techniques are today available for such procedures, but especially the Principal Component Analysis is quite powerful to be applied when spectral information is not restricted to a single wavelength, but rather to a large portion of the spectra.

We have concentrated on a relevant case where the UV-VIS portion of fluorescence spectrum is obtained and applied to determine its correlation with the postmortem interval in an animal model. The fluorescence in this case is subject to many effects due to the biological tissue modification as a natural evolution once the living metabolic action has been interrupted. This is clearly the case where biochemical modification causes alteration of spectrum as a whole and the attempt to concentrate the observation on individual features may fail. With the application of PCA to collect data, rich information patterns made a high correlation between extracted information and the real postmortem time interval possible. The classification of patterns and congregations of collections of information create a distinction into groups of distinct PMI. Even though we have used the method for PMI determination, the method has been shown to be as well powerful in applications in the field of cancer diagnostic, fermentation processing in beverage production, quality control in industry, identification of plagues and other features of interest in agriculture. The level of application of the PCA technique can go beyond the identification of pattern and correlation with values and can also provide specific quantification of individual chemical components of the system which is investigated.

To demonstrate this feature, we consider as an example the sugar quantification during beer production. These cases represent a bigger challenge to innumerous systems in several areas. Using the PCA procedure associated with a Neural Network (NN) we can quantify the composites in a sample, obtaining results comparably quickly. Here, we used the example of beer analysis. Using the MIR absorption spectroscopy of liquid samples, without any type of pre-procedure, we detected and quantified specific compounds (glucose, maltose, maltotriose, dextrin and ethanol) during the production of beer. The NN were used to determine the amount of these types of sugar and alcohol in the wort during the saccharification and fermentation. In the correlation between the values determined by the concentration of maltose spectroscopy with the HPLC technique we find the R2 and coefficient slope to be 0.991 and 0.999 respectively. Finally, the presentation of this chapter is to show the real power of the conjugation of spectroscopy techniques with data analyses. The field is clearly growing in diversity and importance.

## **Author details**

496 Advanced Aspects of Spectroscopy

consuming.

**0**

**4. Conclusion** 

**2**

**4**

**Maltose HPLC (%)**

**6**

**8**

**10**

**Equation y = a + b\*x Adj. R-Square 0.99138**

**Value Standard Error**

**Maltose Intercept 0.01702 0.17346 Maltose Slope 0.99818 0.0258**

To be able to use spectroscopy with the neural processing requires using a standard method. We can simply use the PCA to reduce processing variables, entering the values of ordered pairs into the network, together with the weight values and collecting predetermined output results, in this case the amounts of sugars and ethanol. In the case of fermentation of the wort, using a number of principal components around three , a neural network comprising an input layer with 23 neurons and an output layer of 5 neurons , it is possible to quantify each type of sugar and ethanol with a quoted error of ± 0.2%. Here we exemplify our results showing the correlation between the value determined by the concentration of maltose using spectroscopy and HPLC technique (figure 7), where R2 and the coefficient slope is 0.991 and 0.999 respectively. The results of a linear fit show a good agreement between the proposed new method and the standard procedure. This result allows the use of our technique in brewery, as it enables monitoring quality and making process control less time

**0 2 4 6 8 10**

 **Maltose (MIR absorption) (%)**

**Figure 7.** Correlation between the standard method (HPLC) and proposed procedure (MIR absorption).

In the analysis of spectroscopic data, not only the technique to obtain the values of different properties is important, but the correct mathematical processing of the data is actually the main issue to obtain the correct information. Especially the distinctions of multiple values which are correlated to a specific class of phenomena are the hide information that can be conveniently extracted. During our exposition in this chapter, we have concentrated in demonstrating how powerful the correct spectroscopy analysis can be when the first obtained data have been correctly arranged, allowing a mathematical procedure that treats the information as a whole instead of concentrations in individual values. Many techniques

E.S.Estracanholli, G.Nicolodelli, S.Pratavieira, C.Kurachi and V.S. Bagnato *Institute of Physics of São Carlos, University of São Paulo, SP, Brazil* 
