**4. Proposed method, results and analysis**

The first step is collect samples of kerosene and diesel oil in a reasonable period of time, to obtain a data set that best reflect all possible operational variability, as changes in the cast of oil and operating conditions of the units.

In the second stage experiments are performed to characterize, on a laboratory scale, the product, aiming to determining, from the samples, the real kinematic viscosity to be modeled. The samples were also subjected to infrared radiation.

In the third step the mathematical models were developed using The Unscrambler® and Excel® softwares, associating the information to absorbed infrared radiation with the physicochemical property. In the end, the model is implemented on an industrial scale for forecasting the viscosity in real time, providing to the production area, high power decisionmaking, and enabling increase the profitability of the blending process.

For each oil studied was developed a mathematical model with 800 input variables. To help determine the number of latent variables and minimize the residual variance was used the full cross-validation method, which is a mathematical algorithm able to gradually reduce the number of samples. In the sequence, a model constructed from the remaining samples is tested by comparing it with the true values of the samples excluded. The models are developed using The Unscrambler® program. Several forms of preprocessing were

Multivariate Modeling in Quality Control of Viscosity in Fuel: An Application in Oil Industry 43

Fig. 13. Set of spectra of kerosene used in modeling

Fig. 14. Set of spectra of diesel used in modeling

evaluated to obtain the minimum value of RMSECV (Root Mean Square Error of Cross Validation).

Fig. 12. Representation of online determination of viscosity by mathematical modeling (adapted from Early Jr, 1990)

The preprocessing that provided the best result was the first derivative of the second-degree polynomial proposed by Savitzky-Golay (Galvão et al., 2007), that highlight the differences between samples, contributing to the model can be used to explain the variance between them. The Fig. 13 and Fig. 14 show the original spectra for jet fuel and diesel, and the Fig. 15 and Fig. 16 show the same data after preprocessing by the above derivative.

Fig. 13. Set of spectra of kerosene used in modeling

evaluated to obtain the minimum value of RMSECV (Root Mean Square Error of Cross

Fig. 12. Representation of online determination of viscosity by mathematical modeling

and Fig. 16 show the same data after preprocessing by the above derivative.

The preprocessing that provided the best result was the first derivative of the second-degree polynomial proposed by Savitzky-Golay (Galvão et al., 2007), that highlight the differences between samples, contributing to the model can be used to explain the variance between them. The Fig. 13 and Fig. 14 show the original spectra for jet fuel and diesel, and the Fig. 15

(adapted from Early Jr, 1990)

Validation).

Fig. 14. Set of spectra of diesel used in modeling

Multivariate Modeling in Quality Control of Viscosity in Fuel: An Application in Oil Industry 45

After the derivative of the spectrum, was determined the optimal number of latent variables for each product. In this way, for kerosene, were adopted four latent variables. Above this number, the explained variance decreases due to the incorporation of noise in the model, as

> 1 8.20 2 28.46 3 55.96 4 75.67 5 78.71 6 83.61 7 85.50 8 87.48 9 87.60 10 86.00 11 84.90 12 84.72

Cumulative Variance Explained (%)

Latent Variable

Table 1. Cumulative explained variance (%) versus latent variables (kerosene)

Fig. 17. Explained variance (%) versus latent variables (kerosene).

shown in Table 1 and Fig. 17.

Fig. 16. Derivative (Savitzky-Golay) spectra of diesel

The method of the derivative of Savitzky-Golay smooths the spectrum by polynomial mobile. The derivative of the values of the absorbance as a function of wavenumber is calculated with the polynomial equation cited.

Fig. 15. Derivative (Savitzky-Golay) spectra of kerosene

Fig. 16. Derivative (Savitzky-Golay) spectra of diesel

calculated with the polynomial equation cited.

The method of the derivative of Savitzky-Golay smooths the spectrum by polynomial mobile. The derivative of the values of the absorbance as a function of wavenumber is After the derivative of the spectrum, was determined the optimal number of latent variables for each product. In this way, for kerosene, were adopted four latent variables. Above this number, the explained variance decreases due to the incorporation of noise in the model, as shown in Table 1 and Fig. 17.


Table 1. Cumulative explained variance (%) versus latent variables (kerosene)

Fig. 17. Explained variance (%) versus latent variables (kerosene).

Multivariate Modeling in Quality Control of Viscosity in Fuel: An Application in Oil Industry 47

Fig. 19. Scores of the first two latent variables for kerosene

Fig. 20. Scores of the first two latent variables for diesel

samples that have great influence have low residue (Fig. 16 and Fig. 17).

Among the statistical tools used to detect outliers stands out the Student residues technique versus leverage. The leverage can be interpreted as the distance from the centroid of a sample data set. High values of leverage, means that the sample is located far from the mean and has a major influence in the model. The Student residues can be interpreted as the difference between the actual values and the values predicted by the model. For both products (jet fuel and diesel) there were no evident outliers, and it is also observed that the

For the diesel, using the same procedure, were adopted eight latent variables, because above that number, there is no significant gain for explanation of the variance, as shown in Table 2 and Fig. 18.


Table 2. Cumulative explained variance (%) versus latent variables (diesel)

Fig. 18. Explained variance (%) versus latent variables (kerosene).

The outliers were eliminated observing the graphs of scores of the first two principal components and their influence (residual variance in Y versus leverage). The first two principal components captured the largest variability between the data and both for the samples of diesel and kerosene they are statistically close, as represented in the Fig. 19 and Fig.20.

For the diesel, using the same procedure, were adopted eight latent variables, because above that number, there is no significant gain for explanation of the variance, as shown in Table 2

> 1 8.20 2 28.46 3 55.96 4 75.67 5 78.71 6 83.61 7 85.50 8 87.48 9 87.60 10 86.00 11 84.90 12 84.72

Cumulative Variance Explained (%)

Latent Variable

Table 2. Cumulative explained variance (%) versus latent variables (diesel)

Fig. 18. Explained variance (%) versus latent variables (kerosene).

The outliers were eliminated observing the graphs of scores of the first two principal components and their influence (residual variance in Y versus leverage). The first two principal components captured the largest variability between the data and both for the samples of diesel and kerosene they are statistically close, as represented in the Fig. 19 and

and Fig. 18.

Fig.20.

Fig. 19. Scores of the first two latent variables for kerosene

Fig. 20. Scores of the first two latent variables for diesel

Among the statistical tools used to detect outliers stands out the Student residues technique versus leverage. The leverage can be interpreted as the distance from the centroid of a sample data set. High values of leverage, means that the sample is located far from the mean and has a major influence in the model. The Student residues can be interpreted as the difference between the actual values and the values predicted by the model. For both products (jet fuel and diesel) there were no evident outliers, and it is also observed that the samples that have great influence have low residue (Fig. 16 and Fig. 17).

Multivariate Modeling in Quality Control of Viscosity in Fuel: An Application in Oil Industry 49

Fig. 23. Comparison of the results provided by PLS regression model and the results

Fig. 24. Comparison of the results provided by PLS regression model and the results

It was possible modeling mathematically the kinematic viscosity of jet fuel and diesel oil by multivariate analysis. The values of the standard error of cross validation showed that the

It was possible to solve a real problem combining academia with industry. The results were used in an industrial plant at a refinery in Brazil and helped to speed up the decision-making in blending system, reduced process variability and increase the profitability of production. There was a significant reduction of the analysis performed in the laboratory, because the proposed method is faster, more practical, does not generate chemical waste, minimize

obtained by the reference laboratory (kerosene)

obtained by the reference laboratory (diesel)

products meet the specifications.

**5. Conclusions** 

Fig. 22. Residual versus leverage for diesel

The correlation coefficient (r) indicates the degree of correlation between the estimated values and those obtained experimentally. In developing the model, the objectives were to minimize the RMSECV and maximize the coefficient of multiple determination (R2) or the correlation coefficient (r). Graphs in Fig. 23 and Fig. 24 show the actual values versus predicted values along with the correlation coefficients.

The results of the models are statistically equivalent to those ones from laboratory methods. The Table 3 summarizes the results of modeling.


Table 3. Summary of modeling results

Fig. 23. Comparison of the results provided by PLS regression model and the results obtained by the reference laboratory (kerosene)

Fig. 24. Comparison of the results provided by PLS regression model and the results obtained by the reference laboratory (diesel)

#### **5. Conclusions**

48 Fuel Injection in Automotive Engineering

The correlation coefficient (r) indicates the degree of correlation between the estimated values and those obtained experimentally. In developing the model, the objectives were to minimize the RMSECV and maximize the coefficient of multiple determination (R2) or the correlation coefficient (r). Graphs in Fig. 23 and Fig. 24 show the actual values versus

The results of the models are statistically equivalent to those ones from laboratory methods.

Kerosene 115 4 0.09 0.889854 3.1 to 4.6 Diesel 131 8 0.11 0.959387 3.1 to 5.3

RMSECV

(cSt) Correlation Range

(cSt)

Latent variables

Fig. 21. Residual versus leverage for kerosene

Fig. 22. Residual versus leverage for diesel

predicted values along with the correlation coefficients.

The Table 3 summarizes the results of modeling.

samples

Property Number of

Table 3. Summary of modeling results

It was possible modeling mathematically the kinematic viscosity of jet fuel and diesel oil by multivariate analysis. The values of the standard error of cross validation showed that the products meet the specifications.

It was possible to solve a real problem combining academia with industry. The results were used in an industrial plant at a refinery in Brazil and helped to speed up the decision-making in blending system, reduced process variability and increase the profitability of production.

There was a significant reduction of the analysis performed in the laboratory, because the proposed method is faster, more practical, does not generate chemical waste, minimize

**Section 2** 

**Fuel Injection in ICE Versus** 

**Combustion Rate and Exhaust Emission** 

reprocessing, reducing costs and energy costs. In addition, better quality fuel reduces the impact of burning on the environment.

This chapter opens up other possibilities for assistance, such as simultaneous determinations of several other parameters of oil products.

#### **6. References**


< http://www.ril.com/html/business/refining\_processes.html>

