4.1. Database acquisition and preprocessing

Biodiesels from soybean, corn, palm and babassu (an oleaginous abundant in the Northeast of Brazil) were synthesized via transesterification by methylic route and homogeneous alkaline catalysis and used to prepare 70 binary, ternary and quaternary mixtures (volumetric fractions) designed by simplex-lattice and centroid-simplex designs.

The oxidative stabilities of the samples were determined by the method EN 14112:2003 [33] using a Rancimat equipment Metrohm model 873. The average of two measurements for each sample was taken. The oxidative stabilities of the mixtures ranged from 4.81 to 25.47 h.

The spectra were acquired using a Fourier transform NIR spectrometer PerkinElmer model Frontier™ with a near infrared reflectance accessory (NIRA), equipped with a fast recovery deuterated triglycine sulfate (FR-DTGS) detector. All spectra were recorded with an average of 16 scans and spectral resolution of 2 cm<sup>1</sup> . The measured wavenumber range was 4000–12,000 cm<sup>1</sup> , but the work range was restricted to 4000–6100 cm<sup>1</sup> because of noninformative signal (close to baseline) and increase of noise as wavenumber gets close to 12,000 cm<sup>1</sup> .

The raw spectra (Figure 1a) showed bands characteristic of first overtone of CdH stretching (5550–6100 cm<sup>1</sup> ) and of combination of CdH and C]O stretching modes (4640–4700 cm<sup>1</sup> ) [34]. The bands around 4262 and 4334 cm<sup>1</sup> can be associated to the second overtone of CdH bending and to combination of CdH and C]C stretching modes, respectively [35].

For correction of spectra baseline deviations caused by systematic variations, the first derivative was calculated by the Savitzky-Golay filter [36] with a 15-point quadratic smoothing function. The window size of points to fit the polynomial function of Savitzky-Golay filter depends on how noisy the spectra are. In this case, a 15-point window was enough to smooth the spectral noise. The derivative NIR spectra of the full database can be seen in Figure 1b.

After applying Savitzky-Golay filter, the spectra were mean-centered and then used as input data (X-matrix) consisting of 1051 variables and, as output variable (response, y-vector), it was used the raw oxidative stabilities (h). From this point, only the preprocessed data was used.
