**3. The spectrum analysis method**

The absorption spectrum has various information of sample by each wavelength. In other words, the absorption spectrum has multidimensional vector information. Therefore, it is to analyze the spectra by multivariate analysis. In multivariate analysis for the use of multiple explanatory variables, it is increasing the amount of information. Thereby to reduce noise to a relative, it is possible to build a greater precision calibration curve. We performed correction to the absorption spectrum, and used to the PLSR in quantitative analysis, and used to the SIMCA method and the KNN method in the pattern analysis.

#### **3.1 Spectral correction**

We can extract the maximum information from the spectrum by performed correction to the absorption spectrum. We used to the spectral correction to the normalization and the differential. In the normalization correction, the absorbance of the designated peak is "1", and the coefficient "1/(absorbance of the specified peak)" is multiplied to the absorbance of each wavenumber. For example, there are the measured spectra of sample including material A and B. When you want to get the results of material A, the information of material B in the spectrum is the noise. In here, normalized to all spectra by the absorption peak of material B, and appears only information of material A. In this correction, that can be minimized by measurement error.

On the other hand, the differential correction is a correction of the slope of the spectrum, it is possible to eliminate the effects of baseline. Also, if the wavenumber of several absorption peaks is very close, is able to separate these peaks. The first differential correction is calculated at the slope of each wavenumber in spectrum, the intensity of absorption peaks wavenumber is "0". Derivative spectrum half-width of the higher orders is narrowed, and the noise increases for in lower S/N ratio. For these reasons, we used first differential correction in this study.

#### **3.2 Partial Least Squares Regression (PLSR)**

We are using PLSR for a quantitative analysis. PLSR has not the Multilinear regression (MLR) exists multicollinearity, Measurement accuracy of PLSR is better than Principal Components Regression (PCR) in a small number of factors. In the PLSR, the explanatory variables are the infrared spectra, the objective variables are the reference values. The explanatory variables and the objective variables are assumed to each have an error margin, extracted to PLS factors, calculate a regression, and new objective variables are calculated. Next, a similar calculation using the new objective variables and explanatory variables, add a PLS factor, re-calculate the objective variables. A number of PLS factors increase, and the Standard Error of Calibration (SEC) is smaller.

Introduction of Non-Invasive Measurement Method by Infrared Application 83

SMBG is measured by blood sampling method, but this method the patient suffering and stress, including issues such as the risk of infection. And, the economic burden on patients is very large, because medical needles and measurement kit are disposable. Medical expenses of diabetes and its complications are estimated at about 3,000 billion dollars worldwide, and are expected to continue to increase in the future. Those various studies have been conducted around the world, because the medical expenses have become large economic markets. But, the effective blood glucose measurement method to overcome these problems, have not yet been developed. Therefore, it is desired to develop a method to measure non-invasive blood glucose measurement. Over the past few years, several studies have been made on non-invasive blood glucose measurement based on ATR infrared spectroscopy. The purpose of this study is to examine the accuracy of blood

**4.2 Measurement system for non-invasive measurement of blood glucose** 

Fig. 3. Non-invasive measurement system for blood glucose

This study used FT-IR (Travel-IR) and ATR method. The block diagram of measurement system is shown in Fig.3. In the ATR prism, used to the prism of diamond mounted on ZnSe

The measurement part is the tip of the left hand middle finger of subject. The middle finger was washed with ethanol. The 5μl squalene oil was applied on the prism by micropipette in each measurement. Squalene oil is used as an internal standard method described below. The measurement part of subject put on the prism, pressed from above with a constant pressure. We measured in this state, and got the absorption spectrum including the blood

The absorption spectrum was applied to the ATR correction. We used 1800 cm-1 wavenumber in standard wavenumber of ATR correction, because absorption peak and noise are not in this wavenumber. After, all measured spectra were applied data correction between 2700 cm-1 and 1750 cm-1 to remove the absorption noise of diamond prism, and, were applied normalization correction in the absorption peak of squalene oil. We used these corrected absorption spectra in analysis. In the measurement condition, measurement wavenumber range is 4000~700 cm-1, resolution is 4 cm-1, and accumulation is 30 times.

Prism

Finger

glucose in clinical trial.

(3 times reflection).

glucose value information.

But, increasing the number of factors too, applies Standard Error of Prediction (SEP) increases (Over Fitting). Therefore, we validate the optimal number of PLS factors by Leaveone-out method. The Prediction Residual Error Sum of Squares1 (PRESS1) values calculated from the objective variables by the following equation 2

$$\text{PRESS}\_{\text{n}} \equiv \Sigma \left( \mathbf{y}\_{\text{obs}} \mathbf{y}\_{\text{ret}} \right) 2 \tag{2}$$

yobs: new objective variables, yref: reference values

After, the new objective variables are calculated by calibration curve adding a new PLS factor. PRESS2 value is calculated from the objective variables obtained again. If the residual is significant before and after (The difference between the PRESS1 and PRESS2), adding the PLS factor. If the residual is not significant before and after, select the model that was built before. After, we measured to the spectrum of sample of unknown amount, and calculated by using calibration curve and this spectrum. By the above process, it is possible to be measured quantitatively of unknown sample.

#### **3.3 Soft Independent Modeling of Class Analogy (SIMCA)**

We used SIMCA for the qualitative analysis. A class is made by infrared spectra of known sample and builds a classification model. Each class is analysed by analysis of principal component and the distinction space is set. This space is called SIMCA box. It is classified into the class suited most by applying infrared spectra of unknown sample to SIMCA box. Moreover, the rest error is calculated by applying each spectrum that composes the class to other classes. And, Discrimination Power that can specify the factor in which it distinguishes between classes is obtained. We confirmed the validity of the classification model constructed by using Discrimination Power.

#### **3.4 K-Nearest Neighbor method (KNN)**

KNN method is one of pattern analysis to determine the class by comparing the similarity between the patterns not based on specific statistical distributions. To determine the class of an unknown sample is made on "voting". First, calculate the Euclidean distance between samples for the known and unknown class samples. Next, select a known class samples for the number of "K" close to the distance from an unknown class sample. "K" is odd number. The class of an unknown sample is determined to a most numerous class in the "K".

For example, if the "K=5", analyse the closest class of 5 samples from an unknown sample class. Five classes (1,2,3,3,3), (1,3,3,2,1) are classified at Class "3", (1,1,2,1,2), (2,1,3,1,1) are classified at Class "1". For such analysis, impact on the accuracy of the analysis is a combination of variables used to calculate the distance and the number of "K".
