Factors RMPE 1 27.03 1,2 20.69 1,2,3 19.28 1,2,3,4 19.69 1,2,3,4,5 19.82

chosen *C*×*ε* pair was *C= 4.2222* and *ε= 0.2223* (filled circle in Fig.1)

**3. Results** 

$$\text{Maximize } \sum\_{i=1}^{N} y\_i (a\_i^\prime - a\_i) - s \sum\_{i=1}^{N} (a\_i^\prime - a\_i) - \frac{1}{2} \sum\_{i,j=1}^{N} (a\_i^\prime - a\_i)(a\_i^\prime - a\_i) \left( \left< \mathbf{x}\_i \cdot \mathbf{x}\_j \right> + \frac{\delta\_{ij}}{C} \right) \tag{27}$$

$$\text{subject to } \begin{cases} \sum\_{i=1}^{N} (\alpha'\_i - a\_i) = 0\\ \alpha'\_{i} ; \alpha\_i \ge 0, i = 1..N \end{cases} \tag{12}$$

where , ' are Lagrangean multipliers satisfying ' 0 *i i* and 0 *ij C* for *p=1*. The following Karush-Kuhn-Tucker conditions should also be satisfied

$$\alpha\_i \left( \left\langle \mathfrak{B} \cdot \mathbf{x}\_i \right\rangle + \beta\_0 - y\_i - \varepsilon - \xi\_i \right) = 0, \ i = 1...N$$

$$\alpha\_i \left( y\_i - \left\langle \mathfrak{B} \cdot \mathbf{x}\_i \right\rangle - \beta\_0 - \varepsilon - \xi' \right) = 0, \ i = 1...N$$

Then the link between the dual and primal representation is given by

$$\hat{\mathfrak{B}}\_{SVM} = \sum\_{i=1}^{N} (a\_i - a\_i^\prime) x\_i$$

where ,' 0 *i i* (Cristianini and Shawe-Taylor,2000).

In our application, for the SVM case, both input and output training data where centered and scaled to have zero means and unity standard deviation. The values of the **β** coefficients in the raw data domain were calculated as follows:

$$\hat{\mathbf{Y}} = \mathrm{s}d\_{\mathbf{y}} \mathbf{V}^{-1} \hat{\boldsymbol{\mathfrak{P}}}\_{\text{SVM}} \mathbf{X}\_{naw} + \mathrm{s}d\_{\mathbf{y}} \mathbf{V}^{-1} \hat{\boldsymbol{\mathfrak{P}}}\_{\text{SVM}} \mathbf{X}\_{naw} + \overline{\mathbf{Y}} = \hat{\boldsymbol{\mathcal{B}}}\_{0} + \hat{\boldsymbol{\mathfrak{P}}}\_{\text{SVM}} \mathbf{X}\_{naw} \tag{13}$$

where ^ **Y** is the estimated Ueq, **V** is a diagonal matrix of standard deviations for each column of **X** and **X** is the vector of columns means from **X.** The mean and standard deviation of Ueq from training data set are *Y* and *sd***<sup>y</sup>** , respectively. The intercept is expressed as <sup>1</sup> <sup>0</sup> <sup>ˆ</sup> *sd SVM raw <sup>Y</sup>* **yV <sup>β</sup> <sup>X</sup>** .

#### **2.5 Statistical modeling of equilibrated urea**

The three estimation procedures (OLS, PLS, and SVM) to obtain the regression coefficients **β** of a linear model where applied to build bed side equations to estimate equilibrated urea from intradialysis urea samples and anthropometric data in 109 hemodialyzed patients. Estimation, selection and validation of the model were implemented in R language (www.rproject.org) (see appendix).Prior to fit a model, the appropriate number of factors (A) ,the best cost (*C*) and epsilon (*ε*) pairs values were chosen for PLS and SVM, respectively. For this purpose, a 15 fold cross validation strategy was applied over 70% randomly chosen patients from the data set. In the PLS case, models including 1 to *A* factors with *A=1, 2, 3, 4* and *5* were tested. For each model the cross validation root mean prediction error (RMPE) was calculated. Then the expected value of the RMPE over all partitions was obtained. The model achieving the smaller RMPE mean was chosen. For the linear SVM case, a *C*x*ε* 10x10 grid searches was performed. The ranges were from *4* to *6* for *C* and from *0.001* to *2* for *ε.* A linear SVM model was built for each (*C*,*ε*) pairs and the cross validation RMPE was calculated and compared. The smaller RMPE mean was used as selection criteria. The predictive ability of the fitted models was evaluated using a 20 fold cross-validation strategy over the whole data set. The data set was split in 20 consecutive sets of equal size and 19 were alternatively used for **β** estimation and one for prediction from the estimated model.
