*3.3.2 Cubist regression*

Cubist (CB) is a rule-based approach that uses building rules to generate regression solutions. A rule is generated for each leaf in a regression tree, and it is linked to the data it contains. The linear combination of rules that occurs when all rules are constructed is used to make final predictions [13]. The CB model incorporates boosting with training committees, which is comparable to the approach of boosting by generating a sequence of trees with changed weights successively. The number of neighbors of the CB model is used to modify the rule-based prediction. The models created by two linear models in the CB model are written as follows in Eq. (5), [14]:

$$
\boldsymbol{\hat{y}}\_{par} = (\mathbf{1} - \boldsymbol{a}) \times \boldsymbol{\hat{y}}\_p + \boldsymbol{a} \times \boldsymbol{\hat{y}}\_c \tag{5}
$$

where ^*yc* is the forecast of the current model and ^*yp* is the prediction of the parent model.

## **3.4 Error metrics**

We randomly divided the data into a training set and a testing one to evaluate the investigated models and measure their prediction power. Eqs. (6)–(8) establish the error metrics used to assess the accuracy of the predictive models.

$$R^2 = 1 - \frac{\sum\_{i=1}^{n} \left(Pout\_i - \widehat{Pout\_i}\right)^2}{\sum\_{i=1}^{n} \left(Pout\_i - \overline{Pout}\right)^2} \tag{6}$$

$$\text{RMSE} = \sqrt{\frac{1}{n} \sum\_{i=1}^{n} \left(Pout\_i - \widehat{Pout\_i}\right)^2} \tag{7}$$

$$\text{MAE} = \frac{1}{n} \sum\_{i=1}^{n} \left| Count\_i - \widehat{Pout}\_i \right| \tag{8}$$
