**4. Methodology**

**No. Reference Method Determinant Time scale** 

5 [19] Univariate time series Y*<sup>t</sup>*−1 Annual residential

demand

rainfall

13 [31] Wavelet-deinoizing and ANN 7-year long time series of demand Monthly demand

pressure

lagged demand

in a multivariate model

Precipitation, temperature, humidity,

GDP, population, temperature, greenery

coverage, delayed demand

Wind speed, temperature, demand,

humidity, and rainfall

Historical demand and time Annual demand

Delayed demand, population Daily demand

6 [22] Regression and ANN Temp, rainfall, and lags of peak demand Peak weakly

7 [23] ANN Temperature, rainfall, and delayed

8 [2] Time series Univariate demand series, temperature

9 [6] Time series and ANN Delayed demands, temperature, and

15 [29] ANN, SVM, Monte Carlo Rain, demand, wind speed, atmospheric

smoothing modified regression

weather and price

household size, income, price, temp, rain, drought dummies

Population density Annual demand

Daily demand and hourly demand Daily demand

Monthly demand

Bimonthly demand

demand

demand

Daily demand

Daily, weekly, monthly, annual

Weekly demand

Weekly (6 days)

Annual demand

Hourly demand

Hourly demand

1 [16] Linear regression Seasonal dummies, derivatives of

2 [17] Linear regression Density, building size, lot size,

3 [18] Regression using Bayesian

102 Water Stress in Plants

4 [13] Decomposed daily demand

forecasts

10 [24] Holt-Winters multiplicative

11 [26] Weighted average regression and ANN

12 [27] Decomposed annual demand,

14 [28] SVM with RBF function is

16 [30] SVM and adaptive Fourier series

**Table 1.** Literature on water demand forecasting.

regression and ANN

compared with ANN

moment entropy

followed by composite

### **4.1. Model development**

To determine water demand (*D*) in millions of liters (ML), this research used population (*P*) and hotel occupancy factor (HOR) as socioeconomic parameters (the City of Kelowna is one of the hot spots for tourism in North America), and temperature (*T*) in °C, relative humidity (RH) in percent, and rainfall (*R*) in millimeters as weather parameters. As these parameters did not have the same order of magnitude, they were normalized prior to models development by

$$X = \frac{x - \mu}{\sigma} \tag{1}$$

where *X* is the standardized magnitude of parameter *x*, *μ* and σ are the corresponding mean and standard deviation, respectively. Phase space reconstruction of each explanatory variable was used prior to GEP modeling to define the structure of the model inputs. This was done to identify the stochastic or deterministic nature of the collected data. For a given proper lag time, the phase space was built by applying Taken's theorem [33] and transforming the time-series data into the geometry of a single moving point along a trajectory, where each point corre‐ sponds to a datum. Average mutual information (AMI) was used to determine the proper lag time of water demand for phase space reconstruction of all input factors. This was done to achieve a comprehensive understanding of input factors, variable self-interaction, and assess the use of lag times in demand forecasting models. Labeled *MaDb*OP*c*, where *a*, *b*, and *c* ∈ {1, 2, 3} a total of 27 models were created (**Table 2**), which combined three input types [*M*1: demand data only; *M*2: demand and climatic data; *M*3: demand, climatic, and demographic data], three lag times [*D*1: 1 month lag; *D*2: 1 and 2 month lags; *D*3: 1, 2, and 3 month lags], and three types of genetic operators [OP1: {+, −, *x*}; OP2: {+, −, *x*, *x*<sup>2</sup> , *x*<sup>3</sup> }; OP3: {+, −, *x*, *x*<sup>2</sup> , *x*<sup>3</sup> , √, e*<sup>x</sup>* , log, ln}] used in developing the GEP models.


\**t* is current month; *D* is demand; HOR is hotel occupancy factor; *P*, is population; *R* is rainfall; RH is relative humidity; *T* is temperature.

**Table 2.** Structure of classified models.

**Figure 1.** Time series of water demand in the City of Kelowna District (CKD) for 1966–2008.

Data were used in partitions of 144 samples for training (1996–2007) and 35 samples for validation (2008–2010). The time series of water demand over the time period of 1996–2010 (**Figure 1**) shows a relatively regular periodic cycle of water demand in CKD that is mainly due to seasonal changes.

### **4.2. Genetic expression programming (GEP)**

Introduced by Ferreira, GEP is an emerging soft computing technique [34]. The strategy used for the learning algorithms was the optimal evolution using the genetic operators. Following Ferreira, this research defined the overall structure of the GEP model by: 30 chromosomes, eight head sizes, and three genes [35]. The selected head size determined how complex each model parameter was. Each of the gene heads underwent a set of different arrangements to model the feeding data. Selecting new random populations was followed by reproduction in order to reach the most suitable model under optimized stopping conditions. Models were developed based on three genes linked together by an addition function. The number of genes per chromosome specified the layers or blocks involved in building the whole model. Although a large gene was useful, dividing the chromosomes into simpler units resulted in a more efficient and manageable learning process. RMSE was used as a fitness function to fit a curve to target values. The stopping condition was a maximum fitness and coefficient of determi‐ nation (*R*<sup>2</sup> ). Ten numerical constants were used as floating data point in each gene.

### **4.3. Lag time**

of genetic operators [OP1: {+, −, *x*}; OP2: {+, −, *x*, *x*<sup>2</sup>

Demand Data Based *M*1*D*<sup>1</sup> *Dt*−1

**Classification Model Input variables combination\***

Demand + Weather Data Based *M*2*D*<sup>1</sup> *Dt*−1, *Tt*−1, *Rt*−1, *RHt*−1

*M*1*D*<sup>2</sup> *Dt*−1, *Dt*−2 *M*1*D*<sup>3</sup> *Dt*−1, *Dt*−2, *Dt*−3

\**t* is current month; *D* is demand; HOR is hotel occupancy factor; *P*, is population; *R* is rainfall; RH is relative

**Figure 1.** Time series of water demand in the City of Kelowna District (CKD) for 1966–2008.

Data were used in partitions of 144 samples for training (1996–2007) and 35 samples for validation (2008–2010). The time series of water demand over the time period of 1996–2010 (**Figure 1**) shows a relatively regular periodic cycle of water demand in CKD that is mainly

developing the GEP models.

104 Water Stress in Plants

Demand + Weather + Population Data

**Table 2.** Structure of classified models.

due to seasonal changes.

humidity; *T* is temperature.

Based

, *x*<sup>3</sup>

*M*2*D*<sup>2</sup> *Dt*−1, *Dt*−2, *Tt*−1, *Tt*−2, *Rt*−1, *Rt*−2, *RHt*−1,

*M*3*D*<sup>1</sup> *Dt*−1, *Tt*−1, *Rt*−1, *RHt*−1, *P, HOR*

}; OP3: {+, −, *x*, *x*<sup>2</sup>

*M*2*D*<sup>3</sup> *Dt*−1, *Dt*−2, *Dt*−3, *Tt*−1, *Tt*−2, *Tt*−3, *Rt*−1, *Rt*−2, *Rt*−3, *RHt*−1, *RHt*−2, *RHt*−3

*M*3*D*<sup>3</sup> *Dt*−1, *Dt*−2, *Dt*−3, *Tt*−1, *Tt*−2, *Tt*−3, *Rt*−1, *Rt*−2, *Rt*−3, *RHt*−1, *RHt*−2, *RHt*−3, *P, HOR*

*M*3*D*<sup>2</sup> *Dt*−1, *Dt*−2, *Tt*−1, *Tt*−2, *Rt*−1, *Rt*−2, *RHt*−1, *RHt*−2, *P, HOR*

, *x*<sup>3</sup> , √, e*<sup>x</sup>*

, log, ln}] used in

The literature lists three methods for estimating lag time, AMI, autocorrelation function (ACF), and correlation integral (CI) [36–38]. AMI is considered the best since ACF reflects only linear properties and CI requires a large set of data [39]. Consequently, the present study employed AMI defined as:

$$I\_{\varepsilon} = \sum\_{l=1}^{i=n} P(X\_{i,} X\_{l \leftrightarrow \varepsilon}) \cdot \log\_2 \frac{P(X\_l, X\_{l \leftrightarrow \varepsilon})}{P(X\_l) \cdot P(X\_{l \leftrightarrow \varepsilon})} \tag{2}$$

where the joint probability of two successive time series, *P*(*Xi* , *Xi*+*<sup>τ</sup>*) and the product of their individual marginal probability, *P*(*Xx*) · *P*(*Xi*+*<sup>x</sup>*), were used to find the optimum lag time. This lag can contribute to the maximum information added on *Xi* by the successive time series *Xi*+*<sup>τ</sup>*. The prime objective of using this approach was to make sure these time series were independ‐ ent and thereby better represented the dynamics of the system in the phase space. In other words, a balanced independency was desirable in identifying an optimum delay time.

### **4.4. Support vector machines (SVM)**

For SVM models, in which genetic operators are not used, the input types remained *M*1, *M*2, or *M*3, while the lag times remained *D*1, *D*2, or *D*3. This study compared the performance of radial basis function (RBF), polynomial (Poly), and Linear (Lin) kernels. These were appended to the input type and lag, e.g., *M*1*D*1RBF, *M*1*D*1Poly, *or M*1*D*1Lin. **Figure 2** shows the structure of the SVM model. Kernel functions (RBF, Poly, or Lin) were used to map the input vectors into higher dimensions in space.

**Figure 2.** Support vector machine structure.

In this method, the input vectors are considered as supports forming the backbone of the whole model structure through a training process. If *N* samples of the population given by *<sup>X</sup>* <sup>∈</sup>*<sup>R</sup> <sup>m</sup>*, {*XK* , *YK* }*<sup>K</sup>* =1 *<sup>N</sup>* , *Y* ∈*R*, a function or SVM estimator on a regression can be considered as:

$$f\left(\mathbf{x}\right) = W\phi(X) + b \tag{3}$$

where *X* is an input parameter with *m* components and *Y* is its response output variable, *W* is a weight vector, *b* represents a bias, and *φ* is a transfer function which exhibits nonlinear behavior, mapping the input vectors into a higher dimensional space. As these mapped vectors can compromise the complex nonlinear regression of the input space, Cortes and Vapnik introduced the convex optimization problem with an insensitivity loss function [40]:

$$\text{minimize} \le b, \xi, \xi^\* \quad \frac{1}{2} \quad W^{-2} + C \sum\_{k=N}^{k-1} (\xi\_k^\* - \xi\_k^\*) \tag{4}$$

$$\text{subject to} \begin{cases} Y\_k - W^T \phi(X\_k) - b \le \varepsilon + \tilde{\xi}\_k \\ W^T \phi(X\_k) + b - Y\_k \le \varepsilon + \tilde{\xi}\_k^\* \end{cases} \Big| k = 1, 2, \dots, N \tag{5}$$

where ξk and *ξ<sup>k</sup>* \* are slack variables that penalize training errors by the loss function over the error tolerance , and *C* is a positive trade-off parameter that determines the degree of the empirical error in the optimization problem. Following previous researchers [41, 42], the optimization was simultaneously undertaken through Lagrangian multipliers under Karush Kuhn-Tucker (KTT) conditions.
