1. Introduction

Climate change significantly affects the water availability all around the world. This effect plays a crucial role in arid and semiarid regions. On the other hand, urban development, population growth, industrial development and economic expansion also increases water scarcity concerns critically worldwide. Therefore, the governments have to be prepared beforehand for any consequences related to water problems, especially drinking water. The efficient operation and a management plan of urban water supply requires information about the value of consumption in the future. For using different standards to simulate hydraulic constitutions in pipeline systems (to improve the reliability of the system), it is necessary to have an accurate simulation of consumption value in a specific period. In other words, "The purpose of water demand forecast is to demonstrate futuristic information available for public water suppliers as they conduct their business" [1, 2]. Short-term (e.g., less than a week), mid-term (e.g., weekly to monthly) and long-term (e.g., greater than monthly) period forecast demand values are critical for daily operations and future management of the system. Long-term urban demand forecasting (up to 25 years), mid-term (up to 2 years) and short-term values (up to 2 days) depends upon vital factors such as water supply planning, pipeline maintenance, and water distribution system optimization (e.g. optimized pumping, pipeline maintenance, minimize energy cost and water supply cost, improving system reliability and water quality), respectively [3–5]. While studies have advanced the understanding of nonlinear characteristics and high complexity of water consumption factors, further research is still required. The present accepted knowledge for these factors is still limited and depends upon (1) accurate estimation and forecast water consumption and (2) determination of type and degree of nonlinearity among the effective variables [6]. Over the past decades, two groups of deterministic and probabilistic methods have been proposed to forecast urban water demand. The deterministic approach is solely based on the input variables and their initial conditions, whereas a probabilistic model relies on modeling uncertainties and randomness of the input variables.

most significant input variables are temperature, precipitation, and past demand values that were popular in most of the studies [11–13]. Two different types of variables affecting water demand: climatic (e.g., temperature, relative humidity, rainfall, etc.) and socioeconomic (e.g., population and income) [14]. Climatic variables can affect short-term and mid-term values while socioeconomic variables are useful for long-term forecasting [11, 15, 16]. However, a few studies investigated the impact of climatic variables on demand forecasting [17–19]. Literature enlists various deterministic and probabilistic techniques for forecasting urban drinking water demand. In general, conventional methods were prevalent for a better understanding of determinants of water demand [20–22], which consider linear relationships between effective variables and water demand, which is nonlinear. The mentioned studies are broadly categorized into two-fold: physical based and black box models. Without analyzing the physical processes, the second one applies artificial intelligence techniques (artificial neural networks, genetic programming, etc.), fuzzy-based (fuzzy logic, neuro-fuzzy, etc.), soft computing (support vector machine, etc.), and nonlinear deterministic (nonlinear local approximation, etc.) to identify the relationship between the input and output variables. Conventional regression models [3], autoregressive integrated moving average (ARIMA) [23], autoregressive integrated moving average with explanatory variable (ARIMAX) [24, 25], artificial neural networks (ANN) [9, 26–29], a combination of conventional and ANN [11, 12, 30], feedforward neural networks [12, 31], general regression neural networks [32, 33], support vector machines [14, 9, 34–37], gene expression programming [14, 38], fuzzy regression [39], neuro-fuzzy systems [40, 41], Fourier analysis [4], hybrid models (e.g. combined wavelet-ANN and wavelet-GEP) [13, 38], fuzzy cognitive map learning method [42, 43]. This research applies probabilistic ANN, GEP approach and a conventional method (MLR) to determine the performance of the methods with/without phase

Application of Wavelet Decomposition and Phase Space Reconstruction in Urban Water Consumption Forecasting:…

http://dx.doi.org/10.5772/intechopen.76537

133

The chaotic nature has been addressed for various systems [44–49]. Any chaotic system is deterministic in which minor changes in the initial conditions could lead to entire different behaviors in the next periods [44]. Chaos theory was successfully used to understand the nonlinear dynamic of the system. The models that are based on chaos theory and nonlinear dynamics are a better representative of the behavior of dynamic of observed data [50]. In general, chaos theory improves the understanding of nonlinear dynamics [51]. Ng et al. applied chaos theory on noisy time series of discharge in Saugeen River (Canada) [52]. They argued that noisy time series not only increase the complications of the data but also gave high embedding dimension. Sivakumar et al. utilized the concept of nonlinear dynamic behavior to

Genetic programming (GP) and gene expression programming (GEP) are among the heuristic algorithms based on Darwin's evolution theory [53]. GP was employed to complete missing data in wave records and forecasting [55–57]. Aytek and Kishi used GP model to suspended sediment in the Tongue River (United States) and found GP more accurate than sediment rating curves and multiple linear regressions (MLR) [58]. Ghorbani et al. investigated the chaos theory, artificial neural network (ANN) and GEP in estimating suspended sediment in the Mississippi River (United States) [59]. GEP is superior to GP as it is more convenient to interpret the results by a GEP tree that comes along with output results. GEP also performs

space reconstruction and wavelet decomposition in the case.

classify rivers from phase-space data reconstruction perspective [53].

Given the significant challenges and complexity of probabilistic methods and the fact that preprocessing methods can provide a useful approximation to their probabilistic counterparts, this research focused on the application of pre-processing to forecast short-term consumption.

#### 2. Literature review

Midterm water demand forecast helps the water management authorities to develop an integrated plan which balances supply and demand in a given period. Water stress of an area can be reduced by accurate estimation of drinking water supply demand [3, 7–9]. Moreover, management can provide water sustainability based on their experience as well as the accurate and reliable value of future demand [10].

Compared to other hydrological forecast studies (e.g., river discharge, sedimentation, rainfall, etc.) water consumption is not as influenced by the input factors as other studies do. The 1. Introduction

132 Wavelet Theory and Its Applications

variables.

2. Literature review

and reliable value of future demand [10].

Climate change significantly affects the water availability all around the world. This effect plays a crucial role in arid and semiarid regions. On the other hand, urban development, population growth, industrial development and economic expansion also increases water scarcity concerns critically worldwide. Therefore, the governments have to be prepared beforehand for any consequences related to water problems, especially drinking water. The efficient operation and a management plan of urban water supply requires information about the value of consumption in the future. For using different standards to simulate hydraulic constitutions in pipeline systems (to improve the reliability of the system), it is necessary to have an accurate simulation of consumption value in a specific period. In other words, "The purpose of water demand forecast is to demonstrate futuristic information available for public water suppliers as they conduct their business" [1, 2]. Short-term (e.g., less than a week), mid-term (e.g., weekly to monthly) and long-term (e.g., greater than monthly) period forecast demand values are critical for daily operations and future management of the system. Long-term urban demand forecasting (up to 25 years), mid-term (up to 2 years) and short-term values (up to 2 days) depends upon vital factors such as water supply planning, pipeline maintenance, and water distribution system optimization (e.g. optimized pumping, pipeline maintenance, minimize energy cost and water supply cost, improving system reliability and water quality), respectively [3–5]. While studies have advanced the understanding of nonlinear characteristics and high complexity of water consumption factors, further research is still required. The present accepted knowledge for these factors is still limited and depends upon (1) accurate estimation and forecast water consumption and (2) determination of type and degree of nonlinearity among the effective variables [6]. Over the past decades, two groups of deterministic and probabilistic methods have been proposed to forecast urban water demand. The deterministic approach is solely based on the input variables and their initial conditions, whereas a probabilistic model relies on modeling uncertainties and randomness of the input

Given the significant challenges and complexity of probabilistic methods and the fact that preprocessing methods can provide a useful approximation to their probabilistic counterparts, this research focused on the application of pre-processing to forecast short-term consumption.

Midterm water demand forecast helps the water management authorities to develop an integrated plan which balances supply and demand in a given period. Water stress of an area can be reduced by accurate estimation of drinking water supply demand [3, 7–9]. Moreover, management can provide water sustainability based on their experience as well as the accurate

Compared to other hydrological forecast studies (e.g., river discharge, sedimentation, rainfall, etc.) water consumption is not as influenced by the input factors as other studies do. The most significant input variables are temperature, precipitation, and past demand values that were popular in most of the studies [11–13]. Two different types of variables affecting water demand: climatic (e.g., temperature, relative humidity, rainfall, etc.) and socioeconomic (e.g., population and income) [14]. Climatic variables can affect short-term and mid-term values while socioeconomic variables are useful for long-term forecasting [11, 15, 16]. However, a few studies investigated the impact of climatic variables on demand forecasting [17–19]. Literature enlists various deterministic and probabilistic techniques for forecasting urban drinking water demand. In general, conventional methods were prevalent for a better understanding of determinants of water demand [20–22], which consider linear relationships between effective variables and water demand, which is nonlinear. The mentioned studies are broadly categorized into two-fold: physical based and black box models. Without analyzing the physical processes, the second one applies artificial intelligence techniques (artificial neural networks, genetic programming, etc.), fuzzy-based (fuzzy logic, neuro-fuzzy, etc.), soft computing (support vector machine, etc.), and nonlinear deterministic (nonlinear local approximation, etc.) to identify the relationship between the input and output variables. Conventional regression models [3], autoregressive integrated moving average (ARIMA) [23], autoregressive integrated moving average with explanatory variable (ARIMAX) [24, 25], artificial neural networks (ANN) [9, 26–29], a combination of conventional and ANN [11, 12, 30], feedforward neural networks [12, 31], general regression neural networks [32, 33], support vector machines [14, 9, 34–37], gene expression programming [14, 38], fuzzy regression [39], neuro-fuzzy systems [40, 41], Fourier analysis [4], hybrid models (e.g. combined wavelet-ANN and wavelet-GEP) [13, 38], fuzzy cognitive map learning method [42, 43]. This research applies probabilistic ANN, GEP approach and a conventional method (MLR) to determine the performance of the methods with/without phase space reconstruction and wavelet decomposition in the case.

The chaotic nature has been addressed for various systems [44–49]. Any chaotic system is deterministic in which minor changes in the initial conditions could lead to entire different behaviors in the next periods [44]. Chaos theory was successfully used to understand the nonlinear dynamic of the system. The models that are based on chaos theory and nonlinear dynamics are a better representative of the behavior of dynamic of observed data [50]. In general, chaos theory improves the understanding of nonlinear dynamics [51]. Ng et al. applied chaos theory on noisy time series of discharge in Saugeen River (Canada) [52]. They argued that noisy time series not only increase the complications of the data but also gave high embedding dimension. Sivakumar et al. utilized the concept of nonlinear dynamic behavior to classify rivers from phase-space data reconstruction perspective [53].

Genetic programming (GP) and gene expression programming (GEP) are among the heuristic algorithms based on Darwin's evolution theory [53]. GP was employed to complete missing data in wave records and forecasting [55–57]. Aytek and Kishi used GP model to suspended sediment in the Tongue River (United States) and found GP more accurate than sediment rating curves and multiple linear regressions (MLR) [58]. Ghorbani et al. investigated the chaos theory, artificial neural network (ANN) and GEP in estimating suspended sediment in the Mississippi River (United States) [59]. GEP is superior to GP as it is more convenient to interpret the results by a GEP tree that comes along with output results. GEP also performs better at extracting a mathematical equation which shows the relation between input and output variables [59–61]. Nasseri et al. developed a hybrid model combining the extended Kalman filter with genetic programming for monthly water demand forecasting in Tehran [62]. Shabani et al. proposed a new rationale and a novel technique in forecasting water demand using lag time to feed the determinants of water demand by the development of GEP and SVM models [14]. Yousefi et al. implemented sophisticated mathematical models to forecast water demand of City of Kelowna in monthly temporal scale. Their study assessed the performance of GEP using wavelet decomposition [38].

specific period is the first step for any management plan beyond urban drinking water supply and allocation. This chapter investigates the first step of every long-term plan development in urban drinking water as discussed below. Water utility management needs drinking water long-term forecasted values in several terms. (1) water distribution network design; (2) supply and consumption management; (3) efficient application of distribution network; (4) pipeline pressure management; (5) network development; (6) optimizing the cost of water supply and

Application of Wavelet Decomposition and Phase Space Reconstruction in Urban Water Consumption Forecasting:…

http://dx.doi.org/10.5772/intechopen.76537

135

The present research selected Water consumption of the City of Kelowna (BC, Canada) as the test case. The city of Kelowna water utility provides services for approximately 65,000 residents. Poplar Point, Eldorado, Cedar Creek and Swick Road pump stations cover services for 99% of the population of the area [69]. However, few areas in the boundary are named as "Future City" where does not contain any population yet, land development plan shows water servicing is considered in the area. Monitoring of water quality, the operation of the pumps, water level in reservoirs, and pipeline pressure are conducted by the use of Supervisory

Hourly water demand for the above-mentioned stations has been made available by the city utility of Kelowna. The data used 6 years (approximately 52,464 hourly consumption) starting from January 1st, 2011 to 30th December 2016. Figure 1 shows the variation of daily and monthly water demand and the consumption pattern. Concerning the 6 years water demand samples of daily scale (2186 points), the first 5 years (1882 points) are used for calibrating the models and the last year (365 points – 2016) is considered as the test period. Table 1 shows the

Given a set of physical variables and their interactions, the dynamics of a system (e.g., water consumption) can be defined by a single point moving on a trajectory, where each of its points

Figure 1. Time series plot of (a) daily water demand; (b) average of the consumption pattern in 24 h within 6 years.

network maintenance.

3.1.3. Review of data records

Control and Data Acquisition Software (SCADA).

characteristics of the dataset in the test case.

3.2. Phase space reconstruction (PSR)

3.1.2. Study area

Among the variety of examined methods Artificial Neural Networks (ANNs), have been applied to the various period in the wide variety of hydrological issues. The main reason of ANNs frequent usage is its ability to overcome the relationship in determining the complexity of time series, even with the shortage of amount of data available to train the models. Therefore, most of the studies applicable in area of water resources demand applies ANNs to forecast short, mid and long-term demand values [13, 30, 31].

Regarding the literature review reported by Nourani et al. concluded about the dominant application of wavelet-based models [63]. Moreover, Labat notified about the improving ability of wavelet in models' performance [64]. Therefore, the application of wavelet brought researchers attention into the area such as denoising [65]; stream flow and water resources [66]; evaporation and climatic models [67]; groundwater level modeling [68]; water demand forecasting [13, 38], where in most of the mentioned studies combination of Wavelet-ANNs performed accurately over conventional models without hybrid wavelet models (e.g. ARIMA, MLR, ANN and etc.).

The objectives of this study are four-fold: (1) to investigate chaotic behavior of case data and finding the proper lag time; (2) to find the accuracy of the forecasting for one-day ahead lead time with various input combination, and (3) to study if phase space reconstruction (PSR) based on optimum embedding dimension would improve the accuracy of the models, and 4) application of wavelet decomposition by five different transform functions combined with all the mentioned models with and without PSR.
