**5. Load Forecasting Models**

**Figure 2.** Hourly Aggregate Electricity Demands of Load Centres from January 1, 2005 to December 31, 2008.

Demand for electricity is not static and varies according to a multitude of variables. Load centres will peak at certain periods during the day, usually conforming to business cycle and weather-related influences. The system peak response is the aggregate peak of the val‐ ues of all the load centres within a control area, which may occur at a different time from the peak response of an individual load centre. A significant difference in peak response be‐ tween regions and the system constitutes evidence for a diverse load environment. This evi‐

To determine whether the observed load swings of the studied load centres occurred within the same time frame of the aggregate system response, the coincidence factor C [7, 9] is

> *i i A*

(1)

*P*

is the peak load of a single load centre, and *PA* is the system peak load.

The coincidence factor describes the degree of discrepancy between regional peaking re‐ sponses versus system peaking response. If C is greater than 1 and continues to appreciate across an increasing timeline, the load centres peak at different times than the aggregate sys‐ tem load, which provides evidence for the existence of load diversity among the regions. If C is greater than 1 but remains consistent, an aggregate model can be used to accurately pre‐ dict load swings. In a somewhat consistent or non-diverse system, C will oscillate about 0

**4.2. Regional Peaking Responses Versus System Peaking Response**

dence can provide motivation for the development of multi-region models.

*C <sup>P</sup>* <sup>=</sup> å

and a multi-region model is likely to be of little value in predicting load swings.

adopted, which is defined as,

Where, *Pi*

256 Decision Support Systems

Three load forecasting models were developed: (1) Similar Day Aggregate Load Model, (2) ANN Aggregate Load Model, and (3) ANN Multi-Region Load Model. The similar day ag‐ gregate load model provides the industry benchmark. The ANN aggregate load model serves as the baseline to show the performance enhancement achieved by the ANN ap‐ proach. The ANN multi-region load model demonstrates the performance enhancement achieved by the multi-region approach. All models were evaluated according to the same performance evaluation methods, which will be described in section 6. The models were tested with the same case study, which will be presented in section 7. A comparison of the characteristics of the models is presented in Table 4 and a comparison of the input variables to the models is presented in Table 5. The research methodology and modelling process for each of the three models will be described in this section.


ture forecasts, typically corresponding to the weather of the largest load centres. They do not require training data in the sense that the model learns automatically. Instead the system combines historical data with expert predictions. Figure 1 illustrates the traditional forecast

After consultation with experts including system operators and capacity management engi‐ neers, a similar day-based load forecasting model was developed. An examination of the hourly observations of system load over the period of 2005 – 2010 revealed that the data pat‐ terns can be summarized into four day types. The four day types with their associated elec‐ tric demand behaviour and external influencing variables were represented in a parameterized rule base, shown in Table 6. This rule base can be used in conjunction with a database that consists of parameter values derived from the SCADA database so as to obtain the best-fit days, weighted according to the temporal distance from the target day for which the load is predicted. The weather variables were further subjected to sensitivity analysis to quantify each parameter's influence on the aggregate load. The sensitivity analysis served to confirm significances of the parameters identified by the experts and the less significant

procedure based on the similar day model.

ones were omitted from the input dataset.

Updates and processes knowledge

Communicates with SCADA database, updating data entries and database

Responds to data requests issued by controller module with ADO

Pre-filters data requests initiated by

**Table 6.** Functions of Similar Day System Modules.

databases;

filtering;

Recordset; and

controller module.

**Data Module Controller Module View Module**

data module;

module;

projects.

Accepts input from user;

to the data module;

Provides interface between user and

Receives data and commands from

Acts as the graphical user interface, while hiding non-essential information from the user.

the controller module; Modifies the user interface to accommodate new data or applications requested by the user;

and

Towards Developing a Decision Support System for Electricity Load Forecast

http://dx.doi.org/10.5772/51306

259

Translates user forecast queries into well-formatted SQL, which is then sent

Instructs data module and view to perform actions based on user input; Initiates data requests to the data

Retrieves data from the data module; Negotiates weighting of best-fit days with the data module through parameterized rule base;

Transforms data from data module and sends results to the view module; and Creates, opens, closes, and deletes

The input variables filtered the dataset to select normalized aggregate load which was further modified to correspond with the user-defined load pattern logic. The load pattern logic was generalized from the parameterized rule base. The implemented similar day model consists of

**Table 4.** Summary of Model Properties and Methodologies.


**Table 5.** Summary of Model Inputs.

#### **5.1. Development of a Similar Day Model**

The domain expertise for this research project was drawn from the grid control operating staff of the Saskatchewan utility, including the power system supervisors, capacity man‐ agement engineers, and system operators. They were consulted to identify load patterns, select predictive parameters, and assist in development and pre-processing of both the load and weather datasets.

The load history of the control area and temperature variables from Area01 and Area02 are the generalized inputs to the automated similar day model. These variables are used as in‐ dex for searching a Supervisory Control and Data Acquisition (SCADA) database to obtain the best-fit days, weighted according to the temporal distance from the forecasted day. Most similar day models are aggregated load models, driven by one or more regional tempera‐ ture forecasts, typically corresponding to the weather of the largest load centres. They do not require training data in the sense that the model learns automatically. Instead the system combines historical data with expert predictions. Figure 1 illustrates the traditional forecast procedure based on the similar day model.

After consultation with experts including system operators and capacity management engi‐ neers, a similar day-based load forecasting model was developed. An examination of the hourly observations of system load over the period of 2005 – 2010 revealed that the data pat‐ terns can be summarized into four day types. The four day types with their associated elec‐ tric demand behaviour and external influencing variables were represented in a parameterized rule base, shown in Table 6. This rule base can be used in conjunction with a database that consists of parameter values derived from the SCADA database so as to obtain the best-fit days, weighted according to the temporal distance from the target day for which the load is predicted. The weather variables were further subjected to sensitivity analysis to quantify each parameter's influence on the aggregate load. The sensitivity analysis served to confirm significances of the parameters identified by the experts and the less significant ones were omitted from the input dataset.


**Table 6.** Functions of Similar Day System Modules.

**Model Name Model Type Methodology Model Output Training Type**

electrical demand

electrical demand

electrical demand

Knowledge Discovery in

Supervised training

Supervised training

Databases

Similar Day Aggregate Aggregate

ANN Aggregate Aggregate

ANN Multi-Region Aggregate

**Area04**

**Area05**

**Past Hour Load** X

**Past Hour Load** X

**Past Hour Load** X X X X X X X X X X X X **Air Temp.** X X X X X X X X X X X X **Rel. Humidity** X X X X X X X X X X X X **Wind Speed** X X X X X X X X X X X X

The domain expertise for this research project was drawn from the grid control operating staff of the Saskatchewan utility, including the power system supervisors, capacity man‐ agement engineers, and system operators. They were consulted to identify load patterns, select predictive parameters, and assist in development and pre-processing of both the

The load history of the control area and temperature variables from Area01 and Area02 are the generalized inputs to the automated similar day model. These variables are used as in‐ dex for searching a Supervisory Control and Data Acquisition (SCADA) database to obtain the best-fit days, weighted according to the temporal distance from the forecasted day. Most similar day models are aggregated load models, driven by one or more regional tempera‐

**Area06**

**Area07**

**Area08**

**Area09**

**Area10**

**Area11**

**Area12**

**System**

**Area01 Area02 Area03**

**Air Temp.** X X

**Air Temp.** X X **Rel. Humidity** X X **Wind Speed** X X

**Rel. Humidity Wind Speed**

Aggregate Similar Day Model

Aggregate ANN

Multi-Region ANN

**Aggregate Similar Day**

**Aggregate ANN**

**Multi-Region ANN**

**Table 5.** Summary of Model Inputs.

load and weather datasets.

**5.1. Development of a Similar Day Model**

**Table 4.** Summary of Model Properties and Methodologies.

Model

258 Decision Support Systems

Model

The input variables filtered the dataset to select normalized aggregate load which was further modified to correspond with the user-defined load pattern logic. The load pattern logic was generalized from the parameterized rule base. The implemented similar day model consists of three modules: data, control, and view. The functions of the similar day modules are listed in Table 6, while Figure 4 provides a screenshot of the application as viewed by the user through the view module. An example of the similar day model rule base is provided in Figure 5.

module constitutes the graphical user interface of the aggregate similar day forecasting sys‐

Towards Developing a Decision Support System for Electricity Load Forecast

http://dx.doi.org/10.5772/51306

261

In order to evaluate and modify the pattern set chosen by the user, a training dataset was used as initial testing data for model tuning. Initial results obtained during preliminary test‐ ing approximated those of the experts. However to further improve predictive accuracy, ex‐ perts can be given the option to modify the model if a weather phenomenon such as a heat wave or a cold snap is forecasted. When the pattern set has been configured and stored in the data model module, the user can view the results of the pattern set against the test set.

tem, outputting data to the user for consideration.

**Figure 5.** Example of Similar Day Model Represented in a Decision Tree Structure.

given the possibility of expert contradictions and bias [10];

and represented in the similar day model; and

While extensively used, similar day models are susceptible to the following limitations:

**•** As they are based on expert knowledge, similar day models may be difficult to develop

**•** Similar day models rely on the expert to be correct in the knowledge engineering and

**•** Linear models tend to be produced by similar day models that do not account for dynam‐

**•** The prediction capabilities of a similar day model are only as good as the historical data and degree of specificity in the operators' reasoning knowledge, which has been captured

**•** For the same reason, similar day models tend to be restricted to aggregate models due to the extensive knowledge acquisition required for developing a multi-region model [10].

In order to deal with the considerable load diversity presented in Table 3 and Figure 3, as well as the weather diversity presented in Tables 1 and 2 and Figure 1, a new modelling

**5.2. Limitations of a Similar Day Model**

training stages;

ic environments [10];

**5.3. Development of ANN Models**

**Figure 4.** Screenshot of Similar Day Load Forecast Application.

The data module leverages a database of normalized aggregated loads, pre-filtered to corre‐ spond with identified hourly and weekday groups. The module encapsulates the data stor‐ age and interface between the application and the database. The responsibilities of this module are: responding to data requests issued by the control module, updating and proc‐ essing data entries within the database, and performing data filtering.

The control module provides the interface between the user and the data. The controller ac‐ cepts input from the user and instructs the data and view to perform actions based on those in‐ puts. Its responsibilities include: initiating data requests to the data model from the user, retrieving data from the data model, and outputting the data to the view module. The control‐ ler translates the user's input into a well-formatted SQL query, which is then applied to the da‐ tabase. The database responds with an ActiveX Data Object (ADO) Recordset, which is then translated by the control module and output to the view module. The control module also cre‐ ates, opens, closes, and deletes projects, including load pattern changes instigated by the user.

The view module receives data and commands from the control module, which directs the view module to modify the user interface to accommodate new data or applications request‐ ed by the user. All data transactions outside of this module are opaque to the user. The view module constitutes the graphical user interface of the aggregate similar day forecasting sys‐ tem, outputting data to the user for consideration.

In order to evaluate and modify the pattern set chosen by the user, a training dataset was used as initial testing data for model tuning. Initial results obtained during preliminary test‐ ing approximated those of the experts. However to further improve predictive accuracy, ex‐ perts can be given the option to modify the model if a weather phenomenon such as a heat wave or a cold snap is forecasted. When the pattern set has been configured and stored in the data model module, the user can view the results of the pattern set against the test set.

**Figure 5.** Example of Similar Day Model Represented in a Decision Tree Structure.

#### **5.2. Limitations of a Similar Day Model**

three modules: data, control, and view. The functions of the similar day modules are listed in Table 6, while Figure 4 provides a screenshot of the application as viewed by the user through the view module. An example of the similar day model rule base is provided in Figure 5.

The data module leverages a database of normalized aggregated loads, pre-filtered to corre‐ spond with identified hourly and weekday groups. The module encapsulates the data stor‐ age and interface between the application and the database. The responsibilities of this module are: responding to data requests issued by the control module, updating and proc‐

The control module provides the interface between the user and the data. The controller ac‐ cepts input from the user and instructs the data and view to perform actions based on those in‐ puts. Its responsibilities include: initiating data requests to the data model from the user, retrieving data from the data model, and outputting the data to the view module. The control‐ ler translates the user's input into a well-formatted SQL query, which is then applied to the da‐ tabase. The database responds with an ActiveX Data Object (ADO) Recordset, which is then translated by the control module and output to the view module. The control module also cre‐ ates, opens, closes, and deletes projects, including load pattern changes instigated by the user.

The view module receives data and commands from the control module, which directs the view module to modify the user interface to accommodate new data or applications request‐ ed by the user. All data transactions outside of this module are opaque to the user. The view

essing data entries within the database, and performing data filtering.

**Figure 4.** Screenshot of Similar Day Load Forecast Application.

260 Decision Support Systems

While extensively used, similar day models are susceptible to the following limitations:


#### **5.3. Development of ANN Models**

In order to deal with the considerable load diversity presented in Table 3 and Figure 3, as well as the weather diversity presented in Tables 1 and 2 and Figure 1, a new modelling structure consisting of individual load centre models fed by many weather region specific weather variables was developed. The multi-region model is used to forecast regional loads individually, then the results may be aggregated to forecast system load.

The ANN models utilize the three weather variable categories of: ambient air temperature, relative humidity, and wind speed. Each of the implemented ANN models holds a unique topology. This topology was manually configured for each of the models using the Weka Perceptron GUI. Figure 6 shows the topology of the Aggregate ANN model and Figure 7

Towards Developing a Decision Support System for Electricity Load Forecast

http://dx.doi.org/10.5772/51306

263

Each ANN model utilized a common training history and the same weather inputs; howev‐ er, the Aggregate ANN model only used weather variables from the two largest regions,

In this research, a multi-layer (3 layered) perceptron classifier was chosen for the ANN mod‐ el. This network architecture was chosen due to its conceptual simplicity, computational ef‐ ficiency, and its ability to train by both supervised and unsupervised learning. The classifier uses backpropagation, binary classification, and a sigmoid activation function. The Aggre‐ gate ANN model used 7 inputs, while the multi-region model used a total of 48 inputs. The

After the architecture and topology of the neural networks were determined, optimization of model coefficients was achieved by systematically varying model parameters and observ‐ ing the response of each network. Both the Aggregate ANN model and the Multi-Region ANN model provided the single output of the forecasted system load, and the inputs were the conditional variables of weather and system load. Through a process of trial and error,

Despite their learning capabilities, ANNs are subject to a number of limitations, including:

**•** The size of the training set tends to be proportional to the accuracy of predictions, and a large training set is required. Since the network can become over-trained, care must be taken by the designer to tune the network and terminate the training at appropriate moments.

**•** The training set must cover the range of all possible events which the network is expected to predict. Common events may dampen the response to critical, yet rare, scenarios. Yet in order to respond appropriately to these critical scenarios, transformations of the data‐ set may be required. Unfortunately the insufficient exposure to scenarios is only revealed after the network has been trained and tested. Therefore the designer must be cognizant

**•** Benchmarking efforts for neural network performance are difficult since the model may

**•** Network layers and connections are often implemented on a trial and error basis. While domain knowledge is an important aspect of any modelling efforts, neural networks often expose unconventional connections which lead to significant performance enhancements.

be optimized to locate local, rather than global, minima/maxima [8].

Linear connections are often redundant when using an ANN [8].

whereas the multi-region model used weather variables from all twelve regions.

the model configurations were updated until optimum values were realized.

shows the topology of the Multi-Region ANN model.

ANN model inputs are summarized in Table 5.

**5.4. Limitations of an ANN Model**

of the contents of the training set.

The ANN models were all created and evaluated using the Weka data mining software package. Weka, version 3.75, is a data mining tool, written in Java, and produced by the University of Waikato under the Waikato Environment for Knowledge Analysis. It is a collection of machine learning algorithms, statistical tools, and data transforms for data mining tasks, including: data pre-processing, classification, regression, clustering, associa‐ tion rules, and visualization.

**Figure 6.** (Left) Topology of Aggregate ANN Model.

**Figure 7.** (Right) Topology of Multi-Region ANN Model.

The ANN models utilize the three weather variable categories of: ambient air temperature, relative humidity, and wind speed. Each of the implemented ANN models holds a unique topology. This topology was manually configured for each of the models using the Weka Perceptron GUI. Figure 6 shows the topology of the Aggregate ANN model and Figure 7 shows the topology of the Multi-Region ANN model.

Each ANN model utilized a common training history and the same weather inputs; howev‐ er, the Aggregate ANN model only used weather variables from the two largest regions, whereas the multi-region model used weather variables from all twelve regions.

In this research, a multi-layer (3 layered) perceptron classifier was chosen for the ANN mod‐ el. This network architecture was chosen due to its conceptual simplicity, computational ef‐ ficiency, and its ability to train by both supervised and unsupervised learning. The classifier uses backpropagation, binary classification, and a sigmoid activation function. The Aggre‐ gate ANN model used 7 inputs, while the multi-region model used a total of 48 inputs. The ANN model inputs are summarized in Table 5.

After the architecture and topology of the neural networks were determined, optimization of model coefficients was achieved by systematically varying model parameters and observ‐ ing the response of each network. Both the Aggregate ANN model and the Multi-Region ANN model provided the single output of the forecasted system load, and the inputs were the conditional variables of weather and system load. Through a process of trial and error, the model configurations were updated until optimum values were realized.

#### **5.4. Limitations of an ANN Model**

structure consisting of individual load centre models fed by many weather region specific weather variables was developed. The multi-region model is used to forecast regional loads

The ANN models were all created and evaluated using the Weka data mining software package. Weka, version 3.75, is a data mining tool, written in Java, and produced by the University of Waikato under the Waikato Environment for Knowledge Analysis. It is a collection of machine learning algorithms, statistical tools, and data transforms for data mining tasks, including: data pre-processing, classification, regression, clustering, associa‐

individually, then the results may be aggregated to forecast system load.

tion rules, and visualization.

262 Decision Support Systems

**Figure 6.** (Left) Topology of Aggregate ANN Model.

**Figure 7.** (Right) Topology of Multi-Region ANN Model.

Despite their learning capabilities, ANNs are subject to a number of limitations, including:


The ANN models developed during this research were subject to the aforementioned limita‐ tions; however, efforts were taken to mitigate these impediments. The training set met or ex‐ ceeded the size of similar STLF ANN models [1, 3, 7, 9, 10] and contained a number of scenarios, both common and diverse with respect to weather conditions and load response. The benchmarking process considered a case study of the 2011 year across all hours and weekdays, which exceeded the evaluation events used in similar STLF ANN models [1, 3, 7, 9] and utilized five statistical measures for model benchmarking, further described in sec‐ tion 6. Finally a systematic analysis of model optimization was enacted. Parameters were changed methodologically and performance was noted. Ultimately, the best modelling pa‐ rameters were chosen based upon the analytical review of the model configurations.

untransformed, even if the output is scaled. Its assessment tracks the behaviour of the mod‐ el, rather than its error [11]. Thus, a large correlation value is desirable, whereas a low error

ˆ

å åå (2)

Towards Developing a Decision Support System for Electricity Load Forecast

http://dx.doi.org/10.5772/51306

265

= - å (3)


is the predicted value; and n is the total number of values

is the predicted value; and n is the total number of values

2 2 ( )( ) ( ) ( ) 11 1


*i i i i*

Where *ai* is the actual value; *pi* is the predicted value; *ā* is the mean value of the actual; and n

Accuracy performance of the three load forecasting systems was established by comparing

(| |) /

*i i*

predicted. MAE is the magnitude of individual errors, irrespective of their sign. MAE does not exaggerate the effect of outliers, treating all errors equally according to their magnitude.

<sup>2</sup> ( ) *<sup>n</sup>*

RMSE, like MAE, does not exaggerate large errors as is the case in squared error and root squared error measurements. By computing the square root in RMSE, the dimensionality of the prediction is reduced to that of the predictor [11]. These two methods equally consider

In order to evaluate the consistency of the predictions, RAE and RRSE are utilized. RAE nor‐ malizes the total absolute error of the predictor against the average results to provide a dis‐

value is also desirable.

Correlation is defined as:

MAE and RMSE results.

MAE is defined as:

RMSE is defined as:

all prediction errors.

tance-weighted result.

Where *ai*

Where *ai*

predicted.

*<sup>S</sup> Correlation*

=

is the total number of values predicted.

is the actual value; *pi*

is the actual value; *pi*

*PA P A*

*S S*

*P PA P A*

*n nn p pa a p p a a i ii S S <sup>S</sup> and S nn n*

= = <sup>=</sup> -- -

1

MAE, however, does mask the tendency of a model to over or under predict values.

*ii i p a RMSE <sup>n</sup>*

*i MAE p a n* =

*n*
