**2.1. Probability distribution function and data sampling**

In order to generate sample data for random variables, the random behavior should be si‐ mulated somehow that the model follows the historical data pattern with the most homolo‐ gy to real data. In order to specify the pattern of a random variable, the PDF should be obtained. There are two classes of methods to determine the PDF of a random variable in‐ cluding parametric and non-parametric methods [15]. In parametric methods, the data sam‐ ples are fitted to one of the well-known standard PDFs (such as Normal, Beta, Weibull, etc.) so that the most possible adaptation between the PDF and the existing data is achieved. The values associated with the PDF parameters are evaluated using Goodness of Fit (GoF) meth‐ ods such as Kolmogorov-Smirnov test [16]. On the other hand, the nonparametric methods do not employ specific well-known PDF models.

The use of parametric methods in some studies in which simulation of probabilistic models for wind and solar data is included have been reported, as in [6, 12]. Similarly, authors in [9] employ a fixed experimental equation to represent the PDF of wind data. However, this ap‐ proach to PDF estimation can bring about some defects as follows:


Based on the aforementioned facts, in this study, it is desired to obtain the most accurate dis‐ tribution model taking the advantage of Kernel Density Estimation (KDE), categorized as a non-parametric method.

**Figure 2.** Demonstration of a KDE with Gaussian kernel

the *dfit tool* function in MATLAB.

sented in Figure 3.

**2.3. Correlation of random variables**

where *h* is the smoothing factor. The kernel function in Gaussian case is given by:

1 1 1 () . 2

=

*i K x e*

p

In the present study, the KDE method is used to obtain the PDF of the seasonally wind speed and solar irradiation data for each hour in a day. The method is implemented using

Sample generation from a random variable is possible simply using a Monte Carlo simula‐ tion of its corresponding PDF. However, this is more cumbersome for a group of random variables which may have underlying dependence or correlation. Neglecting the correlation will result in the inaccurate multivariate PDF and then to irrelevant and deviated samples.

There are several correlation coefficients to quantify the correlation among a number of ran‐

cov( , ) cov( , ).cov( , ) *i j ii jj*

where *cov* is the covariance function. This analysis reflects only the *linear* correlation among the random variables. Nevertheless, in many cases, random variables reveal nonlinear corre‐ lation and more complicated relationships among themselves, especially when the PDF of the variables are not of similar patterns [19]. In a case with a large number of random varia‐ bles, neglecting the nonlinear relation will result in more significant deviation of output samples from what they actually should be. For instance, the solar irradiation behavior in two different regions in power system may not establish a linear relation between each oth‐ er, though they are not completely independent. The nonlinear correlation concept is pre‐

dom variables, among which the most famous one is the Pearson coefficient:

r

<sup>=</sup> <sup>å</sup> *<sup>i</sup> <sup>N</sup> x X*

2 ( ) 2

*<sup>N</sup>* (2)

Renewables

105

http://dx.doi.org/10.5772/45849

<sup>=</sup> (3)

*h*


Improved Stochastic Modeling: An Essential Tool for Power System Scheduling in the Presence of Uncertain

**Figure 1.** Solar irradiation data histogram for samples at 11 AM during a season and the parametric and nonparamet‐ ric PDF fits to the data

#### **2.2. Kernel Density Estimation (KDE)**

The simplest and most frequently used nonparametric method is to use a histogram of historical samples. As a brief description of the method, the distance covering the range of samples is divided into equal sections called "bin". For each bin, a sample value is considered as the kernel of that bin. A number of rectangular blocks equal to the num‐ ber of samples in each bin, each with unity area, are located on each bin. In this way, a discrete curve is obtained that somewhat describes the probability distribution of sam‐ ples. However, the overall curve is largely dependent on the size of bins and their mar‐ ginal points, because with the alterations in the bin size, the number of samples in each bin will be changed [17]. Besides, the obtained curve suffers from high raggedness. Hence, KDE method was introduced to solve the mentioned drawbacks. In this method, considering the samples as the kernels of each bin, the blocks are with a unity width and a height equal to the inverse of the number of samples for each sample value ( <sup>1</sup> <sup>n</sup> ) totally gaining a block with unity area. The accumulation of these blocks builds the PDF curve. In order to smooth the curve and eliminate the dependency to the block width, continu‐ ous kernels such as Gaussian or Cosine along with kernel smoother methods are used [18]. Figure 2 demonstrates a KDE with Gaussian kernel. The overall PDF will be ob‐ tained by:

$$\stackrel{\frown}{f}(\infty) = \frac{1}{N} \sum\_{i=1}^{N} \frac{1}{\hbar\_i} \, . K\left(\frac{\infty - X\_i}{\hbar}\right) \tag{1}$$

Improved Stochastic Modeling: An Essential Tool for Power System Scheduling in the Presence of Uncertain Renewables http://dx.doi.org/10.5772/45849 105

**Figure 2.** Demonstration of a KDE with Gaussian kernel

<sup>0</sup> 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 <sup>0</sup>

Data (p.u.)

**Figure 1.** Solar irradiation data histogram for samples at 11 AM during a season and the parametric and nonparamet‐

The simplest and most frequently used nonparametric method is to use a histogram of historical samples. As a brief description of the method, the distance covering the range of samples is divided into equal sections called "bin". For each bin, a sample value is considered as the kernel of that bin. A number of rectangular blocks equal to the num‐ ber of samples in each bin, each with unity area, are located on each bin. In this way, a discrete curve is obtained that somewhat describes the probability distribution of sam‐ ples. However, the overall curve is largely dependent on the size of bins and their mar‐ ginal points, because with the alterations in the bin size, the number of samples in each bin will be changed [17]. Besides, the obtained curve suffers from high raggedness. Hence, KDE method was introduced to solve the mentioned drawbacks. In this method, considering the samples as the kernels of each bin, the blocks are with a unity width and

a height equal to the inverse of the number of samples for each sample value ( <sup>1</sup>

*f* ^ (*x*)= <sup>1</sup> *<sup>N</sup>* ∑ *i*=1 *<sup>N</sup>* 1 *hi*

gaining a block with unity area. The accumulation of these blocks builds the PDF curve. In order to smooth the curve and eliminate the dependency to the block width, continu‐ ous kernels such as Gaussian or Cosine along with kernel smoother methods are used [18]. Figure 2 demonstrates a KDE with Gaussian kernel. The overall PDF will be ob‐

.*K*( *<sup>x</sup>* <sup>−</sup> *Xi*

<sup>n</sup> ) totally

*<sup>h</sup>* ) (1)

0.5 1 1.5 2 2.5 3 3.5

**2.2. Kernel Density Estimation (KDE)**

ric PDF fits to the data

tained by:

Density

104 New Developments in Renewable Energy

Empirical histogram Beta PDF Normal PDF Rayleigh PDF Non-parametric PDF

where *h* is the smoothing factor. The kernel function in Gaussian case is given by:

$$K(\boldsymbol{x}) = \frac{1}{N} \sum\_{l=1}^{N} \frac{1}{\sqrt{2\pi}} e^{-\frac{(\boldsymbol{x} - \boldsymbol{X}\_l)}{2\boldsymbol{h}^2}} \tag{2}$$

In the present study, the KDE method is used to obtain the PDF of the seasonally wind speed and solar irradiation data for each hour in a day. The method is implemented using the *dfit tool* function in MATLAB.

#### **2.3. Correlation of random variables**

Sample generation from a random variable is possible simply using a Monte Carlo simula‐ tion of its corresponding PDF. However, this is more cumbersome for a group of random variables which may have underlying dependence or correlation. Neglecting the correlation will result in the inaccurate multivariate PDF and then to irrelevant and deviated samples.

There are several correlation coefficients to quantify the correlation among a number of ran‐ dom variables, among which the most famous one is the Pearson coefficient:

$$\rho \,\, \rho \,\, \frac{\text{cov}(i, j)}{\sqrt{\text{cov}(i, i).\text{cov}(j, j)}} \,\, \tag{3}$$

where *cov* is the covariance function. This analysis reflects only the *linear* correlation among the random variables. Nevertheless, in many cases, random variables reveal nonlinear corre‐ lation and more complicated relationships among themselves, especially when the PDF of the variables are not of similar patterns [19]. In a case with a large number of random varia‐ bles, neglecting the nonlinear relation will result in more significant deviation of output samples from what they actually should be. For instance, the solar irradiation behavior in two different regions in power system may not establish a linear relation between each oth‐ er, though they are not completely independent. The nonlinear correlation concept is pre‐ sented in Figure 3.

**Figure 3.** A representation of linear vs. nonlinear correlation between two random variables.

In the problem under study, i.e. power system scheduling in the presence of uncertain re‐ newables, we consider the presence of multiple wind farms and solar farms throughout the power system. The solar power and wind power as well as load demand are three distinct stochastic processes. They can be discriminated into 24 random variables representing 24 hours of the day.

The random variables within each random process have their own temporal relation which can be modeled by time series prediction methods [14]. However, there may also be spatial correlation among the random variables from different processes, although it might seem unlikely. Here, we are going to deal with the tangible nonlinear correlation between the hourly random variables for a wind farm and another farm located in a close region, as well as for a PV farm and another one located in a close region. The interested reader may exam‐ ine other possible dependence structures between random variables / processes. Obviously, taking into account these relations results in more accuracy and enhancement of the models and solutions. Figure 4 presents how neglecting the correlations and directly using single variable PDFs to generate samples for a multivariate process may lead to model malfunc‐ tion. For two random variables with similar Log-normal distribution and Pearson correla‐ tion of 0.7 (with diagonal covariance matrix), 1000 samples have been generated considering total independence (Figure 4 (a)) and linear dependence (Figure 4 (b)). It is observed in Fig‐ ure 4 (b) that X1 values tend to be closer to X2 values especially in the upper range, in com‐ parison with Figure 4 (a).

**Figure 4.** The effect of correlation on generation of random samples: (a) the samples are generated assuming com‐

Improved Stochastic Modeling: An Essential Tool for Power System Scheduling in the Presence of Uncertain

Renewables

107

http://dx.doi.org/10.5772/45849

The basic idea behind the copula method is described as the Sklar's theorem [21]. It shows that a multivariate cumulative distribution function (CDF) can be expressed in terms of a multivariate uniform distribution function with marginal density functions U(0,1). In fact, if we have *n* random variables with an *n*-variable CDF, *F*, with margins *F1, F2, …,Fn*, there is an

This equation can be rewritten to extract the Copula of the joint distribution function of the

1 1

) is the inverse CDF. If *F* is a continuous multivariate PDF with continuous sin‐

<sup>1</sup> 1 1 ( ,..., ) ( ( ),..., ( )) *Cu u FF u F u <sup>n</sup> n n*

gle-variable PDFs, the implicit copula distribution function is obtained as follows:

1 2 1 1 ( , ,..., ) ( ( ),..., ( )) *<sup>n</sup> n n Fx x x CF x F x* = (4)


plete independence and (b) the samples are generated considering the correlation.

*n*-variable distribution function, *C*, given by:

random variables, as follows:

where Fi


In order to describe the correlations between random variables including nonlinear correla‐ tions, a method named Copula can be employed which is described in the following section.

#### *2.3.1. Copula method*

The correlation between random variables or samples is measured by the Copula concept. Embrechts & McNeil introduced Copula functions for application in financial risk and port‐ folio assessment problems [20]. Recently, much attention is being paid to this method in statistical modeling and simulation problems.

Copulas provide a way to generate distribution functions that model the correlated multi‐ variate processes and describe the dependence structure between the components. The cu‐ mulative distribution function of a vector of random variables can be expressed in terms of marginal distribution functions of each component and a copula function.

Improved Stochastic Modeling: An Essential Tool for Power System Scheduling in the Presence of Uncertain Renewables http://dx.doi.org/10.5772/45849 107

**Figure 3.** A representation of linear vs. nonlinear correlation between two random variables.

hours of the day.

106 New Developments in Renewable Energy

parison with Figure 4 (a).

statistical modeling and simulation problems.

*2.3.1. Copula method*

In the problem under study, i.e. power system scheduling in the presence of uncertain re‐ newables, we consider the presence of multiple wind farms and solar farms throughout the power system. The solar power and wind power as well as load demand are three distinct stochastic processes. They can be discriminated into 24 random variables representing 24

The random variables within each random process have their own temporal relation which can be modeled by time series prediction methods [14]. However, there may also be spatial correlation among the random variables from different processes, although it might seem unlikely. Here, we are going to deal with the tangible nonlinear correlation between the hourly random variables for a wind farm and another farm located in a close region, as well as for a PV farm and another one located in a close region. The interested reader may exam‐ ine other possible dependence structures between random variables / processes. Obviously, taking into account these relations results in more accuracy and enhancement of the models and solutions. Figure 4 presents how neglecting the correlations and directly using single variable PDFs to generate samples for a multivariate process may lead to model malfunc‐ tion. For two random variables with similar Log-normal distribution and Pearson correla‐ tion of 0.7 (with diagonal covariance matrix), 1000 samples have been generated considering total independence (Figure 4 (a)) and linear dependence (Figure 4 (b)). It is observed in Fig‐ ure 4 (b) that X1 values tend to be closer to X2 values especially in the upper range, in com‐

In order to describe the correlations between random variables including nonlinear correla‐ tions, a method named Copula can be employed which is described in the following section.

The correlation between random variables or samples is measured by the Copula concept. Embrechts & McNeil introduced Copula functions for application in financial risk and port‐ folio assessment problems [20]. Recently, much attention is being paid to this method in

Copulas provide a way to generate distribution functions that model the correlated multi‐ variate processes and describe the dependence structure between the components. The cu‐ mulative distribution function of a vector of random variables can be expressed in terms of

marginal distribution functions of each component and a copula function.

**Figure 4.** The effect of correlation on generation of random samples: (a) the samples are generated assuming com‐ plete independence and (b) the samples are generated considering the correlation.

The basic idea behind the copula method is described as the Sklar's theorem [21]. It shows that a multivariate cumulative distribution function (CDF) can be expressed in terms of a multivariate uniform distribution function with marginal density functions U(0,1). In fact, if we have *n* random variables with an *n*-variable CDF, *F*, with margins *F1, F2, …,Fn*, there is an *n*-variable distribution function, *C*, given by:

$$F(\mathbf{x}\_1, \mathbf{x}\_2, \dots, \mathbf{x}\_n) = \mathbb{C}(F\_1(\mathbf{x}\_1), \dots, F\_n(\mathbf{x}\_n)) \tag{4}$$

This equation can be rewritten to extract the Copula of the joint distribution function of the random variables, as follows:

$$C(\mu\_1, \dots, \mu\_n) = F(F\_1^{-1}(\mu\_1), \dots, F\_n^{-1}(\mu\_n)) \tag{5}$$

where Fi -1(ui ) is the inverse CDF. If *F* is a continuous multivariate PDF with continuous sin‐ gle-variable PDFs, the implicit copula distribution function is obtained as follows:

$$\text{rc}(u) = \frac{f(F\_1^{-1}(u\_1), \dots, F\_u^{-1}(u\_n))}{f(F\_1^{-1}(u\_1)) . f(F\_2^{-1}(u\_2)) \dots f(F\_u^{-1}(u\_n))}\tag{6}$$

*x x <sup>x</sup> <sup>s</sup>* m

Improved Stochastic Modeling: An Essential Tool for Power System Scheduling in the Presence of Uncertain

where *μx* , *σx* are the mean and standard deviation of data *x*, respectively. The simulation results for wind speed data distribution in farm 1 (x axis) and farm 2 (y axis) for 7 and 11 AM in fall season are shown in Figure 5. Figure 6 shows the samples for solar irradiation data distribution on two farms. Also, the values of linear correlation coefficients, the Ken‐ dall's tau correlation and the t-Student copula parameters for wind speed are presented in Table 1. The results state that the correlation of wind speed (and similarly solar irradia‐ tion) between two farms may not be negligible since they have similarities in their climat‐

(a)

(b)

**Figure 5.** Correlated samples for wind speed of two farms for 7 and 11 AM in fall season.


Renewables

109

http://dx.doi.org/10.5772/45849

s

ic conditions.

This equation can be restated as:

$$\text{Cov}(F\_1(\mathbf{x}\_1), \dots, F\_n(\mathbf{x}\_n)) = \frac{f(\mathbf{x}\_1, \dots, \mathbf{x}\_n)}{f\_1(\mathbf{x}\_1) \dots f\_n(\mathbf{x}\_n)} \tag{7}$$

where *c* is the PDF corresponding to *C*. Therefore, a multivariate PDF can be written in terms of the product of its single-variable marginal distributions and its underlying copula (*c*):

$$f(\mathbf{x}\_1, \dots, \mathbf{x}\_n) = \varepsilon(F\_1(\mathbf{x}\_1), \dots, F\_n(\mathbf{x}\_n)) . f\_1(\mathbf{x}\_1) \dots f\_n(\mathbf{x}\_n) \tag{8}$$

Various copula functions are introduced by present. They are generally classified into ex‐ plicit and implicit types. The implicit copulas are inspired by standard distribution func‐ tions and have complicated equations, whereas the explicit ones are simpler and do not follow the specific functions. Among the most widely used implicit copulas, Gaussian copu‐ la and t-Student copula and among the explicit ones, Clayton copula and Gumbel copula can be mentioned. The selection of the most appropriate copula is a complicated problem itself. Here, the t-Student copula is employed because of its simplicity and flexibility. The t-Student copula is formulated as [21]:

$$\mathbb{C}\_{\rho,\mu}(\mu,\nu) = \left\{ \int\_{-\alpha}^{t\_{\nu}^{-1}(u)} \int\_{-\alpha}^{t\_{\nu}^{-1}(v)} \frac{1}{2\pi(1-\rho^{2})^{1/2}} \left\{ 1 + \frac{s^{2} - 2\rho s t + t^{2}}{\nu(1-\rho^{2})} \right\}^{-(\nu+2)/2} ds dt \tag{9}$$

where *(ρ,υ)* are the copula parameters, t<sup>υ</sup> -1(.) is the T distribution function with *υ* degrees of freedom, mean of zero and variance of *<sup>υ</sup>* (*υ* - 2) . The best values for these parameters can be estimated using Inference Functions for Margins (IFM) or Canonical Maximum Likelihood (CML) methods. In both methods, at first the parameters of the single-variable marginal dis‐ tribution functions are computationally or experimentally determined. Then, by substitution of these functions into Copula likelihood functions, the Copula functions are calculated so that the Copula likelihood functions are maximized. Further discussion on the mathematical background and the calculation methods can be found in [21-23] which is suggested to be pursued by the interested reader.

In the current study, the authors employed the two-dimensional Copula method to present the correlation of the wind speed patterns between two wind farms and the correlation of the solar irradiation patterns between two PV farms, for every hour of the day. The available data for three years are initially normalized:

Improved Stochastic Modeling: An Essential Tool for Power System Scheduling in the Presence of Uncertain Renewables http://dx.doi.org/10.5772/45849 109

$$s = \frac{\mathbf{x} - \mu\_{\mathbf{x}}}{\sigma\_{\mathbf{x}}} \tag{10}$$

where *μx* , *σx* are the mean and standard deviation of data *x*, respectively. The simulation results for wind speed data distribution in farm 1 (x axis) and farm 2 (y axis) for 7 and 11 AM in fall season are shown in Figure 5. Figure 6 shows the samples for solar irradiation data distribution on two farms. Also, the values of linear correlation coefficients, the Ken‐ dall's tau correlation and the t-Student copula parameters for wind speed are presented in Table 1. The results state that the correlation of wind speed (and similarly solar irradia‐ tion) between two farms may not be negligible since they have similarities in their climat‐ ic conditions.

1 1 1 1 11 1 11 22 ( ( ),..., ( )) ( ) ( ( )). ( ( ))... ( ( ))

*fF u F u c u fF u fF u fF u* - -

*n n*

*fx x cF x F x*

1 1

This equation can be restated as:

108 New Developments in Renewable Energy

Student copula is formulated as [21]:

r u

where *(ρ,υ)* are the copula parameters, t<sup>υ</sup>

freedom, mean of zero and variance of *<sup>υ</sup>*

pursued by the interested reader.

data for three years are initially normalized:

*n n*

1

1 1 ( ,..., ) ( ( ),..., ( )) ( ).... ( )

where *c* is the PDF corresponding to *C*. Therefore, a multivariate PDF can be written in terms of the product of its single-variable marginal distributions and its underlying copula (*c*):

Various copula functions are introduced by present. They are generally classified into ex‐ plicit and implicit types. The implicit copulas are inspired by standard distribution func‐ tions and have complicated equations, whereas the explicit ones are simpler and do not follow the specific functions. Among the most widely used implicit copulas, Gaussian copu‐ la and t-Student copula and among the explicit ones, Clayton copula and Gumbel copula can be mentioned. The selection of the most appropriate copula is a complicated problem itself. Here, the t-Student copula is employed because of its simplicity and flexibility. The t-

1 1 ( 2)/2 2 2 () ()


(*υ* - 2)

estimated using Inference Functions for Margins (IFM) or Canonical Maximum Likelihood (CML) methods. In both methods, at first the parameters of the single-variable marginal dis‐ tribution functions are computationally or experimentally determined. Then, by substitution of these functions into Copula likelihood functions, the Copula functions are calculated so that the Copula likelihood functions are maximized. Further discussion on the mathematical background and the calculation methods can be found in [21-23] which is suggested to be

In the current study, the authors employed the two-dimensional Copula method to present the correlation of the wind speed patterns between two wind farms and the correlation of the solar irradiation patterns between two PV farms, for every hour of the day. The available

r

 ur

2 (1 ) (1 ) *tutv s st t C uv dsdt*

, 2 1/2 2 1 2 (,) <sup>1</sup>

pr

u u


*n n*

*n*

*n n*

<sup>1</sup> 1 1 1 1 ( ,..., ) ( ( ),..., ( )). ( )... ( ) *<sup>n</sup> n n n n f x x cF x F x f x f x* = (8)

u


. The best values for these parameters can be

ì ü ï ï - + <sup>=</sup> í ý <sup>+</sup> - - ï ï î þ ò ò (9)


*fx fx* <sup>=</sup> (7)

**Figure 5.** Correlated samples for wind speed of two farms for 7 and 11 AM in fall season.

time steps (hours). In order to take into account the temporal correlation, time-series predic‐ tion models can be used. Here, For the purpose of day-ahead scheduling of power system, an initial prediction of random variables should be performed using ANN. Other forecast tools such as ARMA model are reported [6, 24], but ANN is preferred due to its capability of reflecting nonlinear relations among the time-series samples and better performance for long-term applications. Afterwards, the distribution of forecast errors is analyzed to deter‐ mine the confidence interval around the forecasted values for the upcoming potential wind

Improved Stochastic Modeling: An Essential Tool for Power System Scheduling in the Presence of Uncertain

Renewables

111

http://dx.doi.org/10.5772/45849

The forecast process is performed using two Multi-Layer Perceptron (MLP) neural networks [25] for wind speed and solar irradiation. Each network is configured with three layers in‐ cluding one hidden layer. The input is a 24 hour structure in which a vector of 90 data sam‐ ples forms its arrays (representing each hour of the day for three month of a season). The available data is divided into three groups proportional to 70%, 15% and 15% for training, validation and test steps, respectively. The hourly data of wind speed and solar irradiation for the first farm are presented in Figure 7 and Figure 8, respectively. The plots of forecast results along with the actual data for one week are followed in Figure 9 and Figure 10.

**2.5. Estimation of the confidence interval and risk analysis for wind speed and solar**

From the power system planning viewpoint, the important aim is to reduce as far as possi‐ ble the uncertainty and risk associated with generation and power supply. The risk is more crucial when the ex-ante planned generation is less than the ex-post actual generation. The error in the forecast data can be estimated with a level of confidence (LC), in order to deter‐ mine a reliable level of generation to be considered in the planning stage. Here, the confi‐ dence interval method [26], known in risk assessment problems is proposed as a constraint to specify a lower and upper band for the wind speed and solar irradiation scenarios. For example, *LC=90%* means that the probability of forecast error (*Powerrisk*) being less than a

<sup>0</sup> <sup>1000</sup> <sup>2000</sup> <sup>3000</sup> <sup>4000</sup> <sup>5000</sup> <sup>6000</sup> <sup>7000</sup> <sup>8000</sup> <sup>9000</sup> <sup>0</sup>

Time (hr)

speed and solar irradiation data on the scheduling day.

definite value obtained from the distribution of forecast error

**Figure 7.** Hourly wind speed data in farm 1

Wind speed (m/s)

**irradiation scenarios**

**Figure 6.** Correlated samples for solar irradiation of two farms for 7 and 11 AM in fall season


**Table 1.** The parameters of the two-variable t-Student copula distribution for wind speed of two farms at different hours of the day in fall season

#### **2.4. Time-series prediction of wind speed and solar irradiation**

As mentioned earlier, besides the spatial correlation among different farms, the wind speed and solar irradiation random variables assigned to the scheduling time steps exhibit tempo‐ ral correlation, i.e., they are dependent on the condition of random variables at previous time steps (hours). In order to take into account the temporal correlation, time-series predic‐ tion models can be used. Here, For the purpose of day-ahead scheduling of power system, an initial prediction of random variables should be performed using ANN. Other forecast tools such as ARMA model are reported [6, 24], but ANN is preferred due to its capability of reflecting nonlinear relations among the time-series samples and better performance for long-term applications. Afterwards, the distribution of forecast errors is analyzed to deter‐ mine the confidence interval around the forecasted values for the upcoming potential wind speed and solar irradiation data on the scheduling day.

The forecast process is performed using two Multi-Layer Perceptron (MLP) neural networks [25] for wind speed and solar irradiation. Each network is configured with three layers in‐ cluding one hidden layer. The input is a 24 hour structure in which a vector of 90 data sam‐ ples forms its arrays (representing each hour of the day for three month of a season). The available data is divided into three groups proportional to 70%, 15% and 15% for training, validation and test steps, respectively. The hourly data of wind speed and solar irradiation for the first farm are presented in Figure 7 and Figure 8, respectively. The plots of forecast results along with the actual data for one week are followed in Figure 9 and Figure 10.
