3. Informatics solutions for monitoring and analyzing the power plants' KPIs

In order to analyze and monitor the key performance indicators, the executives of the power plants require an advanced decision support system (DSS). Our proposal consists in developing an informatics solution based on three levels architecture that involves models for data management, analytical models and interfaces (Figure 1):

The architecture components are as follows:

Figure 1. SIPAMER's architecture.

## 3.1. Level 1—data management

All data sources gathered from wind/photovoltaic power plants are extracted, transformed and loaded into a central relational database running Oracle Database 12c Edition in order to enable user access through cloud computing. The sources are heterogeneous: measuring devices for climate conditions (wind speed, direction, temperature, atmospheric pressure, and humidity), sensors for photovoltaic cells and wind turbines, SCADA API for measuring realtime parameters regarding power plant output. These sources are mapped into a relational data stage; then, the extract, transform and load (ETL) process is applied, and data are finally loaded into a relational data mart that organizes objects as dimensions and facts. This approach makes it easier the development of the analytical model with KPIs framework and enables an advanced roll-up/drill-down interfaces.

Based on the executives' requirements regarding the KPIs, we designed the main structural entities (objects) that will enable multidimensional data exploration. They will be organized as dimensions (subject entities) with descriptive attributes structured on hierarchies with multiple levels to enable typical OLAP operations: roll-up/drill-down, slicing and dicing. The data mart contains the following dimensions: DIM\_STAKEHOLDER, DIM\_POWERPLANT, DIM\_REGION, DIM\_TURBINE, DIM\_PV and DIM\_TIME.

Figure 2. Snowflake schema for the KPIs data mart.

Facts tables are objects that contain attributes like measures (metrics) and foreign keys to the dimension tables. Facts are usually numerical data that can be aggregated and analyzed by dimensions' levels. The model contains the following facts: FACT\_PV\_OUTPUT and FACT\_WIND\_OUTPUT. The objects are organized in a snowflake schema as shown in Figure 2.

The data mart allows us to design the KPIs framework in a subject-oriented and multidimensional view.

## 3.2. Level 2—models

3.1. Level 1—data management

enables an advanced roll-up/drill-down interfaces.

18 Recent Improvements of Power Plants Management and Technology

Figure 2. Snowflake schema for the KPIs data mart.

DIM\_REGION, DIM\_TURBINE, DIM\_PV and DIM\_TIME.

All data sources gathered from wind/photovoltaic power plants are extracted, transformed and loaded into a central relational database running Oracle Database 12c Edition in order to enable user access through cloud computing. The sources are heterogeneous: measuring devices for climate conditions (wind speed, direction, temperature, atmospheric pressure, and humidity), sensors for photovoltaic cells and wind turbines, SCADA API for measuring realtime parameters regarding power plant output. These sources are mapped into a relational data stage; then, the extract, transform and load (ETL) process is applied, and data are finally loaded into a relational data mart that organizes objects as dimensions and facts. This approach makes it easier the development of the analytical model with KPIs framework and

Based on the executives' requirements regarding the KPIs, we designed the main structural entities (objects) that will enable multidimensional data exploration. They will be organized as dimensions (subject entities) with descriptive attributes structured on hierarchies with multiple levels to enable typical OLAP operations: roll-up/drill-down, slicing and dicing. The data mart contains the following dimensions: DIM\_STAKEHOLDER, DIM\_POWERPLANT, This level contains models for forecasting the power plant output on short term (hourly, up to 3 days) and the KPIs analytical framework.

Forecasting models are build distinct for each type of renewable power plant, WPP and PPP due to the different influence factors that affect the power plant' operation and generation. The aim of the model is to improve predictions made and transmitted currently by the producer on short-time intervals. The deviations between forecasting and recorded production are currently about 30–35% for wind power plants and 15–20% for photovoltaic power plants [15, 16]. Minimizing these deviations will lead to lower costs for stakeholders due to the fact that imbalances are paid. The model consists in a set of experimental methods based on data mining algorithms, developed, validated and tested on WPP and PPP data sets. We developed three algorithms based on artificial neural networks (ANN): Levenberg-Marquardt algorithm (LM), Bayesian regularization algorithm (BR), and scaled conjugate gradient algorithm (SCG).

## 3.2.1. Forecasting the photovoltaic power plants' output

We identified the input parameters (irradiance, temperature, wind speed & direction, tilt, exposure) and the output (power), and for the training and validation, we used a data set that consist of 50,631 samples from every 10 minutes direct measurements in a PPP located in Romania, Giurgiu County, during January 1, 2014—December 31, 2014. Within this photovoltaic power plant are installed two types of ABB—PSV800 invertors, with 600 kW and 760 kW, 30,888 solar panels and the solar module has a rated power of 245 W with a 20-kV connection. The configuration is widely used in other PPP; therefore, the developed ANN can be easily implemented in other power plants with similar configuration.

Since solar energy presents seasonal variations related to the various climate conditions of the year, we designed the neural networks adaptable to irregular seasonal variations by changing the settings on the number of neurons in hidden layers and developed two types of ANN.

First, we designed one neural network for each of the three algorithms (LM, BR and SCG) based on the whole year data. The results were good, with an average mean squared error (MSE) of 0.19, and average for correlation coefficient, R = 0.95, with 0.9573 for LM.

Then, we consider the second option, to take into account the seasonal variations for solar energy, and we designed neural networks based on LM, BR and SCG for each month. So, we obtained 36 neural networks with a much better results than the previous case (yearly ANNs). Comparing results from the monthly data, we found that the prediction accuracy is excellent in

Figure 3. Regression between target values and the output values of the neural network SfebruaryLM.

all months, and monthly performance indicators have comparable values. The MSE is between 0.03 and 0.1, and coefficient R is between 0.997 and 0.999. For example, Figure 3 shows the correlation coefficient for the neural network SFebruaryLM developed on Levenberg-Marquardt algorithm.

By comparing the forecasting results through the development of neural networks based on the three algorithms, we found that in 69% of cases, neural networks developed with Bayesian regularization produced a better generalization than networks developed with Levenberg-Marquardt and SCG algorithms. But, in 31% of cases, the forecasting results with the highest level of accuracy have been obtained in the case of Levenberg-Marquardt algorithm.

If, in order to improve the accuracy of the forecasting model, new elements are added as input data, the LM algorithm will offer the advantage of a higher training rate compared with the BR algorithm but would have the disadvantage of an increased memory consumption. When new inputs are added and we want to obtain a high speed and performance, then the best solution is to develop the ANN based on SCG algorithm as it is faster than the other two algorithms (LM and BR) requiring low memory consumption, with the drawback that it provides a lower level of prediction accuracy.

## 3.2.2. Forecasting the wind power plants' output

all months, and monthly performance indicators have comparable values. The MSE is between 0.03 and 0.1, and coefficient R is between 0.997 and 0.999. For example, Figure 3 shows the correlation coefficient for the neural network SFebruaryLM developed on Levenberg-Marquardt

Figure 3. Regression between target values and the output values of the neural network SfebruaryLM.

20 Recent Improvements of Power Plants Management and Technology

By comparing the forecasting results through the development of neural networks based on the three algorithms, we found that in 69% of cases, neural networks developed with Bayesian regularization produced a better generalization than networks developed with Levenberg-Marquardt and SCG algorithms. But, in 31% of cases, the forecasting results with the highest

If, in order to improve the accuracy of the forecasting model, new elements are added as input data, the LM algorithm will offer the advantage of a higher training rate compared with the BR algorithm but would have the disadvantage of an increased memory consumption. When new inputs are added and we want to obtain a high speed and performance, then the best solution is to develop the ANN based on SCG algorithm as it is faster than the other two algorithms (LM and BR) requiring low memory consumption, with the drawback that it provides a lower

level of accuracy have been obtained in the case of Levenberg-Marquardt algorithm.

algorithm.

level of prediction accuracy.

We identified the input parameters (temperature, wind speed & direction at 50 m, 55 m, 75 m, 90 m, humidity, atmospheric pressure, turbine height, soil orography, slipstream effect) and the output (power). For ANN training and validation, we used a data set of 17,491 samples from hourly measurements in a WPP located in Romania, Tulcea, for 2 years (January 1, 2013– December 31, 2014). In this WPP, there are two types of wind turbines: V90 2MW/3MW IEC IA/IIA, with a height of 90 meters. These types of wind turbines are commonly used, so we can consider the data set suitable for training a generalized neural network, as described in [17].

Since wind energy presents seasonal variations over 1 year period, we design two sets of ANN based of three algorithms: Levenberg-Marquardt algorithm (LM), Bayesian regularization algorithm (BR) and scaled conjugate gradient algorithm (SCG).

First, we designed the neural network based on data set covering 2 years records for each algorithm (LM, BR and SCG). For the second solution, we take into account seasonal variations that affect wind energy and designed neural networks for each season, dividing the data into 4 sets corresponding to 4 seasons specific to Romania region. The results between the ANN trained for the whole year and the ANN trained for corresponding season are compared in Table 1.

The best approach is to develop and train the neural networks adjusted with seasonal data, due to the fact that the prediction accuracy is excellent in all seasons, and performance indicators have comparable values. Comparing the results for each algorithm (LM, BR, SCG), in most cases, neural networks based on Bayesian regularization produced a better generalization than Levenberg-Marquardt or SCG algorithms, but LM performed faster and with minimum memory consumption.

KPIs analytical framework provides methods for calculating the key performance indicators used by executives to monitor the power plants in terms of technological and business processes. For technological processes, we build the KPIs presented in Section 2 based on formulas (1) to (19). For business processes, we included commonly used KPIs as income, cost, profit/ loss, etc. The KPIs are developed directly into the facts tables, as derived measures and accessible into the interface level.


Table 1. Comparison between ANN developed for one year and ANN with seasonal adjustments.

## 3.3. Level 3—interface

The forecasting and analytical models are integrated into an online dashboard developed in Java with application development framework (ADF). The dashboard is built as a business intelligence (BI) portal with a very friendly interface and interactive charts, reports, pivot tables, maps and narrative elements that allows executives and stakeholders to easily analyze the KPIs. The dashboard contains three sections:


Figure 4. Forecast versus actual energy for WPP groups.

basis, the forecasted energy (orange line) versus actual produced energy (green line) for a WPP group. The chart displays also other 2 generation groups (grey and light blue lines) situated in the same region with the green marked group and the difference between estimated and actual values (light orange line).

• KPIs Analytics—contains analytical Business Intelligence elements (interactive charts, gauges, reports, maps, pivot tables) that enable KPI advanced analysis through dimensions' hierarchies that allows executives to compare indicators over different periods of time, regions and locations, aggregate/detailed KPIs over power plants' groups or module/turbines. For example, Figure 5 shows the average power, installed power load factor, installed power load duration and maximum power load duration for a wind power plant with two groups of 5 and 10 MW.

Figure 5. KPIs dashboard.

3.3. Level 3—interface

power plant;

the KPIs. The dashboard contains three sections:

22 Recent Improvements of Power Plants Management and Technology

estimated and actual values (light orange line).

with two groups of 5 and 10 MW.

Figure 4. Forecast versus actual energy for WPP groups.

The forecasting and analytical models are integrated into an online dashboard developed in Java with application development framework (ADF). The dashboard is built as a business intelligence (BI) portal with a very friendly interface and interactive charts, reports, pivot tables, maps and narrative elements that allows executives and stakeholders to easily analyze

• Production management—it contains reports for power plant current operations and maintenance plans, it displays the generation groups' configuration and location, real time data gathered from measuring devices, SCADA and generation groups or the entire

• Forecasting—it contains access to the forecasting models and offers reports and charts to display the estimations versus actual values for different periods of time, selected by the user. For example, Figure 4 shows a chart that displays for one day interval, on hourly

basis, the forecasted energy (orange line) versus actual produced energy (green line) for a WPP group. The chart displays also other 2 generation groups (grey and light blue lines) situated in the same region with the green marked group and the difference between

• KPIs Analytics—contains analytical Business Intelligence elements (interactive charts, gauges, reports, maps, pivot tables) that enable KPI advanced analysis through dimensions' hierarchies that allows executives to compare indicators over different periods of time, regions and locations, aggregate/detailed KPIs over power plants' groups or module/turbines. For example, Figure 5 shows the average power, installed power load factor, installed power load duration and maximum power load duration for a wind power plant The dashboard is developed in a cloud computing architecture, and it is accessible as a service, customized and configured depending on stakeholders' interest.
