**2. Methodology**

This section presents the main tools used in this work: the LCOE methodology provided by IEA and the SVM, the used machine learning technique. Just before SVM presentations a very brief remind about ML and its use in energy systems and CO2 emissions estimates will be provided.

#### **2.1 Levelised cost of energy**

The Levelised Cost of Energy (LCOE) is the selected tool to measure the cost of an energy unit produced by the considered technologies. LCOE is a methodology described in the joint report by the International Energy Agency and the OECD

#### *Machine Learning in Estimating CO2 Emissions from Electricity Generation DOI: http://dx.doi.org/10.5772/intechopen.97452*

(Organization for Economic Co-operation and Development) Nuclear Energy Agency (NEA) (now at the ninth edition in a series of studies on electricity generating costs) [1]. This report includes cost data on power generation from natural gas, coal, nuclear, and a broad range of renewable technologies.

The metric for plant-level cost chosen is the well-known levelised cost of electricity (LCOE) (IEA are now considering system effects and system costs with the help of the broader value-adjusted LCOE, or Levelised Cost of Value-Adjusted LCOE, VALCOE metric, here not considered).

The LCOE is widely considered as the principal tool for comparing the plantlevel unit costs of different base load technologies over their operating lifetimes since indicates the economic costs of a technology family, not the financial costs of a certain projects in a certain market. Due to the equality between discounted average costs and the stable remuneration over lifetime electricity production LCOE recall the costs of electricity production in regulated electricity markets with stable tariffs than to the variable prices in deregulated markets.

Despite many limitations, LCOE has maintained its utility and appeal since it is a uniquely straightforward, transparent, comparable, and well understood metrics remaining a widely used tool for modeling, policy making and public debate.

The calculation of the LCOE is based on the equivalence of the present value of the sum of discounted revenues and the present value of the sum of discounted costs. Another way on the left-hand side one finds the discounted sum of benefits and on the right-hand side the discounted sum of costs:

$$LCOE = P\_{MWh} = \frac{\sum \left(Capital\_t + O\&M\_t + Fuel\_t + Carbon\_t + D\_t\right) \* \left(1 + r\right)^{-t}}{\sum MWh \* \left(1 + r\right)^{-t}} \tag{1}$$

where:


Eq. (1) is the formula used here to calculate average lifetime levelized costs based on the costs for investment, operation and maintenance, fuel, carbon emissions and decommissioning and dismantling provided by OECD countries and selected non-member countries.

#### **2.2 Machine learning**

Machine learning (ML) is the field of artificial intelligence (AI) that provide methods to learn from data over time creating algorithms not being programmed to do so.

The literature about ML is relatively recent but is so vast that only some hint to review works can be made here, as an access point to this world1 .

Machine learning approaches are normally categorized as in the follows.

<sup>1</sup> Here we just remind a recent review of the state of art in machine learning techniques [3].

**Supervised machine learning**, that trains itself on a labeled data set; **unsupervised machine learning** that uses unlabeled data with algorithms to extract the features required to label, sort, and classify the data in real-time, without human intervention; **semi-supervised learning** (SsL) namely a medium between supervised and unsupervised learning: SsL uses a smaller labeled data set during training and make classification and feature extraction from a larger, unlabeled data set; **reinforcement machine learning** is like supervised learning, but do not requires sample data for training (since using "trial and error" mode).

About the machine learning algorithms for use with labeled data the **regression algorithms** (as linear and logistic regression); **decision trees** (based on a set of decision rules to perform classification); **instance-based algorithms**: it uses classification to estimate how likely a data point is to be a member of one group, or another based on its proximity to other data points.

Methods based for use with on unlabeled data are: **clustering algorithms**: (like K-means, TwoStep, and Kohonen clustering); **association algorithms**: (that find patterns in data by identifying 'if-then' relationships namely association rules); **neural networks**: (that create a layered network of calculations featuring an input layer, when data in; one or more hidden layer, where calculations are performed; and an output layer. Where each conclusion is assigned a probability); **deep neural network** that uses multiple hidden layers, each of which successively refines the results of the previous layer. Deep learning models are typically unsupervised or semi-supervised. Certain types of deep learning models—including convolutional neural networks (CNNs) and recurrent neural networks (RNNs)—are driving progress in areas such as computer vision, natural language processing (including speech recognition), and self-driving cars.

In this work, the machine learning approach used is the SVM one.

SVMs<sup>2</sup> are machine learning algorithms built on statistical learning theory for structural risk minimization. In pattern recognition, classification, and analysis of regression, SVMs outperform other methodologies. The significant range of SVM applications in the field of load forecasting is due to its ability to generalize (also, local minima lead to no problems in SVM).

SVM was chosen, in this work, for the sake of simplicity, since the performed Support Vector Regression (SVR) [5], extremely easy to understand in comparing a traditional statistical tool with a competing machine learning based one.

Often, the available applications of SVM in the energy sector are oriented on the engineering side<sup>3</sup> while in this work the approach is oriented in support decisions for energy policy field.

Using one of the possibilities offered by SVMs, namely the SVR, the follows show how it is possible to obtain more accurate forecasts of costs per unit of energy produced, using LCOE as a metric.

The best available accuracy is then used in a context of cost-effectiveness analysis.

In the following, a method to select among competing options (options that can be differ even for slight changes in some significant LCOE parameters), the one characterized by the best Incremental Cost-Effectiveness Ratio (ICER) is presented.

The possibility of making this choice during the lifetime of the plant leads to the possibility of identifying the best technology available, year by year, to get the corresponding profile of the associated CO2 emissions.

<sup>2</sup> For a good introduction to this topic see [4].

<sup>3</sup> See, for example [6].
