**2.1 Introduction**

The three major classes of financial risks and their corresponding definitions are defined as follows:


Note that [8] argued that the probability of an operational risk event increases with many personnel and with a greater transaction volume. The latter is also based on the study by [9] who investigated the effect of bank size on operational loss amounts and deducted that, on average, for every unit increase in bank size, operational losses are predicted to increase by approximately a fourth of a root of that. Note that there are different classes of operational losses that the financial industry must be aware of; see [2]:


Category (i) has been argued that it is not feasible/implausible in the financial industry, with (ii) and (iii) are unimportant and can often be both prevented. However, category (iv) tends to cause the most devastation loses, with the best example being the 1995 Barings Bank's collapse (also portrayed in the movie "Rogue Trader"). Consequently, banks must be extremely cautious of these types of losses as they tend to cause bankruptcy in many financial institutions. Low-frequency/high-severity operational losses can be extreme in size when they are compared to the rest of the data. If you construct a histogram of the loss distribution, the low-frequency/high-severity operational losses events would be placed in the far-right end, which often referred to as "tail event". Due to operational loss, data exhibit such tail events. We say that the data are heavy-tailed.

In different fields that use "Data Science" techniques (e.g. insurance, banks, etc.), different types of distributions are used to model data due to the different products that are offered by several financial institutions. These financial institutions are

increasingly measuring and managing their exposure to different types of risks; see [4]. A proper evaluation of risk in any financial institution is an uncertainty problem that may easily lead to a bankruptcy of that firm and consequently is a major concern for national and international financial regulatory bodies.

It is important to mention that risk data from different products offered by financial institutions (e.g. micro-insurance, re-insurance, investment, savings, stock exchange, etc.) are distributed differently; see for instance [10]. Consequently, a thorough understanding of a variety of distributions is a must for an inspiring data scientist. According to [2], the most applied basic distributions in quantifying operational risk are those that are skewed to the right (right-tailed). There are two main ways to categorize the right-tailed loss distributions, i.e. parametric and nonparametric approaches. More specifically, in this chapter, we will consider the following most common parametric loss distributions: (i) exponential, (ii) gamma, (iii) Weibull, (iv) Pareto, (v) Burr, and (vi) log-normal. It is worth mentioning that these are not the only existing loss distributions, for example, a combination of two of the above, i.e. the composite Weibull-Pareto distribution in the context of risk is discussed in [11].

The next subsections discuss the following: Section 2.2 provides some distributional properties of the considered parametric loss distributions. More importantly, Section 2.2 provides the literature review of some publications that applied the considered distributions in the context of risk analysis. Next, Section 2.3 gives a brief discussion on nonparametric loss distributions (seldomly used), and Section 2.4 discusses some well-known methods of quantifying risk. Section 2.5 discusses other types of risks that use the LDA. Finally, Section 2.6 gives some concluding remarks.

### **2.2 Parametric loss distributions**

A summary of some publications that discussed parametric loss distribution's application in operational risk is provided in **Table 1**. This table was constructed with an effort to easily identify which type of loss distributions is discussed in these separate publications. The corresponding loss distribution function properties are listed in **Table 2** with the expressions adopted from [12, 13] and Chapter 6 of [2].

Note that when the different parameters in **Table 2** are varied, the distributions tend to vary significantly, especially in the tail area. The latter will be illustrated in detail in the next section.

## *2.2.1 Pareto distribution*

The Pareto distribution is a very heavy-tailed distribution that takes on positive values, and its parameter *α* is used to determine the size of the tail heaviness. The Pareto distribution tail is monotonically decreasing, and this means that the tail decreases as *x* increases and it becomes is thicker for values of *x* closer to zero. To derive the Pareto distribution, assume that a variate *x* follows an exponential distribution with mean *β*<sup>1</sup> ; furthermore, suppose that *β* follows a gamma distribution, therefore the *x* follows a Pareto distribution; see [17]. Note that when *α*< 1, a very heavy tail is encountered with the mean and variance being infinite. This means that losses of infinite sizes are theoretically possible. The extreme heaviness of the Pareto distribution tail makes it ideal for modeling losses of high magnitudes; see [2]. There are also different versions of the Pareto distribution that are used in risk analysis. The most popular of those variations is being the generalized Pareto distribution (GPD).

*Quantifying Risk Using Loss Distributions DOI: http://dx.doi.org/10.5772/intechopen.108856*


#### **Table 1.**

*A summary of publications discussed in this chapter and their classification according to the type of loss distribution.*

The GPD is especially good while modeling data greater than a high threshold, also known as estimation of tails of extreme losses.

## *2.2.2 Burr distribution*

The Burr distribution is heavy-tailed, and it is skewed to the right; see [19]. The Burr distribution is a special case of GPD described in subsection 2.2.1. It has three parameters which gives it more flexibility over the traditional Pareto distribution. The Burr distribution has an additional parameter *γ* and when *γ* ¼ 1, it reduces to a Pareto distribution. One of the well-known uses of the Burr distribution is modeling natural catastrophes and as a result, it is a popular distribution or model for use in the insurance industry for pricing of premiums; see [2, 22]. The family of Burr distributions goes back to 1941, and it is sometimes referred as the extended Pareto or beta prime distribution. All the PDFs of the loss distributions in the Burr family have a monotonically decreasing, right-skewed tails; see [15]. The Burr distribution is well recognized in probability theory with many applications in agriculture, biology, etc., see [20].

### *2.2.3 Gamma distribution*

The gamma distribution is a light-tailed distribution which is skewed to the right. The gamma distribution is a two-parameter distribution that is a generalization of the exponential distribution. It is a two-parameter probability distribution, where *x* is a



random variable, *β* is a scale parameter, *α* is the shape parameter, and *Γ*ð Þ∙ is a gamma distribution; see **Table 2**. The gamma distribution is said to be the generalization of the exponential distribution because for *γ* ¼ 1 it becomes the exponential with parameter *λ* ¼ 1*=β*, and it is usually used to model time between events; see [2]. According to [7], the gamma distribution is one of the most important loss distributions in risk analysis because it forms the base for creating many of the popular distributions we have. Exponential distribution is described by a density *f* and distribution *F* given in **Table 2** above, where *λ* represents the "failure" rate. The distribution is tractable and has unique mathematical properties, e.g. the failure distribution is described by a single parameter known as the mean time to failure, denoted by *θ*, also that the failure rate is defined by knowing the mean life, i.e. *λ* ¼ 1*=θ* . The exponential distribution can be used to model the time elapsed until the next event (e.g. accident); see [23].

The exponential PDF has a monotone decrease and an exponentially decreasing and light tail. This means that when it is applied in risk analysis, the event of high losses is given an almost zero probability; see [2]. Due to this property, [14] stated that the exponential distribution is not used very much in operational losses, but the constant decrease of the tail is useful for modeling lifetime data of items which have a constant failure rate. The exponential distribution has attractive and easily understandable mathematical properties; thus, it is mostly used in risk analysis for developing other models.

## *2.2.4 Weibull distribution*

Another generalization of the exponential distribution is the Weibull distribution, and it has two parameters (see **Table 2**) compared to the one parameter of the exponential distribution. The Weibull distribution has a light tail which is skewed to the right. The additional parameter allows the Weibull distribution to have more flexibility as well as heavier or lighter tail than the exponential distribution. That is, the Weibull distribution has a lighter tail than exponential distribution if *α* <1, equals to the exponential distribution if *α* ¼ 1 and has a heavier tail than exponential distribution if *α*> 1; see [2]. Furthermore, [2] stated that in risk analysis, the heavy-tailed Weibull distribution is a popular model as it has been shown to be optimal for modeling asset returns as well as used in reinsurance.

### *2.2.5 Log-normal distribution*

A log-normal distribution is a moderately heavy-tailed distribution that is skewed to the right. The distribution is derived by taking the natural logarithm of the data and fitting it to the normal distribution. The distribution is right-tailed and takes on only positive *x* values. The log-normal distribution, like the normal distribution, has parameters *μ* and *σ* (see **Table 2**). The distribution is useful for modeling of claim sizes. The thick tail and right skewness properties make it fit many situations. The lognormal can also resemble the normal distribution if the *α* is very small, and this property is not always desirable for analyzing risk; see [16].

### **2.3 Nonparametric loss distribution**

In the nonparametric loss distributions (e.g. empirical distribution function), all the data on the certain risk type is considered. In other words, we do not have to

estimate any parameters as all the data are available (which is hardly ever the case); see [2, 16]. According to [7], the CDF of the empirical distribution, *Fn*ð Þ *x* , is given by:

$$F\_n(\boldsymbol{\pi}) = \frac{1}{n} \# \{ \boldsymbol{i} : \boldsymbol{\pi}\_{\boldsymbol{i}} \le \boldsymbol{\pi} \} \tag{1}$$

where # denotes the number of observations ≤ *x*, and *n* is the total number observations in the sample.

Some of the advantages of nonparametric loss distributions are as follows:


Below are some of the disadvantages:

• Less efficient to compute and may provide inaccurate results, especially when the underlying distribution is known.

### **2.4 Risk quantification**

According to [18, 24], LDA is widely used to quantify operational risk; moreover, both [18, 24] showed that when quantifying operational risk, the PDF for an occurrence and the frequency for that occurrence are approximated firstly for a certain risk type or business line then later for the institution. The process of deriving these probability distributions is done in three steps: firstly, the loss severity distribution is derived; secondly, the loss frequency distribution is also derived; and lastly, the aggregate loss distribution is found by compounding the severity and frequency loss distributions.

The Value-at-Risk (VaR), which is a combination of expected and unexpected losses, is used when approximating the PDFs, and [18, 24] stated that the Capital-at-Risk (CaR) given in Eq. (2) is just the VaR, and this value is computed for a certain risk type cell and a certain occurrence type:

$$\text{CaR } (i, j; a) = \text{EL } (i, j) + \text{UL } (i, j; a) \tag{2}$$

Note that in Eq. (2), we use the indices *i* and *j* to denote a given business line and a given event type, *EL i*ð Þ , *j* is the expected loss, and *UL i*ð Þ , *j*; *α* is the unexpected loss at significance level *α*.

Another method of quantifying risk is the internal measurement approach (IMA). According to [18], when using IMA, the business type and the event type risk are both quantified using

$$\text{CaR } (i, j) = \text{EL } (i, j) \times \chi \text{ } (i, j) \times \text{RPI } (i, j) \tag{3}$$

where *γ* is the scaling factor and RPI is the risk profile index.

LDA is of great importance when computing regulatory capital, and as noted by [18], even though LDA is such a great tool, it also has its downside. This is due to a lack of data *Quantifying Risk Using Loss Distributions DOI: http://dx.doi.org/10.5772/intechopen.108856*

and even if a bank keeps large amounts of losses data, they may still be unrepresentative of potential extreme losses. Three of some popular approaches under LDA is the extreme value theory (EVT), VaR, and IMA which are briefly discussed in Section 3.

Risks need to be measurable so that they can be evaluated and examined. It is ideal to have a high-quality historical data that can be subjected to in-depth statistical analysis. Numerous quantification models tend to lack high-quality statistical data. The types of risk determine which quantification technique to use. In finance, the main methods to quantify risk are as outlined below; see [2, 3]:

• Dynamic financial analysis

This simulates the enterprise's overall risks as well as their interactions. Typically, forecast balance sheets and projected income statements are produced as outputs using cashflows.

• Financial Conditions Reports (FCR)

The Financial Conditions Report (FCR) displays both the current state of solvency and potential future developments. The volume and profitability of new business as well as any special characteristics it might have would typically be projected.

• Quantitative methods

Quantitative methods are employed for risks in insurance and underwriting, markets, and economies, such as interest rate, basis risk, and market fluctuations. Time series and scenario analysis might be included, as well as the fitting of statistical models and subsequent calculation of risk metrics like VaR.

• Credit risk models

Instead of measuring the risk in a credit portfolio, these models assess the credit risk of a single entity (business or person). These may be quantified as well as subjectively, and counterparty risk is one of them. A credit risk model's job is to take the state of the overall economy and the circumstances surrounding the company under consideration as inputs and provide a credit spread as an output. In this context, structural and reduced form models are the two main groups of credit risk models. Based on the value of a company's assets and obligations, structural models are used to assess the likelihood that a default will occur.

• Asset Liability Modeling (ALM)

This approach, which is common in the insurance industry and primarily measures liquidity and capital requirements, might be used by various types of financial companies.

• Scenario analysis

Operational hazards and other risks that are challenging to measure, such as legal risk, regulatory risk, agency risk, moral hazard, strategic risk, political risk, and reputational risk, are often covered under these.

• Sensitivity testing

It is used to change each parameter separately and measure how much the outputs of the model fluctuate or are sensitive to different variables.

### **2.5 Other risks**

Another risk that is quantified using LDA is credit risk, which is the estimation of expected and unexpected loss from credit defaults; see [10]. Note that [21] used the inverse Gaussian distribution to quantify credit risk and used Copula functions and Laplace transformation to run an algorithm that quantifies the corresponding probabilities. In this chapter, our focus is not on credit risk. Hence, a reader who might be interested in how the probability distribution of defaults is quantified can go through the articles by [10, 18].
