**2. Approximating VaR**

Let the random variable *N* denotes the annual number of loss events and that *N* is distributed according to a Poisson distribution with parameter lambda,

*Construction of Forward-Looking Distributions Using Limited Historical Data and Scenario… DOI: http://dx.doi.org/10.5772/intechopen.93722*

i.e.*N* � *Poi*ð Þ*λ* . Note that one could use other frequency distributions like the negative binomial, but we found that the Poisson is by far the most popular in practice since it fits the data well. Furthermore, assume that the random variables *X*1, … , *XN* denote the loss severities of these loss events and that they are independently and identically distributed according to a severity distribution *T*, i.e. *X*1, … ,*XN* � *iid T*. Then the annual aggregate loss is *<sup>A</sup>* <sup>¼</sup> <sup>P</sup>*<sup>N</sup> <sup>n</sup>*¼<sup>1</sup>*Xn* and the distribution of *<sup>A</sup>* is the aggregate loss distribution, which is a compound Poisson distribution that depends on *λ* and *T* and is denoted by *CoP T*ð Þ , *λ* . Of course, in practice we do not know *T* and *λ* and have to estimate it. First we have to decide on a model for *T*, which can be a class of distributions *F x*ð Þ , *θ* . Then *θ* and *λ* have to be estimated using statistical estimates.

The compound Poisson distribution *CoP T*ð Þ , *λ* and its VaR are difficult to calculate analytically so that in practice Monte Carlo (MC) simulation is often used. This is done by generating *N* according to the assumed frequency distribution and then by generating *X*1, … , *XN* independent and identically distributed according to the true severity distribution *<sup>T</sup>* and calculating *<sup>A</sup>* <sup>¼</sup> <sup>P</sup>*<sup>N</sup> <sup>n</sup>*¼<sup>1</sup>*Xn*. The previous process is repeated *I* times independently to obtain *Ai*, *i* ¼ 1, 2, … ,*I* and then the 99.9% VaR is approximated by *A*ð Þ ½ �þ <sup>0</sup>*:*<sup>999</sup> <sup>∗</sup> *<sup>I</sup>* <sup>1</sup> where *A*ð Þ*<sup>i</sup>* denotes the *i*-th order statistic and ½ � *k* the largest integer contained in *k*. Note that three input items are required to perform this, namely the number of repetitions *I* as well as the frequency and loss severity distributions. The number of repetitions determines the accuracy of the approximation and the larger it is, the higher its accuracy. In order to illustrate the Monte Carlo approximation method, we assume that the Burr is the true underlying severity distribution and we use six parameter sets corresponding to an extreme value index (EVI) of 0.33, 0.83, 1.0, 1.33, 1.85 and 2.35 as indicated in **Table 1** below. See Appendix A for a discussion of the characteristics of this distribution and its properties. We take the number of repetitions as *I* ¼ 1 000 000 and repeat the calculation of VaR 1000 times. The 90% band containing the VaR values are shown in **Figure 1** below. Here the lower (upper) bound has been determined as the 5% (95%) percentile of the 1000 VaR values, divided by its median, and by subtracting 1. In mathematical terms the 90% band is defined as

*VaR*ð Þ <sup>51</sup> *Median VaR* ð Þ 1, … ,*VaR*<sup>1000</sup> � 1, *VaR*ð Þ <sup>951</sup> *Median VaR* ð Þ 1, … ,*VaR*<sup>1000</sup> � <sup>1</sup> h i, where *VaR*ð Þ*<sup>k</sup>* denotes the *<sup>k</sup>*-th order statistic. From **Figure 1** it is clear that the spread, as measured by the 90% band, declines with increasing lambda, but increases with increasing EVI.

In principle, infinitely many repetitions are required to get the exact true VaR. The large number of simulation repetitions involved in the MC approaches above motivates the use of other numerical methods such as Panjer recursion, methods based on fast Fourier transforms [5] and the single loss approximation (SLA) method (see e.g. [6]). For a detailed comparison of numerical approximation


**Table 1.** *Parameter sets of Burr distribution.*

method. Such a distribution is assumed to be an adequate reflection of the past but need to be forward looking in the sense that anticipated future losses are taken into account. The constructed distribution may then be used to answer questions like 'What aggregate loss level will be exceeded only once in c years?' or 'What is the expected annual aggregate loss level?' or 'If we want to guard ourselves against a one in a thousand-year aggregate loss, how much capital should we hold next year?' The aggregate loss distribution and its quantiles will provide answers to these questions and it is therefore paramount that this distribution is modelled and estimated as accurately as possible. Often it is the extreme quantiles of this distri-

*Linear and Non-Linear Financial Econometrics - Theory and Practice*

Under Basel II's advanced measurement approach, banks may use their own internal models to calculate their operational risk capital, and the LDA is known to be a popular method for this. A bank must be able to demonstrate that their approach captures potentially severe 'tail' events and they must hold capital to protect them against a one-in-a-thousand-year aggregate loss. To determine this capital amount, the 99.9% Value-at-Risk (VaR) of the aggregate distribution is calculated [1]. In order to estimate a one-in-a-thousand-year loss, one would hope that at least a thousand years of historical data is available. However, in reality only between five and ten years of internal data is available and scenario assessments by experts are often used to augment the historical data and to provide a forward-looking view. The much anticipated implementation of Basel III will require banks to calculate

operational risk capital on a new standardised approach, which is simple, risksensitive and comparable between different banks [2]. Although the more sophisticated internal models described above will no longer be allowed in determining minimum regulatory capital, these models will remain relevant for the determination of economic capital and decision making within banks and other financial institutions. It is also suggested that LDA models would form an integral part of the supervisory review of a bank's internal operational risk management process [3]. For this reason, we believe the LDA remains relevant and will continue to be

In this chapter we provide an exposition of statistical methods that may be used to estimate VaR using historical data in combination with quantile assessments by experts. The proposed approach has been discussed and studied elsewhere (see [4]), but specifically in the context of operational risk and economic capital estimation. In this chapter we concentrate on the estimation of the VaR of the aggregate loss or claims distribution and strive to make the approach more accessible to a wider audience. Also, based on the implementation done for major banks, we include some practical guidelines for the use and implementation of the method in practice. In the next section we discuss two approaches, Monte Carlo and Single Loss Approximation, that may be used for the approximation of VaR assuming known distributions and parameters. Then, in the third section (Historical data and scenario modelling), we will discuss the available sources of data and formulate the scenario approach and how these may be created and assessed by experts. This is followed, in section four (Estimating VaR), by the estimation of VaR using three modelling approaches. In the fifth section (Implementation recommendations) some guidelines on the implementation of the preferred approach are given. Some

Let the random variable *N* denotes the annual number of loss events and that

*N* is distributed according to a Poisson distribution with parameter lambda,

bution that is of interest.

studied and improved on.

**2. Approximating VaR**

**14**

concluding remarks are made in the last section.

**Figure 1.** *Variation obtained in the VaR estimates for different values of EVI and frequency.*

methods, the interested reader is referred to [7]. The SLA has become very popular in the financial industry due to its simplicity and can be stated as follows: If *T* is the true underlying severity distribution function of the individual losses and *λ* the true annual frequency then the 100 1ð Þ � γ % VaR of the compound loss distribution may be approximated by *T*�<sup>1</sup> ð Þ 1 � *γ=λ* or, as modified by [8] for large *λ*, by *T*�<sup>1</sup> ð Þþ 1 � *γ=λ λμ*, where *μ* is the finite mean of the true underlying severity distribution. The first order approximation by [6]

$$CoP^{-1}(1 - \chi) \approx T^{-1}(1 - \chi/\lambda),\tag{1}$$

*T x*ð Þ¼ *T q*ð Þ*Te*ð Þþ *x* ½ � 1 � *T q*ð Þ *Tu*ð Þ *x* for all *x:* (3)

This identity represents *T x*ð Þ as a mixture of the two conditional distributions. Instead of modelling *T x*ð Þ with a class of distributions *F x*ð Þ , *θ* we may now consider modelling *Te*ð Þ *x* with *Fe*ð Þ *x*, *θ* and *Tu*ð Þ *x* , with *Fu*ð Þ *x*, *θ* . Borrowing from EVT a popular choice for *Fu*ð Þ *x*, *θ* could be the generalised Pareto distribution (GPD), whilst a host of choices are available for *Fe*ð Þ *x*, *θ* , the obvious being the empirical distribution. Note that the Pickands-Balkema-de Haan limit theorem (see e.g. [11]), states that the conditional tail of all distributions in the domain of attraction of the Generalised Extreme Value distribution (GEV), tends to a GPD distribution. The distributions in the domain of attraction of the GEV are a wide class of distributions, which includes most distributions of interest to us. Although one could consider alternative distributions to the GPD for modelling the tail of a severity distribution, this theorem, and the limiting conditions that we are interested in, suggest that the GPD is a good choice. In the fourth section (Estimating VaR) we

*Construction of Forward-Looking Distributions Using Limited Historical Data and Scenario…*

It is practice in operational risk management to use different data sources for modelling future losses. Banks have been collecting their own data, but realistically, most banks only have between five and ten years of reliable loss data. To address this shortcoming, loss data from external sources and scenario data can be used by banks in addition to their own internal loss data and controls [12]. Certain external loss databases exist, including publicly available data, insurance data and consortium data. The process of incorporating data from external sources requires due consideration because of biases in the external data. One method of combining operational losses collected from various banks of different sizes and loss reporting thresholds, is discussed in [13]. In the remainder of our discussion we will only refer to historical data, which may be a combination of internal and external loss data. Three types of scenario assessments are also suggested to improve the estimation of the severity distribution, namely the individual scenario approach, the interval approach, and the percentile approach. In the remainder of the chapter we discuss the percentile approach as we believe it is the most practical of the existing approaches available in the literature [4]. That being said, it should be noted that probability assessments by experts are notoriously difficult and unreliable as discussed in [14]. We mentioned previously that it is often an extreme quantile of the aggregate loss distribution that is of interest. In the case of operational risk, the regulator requires that the one-in-a-thousand-year quantile of this distribution be estimated, in other words the aggregate loss level that will be exceeded once in a thousand years. Considering that banks' only have limited historical data available, i.e. maximum of ten years of internal data, the estimation of such a quantile, using historical data only, is a near impossible task. So modellers have suggested the use of

We advocate the use of the so-called 1-in-*c* year scenario approach as discussed in [4]. In the 1-in-*c* years scenario approach, the experts are asked to answer the question: 'What loss level *qc* is expected to be exceeded once every *c* years?'. Popular choices for *c* vary between 5 and 100 and often 3 values for *c* are used. As an

example, the bank alluded to at the start of this chapter, used *c* ¼ 7, 20 and 100 and motivated the first choice as the number of years of reliable historical data available to them. In this case the largest loss in the historical data may serve as a guide for

will discuss this in more detail.

*DOI: http://dx.doi.org/10.5772/intechopen.93722*

**3. Historical data and scenario modelling**

scenarios and experts' assessments thereof.

**17**

states that the 100 1ð Þ � γ % VaR of the aggregate loss distribution may be approximated by the 100 1ð Þ � *γ=λ* % VaR of the severity distribution, if the latter is part of the sub-exponential class of distributions. This follows from a theorem from extreme value theory (EVT) which states that *P A* <sup>¼</sup> <sup>P</sup>*<sup>N</sup> <sup>n</sup>*¼<sup>1</sup>*Xn* <sup>&</sup>gt;*<sup>x</sup>* � �<sup>≈</sup> *P max X* ð Þ f g 1, … ,*XN* >*x* as *x* ! ∞ (see e.g. [9]). The result is quite remarkable in

that a quantile of the aggregate loss distribution may be approximated by a more extreme quantile (if *λ*>1) of the underlying severity distribution. EVT is all about modelling extremal events and is especially concerned about modelling the tail of a distribution (see e.g. [10]), i.e. that part of the distribution we are most interested in. Bearing this in mind we might consider modelling the body and tail of the severity distribution separately as follows.

Let *q* be a quantile of the severity distribution *T*. We use *q* as a threshold that splice *T* in such a way that the interval below *q* is the expected part and the interval above *q* the unexpected part of the severity distribution. Define two distribution functions

$$T\_{\epsilon}(\mathbf{x}) = T(\mathbf{x}) / T(q) \text{ for } \mathbf{x} \le q \text{ and}$$

$$T\_{\mathfrak{u}}(\mathbf{x}) = [T(\mathbf{x}) - T(q)] / [1 - T(q)] \text{ for } \mathbf{x} > q,\tag{2}$$

i.e. *Te*ð Þ *x* is the conditional distribution function of a random loss *X* � *T* given that *X* ≤ *q* and *Tu*ð Þ *x* is the conditional distribution function given that *X* >*q*.

Note that we then have the identity

*Construction of Forward-Looking Distributions Using Limited Historical Data and Scenario… DOI: http://dx.doi.org/10.5772/intechopen.93722*

$$T(\boldsymbol{\kappa}) = T(q)T\_{\boldsymbol{\epsilon}}(\boldsymbol{\kappa}) + [1 - T(q)]T\_{\boldsymbol{u}}(\boldsymbol{\kappa}) \text{ for all } \boldsymbol{\kappa}. \tag{3}$$

This identity represents *T x*ð Þ as a mixture of the two conditional distributions. Instead of modelling *T x*ð Þ with a class of distributions *F x*ð Þ , *θ* we may now consider modelling *Te*ð Þ *x* with *Fe*ð Þ *x*, *θ* and *Tu*ð Þ *x* , with *Fu*ð Þ *x*, *θ* . Borrowing from EVT a popular choice for *Fu*ð Þ *x*, *θ* could be the generalised Pareto distribution (GPD), whilst a host of choices are available for *Fe*ð Þ *x*, *θ* , the obvious being the empirical distribution. Note that the Pickands-Balkema-de Haan limit theorem (see e.g. [11]), states that the conditional tail of all distributions in the domain of attraction of the Generalised Extreme Value distribution (GEV), tends to a GPD distribution. The distributions in the domain of attraction of the GEV are a wide class of distributions, which includes most distributions of interest to us. Although one could consider alternative distributions to the GPD for modelling the tail of a severity distribution, this theorem, and the limiting conditions that we are interested in, suggest that the GPD is a good choice. In the fourth section (Estimating VaR) we will discuss this in more detail.
