**1. Introduction**

A natural disaster is a sudden and adverse event caused by natural forces that significantly disrupts the workings of society [1]. Such events result to loss of life, injury or other negative health impacts, loss of livelihoods, environmental damage, social and economic disruption. These natural forces causing natural disasters are called natural hazards. A natural hazards can be formally defined as a natural process or phenomenon that may cause loss of livelihoods and services, negative health impacts such as loss of life and injury, social and economic disruption or environmental damage [2]. Therefore, a natural hazard becomes a disaster once it occurs. The Centre for Research on the Epidemiology of Disasters Database (EM-DAT) generally classifies natural disaster according to the hazards that causes them. We have geophysical (earthquakes, volcanoes landslides and tsunamis), hydrological (floods and avalanches), climatological (droughts and wildfires), meteorological (storms and cyclones) or biological (epidemics and pests/insects/ plagues). Hydrological, meteorological and climatological hazards are generally termed as weather-related hazards. Natural disasters are considered extreme events and thus we consider the method of Extreme Value Theory for analysis of our data.

#### **1.1 Extreme value modelling**

Extreme value theorem (EVT) is used to develop stochastic models aimed at solving real-world problems relating to extremal and unusual events for instance stock market crashes, natural catastrophes, major insurance claims e.t.c. EVT models the tails of loss severity distributions as it only considers extreme and rare events. Extreme value modelling has been used widely in hydrology, insurance, finance and environmental application where extreme events are of interest. Due to the scope of this research, we will limit the literature to those relating to natural disasters. The relevant theories and statistical methods behind EVT are presented by [3].

that is used to evaluate the model fit. The traditional EVT approach only considers the extreme data above the threshold, and discards the rest that are below. [7] states that the motivation for ignoring extreme data is that extreme and non-extreme data are caused by different processes and the latter is rarely of concern. Furthermore, there is concern that the data below the threshold may compromise the examination of the tail fit. However, there has been increasing interests in extreme value mixture models, which uses all the data for parameter estimation in the inference. This class of models has the potential to overcome some of the difficulties encountered in using the traditional EVT, in regards to threshold selection and uncertainties associated with it. The next section will highlight the development of extreme value

*On Modelling Extreme Damages from Natural Disasters in Kenya*

*DOI: http://dx.doi.org/10.5772/intechopen.94578*

Extreme value mixture models typically comprise of two compenents: a bulk model which describes all the non-extreme data below the threshold, and a tail model which is the traditional extreme value model to model the extreme data above the threshold. The advantage of these models is that they consider all the data without wasting information and provides an automated approach both for estimation of the threshold and quantification of the resulting uncertainties from the

[14] propose a dynamic weighted mixture model combining a light-tailed distribution for the bulk model, with GPD for the tail model. They use a Weibull distribution as the bulk distribution. Rather than explicitly defining the threshold, they adopt a Cauchy distribution function as the mixing function to make the transition between the bulk and tail distributions. They use maximum likelihood estimation for parameter estimation, and numeric iteration to calculate empirical quantiles. They then test the model using Danish fire loss dataset, and find the parameter estimates to be comparable to those reported in the literature regarding EOT inference. The quantiles are comparable as well. The model is also compared to an existing robust thresholding approach, and it is found to be better for very small

[15] develop an extreme value mixture model comprising of a truncated gamma to fit the bulk distributions and GPD for the tail distributions. They note that any other parametric distribution can also be used. They treat the threshold as an unknown parameter, and use Bayesian statistics for inference about the unknown parameters. Inference on the Posterior distribution is done using Markov chain Monte Carlo methods, and simulation studies is carried out to analyse the performance of the model. They find that the parameter estimates are very close to the true value for large samples. However, for small samples, they encounter problems with convergence, and consequently parameter estimation. They also apply the model to Nasdaq 100 dataset, an index in the New York Stock Exchange, and compare its performance with the traditional extreme value model. The resulting

GPD estimates is close to that obtained using the traditional approach.

Unlike Extreme Value Theorem which is concerned with only the tails of loss severity distributions, Compound Extreme Value Distribution (CEVD) models can simultaneously model the frequency and severity of extreme events. The idea was proposed by Tebfu and Fengshi in 1980 to model typhoon in South China [16]. They only consider the events above the threshold, as is the case with the traditional EVT, and assume that the number of exceedance (frequency) is Poisson Distributed.

**1.3 Compound extreme value distribution**

mixture models.

threshold choice [7].

percentiles in small datasets.

**275**

**1.2 Extreme value mixture models**

[4, 5] propose the Generalised Extreme Value (GEV) distribution which is used to model the maxima of a set of independent and identically distributed (iid) events. The resulting model, referred to as the Block Maxima (BM) model, involves dividing the observations into non-overlapping blocks of equal sizes and restrict the attention to maximum observations in each period. The observations created are then assumed to follow the GEV [6]. The block sizes are vital in the EVT model as too small block sizes can lead to poor asymptotic approximation, while too big sizes will result to very little observations [7]. [8] use BM method to model hydrological floods and droughts. They find that using block sizes of one year or less to model drought leads to bias. Floods, on the other hand, is modelled using block sizes of less than one year. The same method is used by [9] to study seasonal variation of extreme precipitation across the UK. They use a one month block, having found little improvement of a longer-block. They were able to identify regions with a high risk of extreme precipitation in a given season. A major drawback of the Block Maxima is its inefficiency in terms of data usage. Dividing the dataset into blocks is wasteful of the data since not all the available data is used. To overcome this challenge, the data above some sufficiently high threshold is modelled in what is commonly referred to as excess over threshold (EOT). Several studies have been conducted to compare the performance of BM and EOT, all based on simulated data. [10] states that, in the case where the shape parameter is equal to zero, EOT estimates for a high quantile is better only if the number of exceedance is larger than 1.65 times the number of blocks. [11] find that EOT is preferable when the shape parameter is greater than zero and when the number of exceedance is greater than the number of blocks. However, when the shape parameter is less than zero, BM is more efficient. [12] carries out a vast simulation study to compare the two methods based on their accuracy as measured by mean squared errors, on the basis of time series with various characteristics and in estimating probabilities of exceedance. He finds that the EOT estimates for a sample that has an average of two or more exceedance per year are more accurate than the corresponding BM estimates. In addition, he finds that EOT should be used when the data is less than 50 years. When there is 200 years of data, both approaches have similar accuracies. In terms of the return value estimates, he finds that the EOT estimates are significantly more accurate than those obtained by BM. [6] find that the BM method at an optimal level gives lower asymptotic minimal square error than the EOT method. They conclude that BM is more efficient than EOT in practical situations.

Despite having mixed results, we can deduce that EOT is, on average, a preferred method to BM. Two features are however dominant from all these studies: first, EOT is more efficient than BM if the number of exceedance is greater than the number of blocks and secondly, the two methods have comparable performances for large sample sizes.

The biggest challenge in using EOT is selecting the appropriate threshold. The threshold selection process is always a trade–off between the bias and variance [13]. Moreover, the choice of the threshold significantly affects the tail of the distribution and hence the GPD parameters. The traditional extreme value modelling approach uses fixed threshold approach, where the threshold is chosen and fixed before the model is fitted. The threshold is usually chosen based on various diagnostics plots

*On Modelling Extreme Damages from Natural Disasters in Kenya DOI: http://dx.doi.org/10.5772/intechopen.94578*

that is used to evaluate the model fit. The traditional EVT approach only considers the extreme data above the threshold, and discards the rest that are below. [7] states that the motivation for ignoring extreme data is that extreme and non-extreme data are caused by different processes and the latter is rarely of concern. Furthermore, there is concern that the data below the threshold may compromise the examination of the tail fit. However, there has been increasing interests in extreme value mixture models, which uses all the data for parameter estimation in the inference. This class of models has the potential to overcome some of the difficulties encountered in using the traditional EVT, in regards to threshold selection and uncertainties associated with it. The next section will highlight the development of extreme value mixture models.

### **1.2 Extreme value mixture models**

stock market crashes, natural catastrophes, major insurance claims e.t.c. EVT models the tails of loss severity distributions as it only considers extreme and rare events. Extreme value modelling has been used widely in hydrology, insurance, finance and environmental application where extreme events are of interest. Due to the scope of this research, we will limit the literature to those relating to natural disasters. The relevant theories and statistical methods behind EVT are presented

*Natural Hazards - Impacts, Adjustments and Resilience*

to model the maxima of a set of independent and identically distributed (iid) events. The resulting model, referred to as the Block Maxima (BM) model, involves dividing the observations into non-overlapping blocks of equal sizes and restrict the attention to maximum observations in each period. The observations created are then assumed to follow the GEV [6]. The block sizes are vital in the EVT model as too small block sizes can lead to poor asymptotic approximation, while too big sizes will result to very little observations [7]. [8] use BM method to model hydrological floods and droughts. They find that using block sizes of one year or less to model drought leads to bias. Floods, on the other hand, is modelled using block sizes of less than one year. The same method is used by [9] to study seasonal variation of extreme precipitation across the UK. They use a one month block, having found little improvement of a longer-block. They were able to identify regions with a high risk of extreme precipitation in a given season. A major drawback of the Block Maxima is its inefficiency in terms of data usage. Dividing the dataset into blocks is wasteful of the data since not all the available data is used. To overcome this challenge, the data above some sufficiently high threshold is modelled in what is commonly referred to as excess over threshold (EOT). Several studies have been conducted to compare the performance of BM and EOT, all based on simulated data. [10] states that, in the case where the shape parameter is equal to zero, EOT estimates for a high quantile is better only if the number of exceedance is larger than 1.65 times the number of blocks. [11] find that EOT is preferable when the shape parameter is greater than zero and when the number of exceedance is greater than the number of blocks. However, when the shape parameter is less than zero, BM is more efficient. [12] carries out a vast simulation study to compare the two methods based on their accuracy as measured by mean squared errors, on the basis of time series with various characteristics and in estimating probabilities of exceedance. He finds that the EOT estimates for a sample that has an average of two or more exceedance per year are more accurate than the corresponding BM estimates. In addition, he finds that EOT should be used when the data is less than 50 years. When there is 200 years of data, both approaches have similar accuracies. In terms of the return value estimates, he finds that the EOT estimates are significantly more accurate than those obtained by BM. [6] find that the BM method at an optimal level gives lower asymptotic minimal square error than the EOT method. They

conclude that BM is more efficient than EOT in practical situations.

for large sample sizes.

**274**

Despite having mixed results, we can deduce that EOT is, on average, a preferred method to BM. Two features are however dominant from all these studies: first, EOT is more efficient than BM if the number of exceedance is greater than the number of blocks and secondly, the two methods have comparable performances

The biggest challenge in using EOT is selecting the appropriate threshold. The threshold selection process is always a trade–off between the bias and variance [13]. Moreover, the choice of the threshold significantly affects the tail of the distribution and hence the GPD parameters. The traditional extreme value modelling approach uses fixed threshold approach, where the threshold is chosen and fixed before the model is fitted. The threshold is usually chosen based on various diagnostics plots

[4, 5] propose the Generalised Extreme Value (GEV) distribution which is used

by [3].

Extreme value mixture models typically comprise of two compenents: a bulk model which describes all the non-extreme data below the threshold, and a tail model which is the traditional extreme value model to model the extreme data above the threshold. The advantage of these models is that they consider all the data without wasting information and provides an automated approach both for estimation of the threshold and quantification of the resulting uncertainties from the threshold choice [7].

[14] propose a dynamic weighted mixture model combining a light-tailed distribution for the bulk model, with GPD for the tail model. They use a Weibull distribution as the bulk distribution. Rather than explicitly defining the threshold, they adopt a Cauchy distribution function as the mixing function to make the transition between the bulk and tail distributions. They use maximum likelihood estimation for parameter estimation, and numeric iteration to calculate empirical quantiles. They then test the model using Danish fire loss dataset, and find the parameter estimates to be comparable to those reported in the literature regarding EOT inference. The quantiles are comparable as well. The model is also compared to an existing robust thresholding approach, and it is found to be better for very small percentiles in small datasets.

[15] develop an extreme value mixture model comprising of a truncated gamma to fit the bulk distributions and GPD for the tail distributions. They note that any other parametric distribution can also be used. They treat the threshold as an unknown parameter, and use Bayesian statistics for inference about the unknown parameters. Inference on the Posterior distribution is done using Markov chain Monte Carlo methods, and simulation studies is carried out to analyse the performance of the model. They find that the parameter estimates are very close to the true value for large samples. However, for small samples, they encounter problems with convergence, and consequently parameter estimation. They also apply the model to Nasdaq 100 dataset, an index in the New York Stock Exchange, and compare its performance with the traditional extreme value model. The resulting GPD estimates is close to that obtained using the traditional approach.

#### **1.3 Compound extreme value distribution**

Unlike Extreme Value Theorem which is concerned with only the tails of loss severity distributions, Compound Extreme Value Distribution (CEVD) models can simultaneously model the frequency and severity of extreme events. The idea was proposed by Tebfu and Fengshi in 1980 to model typhoon in South China [16]. They only consider the events above the threshold, as is the case with the traditional EVT, and assume that the number of exceedance (frequency) is Poisson Distributed.

The exceedance (severity) are, on the other hand, assumed to follow a Gumbel Distribution, which is a special case of the General Extreme Distribution. Hence, the model is usually referred to as Poisson-Gumbel. Tebfu and Fengshi [7] also used CEVD to model hurricane characteristics along the Atlantic coasts and the Gulf of Mexico. They assume that the number of exceedance follows a Poisson Distribution, and the exceedance is Weibull Distribution.

infinity. Since *F y*ð Þ< 1 for *y*<*ysup*, where *ysup* is the upper end-point of F, we have *<sup>F</sup>n*f g*<sup>y</sup>* ! 0 as *<sup>n</sup>* ! <sup>∞</sup>. We can remove the degeneracy problem by allowing some

> *<sup>M</sup>*^*<sup>n</sup>* <sup>¼</sup> *Mn* � *dn cn*

where f g *cn* and f g *dn* are sequences of constants with *cn* > 0. Under a suitable choice of *cn* and *dn*, the distribution of *Mn* can be stabilised and which leads to

Theorem 1.1 [Extremal Types Theorem] For a non-degenerate distribution function, G, if there exists sequences of constants f g *cn* >0 and f g *dn* , as *n* ! ∞,

≤*y*

*c*

exp � *<sup>y</sup>* � *<sup>d</sup>*

0 *y*≤ *d*

*c* � � � � � � *<sup>α</sup>*

1 *y*≥ *d*

The proof of this theory is presented in [21]. The three classes of distributions are called extreme value distributions, with type I (Gumbel), type II (Frechet) and type III (Weibull) respectively. The extremal types theorem suggests that regardless of the population distribution of *Mn*, if a non-degenerate limit can be obtained by linear re-normalisation, then the limit distribution will be one of the three extreme

In modelling an unknown population distribution, we choose one of the three

Von Misses [4] and Jenkinson [5] combined the three types of extreme value

distributions leading to the generalised extreme value distribution (GEV).

types of limiting distributions and then estimate the model parameters. This approach is, however, deemed to be inefficent as uncertainty associated with the choice is not considered in the subsequent inference [22]. A better approach is to combine the three models into one single family, with the distributions being

*c* � � � ��*<sup>α</sup>*

� � � � � � , � <sup>∞</sup> <sup>&</sup>lt;*y*<sup>&</sup>lt; <sup>∞</sup> (5)

*y*> *d*

*y*< *d*

! *G y*ð Þ (4)

*<sup>P</sup> Mn* � *dn cn*

then G belongs to one of the following families:

*G y*ð Þ¼

*G y*ð Þ¼ exp � exp � *<sup>y</sup>* � *<sup>d</sup>*

8 < :

8 < :

*G y*ð Þ¼ exp � � *<sup>y</sup>* � *<sup>d</sup>*

� �

(3)

(6)

(7)

linear re-normalisation of *Mn*. Consider a linear re-normalisation:

*On Modelling Extreme Damages from Natural Disasters in Kenya*

*DOI: http://dx.doi.org/10.5772/intechopen.94578*

extremal types theorem:

such that

Gumbel:

Frechet:

Weibull:

for *c*>0 and *d*∈ .

value distributions.

**277**

special cases of the universal distribution.

**2.1 The generalised extreme value distribution**

Initially, CEVD was mostly used in hydrology, to model wave heights and the resulting extreme events. The model has been successfully used to predict design wave height. For instance, Hurricane Katrina of 2005 corresponded to 60 years return period as predicted by the Poisson-Weibull model [17]. As a result, there has been several extensions to these class of models including the Bivariate Compound Extreme Value Distribution (BCEVD) model. [18] and Multivariate Compound Extreme Value Distribution (MCEVD) model [17]. In addition, the model has been adopted in a wider range of areas, including finance, insurance, disasters and catastrophic modelling.

[19] investigate the global historical occurrences of tsunamis. They compare the distribution of the number of annual tsunami events using a Poisson distribution and a Negative binomial distribution. The latter provides a consistent fit to tsunami events whose height is greater than one. They also also investigate the interval distribution using gamma and exponential distributions. The former is found to be a better fit, suggesting that the number of tsunami events is not a Poisson process.

[20] study tsunami events in the USA. They assume that the occurrence frequency of tsunami in each year follow a Poisson distribution. They then identify the distribution of tsunami heights by fitting six distributions: Gumbel, Log normal, Weibull, maximum entropy, and GPD. They select GPD, which had the best fit. They use MLE for parameter estimation, and investigate the fit of the Poisson Compound Extreme Value Distribution using goodness-of-fit statistics. The results is consistent with [19], that the Poisson-Generalised Pareto Distribution is appropriate for disaster modelling.
