**3.1 Generalised Pareto distribution (GPD)**

Definition 1The random variable Z has a Generalised Pareto Distribution if the cumulative distribution function of Z is given by:

$$F\_{\left(\zeta,\nu,\sigma\right)}\left(\mathbf{z}\right) = \begin{cases} 1 - \left(\mathbf{1} + \frac{\zeta\left(\mathbf{z} - \nu\right)}{\sigma}\right)^{-1/\zeta} & \text{for } \zeta \neq \mathbf{0} \\ 1 - \exp\left(-\frac{\mathbf{z} - \nu}{\sigma}\right) & \text{for } \zeta = \mathbf{0} \end{cases} \tag{13}$$

for *<sup>z</sup>*>*<sup>ν</sup>* and <sup>1</sup> <sup>þ</sup> *<sup>ζ</sup>*ð Þ *<sup>z</sup>* � *<sup>ν</sup> σ* � � <sup>&</sup>gt; 0 and parameters: location *<sup>ν</sup>*>0, scale *<sup>σ</sup>* <sup>&</sup>gt;0,

shape *ζ* ∈ .

Theorem 1.2 If there exists sequences of constants *cn* and *dn*, such that

≤ *y*

*G y*ð Þ¼ exp � <sup>1</sup> <sup>þ</sup> *<sup>ζ</sup> <sup>y</sup>* � *<sup>ν</sup>*

*σ* � �

where G is a non-degenerate distribution function, then G is a member of the

The shape parameter determines the tail behaviour under the same values of location and scale, and thus indicates the type of extreme value distribution:

• If *ζ* ¼ 0, the GEV becomes a Gumbel distribution and the tail decays

• If *ζ* <0, the GEV becomes a negative Weibull distribution with the upper

• If *ζ* >0, the GEV simplifies to a Frechet distribution with a heavy tail and

Modelling using block maxima is inefficient as it is wasteful of the data, considering that the complete dataset is available. An alternative approach is to model all the data above some high threshold, in what is commonly referred to as threshold

*Fu*ð Þ¼ *y P Z*ð Þ � *v*≤*y*j*Z* > *v* , 0≤*y*≤*z*<sup>þ</sup> � *v* (10)

<sup>1</sup> � *F v*ð Þ <sup>¼</sup> *F z*ð Þ� *F v*ð Þ

*Fv*ð Þ*z* ≈*F*ð Þ *<sup>ζ</sup>*,*ν*,*<sup>σ</sup>* ð Þ*z* , *v* ! ∞ (12)

<sup>1</sup> � *F v*ð Þ (11)

Given a set of independent and identically distributed random variables *Z*1, … , *Zn*, having a common distribution function,F, we are interested in estimating the conditional excess distribution function, *Fv*, of random variable *X* above a high

where *y* ¼ *z* � *v* are the exceedances and *z*<sup>þ</sup> is the right endpoint of F. We can

Piklands [23] posed that if the underlying distribution F(z) is in the maximum domain of attraction of extreme value distribution, then the conditional excess

<sup>1</sup> � *F v*ð Þ <sup>¼</sup> *F v*ð Þ� <sup>þ</sup> *<sup>z</sup>* � *<sup>v</sup> F v*ð Þ

distribution function *Fv*ð Þ*z* for a large *v*, can be approximated by:

where *F*ð Þ *<sup>ζ</sup>*,*ν*,*<sup>σ</sup>* ð Þ*z* is the Generalised Pareto Distribution (GPD).

**3. Threshold models and the generalised Pareto distribution**

modelling or Excess over Threshold (EOT) modelling.

*F v*ð Þ� þ *y F v*ð Þ

*σ* h i � � �1*=<sup>ζ</sup>* � �

��!*n*!<sup>∞</sup> *G y*ð Þ (8)

>0 and with parameters: scale *σ* >0,

(9)

*<sup>P</sup> Mn* � *dn cn*

*Natural Hazards - Impacts, Adjustments and Resilience*

defined on y such that 1 <sup>þ</sup> *<sup>ζ</sup> <sup>y</sup>* � *<sup>ν</sup>*

location *ν*∈ and scale *ζ* ∈ .

endpoint being finite.

which decays polynomially.

exponentially.

threshold *v*:

**278**

express *Fv*ð Þ*y* in terms of *z* as:

*Fv*ð Þ¼ *y*

GEV family:

� �

Remark 1 (Special Cases) Under specific conditions, the GPD simplifies to other continuous distributions:


The mean of the GPD is:

$$E(Z) = \nu + \frac{\sigma}{1 - \zeta}, \text{for } \zeta < 1\tag{14}$$

The Variance of the GPD is:

$$\text{Var}(Z) = \nu + \frac{\sigma^2}{\left(1 - \zeta\right)^2 \left(1 - 2\zeta\right)}, \text{for } \zeta < \frac{1}{2} \tag{15}$$

In general, the *<sup>r</sup>* � *th* moment of the GPD only exists if *<sup>ζ</sup>* <sup>&</sup>lt; <sup>1</sup> *r* .

The shape parameter, *ζ*, determines the tail distribution of the GPD as indicated in remark 1. When *ζ* ¼ 0, there exists a decreasing exponential tail, when *ζ* >0, there is a heavy tail and when *ζ* < 0, the tail is short, with finite upper end point *<sup>ν</sup>* � *<sup>σ</sup><sup>v</sup> ζ* .

#### **3.2 Relationship between GEV and GPD**

Theorem 1.3 Let *Z*1, … , *Zn* be a sequence of independent and identically distributed random variables with a common cumulative distribution function F, and let *Mn* ¼ max f g *Z*1, … , *Zn* satisfying the conditions to be approximated by GEV, i.e., for large n:

$$P\{M\_N \le x\} = G(x)\text{, where}\\
\mathbf{G(z)} = \exp\left\{-\left[\mathbf{1} + \zeta\left(\frac{\mathbf{z}-\nu}{\sigma}\right)\right]^{-1/\zeta}\right\}\tag{16}$$

Then, for a sufficiently high threshold *v*, the conditional distribution function of ð Þ *Z* � *v* , conditioned to *Z* > *v* is approximately given by

*Natural Hazards - Impacts, Adjustments and Resilience*

$$H(\mathbf{y}) = \mathbf{1} - \left(\mathbf{1} + \frac{\zeta \mathbf{y}}{\delta}\right)^{-1/\zeta} \tag{17}$$

Therefore,

constant.

where, *<sup>y</sup>* <sup>&</sup>gt;0, <sup>1</sup> <sup>þ</sup> *<sup>ζ</sup><sup>y</sup>*

*δ*

*DOI: http://dx.doi.org/10.5772/intechopen.94578*

modelling extreme events using GPD.

For any higher threshold *r*>*v*:

life plot changes linearly with *v*.

**3.3 Threshold selection**

*3.3.1 Mean residual life plot*

for *ζ* < 1 is:

**281**

*P Z*<sup>f</sup> <sup>≤</sup>*<sup>v</sup>* <sup>þ</sup> *<sup>y</sup>*j*<sup>Z</sup>* <sup>&</sup>gt;*v*g ¼ <sup>1</sup> � <sup>1</sup> <sup>þ</sup> *<sup>ζ</sup><sup>y</sup>*

Theorem 1.3 implies that we can use GPD as an approximation to the distribution of maxima using EOT as alternative to GEV in block maxima. We can observe how the parameters of the GPD are uniquely determined by those of the GEV. In particular, the shape parameter is equal in both cases. When the block sizes change in the GEV, the parameters *ν* and *σ* change, while the shape parameter remains

Eq. (23) clearly indicates the dependence of the scale parameters *σ* on the threshold. The threshold choice is an important part of threshold modelling. As with the case with the block sizes in GEV, the choice of threshold is a trade-off between the bias and variance [13]. If the threshold is too low, we violate the asymptotic arguments underlying the GPD, whereas, if the threshold is too high, we will generate few exceedances to estimate the parameters, resulting to a large variance. We now describe the process of selecting the appropriate threshold, to be used in

In modelling using EOT, the threshold is usually chosen before the model is fitted. We will present three tools that will help in identifying the threshold:

excesses *z* � *v* are approximated by a GPD for a high threshold *v*, the mean excess,

*e v*ð Þ¼ *E Z*ð Þ¼ � *<sup>v</sup>*∣*<sup>Z</sup>* <sup>&</sup>gt;*<sup>v</sup> <sup>σ</sup><sup>v</sup>*

Therefore, the mean excess function *e r*ð Þ is a linear function of *v*, once a suitable high threshold has been reached. The sample mean residual life plot is drawn using:

where *z*ð Þ*<sup>i</sup>* is the observation, *i*, above the threshold, *v*, and *nv* is the total number of observations above *v*. For a high *v*, all the exceedances, *r*> *v*, in the mean residual

Using this result, the procedure for estimating the threshold is as follows:

This method is based on the mean of the GPD: *E Z*ð Þ¼ *<sup>σ</sup>*

*e r*ð Þ¼ *E Z*ð Þ¼ � *<sup>r</sup>*∣*<sup>Z</sup>* <sup>&</sup>gt; *<sup>r</sup> <sup>σ</sup><sup>r</sup>*

*<sup>v</sup>*, <sup>1</sup> *nv* X*nv i*¼1

� � <sup>&</sup>gt;0 and *<sup>δ</sup>* <sup>¼</sup> *<sup>σ</sup>* <sup>þ</sup> *<sup>ζ</sup>*ð Þ *<sup>v</sup>* � *<sup>ν</sup>* .

Eq. (23) is the generalised Pareto family of distributions.

*On Modelling Extreme Damages from Natural Disasters in Kenya*

*δ* � ��1*=<sup>ζ</sup>*

1�*ζ*

<sup>1</sup> � *<sup>ζ</sup>* <sup>¼</sup> *<sup>σ</sup><sup>v</sup>* <sup>þ</sup> *<sup>ζ</sup>*ð Þ *<sup>r</sup>* � *<sup>v</sup>*

*<sup>z</sup>*ð Þ*<sup>i</sup>* � *<sup>v</sup>* � � ! (26)

, when *ζ* < 1. If the

<sup>1</sup> � *<sup>ζ</sup>* (24)

<sup>1</sup> � *<sup>ζ</sup>* (25)

(23)

where *<sup>y</sup>* <sup>¼</sup> *<sup>z</sup>* � *<sup>v</sup>*>0, <sup>1</sup> <sup>þ</sup> *<sup>ζ</sup><sup>y</sup> δ* � � <sup>&</sup>gt; 0 and *<sup>δ</sup>* <sup>¼</sup> *<sup>σ</sup>* <sup>þ</sup> *<sup>ζ</sup>*ð Þ *<sup>v</sup>* � *<sup>ν</sup>* .

**Proof.** Denote the distribution function of the random variable Z by F. By theorem 1.2, for large *n*,

$$\begin{split} G(z) &= F^{\mathfrak{n}}(z) \approx \exp\left\{-\left[\mathbbm{1} + \zeta\left(\frac{z-\nu}{\sigma}\right)\right]^{-1/\zeta}\right\} \\ &n \log F(z) \approx \left\{-\left[\mathbbm{1} + \zeta\left(\frac{z-\nu}{\sigma}\right)\right]^{-1/\zeta}\right\} \end{split} \tag{18}$$

For large values of *z*, Taylor series expansion implies:

$$\log F(z) \approx -\left\{1 - F(z)\right\} \tag{19}$$

Hence,

$$\begin{aligned} n\{1 - F(z)\} &\approx \left[1 + \zeta\left(\frac{z - \nu}{\sigma}\right)\right]^{-1/\zeta} \\ \{1 - F(z)\} &\approx \frac{1}{n} \left[1 + \zeta\left(\frac{z - \nu}{\sigma}\right)\right]^{-1/\zeta} \end{aligned} \tag{20}$$

So, for *z* ¼ *v* þ *y*

$$\{\mathbf{1} - F(\nu + \jmath)\} \approx \frac{\mathbf{1}}{n} \left[\mathbf{1} + \zeta \left(\frac{\nu + \jmath - \nu}{\sigma}\right)\right]^{-1/\zeta} \tag{21}$$

Thus,

$$\begin{aligned} P\{Z > v + y | Z > v\} &= \frac{1 - P(Z < v + y)}{1 - P(Z < v)} \\ &\approx \frac{1 + \zeta(v + y - \nu)/\sigma}{[1 + \zeta(v - \nu)/\sigma]^{-1/\zeta}} \\ &= \left[\frac{1 + \zeta(v + y - \nu)/\sigma}{1 + \zeta(v - \nu)/\sigma}\right]^{-1/\zeta} \\ &= \left[\frac{1 + \zeta(v - \nu)/\sigma + \zeta y/\sigma}{1 + \zeta(v - \nu)/\sigma}\right]^{-1/\zeta} \\ &= \left[\frac{1 + \zeta(v - \nu)}{1 + \zeta(v - \nu)} + \frac{\zeta y/\sigma}{1 + \zeta(v - \nu)/\sigma}\right]^{-1/\zeta} \\ &= \left[1 + \frac{\zeta y/\sigma}{\sigma + \zeta(v - \nu)}\right]^{-1/\zeta} \\ &= \left[1 + \frac{\zeta y}{\sigma + \zeta(v - \nu)}\right]^{-1/\zeta} \\ &= \left[1 + \frac{\zeta y}{\sigma + \zeta(v - \nu)}\right]^{-1/\zeta} \end{aligned} \tag{22}$$

*On Modelling Extreme Damages from Natural Disasters in Kenya DOI: http://dx.doi.org/10.5772/intechopen.94578*

Therefore,

*H y*ð Þ¼ <sup>1</sup> � <sup>1</sup> <sup>þ</sup> *<sup>ζ</sup><sup>y</sup>*

**Proof.** Denote the distribution function of the random variable Z by F. By

*δ* � �

*G z*ð Þ¼ *<sup>F</sup>n*ð Þ*<sup>z</sup>* <sup>≈</sup>exp � <sup>1</sup> <sup>þ</sup> *<sup>ζ</sup>*

*n* log *F z*ð Þ≈ � 1 þ *ζ*

*n*f g 1 � *F z*ð Þ ≈ 1 þ *ζ*

*n*

*n*

1 � *P Z*ð Þ <*v*

<sup>¼</sup> <sup>1</sup> <sup>þ</sup> *<sup>ζ</sup>*ð Þ *<sup>v</sup>* <sup>þ</sup> *<sup>y</sup>* � *<sup>ν</sup> <sup>=</sup><sup>σ</sup>* 1 þ *ζ*ð Þ *v* � *ν =σ* � ��1*=<sup>ζ</sup>*

<sup>¼</sup> <sup>1</sup> <sup>þ</sup> *<sup>ζ</sup>*ð Þ *<sup>v</sup>* � *<sup>ν</sup>* 1 þ *ζ*ð Þ *v* � *ν*

<sup>¼</sup> <sup>1</sup> <sup>þ</sup> *<sup>ζ</sup>y=<sup>σ</sup>*

<sup>¼</sup> <sup>1</sup> <sup>þ</sup> *<sup>ζ</sup><sup>y</sup>*

<sup>¼</sup> <sup>1</sup> <sup>þ</sup> *<sup>ζ</sup><sup>y</sup> δ* � ��1*=<sup>ζ</sup>*

2 4

<sup>≈</sup>½ � <sup>1</sup> <sup>þ</sup> *<sup>ζ</sup>*ð Þ *<sup>v</sup>* <sup>þ</sup> *<sup>y</sup>* � *<sup>ν</sup> <sup>=</sup><sup>σ</sup>* �1*=<sup>ζ</sup>* ½ � <sup>1</sup> <sup>þ</sup> *<sup>ζ</sup>*ð Þ *<sup>v</sup>* � *<sup>ν</sup> <sup>=</sup><sup>σ</sup>* �1*=<sup>ζ</sup>*

<sup>¼</sup> <sup>1</sup> <sup>þ</sup> *<sup>ζ</sup>*ð Þ *<sup>v</sup>* � *<sup>ν</sup> <sup>=</sup><sup>σ</sup>* <sup>þ</sup> *<sup>ζ</sup>y=<sup>σ</sup>* 1 þ *ζ*ð Þ *v* � *ν =σ* � ��1*=<sup>ζ</sup>*

> *σ* þ *ζ*ð Þ *v* � *ν σ*

*σ* þ *ζ*ð Þ *v* � *ν* � ��1*=<sup>ζ</sup>*

<sup>þ</sup> *<sup>ζ</sup>y=<sup>σ</sup>* 1 þ *ζ*ð Þ *v* � *ν =σ*

> 3 5

�1*=ζ*

� ��1*=<sup>ζ</sup>*

1 þ *ζ*

f g <sup>1</sup> � *F z*ð Þ <sup>≈</sup> <sup>1</sup>

f g <sup>1</sup> � *F v*ð Þ <sup>þ</sup> *<sup>y</sup>* <sup>≈</sup> <sup>1</sup>

*P Z*f g <sup>&</sup>gt;*<sup>v</sup>* <sup>þ</sup> *<sup>y</sup>*j*<sup>Z</sup>* <sup>&</sup>gt;*<sup>v</sup>* <sup>¼</sup> <sup>1</sup> � *P Z*ð Þ <sup>&</sup>lt; *<sup>v</sup>* <sup>þ</sup> *<sup>y</sup>*

For large values of *z*, Taylor series expansion implies:

where *<sup>y</sup>* <sup>¼</sup> *<sup>z</sup>* � *<sup>v</sup>*>0, <sup>1</sup> <sup>þ</sup> *<sup>ζ</sup><sup>y</sup>*

*Natural Hazards - Impacts, Adjustments and Resilience*

theorem 1.2, for large *n*,

Hence,

Thus,

**280**

So, for *z* ¼ *v* þ *y*

*δ* � ��1*=<sup>ζ</sup>*

> 0 and *δ* ¼ *σ* þ *ζ*ð Þ *v* � *ν* .

*z* � *ν σ*

*z* � *ν σ* h i � � �1*=<sup>ζ</sup>*

> *z* � *ν σ*

<sup>1</sup> <sup>þ</sup> *<sup>ζ</sup> <sup>v</sup>* <sup>þ</sup> *<sup>y</sup>* � *<sup>ν</sup> σ* � � � � �1*=<sup>ζ</sup>*

*z* � *ν σ* h i � � �1*=<sup>ζ</sup>* � �

h i � � �1*=<sup>ζ</sup>* � � (18)

log *F z*ð Þ≈ � f g 1 � *F z*ð Þ (19)

h i � � �1*=<sup>ζ</sup>* (20)

(17)

(21)

(22)

$$P\{Z \le v + \jmath | Z > v\} = \mathbf{1} - \left[\mathbf{1} + \frac{\zeta \jmath}{\delta}\right]^{-1/\zeta} \tag{23}$$

where, *<sup>y</sup>* <sup>&</sup>gt;0, <sup>1</sup> <sup>þ</sup> *<sup>ζ</sup><sup>y</sup> δ* � � <sup>&</sup>gt;0 and *<sup>δ</sup>* <sup>¼</sup> *<sup>σ</sup>* <sup>þ</sup> *<sup>ζ</sup>*ð Þ *<sup>v</sup>* � *<sup>ν</sup>* .

Eq. (23) is the generalised Pareto family of distributions.

Theorem 1.3 implies that we can use GPD as an approximation to the distribution of maxima using EOT as alternative to GEV in block maxima. We can observe how the parameters of the GPD are uniquely determined by those of the GEV. In particular, the shape parameter is equal in both cases. When the block sizes change in the GEV, the parameters *ν* and *σ* change, while the shape parameter remains constant.

Eq. (23) clearly indicates the dependence of the scale parameters *σ* on the threshold. The threshold choice is an important part of threshold modelling. As with the case with the block sizes in GEV, the choice of threshold is a trade-off between the bias and variance [13]. If the threshold is too low, we violate the asymptotic arguments underlying the GPD, whereas, if the threshold is too high, we will generate few exceedances to estimate the parameters, resulting to a large variance. We now describe the process of selecting the appropriate threshold, to be used in modelling extreme events using GPD.

### **3.3 Threshold selection**

In modelling using EOT, the threshold is usually chosen before the model is fitted. We will present three tools that will help in identifying the threshold:

#### *3.3.1 Mean residual life plot*

This method is based on the mean of the GPD: *E Z*ð Þ¼ *<sup>σ</sup>* 1�*ζ* , when *ζ* < 1. If the excesses *z* � *v* are approximated by a GPD for a high threshold *v*, the mean excess, for *ζ* < 1 is:

$$e(v) = E(Z - v | Z > v) = \frac{\sigma\_v}{1 - \zeta} \tag{24}$$

For any higher threshold *r*>*v*:

$$\epsilon(r) = E(Z - r | Z > r) = \frac{\sigma\_r}{1 - \zeta} = \frac{\sigma\_v + \zeta(r - v)}{1 - \zeta} \tag{25}$$

Therefore, the mean excess function *e r*ð Þ is a linear function of *v*, once a suitable high threshold has been reached. The sample mean residual life plot is drawn using:

$$\left(v, \frac{1}{n\_v} \sum\_{i=1}^{n\_v} (z\_{(i)} - v)\right) \tag{26}$$

where *z*ð Þ*<sup>i</sup>* is the observation, *i*, above the threshold, *v*, and *nv* is the total number of observations above *v*. For a high *v*, all the exceedances, *r*> *v*, in the mean residual life plot changes linearly with *v*.

Using this result, the procedure for estimating the threshold is as follows:


## *3.3.2 Parameter stability plot*

Assuming that the exceedances, (*z* � *v*), over a threshold *v* follow a GPD (*ζ*, *σv*), the exceedances will still follow a GPD for any higher threshold *r*>*v*, with the same *ζ*, but with scale parameter of:

$$
\sigma\_r = \sigma\_v + \zeta(r - v) \tag{27}
$$

rather than from the start to the end. The starting point of change in the series of differences, and hence the threshold, is the the intersection point of the series

We use data for all the natural disasters that has been recorded in Kenya in the period 1964 2018. The data is obtained from CRED database, which is the currently the most comprehensive database for natural events. The data was also crossreferenced with that from other sources including UN-agencies and NDMU. The impact of natural disasters is quantified in terms of the total number of people affected on an annual basis, which we deemed to be more reliable than the the total damage in monetary terms. The total number of people affected includes those who were injured, died, left homeless or affected in any other way by natural disasters. A total of 112 events have been recorded over that period of time with a resulting

Descriptive statistics for both the annual occurrence and the impacts are provided in **Table 1**. The minimum number of disasters and the resulting impact is zero, which corresponds to those years where no natural disasters occurred. A total of 22 years recorded no natural disaster events. The average annual number of natural disaster occurrences in Kenya is two, and about 1*:*3 million people are affected every year. The maximum number of natural disaster occurrences observed in any year is 9 and the worst disaster recorded in any year affected approximately 23, 331, 469 people. The mean is greater than the median for both variables, indicating that the data is right-skewed. We can also observe that the spread of the impacts is large as suggested by both the standard deviation and the inter-quantile range (about 3*:*5 million and 252718 respectively), as opposed to that

We are interested in the distribution of the number of occurences and the corresponding impact of natural disasters in Kenya in the period of study. **Figure 1** shows the distribution of the number of natural disaster occurrences in the last 55 years. The number of natural disasters between 1964 and the late 1990s was fairly low, with no year experiencing more than two events. We can then observe a sharp

**Statistic Annual Occurrence Annual Impact** N 55 55 Total 112 62,160,910 Mean 2 1,130,198 Std. Dev 2.26 3,494,302 Minimum 0 0 st Quantile 0 0 Median 1 15000 rd Quantile 4 252718 Maximum 9 23,331,469 Inter Quantile Range 4 252,718

damage of approximately 62 million people affected.

*On Modelling Extreme Damages from Natural Disasters in Kenya*

*DOI: http://dx.doi.org/10.5772/intechopen.94578*

*Ur* and *Up*.

**4. Data analysis**

of the occurrence.

**Table 1.**

**283**

*Descriptive Statistics.*

(This follows from theory 1.3).

Let us re-parametrise the scale parameter *σr*:

$$
\sigma^\* = \sigma\_r - \zeta\_r \tag{28}
$$

such that:

$$
\sigma\_r = \sigma\_v + \zeta r - \zeta v \tag{29}
$$

$$
\sigma\_r - \zeta\_r = \sigma\_v - \zeta v
$$

$$
\sigma^\* = \sigma\_v - \zeta v
$$

The parameter *σ* <sup>∗</sup> now only depends on a sufficiently high threshold *v*. The parameter stability plot involves plotting the GPD parameter estimates for a range of values *v*. The threshold is chosen to be the point where the shape and the modified scale parameters become stable (that is, the parameter estimates is constant above the threshold at which the GPD becomes valid). We use confidence interval to select this point.

#### *3.3.3 Gertensgarbe and Werner plot*

The test was proposed by Gertensgarbe and Werner (1989) and is used to select a threshold by detecting the starting point of the extreme region. The idea behind the test is that the behaviour of a series of differences that correspond to the extreme observations is different from the one corresponding to the non-extreme observations. So, given a series of differences Δ*<sup>r</sup>* ¼ *z*ð Þ*<sup>r</sup>* � *z*ð Þ *<sup>r</sup>*þ<sup>1</sup> , *i* ¼ 2, 3, ⋯, *n*, of the order statistics, *z*ð Þ<sup>1</sup> ≤*z*ð Þ<sup>2</sup> ≤⋯≤*z*ð Þ *<sup>n</sup>* , the starting point of the extreme region, and hence the threshold, will be the point at which the series of differences exhibit a significant statistical change.

To detect this point, we apply a sequential version of the Mann-Kendall test to check the null hypothesis that there is no change in the series of differences. Define a series:

$$U\_r = \frac{U\_r^\* - \frac{r(r-1)}{4}}{\sqrt{\frac{i(r-1)(i+5)}{72}}} \tag{30}$$

where *U* <sup>∗</sup> *<sup>r</sup>* <sup>¼</sup> <sup>P</sup>*<sup>r</sup> <sup>j</sup>*¼<sup>1</sup>*<sup>n</sup> <sup>j</sup>* and *<sup>n</sup> <sup>j</sup>* is the number of values in the the series of differences Δ1, ⋯, Δ *<sup>j</sup>* less than Δ *<sup>j</sup>*. We also define another series, *Up*, by applying the same procedure to the series of differences from the end to start,Δ*n*, ⋯, Δ1,

rather than from the start to the end. The starting point of change in the series of differences, and hence the threshold, is the the intersection point of the series *Ur* and *Up*.
