1. Introduction

In reinsurance industry, losses for a contract are simulated and represented by the losses cumulative distribution function (CDF), survival, or quantile functions. The plots of these functions are called the EP curves with the following terminology [1]: for a given annual or aggregated loss, the probability of seeing annual loss exceeding that loss is the exceeding-probability (EP) or aggregate-exceedingprobability (AEP). The average of all annual losses exceeding that given loss is the AEP tail value at risk, called the AEP TVaR, or simply TVaR. The EP curve is represented by a table consisting of pairs of probability and loss. It is desirable to fit a parametric distribution to this table for a more succinct representation and more reasonable interpolations for values not in the table. Then which distribution family to use and what characteristics of the data are needed or determine the distribution are the questions to answer.

The (scaled) Beta distribution is widely used in reinsurance for fit loss or loss ratio, perhaps because the Beta distribution has only two parameters and very simple formulas for mean and standard deviation using these parameters, whose inverse function also has simple formulas, so that the two statistics of mean and standard deviation can be used to easily determine the parameters.

For about 85% of the perils, this approach works well, in the sense that the TVaR of the fitted distribution for quantile of interest, such as the 0.96, 0.99 or 0.996

quantile TVaR which is needed for pricing and risk monitoring, is close to a few percent of the original data TVaR. The remaining 15% perils, such as the North American Tornado Hail (NATH), Australia Wind Storm (AUWS), Hawaii Wind Storm (HIWS), and Mexico Earthquake (MXEQ), can have more than 10% deviations.

The maximum likelihood estimation method is a way to find alternative fitting distributions [2, 3]. Instead of finding approximations of the smoothed empirical distribution, we optimize an objective function whose optimum solution gives us the candidate distribution form. Suppose the annual losses xi occurred ni times in our observation; to find a probability function that gives probabilities pi for these Q ni losses, we just maximize the objective function <sup>i</sup> pi . It is easily seen that for the optimum solution we have pi <sup>¼</sup> ni : the relative occurring frequency is maintained in pj nj the probability function. In the objective function, if we replace the pi by a power function of pi , the conclusion still holds, but not if we use a logarithm or exponential function.

While the maximum likelihood approach works well for many perils and identifies a few best fitted distribution families (Mathematica has more than 200 distribution families that can be used for extensive searches), it did not work for the NATH peril. The NATH has {Mean, StandardDeviation, Skewness, Kurtosis, 0.99TVaR} = {7418611.10904006, 9517336.93024634, 5.99378199789956, 65.8901734355745, 68867612.8345741}.

This is not contradictory to the maximum likelihood principal, since in any implementation, only known forms of the probability density function (PDF) and as-small-as-possible numbers of parameters can be used. To overcome this limitation, we need to look into the particularity of those distributions and come up with or select more suitable function forms for the PDF or CDF. In [4] it is found that a high coefficient of variation (CV) distribution is hard to fit or simulate. But the NATH has a small CV of 1.28. The skewness and kurtosis alone also not differentiate them from other distributions.

Trial and error found the empirical rule that these hard distributions have small values of kurtosis divided by skewness squared, Table 1. This finding prompted us for the study of the property of kurtosis/skewness^2 (K/S^2), henceforth will be called the shape factor (SF).

Numerical optimization or solution will be our primary tool for this SF study. Analytical deduction, symbolic algebra, and symbolic limit from computer algebra system (CAS) Mathematica will be another major tool, as well as Mathematica's plot functions. Those plots can help reveal the patterns or tendencies of functions. The found pattern can in turn aid in taking special directional/constraint limit or substitutions in CAS to get the analytical formula for SF bound when it is possible.

The overall lower bound we find of SF is presented in Section 2, through the triple analytical, graphical, and numerical methods. Followed by in-detail studies of SF of various selected distribution families, which are either widely used in practice,


Table 1. Numerical characteristics of a few hard to fit and simulate perils. What Determines EP Curve Shape? DOI: http://dx.doi.org/10.5772/intechopen.82832

such as the Beta distribution in Section 3 and the generalized Gamma distribution in Section 6, or is most simple to simulate, such as the Kumaraswamy distribution in Section 4. The most inclusive distribution, BetaPrime distribution, is in Section 5, for which we do not get an analytical formula, so the empirical formula for SF lower bound is provided. Some distributions that have wide matching capabilities, but for the NATH may have fitted distribution facing numerical difficulties, such as the Fleishman distribution, whose fit has non-monotonically increasing polynomial form and hence is hard to solve for inverse CDF, are only briefly mentioned in Section 7. The top distribution found through maximum likelihood fit, the generalized hyperbolic distribution (GH), even with the most complex PDF, has unexpectedly simple and beautiful analytical formulas for SF lower bound; the results are in the final Section 8. All our studies will focus on SF bound deductions and applications.
