**1. Introduction**

In many situations related to experimental data analysis, one often comes across the following phenomenon: although conventional reasoning based on the central limit theorem of probability theory concludes that the expected distribution of observations should be normal, instead, the statistical procedures expose the noticeable non-normality of real distributions. Moreover, as a rule, the observed non-normal distributions are more leptokurtic than the normal law, having sharper vertices and heavier tails. These situations are typical in the financial data analysis (see, e.g., Chapter 4 in [1] or Chapter 8 in [2] and references therein), in experimental physics (see, e.g., [3]), and other fields dealing with statistical analysis of experimental data. Many attempts were undertaken to explain this heavytailedness. Most significant theoretical breakthrough is usually associated with the

results of B. Mandelbrot and others who proposed, instead of the standard central limit theorem, to use reasoning based on limit theorems for sums of random summands with infinite variances (see, e.g., [4]) resulting in non-normal stable laws as heavy-tailed models of the distributions of experimental data. However, first, in most cases the key assumption within this approach, the infiniteness of the variances of elementary summands can hardly be believed to hold in practice and, second, although more heavy-tailed than the normal law, the real distributions often turn out to be more light-tailed than the stable laws.

financial mathematics do take this circumstance into consideration as one of possible ways of dealing with heavy tails. However, in other fields such as medical statistics or quality control, this approach has not become conventional; yet, the number of patients with a certain disease varies from month to month due to seasonal factors or from year to year due to some epidemic reasons and the number of failed items varies from lot to lot. In these cases, the number of available observations as well as the observations themselves is unknown beforehand and should be treated as random to avoid underestimation of risks or error probabilities. Therefore, it is quite reasonable to study the asymptotic behavior of general statistics constructed from samples with random sizes for the purpose of construction of suitable and reasonable asymptotic approximations. As this is so, to obtain non-trivial asymptotic distributions in limit theorems of probability theory and mathematical statistics, an appropriate centering and normalization of random variables and vectors under consideration must be used. It should be especially noted that to obtain reasonable approximation to the distribution of the basic statistics, both centering and normalizing values should be non-random. Otherwise, the approximate distribution becomes random itself and, for example, the problem

*From Asymptotic Normality to Heavy-Tailedness via Limit Theorems for Random Sums…*

*DOI: http://dx.doi.org/10.5772/intechopen.89659*

of evaluation of quantiles or significance levels becomes senseless.

Ψð Þ¼ *x*

decreasing heavy tails appears as an asymptotic law for this statistic.

1 2 1 þ

(the Student distribution with two degrees of freedom) which has such heavy tails that its moments of orders *δ*≥2 do not exist. In general, as it was shown in [8], if a statistic that is asymptotically normal in the traditional sense is constructed on the basis of a sample with random size having negative binomial distribution, then instead of the expected normal law, the Student distribution with power-type

Let *r*∈ . We will consider random elements taking values in the *r*-dimensional

Assume that all the random variables and random vectors are defined on one and the same probability space ð Þ Ω, A, P . By the measurability of a random field, we will mean its measurability as a function of two variates, an elementary outcome

*x* ffiffiffiffiffiffiffiffiffiffiffiffiffi <sup>2</sup> <sup>þ</sup> *<sup>x</sup>*<sup>2</sup> <sup>p</sup>

tation *n*, then the normalized sample median ffiffiffi

**2. Notation and definitions: auxiliary results**

.

(see, e.g., [2, 8, 10]).

limit distribution function

Euclidean space *<sup>r</sup>*

**169**

In asymptotic settings, statistics constructed from samples with random sizes are special cases of random sequences with random indices. The randomness of indices usually leads to the limit distributions for the corresponding random sequences being heavy-tailed even in the situations where the distributions of non-randomly indexed random sequences are asymptotically normal

Many authors noted that the asymptotic properties of statistics constructed from samples with random samples differ from those of the asymptotically normal statistics in the classical sense. To illustrate this, we will repeatedly cite [11] where the following example is given. Let *X*ð Þ<sup>1</sup> , … ,*X*ð Þ *<sup>n</sup>* be order statistics constructed from the sample *X*1, … ,*Xn*. It is well known (see, e.g., [12]) that in the standard situation the sample median is asymptotically normal. At the same time, in [11] it was demonstrated that if the sample size *Nn* has the geometric distribution with expec-

*<sup>n</sup>* <sup>p</sup> *<sup>X</sup>*ð Þ ½ �þ *Nn=*<sup>2</sup> <sup>1</sup> � med*X*<sup>1</sup>

� � (1)

� � has the

In this work, in order to give a more realistic explanation of the observed nonnormality of the distributions of real data, an alternative approach based on limit theorems for statistics constructed from samples with random sizes is developed. Within this approach, it becomes possible to obtain arbitrarily heavy tails of the data distributions without assuming the non-existence of the moments of the observed characteristics.

This work was inspired by the publication of the paper [5] in which, based on the results of [6], a particular case of random sums was considered. One more reason for writing this work was the recent publication [7], the authors of which reproduced some results of [8, 9] without citing these earlier papers.

Here we give a more general description of the transformation of the limit distribution of a sum of independent random variables or another statistic (i.e., of a measurable function of a sample) under the replacement of the non-random number of summands or the sample size by a random variable. General limit theorems are proved (Section 3). Section 4 contains some comments on heavy-tailedness of scale mixtures of normal distributions. As examples of the application of general theorems, conditions are presented for the convergence of the distributions of random sums of independent random vectors *with finite covariance matrices* to multivariate elliptically contoured stable and Linnik distributions (Section 5). Also, conditions are presented for the convergence of the distributions of asymptotically normal (in the traditional sense) statistics to multivariate Student distributions (Section 6).

In Section 7, the joint asymptotic behavior of sample quantiles is considered. In applied researches related to risk analysis, such characteristic as VaR (Value-at-Risk) is very popular. Formally, VaR is a certain quantile of the observed risky value. Therefore, the joint asymptotic behavior of sample quantiles in samples with random sizes is considered in detail in Section 7 as one more example of the application of the general theorem proved in Section 3. In this section, we show how the proposed technique can be applied to the continuous-time case assuming that the sample size increases in time following a Cox process. One more interpretation of this setting is related with an important case where the sample size has the mixed Poisson distribution.

In classical problems of mathematical statistics, the size of the available sample, that is, the number of available observations, is traditionally assumed to be deterministic. In the asymptotic settings, it plays the role of infinitely increasing *known* parameter. At the same time, in practice very often the data to be analyzed are collected or registered during a certain period of time and the flow of informative events each of which brings a next observation forms a random point process. Therefore, the number of available observations is unknown till the end of the process of their registration and also must be treated as a (random) observation. For example, this is so in insurance statistics where, during different accounting periods, different numbers of insurance events (insurance claims and/or insurance contracts) occur and in high-frequency financial statistics where the number of events in a limit order book during a time unit essentially depends on the intensity of order flows. Moreover, contemporary statistical procedures of insurance and

### *From Asymptotic Normality to Heavy-Tailedness via Limit Theorems for Random Sums… DOI: http://dx.doi.org/10.5772/intechopen.89659*

financial mathematics do take this circumstance into consideration as one of possible ways of dealing with heavy tails. However, in other fields such as medical statistics or quality control, this approach has not become conventional; yet, the number of patients with a certain disease varies from month to month due to seasonal factors or from year to year due to some epidemic reasons and the number of failed items varies from lot to lot. In these cases, the number of available observations as well as the observations themselves is unknown beforehand and should be treated as random to avoid underestimation of risks or error probabilities.

Therefore, it is quite reasonable to study the asymptotic behavior of general statistics constructed from samples with random sizes for the purpose of construction of suitable and reasonable asymptotic approximations. As this is so, to obtain non-trivial asymptotic distributions in limit theorems of probability theory and mathematical statistics, an appropriate centering and normalization of random variables and vectors under consideration must be used. It should be especially noted that to obtain reasonable approximation to the distribution of the basic statistics, both centering and normalizing values should be non-random. Otherwise, the approximate distribution becomes random itself and, for example, the problem of evaluation of quantiles or significance levels becomes senseless.

In asymptotic settings, statistics constructed from samples with random sizes are special cases of random sequences with random indices. The randomness of indices usually leads to the limit distributions for the corresponding random sequences being heavy-tailed even in the situations where the distributions of non-randomly indexed random sequences are asymptotically normal (see, e.g., [2, 8, 10]).

Many authors noted that the asymptotic properties of statistics constructed from samples with random samples differ from those of the asymptotically normal statistics in the classical sense. To illustrate this, we will repeatedly cite [11] where the following example is given. Let *X*ð Þ<sup>1</sup> , … ,*X*ð Þ *<sup>n</sup>* be order statistics constructed from the sample *X*1, … ,*Xn*. It is well known (see, e.g., [12]) that in the standard situation the sample median is asymptotically normal. At the same time, in [11] it was demonstrated that if the sample size *Nn* has the geometric distribution with expectation *n*, then the normalized sample median ffiffiffi *<sup>n</sup>* <sup>p</sup> *<sup>X</sup>*ð Þ ½ �þ *Nn=*<sup>2</sup> <sup>1</sup> � med*X*<sup>1</sup> � � has the limit distribution function

$$\Psi(\mathbf{x}) = \frac{1}{2} \left( \mathbf{1} + \frac{\mathbf{x}}{\sqrt{2 + \mathbf{x}^2}} \right) \tag{1}$$

(the Student distribution with two degrees of freedom) which has such heavy tails that its moments of orders *δ*≥2 do not exist. In general, as it was shown in [8], if a statistic that is asymptotically normal in the traditional sense is constructed on the basis of a sample with random size having negative binomial distribution, then instead of the expected normal law, the Student distribution with power-type decreasing heavy tails appears as an asymptotic law for this statistic.
