**5. Rank measure**

The concept of a rank measure was proposed years ago and analyzed from both the intuitive and the formal points of view. Here, we propose an approach which can be regarded as justification as rationale in constructive style.

*Statement*. For a trivial metrological model, if the source of randomness is described only by its distribution, and the data elements are statistically independent, the implementation of the 'principle of measuring the probability of origin' leads to a simple 'rank measure'.

*Proof*. From the assumption of data independence, the value of the metric is independent of the permutation of the data elements in the data sample used to identify the trivial model.

In fact, suppose that for two data samples of the same length, all elements are the same. Should the metric distinguish them? It is obvious enough that it is not necessary to distinguish and there is no possibility to do this.

Now, in each sample, one element by element of a different but identical value and in the same position is replaced. As before, the samples are indistinguishable.

Now, in one of the data samples, we change the positions of any two elements. If the data elements are equal, then the samples are indistinguishable. If the data elements are different, then the samples can be distinguished, but should this be done?

If the data is independent, then any position of each element is equally probable. Thus, the probability of origin is unchanged. The metric must be such that a simple permutation of data elements within one of the samples does not change the value of the metric. Consequently, neither the number nor the step of internal permutations on the value of the metric is affected.

This creates an equivalence class for data samples formally different as records of the data acquisition process, but within the class, those samples are indistinguishable by the metric. Data sample after simple sorting in ascending order (rank statistics) is a natural representative of each of these classes and can be used instead.

Each of the data sample elements *dk*|*<sup>k</sup>* <sup>=</sup> 1…*n* in its ordered sample has its own order density of distribution *pk* <sup>⁄</sup>*<sup>n</sup>* (*x*) <sup>=</sup> \_\_\_\_\_\_\_\_\_\_ *<sup>n</sup>*! (*k* − 1)!(*n* − *k*)! *P* (*x*)*<sup>k</sup>*−<sup>1</sup> (1 − *P*(*x*))*<sup>n</sup>*−*<sup>k</sup> p*(*x*) different from the distributions in each of elements in other positions, where *p*(*x*) and *P*(*x*) are the probability distribution density and the cumulative distribution function of a model random source, respectively.

The probability of the origin of the value of each data element *dk* is calculated from the corresponding order distribution density as *pk* <sup>⁄</sup>*<sup>n</sup>* (*dk* ). The probability associated with the entire sample of data {*dk*}*<sup>n</sup>* is naturally calculated as the multiplication of the origin probabilities of each of the elements of data *m*({*dk*}*n*) <sup>=</sup> <sup>∏</sup>*<sup>k</sup>*=1 *<sup>n</sup> pk* <sup>⁄</sup>*<sup>n</sup>* (*d*) because the event of obtaining a sample of data is considered single. We call this result the 'rank measure'. *End of proof.*

An important feature of the algorithm for identifying a trivial statistical model with the assumptions made is that there is no need to explicitly define the metric. You can immediately go to the estimation of the demanded probability of origin by comparing the prediction of the model in the form of the densities of the distributions of each of the data elements and the sorted experimental data. The formula of a rank measure can be dissected to three factors:

$$\dim\{\text{id}\}\_{\mathfrak{n}'}\mu,\sigma\rangle = \left(\prod\_{k=1}^n \frac{\mathfrak{n}!}{(\mathbf{k}-1)!(\mathfrak{n}-\mathbf{k})!} \right) \times \left(\prod\_{k=1}^n \left(P\left(d\_{k'}\mu,\sigma\right)^{\mathfrak{n}-1} \left(1-P\{d\_{k'}\mu,\sigma\}\right)^{\mathfrak{n}-\mathfrak{k}}\right)\right) \left(\prod\_{k=1}^n p(d\_{k'}\mu,\sigma\rangle)\right)$$

Their interpretation is obvious: the latter is the formula of the likelihood method, the second is the correction to the likelihood method and the first is the normalizing factor. For this reason, the rank measure can be considered as a corrected likelihood method.

The rank measure is the simplest solution of the identification problem for the simplest model that can be obtained within the framework of calculating the probability of origin. The reason is in the availability of an analytical formula for calculating the model's prediction. For more complex models, there is no such formula. At least we need to compute the prediction of the model numerically. Studies were conducted and it was revealed that for two important particular models' explicit formulation of a metric is not required too. It is multifactor expansion of the trivial model and model where the parameters of the dynamic deterministic function are identified against the background of noise.
