**7.3 YL, min search algorithm**

Assertion 6 is much like Assertion 1, which made it possible to go from an arbitrary numerical series to an ordered one for the optimal solution search (see Section 3). Similarly, Assertion 6 makes it possible to go to an ordered number series to find the minimizing set of numbers of a given length. In our case, considerations similar to those presented in Section 3 may be made when searching for a minimizing set. Namely, if YL, min ¼ yj 1 , … , yj L n o is the minimizing set of length L, then an arbitrary permutation of the numbers yj 1 , … , yj L will also be the minimizing set. Thus, the process for searching of YL, min does not depend on the arrangement of given numbers yj . This allows us to arrange them in the order most suitable for searching the minimizing set. We arrange the values yj n o in the ascending order, and we will look for the minimizing set in the ordered sequence. As in Section 3, for brevity, the ordered sequence will also be denoted by yj n o. For simplicity of logic,

we will assume that all yj are different, that is, Eq. (10) holds. Rewrite it for convenience:

$$\mathbf{y}\_1 < \mathbf{y}\_2 < \dots < \mathbf{y}\_N$$

Note that due to Assertion 6, if YL, min is the minimizing set and yj 1 and yj L are its smallest and greatest values, respectively, then all values of yj from the interval

yj 1 , yj L � � belong to YL, min . Consequently, considering Eq. (10), we have yj L ¼ yj <sup>1</sup>þL�<sup>1</sup> and therefore

$$\mathbf{Y}\_{\mathbf{L},\min} = \left\{ \mathbf{y}\_{\mathbf{j}\_1}, \mathbf{y}\_{\mathbf{j}\_1+1}, \dots, \mathbf{y}\_{\mathbf{j}\_1+\mathbf{L}-1} \right\}.$$

Thus, in order to find a minimizing set of length L, it is sufficient for us to check N � L + 1 sets

$$\{\mathbf{y}\_1, \dots, \mathbf{y}\_{\mathcal{L}}\}, \{\mathbf{y}\_2, \dots, \mathbf{y}\_{\mathcal{L}+1}\}, \dots, \{\mathbf{y}\_{\mathcal{N}-\mathcal{L}+1}, \dots, \mathbf{y}\_{\mathcal{N}}\}.$$

and choose from them the one that has minimal SD value. In the minimizing set searching procedure, we use the appropriate designations z k, L ð Þ and <sup>σ</sup><sup>2</sup>ð Þ k, L expressed by Eqs. (13) and (14) to denote mean and square SD in the case of sets composed of the ascending ordered values.

The calculation of z k, L ð Þ and <sup>σ</sup><sup>2</sup>ð Þ k, L when searching YL, min can be carried out in the manner similar to that of the optimal solution according to the schematic, in which only the <sup>σ</sup><sup>2</sup>ð Þ k, L is shown:

$$\begin{array}{ccccc}\text{Length of Set} & & & & \\ & \text{N} & & \sigma^{\omega}(\mathfrak{n}, \mathfrak{N}) & & \\ & & \downarrow & & \\ & & \downarrow & & \\ & & \text{N}-\mathfrak{n} & & \sigma^{\omega}(\mathfrak{n}, \mathfrak{N}-\mathfrak{n}) & \\ & & \vdots & & \vdots & \\ & & & \downarrow & & \\ & & & \sigma^{\omega}(\mathfrak{n}, \mathfrak{n}) & \longrightarrow & \sigma^{\omega}(\mathfrak{n}, \mathfrak{l}) & \cdots & \longrightarrow & \sigma^{\omega}(\mathfrak{N}-\mathfrak{l}+\mathfrak{l}+\mathfrak{l}) \end{array}$$

**Figure 7.**

*Scheme of calculations when finding the minimizing set of length L.*

At first, we carry out the transitions in the direction of the vertical arrows (see diagram in **Figure 7**) and calculate sequential values <sup>σ</sup><sup>2</sup>ð Þ 1, N , <sup>σ</sup><sup>2</sup>ð Þ 1, N � <sup>1</sup> , … , <sup>σ</sup><sup>2</sup>ð Þ 1, L using formula (17)–(19). Finally, we go in the direction of the horizontal arrows and calculate values of <sup>σ</sup><sup>2</sup>ð Þ 1, L , <sup>σ</sup><sup>2</sup>ð Þ 2, L , … , <sup>σ</sup><sup>2</sup>ð Þ <sup>N</sup> � <sup>L</sup> <sup>þ</sup> 1, L using formula (20)–(22) and choose from them the minimal one.

#### **7.4 Trend search algorithm**

In order to correctly detect outliers in measurement data that include an unknown trend, it is necessary to find and remove trend from the original data. The problem in determining of unknown trend is to find a suitable profile for the measurement data by adjusting the fitting parameters. This implies, in turn, the needs to select from the specified series *y <sup>j</sup>* the "right" reference values yj 1 , … , yj L n o used for fitting. If the data do not contain coarse measurements, all N values of the original data can be treated as reference ones and used for fitting. If the data contain *Effective Algorithms for Detection Outliers and Cycle Slip Repair in GNSS Data Measurements DOI: http://dx.doi.org/10.5772/intechopen.92658*

rough measurements, the participation all or some of those values in the fit will result in a fatal distortion of the trend function and, as a result, an incorrect determination of the outliers. Thus, when determining the trend, it is necessary to exclude rough measurements from the number of reference values used for fitting and by which the trend approximations are built. We have the vicious circle. To determine outliers, it is necessary to find a proper trend approximation, and to find an exact trend approximation, we need to find all outliers and remove them from the values used for fitting. Another vicious circle is obtained when trying to choose the number of values for fitting. If we take it too small to minimize the possibility for outliers to be included into the subset of values for fitting, the data fit may not be accurate enough for the rest values of the data outside of the subset. If, on the contrary, we take rather large number of points for fitting, the outliers may be among them, and we will also get a bad approximation of the trend.

We describe herein a strategy for finding an unknown trend and detecting outliers. This strategy assumes that the number of outliers in the series presented by Eq. (1) does not exceed a certain value Nmaxout known a priori. Thus, we can suppose that the number of "right" values suitable for fitting in the series *y <sup>j</sup>* is not less than N � Nmaxout.

Below L is supposed to be fixed and associated with the number of the reference values of the series (1) used for fitting, and L ≤ N � Nmaxout.

Let us consider the following algorithm. It contains internal iterations, which we will denote with the upper index "s" in parentheses.

Step 0: n = 0. We set some L, satisfying the condition L ≤ N � Nmaxout. Step 1: n++; s = 0; flagð Þ <sup>0</sup> <sup>j</sup> ¼ 1.

Step 2: s++. We fit polynomial to the data set and find fitting coefficients a!ð Þ<sup>s</sup> <sup>¼</sup> a ð Þs <sup>0</sup> , … , að Þ<sup>s</sup> n n o:

$$\overrightarrow{\mathbf{a}}^{(\mathfrak{s})} = \underset{\mathfrak{a}\_{\varnothing}, \dots, \mathfrak{a}\_{\mathfrak{a}-1}, \mathfrak{a}\_{\mathfrak{a}}}{\arg\min} \Phi^{(\mathfrak{s}-1)}\left(\overrightarrow{\mathbf{a}}\right),\tag{50}$$

$$\mathfrak{sp}^{(s-1)}\left(\stackrel{\dashrightarrow}{\mathbf{a}}\right) \triangleq \sum\_{j=1}^{N} \left(\mathbf{y}\_{j} - \mathbf{P}\_{\mathbf{n},j}\left(\stackrel{\dashrightarrow}{\mathbf{a}}\right)\right)^{2} \cdot \mathbf{f} \mathbf{lag}\_{j}^{(s-1)}.\tag{51}$$

Step 3: Consider the values y^ð Þ<sup>s</sup> <sup>j</sup> obtained after subtraction from yj the polynomial values with coefficients found in Step 2:

$$
\hat{\mathbf{y}}\_{\mathbf{j}}^{(\mathbf{s})} = \mathbf{y}\_{\mathbf{j}} - \mathbf{P}\_{\mathbf{n},\mathbf{j}} \left( \overrightarrow{\mathbf{a}}^{(\mathbf{s})} \right). \tag{52}
$$

Using the algorithm described in Section 7.3, we find from the numbers y^ð Þ<sup>s</sup> j n o<sup>N</sup> j¼1 defined in Eq. (52) a minimizing set Yð Þ<sup>s</sup> L, min ¼ y^ ð Þs j ð Þs 1 , … , y^ð Þ<sup>s</sup> j ð Þs L � � of length L. For this set, we calculate the corresponding values of the mean zYð Þ<sup>s</sup> L, min and SD σYð Þ<sup>s</sup> L, min according to the formulas (40) and (41) in which j1, … , jL � � is replaced by j ð Þs <sup>1</sup> , … , jð Þ<sup>s</sup> L n o. We redefine the reference points for the next fit process (Step 2) by marking them with flagð Þ<sup>s</sup> j :

$$\text{flag}^{(s)}\_{\mathbf{j}} = \begin{cases} \mathbf{1}, & \text{if} \quad \mathbf{j} \in \left\{ \mathbf{j}\_1^{(s)}, \dots, \mathbf{j}\_L^{(s)} \right\}\_{\text{.}} \\ \mathbf{0}, & \text{otherwise} \end{cases} \tag{53}$$

We transit to Step 2 and do so until the convergence of the σYð Þ<sup>s</sup> L, min , s = 1,2, … , is achieved. Note that we will consider the issue of convergence further in Section 7.5. After reaching convergence of values σYð Þ<sup>s</sup> , we transit to Step 4 for outlier detection.

L, min Step 4: Searching the optimal solution for data set y^ð Þ<sup>s</sup> <sup>j</sup> ¼ yj � Pn,j a !ð Þ<sup>s</sup> � �, j = 1, …, N. We find the optimal solution for values y^ð Þ<sup>s</sup> <sup>j</sup> using the algorithm of Section 3 and determine the number Nout of outliers detected. If it turns out that Nout≤Nmaxout*,* then solution is considered found; searching process stops: fj ¼ Pn,j a !ð Þ<sup>s</sup> � �. Other-

wise, if Nout>Nmaxout, then we transit to Step 1.

We do this until we find a solution or until *n* reaches some preset value Nmax (e.g., 10). In this case, probably, we may need to select a different functional class to search for a trend.

**Example.** Let us consider the data simulated in accordance with formula: *<sup>y</sup> <sup>j</sup>* = 10 ffiffiffiffiffiffiffiffiffiffiffiffiffiffi *<sup>j</sup>* <sup>þ</sup> <sup>10</sup> <sup>p</sup> <sup>þ</sup> 2random *<sup>j</sup>*, where *<sup>j</sup>* <sup>¼</sup> <sup>1</sup> … *<sup>N</sup>*, random *<sup>j</sup>*—pseudorandom numbers, evenly distributed in the segment [0,1]. Let us set N = 150 and model 10 outliers at points *j* ¼ 6*:::*10 and *j* ¼ 140*:::*144 (see **Figure 8**). Further, let us set σmax = 0.6, MINOBS = 10 and search for outliers as described above with L = 130. If *n* = 1, we get *N*out = 88; if *n* = 2, we get *N*out = 33; if *n* = 3, we get *N*out = 17; if *n* = 4, we get *N*out = 10. Thus, a suitable trend approximation turned out to be a fourth degree polynomial. **Figure 9** shows values for fourth degree polynomial fitted to the data set on

**Figure 8.** *Data simulated using: y <sup>j</sup> <sup>=</sup>* <sup>10</sup> ffiffiffiffiffiffiffiffiffiffiffiffiffiffi *<sup>j</sup>* <sup>þ</sup> <sup>10</sup> <sup>p</sup> <sup>þ</sup> 2random *<sup>j</sup> and fourth degree polynomial after the first iteration.*

**Figure 9.** *Simulated data and fourth degree polynomial after the eighth iteration.*

*Effective Algorithms for Detection Outliers and Cycle Slip Repair in GNSS Data Measurements DOI: http://dx.doi.org/10.5772/intechopen.92658*

**Figure 10.**

*The differences* ^*y <sup>j</sup>* ¼ *y <sup>j</sup>* � *P*4,*<sup>j</sup> a* !� �*, approximation of unknown trend with fourth degree polynomial.*

the eighth iteration after the convergence discussed above is reached. In **Figure 10**, the differences ^*y <sup>j</sup>* ¼ *y <sup>j</sup>* � *P*4,*<sup>j</sup> a* !� � are plotted. Positions of detected outliers are marked with white circles. Note that the trend was modeled by a function that does not belong to the class of power polynomials.

#### **7.5 Convergence of iterations in trend search algorithm**

This section explains the convergence of the iterations described in the trend search algorithm (see previous section).

**Assertion 7.** *The SD sequence* σYð Þ<sup>s</sup> L, min *of values calculated in the trend search algorithm monotonically decreases at s = 1, 2, …* :

$$
\sigma\_{Y\_{\text{l,min}}^{(i)}} \ge \sigma\_{Y\_{\text{l,min}}^{(i+1)}} \ge \dots \tag{54}
$$

and therefore converged.

**Proof.** We start our consideration with Step 3 and *s*th iteration, *s* = 1, 2, *…* In Step 3, for the sequence y^ ð Þs j n o<sup>N</sup> j¼1 , we find a minimizing set of length L: Yð Þ<sup>s</sup> L, min ¼ y^ ð Þs j ð Þs 1 , … , y^ð Þ<sup>s</sup> j ð Þs L � �. Write the expressions for the mean and the square of SD corresponding to this set. We have due to Eqs. (40) and (41):

$$\mathbf{z}\_{\mathbf{Y}\_{\mathbf{L},\min}^{(\mathbf{s})}} = \mathbf{L}^{-1} \sum\_{\mathbf{j} \in \{\mathbf{j}\_1^{(\mathbf{s})}, \dots, \mathbf{j}\_L^{(\mathbf{s})}\}} \mathbf{\hat{y}}\_{\mathbf{j}}^{(\mathbf{s})},\tag{55}$$

$$\sigma\_{\mathbf{Y}\_{\mathbf{l},\min}^{(\mathbf{s})}}^{2} = \left(\mathbf{L} - \mathbf{1}\right)^{-1} \sum\_{\mathbf{j} \in \left\{\mathbf{j}\_{\mathbf{l}}^{(\mathbf{s})}, \dots, \mathbf{j}\_{\mathbf{l}}^{(\mathbf{s})}\right\}} \left(\hat{\mathbf{y}}\_{\mathbf{l}}^{(\mathbf{s})} - \mathbf{z}\_{\mathbf{Y}\_{\mathbf{l},\min}^{(\mathbf{s})}}\right)^{2}. \tag{56}$$

Transform the expression on the right side of Eq. (56). Substitution here instead of y^ ð Þs <sup>j</sup> expression in Eq. (52) gives:

$$\sigma\_{\mathbf{Y}\_{\mathbf{L},\min}^{(\mathbf{s})}}^2 = (\mathbf{L} - \mathbf{1})^{-1} \sum\_{\mathbf{j} \in \{\mathbf{j}\_1^{(\mathbf{s})}, \dots, \mathbf{j}\_\mathbf{L}^{(\mathbf{s})}\}} \left(\mathbf{y}\_\mathbf{j} - \mathbf{P}\_{\mathbf{n},\mathbf{j}} \left(\overrightarrow{\mathbf{a}}^{(\mathbf{s})}\right) - \mathbf{z}\_{\mathbf{Y}\_{\mathbf{L},\min}^{(\mathbf{s})}}\right)^2.$$

Using the flagð Þ<sup>s</sup> <sup>j</sup> defined in Eq. (53), rewrite the last equality as follows:

$$\begin{split} \sigma\_{\mathbf{Y}\_{\text{l,min}}^{2}}^{2} &= \left(\mathbf{L} - \mathbf{1}\right)^{-1} \sum\_{\mathbf{j}=1}^{N} \left(\mathbf{y}\_{\text{j}} - \mathbf{P}\_{\text{n}\mathbf{j}}\left(\overrightarrow{\mathbf{a}}^{(s)}\right) - \mathbf{z}\_{\text{Y}\_{\text{l,min}}^{(s)}}\right)^{2} \cdot \mathbf{f} \mathbf{lag}\_{\text{j}}^{(s)} = \\ &= \left(\mathbf{L} - \mathbf{1}\right)^{-1} \sum\_{\mathbf{j}=1}^{N} \left(\mathbf{y}\_{\text{j}} - \mathbf{P}\_{\text{n}\mathbf{j}}\left(\overrightarrow{\mathbf{b}}\right)\right)^{2} \cdot \mathbf{f} \mathbf{lag}\_{\text{j}}^{(s)} = \left(\mathbf{L} - \mathbf{1}\right)^{-1} \mathbf{o}^{(s)}\left(\overrightarrow{\mathbf{b}}\right). \end{split} \tag{57}$$

Here we introduce the designation b! ¼ a ð Þs <sup>0</sup> þ zYð Þ<sup>s</sup> L, min , að Þ<sup>s</sup> <sup>1</sup> , … , að Þ<sup>s</sup> n � � and take into account the definition of Pn,j a !� � expressed by Eq. (37) and the definition of functional Φð Þ<sup>s</sup> a !� � given in Eq. (51). From Eq. (57), we get:

$$
\sigma\_{\mathbf{Y}\_{\mathbf{L},\min}^{2}}^{2} = (\mathbf{L} - \mathbf{1})^{-1} \Phi^{(s)} \left( \overrightarrow{\mathbf{b}} \right) \tag{58}
$$

We transit to Step 2; *<sup>s</sup>* is incremented by 1. We find a vector a!ð Þ <sup>s</sup>þ<sup>1</sup> of polynomial coefficients, which minimizes the functional Φð Þ<sup>s</sup> a !� �:

$$
\overrightarrow{\mathbf{a}}^{(\mathbf{s}+\mathbf{1})} = \underset{\overrightarrow{\mathbf{a}}}{\text{arg min}} \,\Phi^{(\mathbf{s})} \left(\overrightarrow{\mathbf{a}}\right).
$$

Thus,

$$\Phi^{(\mathbf{s})}\left(\overrightarrow{\mathbf{a}}^{(\mathbf{s}+1)}\right) = \min\_{\overrightarrow{\mathbf{a}}} \left\{ \Phi^{(\mathbf{s})}\left(\overrightarrow{\mathbf{a}}\right) \right\} \tag{59}$$

From the definition of the extremum of functional, it follows:

$$
\Phi^{(\mathbf{s})} \left( \overrightarrow{\mathbf{a}}^{(\mathbf{s}+1)} \right) \le \Phi^{(\mathbf{s})} \left( \overrightarrow{\mathbf{b}} \right). \tag{60}
$$

Taking into account Eq. (58), we have:

$$
\sigma\_{Y\_{\mathbf{L},\min}^{(s)}}^2 \ge (\mathbf{L} - \mathbf{1})^{-1} \Phi^{(s)} \left( \overrightarrow{\mathbf{a}}^{(s+1)} \right). \tag{61}
$$

The extremum condition (one of n + 1) of functional Φð Þ<sup>s</sup> a !� � is as follows:

$$\frac{\partial}{\partial \mathbf{a}\_0} \Phi^{(\mathbf{s})} \left( \overleftarrow{\mathbf{a}} \right) \Big|\_{\overrightarrow{\mathbf{a}} = \overrightarrow{\mathbf{a}}^{(\mathbf{s}+1)}} = \mathbf{0}.$$

From here, we derive:

$$\sum\_{\mathbf{j=1}}^{N} \left( \mathbf{y}\_{\mathbf{j}} - \mathbf{P}\_{\mathbf{n}, \mathbf{j}} \left( \overrightarrow{\mathbf{a}}^{(s+1)} \right) \right) \cdot \mathbf{f} \mathbf{lag}\_{\mathbf{j}}^{(s)} = \mathbf{0}.$$

Taking into account the designation for y^ð Þ <sup>s</sup>þ<sup>1</sup> <sup>j</sup> given by Eq. (52), we can write this equality in the form.

$$\sum\_{j=1}^{N} \hat{\mathbf{y}}\_{\mathfrak{j}}^{(\mathfrak{s}+1)} \cdot \mathbf{f} \mathbf{lag}\_{\mathfrak{j}}^{(\mathfrak{s})} = \mathbf{0} \text{ or } \sum\_{\mathfrak{j} \in \{\mathfrak{j}\_1^{(\mathfrak{s})}, \dots, \mathfrak{j}\_L^{(\mathfrak{s})}\}} \hat{\mathbf{y}}\_{\mathfrak{j}}^{(\mathfrak{s}+1)} = \mathbf{0}. \tag{62}$$

*Effective Algorithms for Detection Outliers and Cycle Slip Repair in GNSS Data Measurements DOI: http://dx.doi.org/10.5772/intechopen.92658*

Consider the set Yð Þ <sup>s</sup>þ<sup>1</sup> <sup>L</sup> <sup>¼</sup> <sup>y</sup>^ð Þ <sup>s</sup>þ<sup>1</sup> j ð Þs 1 , … , y^ð Þ <sup>s</sup>þ<sup>1</sup> j ð Þs L � �. The mean and SD values for it are calculated using formulas (38) and (39). Taking into account Eq. (62), we get:

$$\mathbf{z}\_{\mathbf{y}\_{\mathbf{L}}^{(\mathfrak{s}+1)}} = \mathbf{L}^{-1} \sum\_{\mathbf{j} \in \{\mathfrak{j}\_1^{(\mathfrak{s})}, \dots, \mathfrak{j}\_{\mathbf{L}}^{(\mathfrak{s})}\}} \mathbf{\hat{y}}\_{\mathbf{j}}^{(\mathfrak{s}+1)} = \mathbf{0},$$

and

$$\begin{split} \sigma^{2}\_{\mathbf{Y}\_{\mathbf{L}}^{(s+1)}} &= (\mathbf{L} - \mathbf{1})^{-1} \sum\_{\mathbf{j} \in \{\mathbf{j}\_{\mathrm{i}}^{(s)}, \dots, \mathbf{j}\_{\mathrm{i}}^{(s)}\}} \left( \hat{\mathbf{y}}\_{\mathbf{j}}^{(s+1)} - \mathbf{z}\_{\mathbf{Y}\_{\mathbf{L}}^{(s+1)}} \right)^{2} = (\mathbf{L} - \mathbf{1})^{-1} \sum\_{\mathbf{j} \in \{\mathbf{j}\_{\mathrm{i}}^{(s)}, \dots, \mathbf{j}\_{\mathrm{i}}^{(s)}\}} \left( \hat{\mathbf{y}}\_{\mathbf{j}}^{(s+1)} \right)^{2} = \mathbf{0} \\ &= (\mathbf{L} - \mathbf{1})^{-1} \sum\_{\mathbf{j}=1}^{N} \left( \mathbf{y}\_{\mathbf{j}} - \mathbf{P}\_{\mathrm{n}, \mathbf{j}} \left( \overline{\mathbf{a}}^{(s+1)} \right) \right)^{2} \cdot \mathbf{f} \mathbf{lag}\_{\mathrm{j}}^{(s)} = (\mathbf{L} - \mathbf{1})^{-1} \Phi^{(s)} \left( \overline{\mathbf{a}}^{(s+1)} \right). \end{split}$$

Thus,

$$
\sigma\_{Y\_{\mathbf{L}}^{(s+1)}}^2 = (\mathbf{L} - \mathbf{1})^{-1} \Phi^{(s)} \left( \stackrel{\rightarrow}{\mathbf{a}}^{(s+1)} \right). \tag{63}
$$

We transit to Step 3. In this step, for sequence y^ð Þ <sup>s</sup>þ<sup>1</sup> j n o<sup>N</sup> j¼1 , we find a minimizing set of length L: Yð Þ <sup>s</sup>þ<sup>1</sup> L, min ¼ y^ ð Þ sþ1 j ð Þ sþ1 1 , … , y^ ð Þ sþ1 j ð Þ sþ1 L � �. The SD square <sup>σ</sup><sup>2</sup> Yð Þ <sup>s</sup>þ<sup>1</sup> L, min for this set does not exceed the same magnitude for any other set, particularly for Yð Þ <sup>s</sup>þ<sup>1</sup> <sup>L</sup> ¼

$$\left\{ \hat{\mathbf{y}}\_{\mathfrak{j}\_1^{(\mathbf{s})}}^{(\mathbf{s}+\mathbf{1})}, \dots, \hat{\mathbf{y}}\_{\mathfrak{j}\_L^{(\mathbf{s})}}^{(\mathbf{s}+\mathbf{1})} \right\} ;$$

σ2 Yð Þ <sup>s</sup>þ<sup>1</sup> L ≥σ<sup>2</sup> Yð Þ <sup>s</sup>þ<sup>1</sup> L, min *:* (64)

Finally, from Eqs. (61) and (63) and the last inequality (64), we get

$$
\sigma\_{Y\_{\mathbf{L},\min}^{(s)}}^2 \ge (\mathbf{L} - \mathbf{1})^{-1} \Phi^{(s)} \left( \overrightarrow{\mathbf{a}}^{(s+1)} \right) \\
= \sigma\_{Y\_{\mathbf{L}}^{(s+1)}}^2 \ge \sigma\_{Y\_{\mathbf{L},\min}^{(s+1)}}^2
$$

that is, σ<sup>2</sup> Yð Þ<sup>s</sup> L, min ≥σ<sup>2</sup> Yð Þ <sup>s</sup>þ<sup>1</sup> L, min , that proves Assertion 7.

### **8. Detection and repair cycle slips in the Melbourne-Wübbena combination algorithm**

In this and the following section, we describe cycle slip repair algorithm for observers represented in the form of Melbourne-Wübbena combination, which is often used in modern GNSS measurement data processing programs. Loss by the receiver of the carrier phase capture results in jumps in the code and phase measurement data. In the absence of jumps, as we already discussed in Section 2, the values of the combination consist of measurement noise superimposed on an unknown constant value dependent on a specified satellite-receiver pair.

In case of temporary loss of carrier phase capture by the receiver, jumps occur in a series of values representing the Melbourne-Wübbena combination. The procedure of detecting jumps and eliminating them from the values of the combination,

called cycle slip repair, is one of the most important steps of preprocessing GNSS data. The main difficulty in detecting jumps is that neither the exact size of jumps nor their epochs are known. A number of algorithms, descriptions of which can be found, for example, in Refs. [3, 6, 8] are proposed for the detection of jumps. Although differing in detail, they are based on a common idea, that is, comparison of the SDs of the time series of measurement data obtained from one of the bounds of the time interval to an arbitrary moment, an epoch. If the differences of the SDs corresponding to two adjacent epochs exceed a predetermined threshold value, then a jump is declared in one of these two epochs. A drawback of similar algorithms is the frequent false detection of jumps during epochs containing rough measurements (outliers) since the values of outliers can exceed the sizes of a jump itself. On the other hand, an attempt to increase the threshold value leads to the opposite effect, an inability to recognize jumps that are small in magnitude.

Below, we propose a robust cycle slip repair algorithm that allows, more reliably than similar known algorithms, to detect jumps and determine their sizes. The proposed algorithm is based on search for so-called clusters consisting of epochs, in which the values of the combination are grouped about corresponding predefined values. Besides, this algorithm implements the above-described method of cleaning data from outliers based on the search for the optimal solution. This method, combined with Springer's algorithm used in Ref. [3], allows for the reliable determination of multiple (cascade) jumps in the Melbourne-Wübbena combination.

The Melbourne-Wübbena combination LM�<sup>W</sup> can be presented in the form of the total of three terms [6]:

$$\mathbf{L}\_{\mathbf{M}-\mathbf{W}} = \lambda\_{\mathbf{5}} \mathbf{n}\_{\mathbf{5}} + \mathbf{B} + \nu,\tag{65}$$

where *λ*<sup>5</sup> is the formal wavelength (for all GPS satellites *λ*<sup>5</sup> = 86.16 cm and for GLONASS satellites *λ*<sup>5</sup> = 84.0 ÷ 84.36 cm); *n*<sup>5</sup> is an unknown integer, the so-called wide lane ambiguity [3]; *B* is the residual systematic error caused by instrumental delays relative to the specific receiver-satellite pair and, as assumed, not timedependent; and *ν* is a random component. Both parts of the equality in (65) are expressed in meters. At the same time, the *LM*�*<sup>W</sup>* combination is often expressed in cycles of wavelength *λ*5. Let us divide both parts of Eq. (65) by *λ*5. We introduce the designations *y* ¼ *LM*�*<sup>W</sup>=λ*5, *β* ¼ *B=λ*5, and *ξ* ¼ *ν=λ*<sup>5</sup> and derive for each of the epochs associated with indexes *j* ¼ 1, … , *N*:

$$
\mathcal{Y}\_j = n\_{\mathfrak{H}} + \beta + \mathfrak{z}\_j. \tag{66}
$$

where *n*5*<sup>j</sup>* is an unknown piecewise-constant function of a discrete argument, regarding which it is only known that it takes integer values and can undergo integer jumps; β is an unknown constant value; and *ξ <sup>j</sup>* is a random variable. The problem consists in the development of an algorithm for automatic processing of the informational data *y <sup>j</sup>* of form of Eq. (66), making it possible to effectively determine integer-valued jumps against the noise component and outliers.

#### **9. Description of the proposed algorithm**

Let us present the proposed algorithm as the following sequence of steps. Step 0*:* We introduce parameter Δ by specifying its value as Δ ¼ 2 � *σ*max. We mark the indexes of epochs with function flagj as nulls: flagj ¼ 0; j = 1, … , N.

Arrange the array *y <sup>j</sup>* in the ascending order and renumber it by placing the indexes in the ascending order. Denote the resulting array as *Y <sup>j</sup>*. For simplicity of *Effective Algorithms for Detection Outliers and Cycle Slip Repair in GNSS Data Measurements DOI: http://dx.doi.org/10.5772/intechopen.92658*

**Figure 11.** *Searching for the maximum of the density of values of the array Y j.*

logic, we suppose that all *Y <sup>j</sup>* are different. Therefore, we have: *Y*<sup>1</sup> <*Y*<sup>2</sup> < … < *YN*. Denote the corresponding permutation of the values of function flagj as FLAGj.

Step 1: On this step, we are looking for the maximum density of array values *y <sup>j</sup>* .

We consider the segments *<sup>Y</sup> <sup>j</sup>*, *<sup>Y</sup> <sup>j</sup>* <sup>þ</sup> <sup>Δ</sup> � � of length <sup>Δ</sup> and look for the one that contains the largest number of *Yi* values. Here, only those indexes *i* and *j* from the range of 1 ≤ *i*, *j* ≤ *N* are considered for which *FLAGi* ¼ *FLAG <sup>j</sup>*= 0. The purpose of the ordering of the original array consists in improving the effectiveness of the maximum density search procedure. It can be shown that the number of comparisons required when searching for the above segment does not exceed 2 N in the case of the ordered array. For an unordered array, the number of comparisons is evaluated as *N*<sup>2</sup> .

Step 2: Let *YJ*max , *YJ*max <sup>þ</sup> <sup>Δ</sup> � � be the segment found in the previous step, containing *N*max values of *Y <sup>j</sup>* (see **Figure 11**). We calculate the mean *m* of these numbers *Y <sup>j</sup>*.

$$m = N\_{\text{max}}^{-1} \cdot \sum\_{j=l\_{\text{max}}}^{J\_{\text{max}} + N\_{\text{max}} - 1} Y\_j,\tag{67}$$

$$\mathbf{Y}\_{j} \in \left[ \mathbf{Y}\_{I\_{\text{max}}}, \mathbf{Y}\_{I\_{\text{max}}} + \boldsymbol{\Delta} \right]. \tag{68}$$

Step 3: Cluster searching. The values *y <sup>j</sup>* that are clustered about the mean *m* found in the previous step can pertain to the epochs scattered along the time axis in chaotic order. The task of this step is to define an accumulation of such points, a so-called cluster, which we define as follows:

**Definition 3**. *We designate as an m*ð Þ , Δ *cluster the set of points (epochs) of the time axis, the indexes j of which belong to the segment* [*k*,*l*] (1 ≤ *k*<*l* ≤ *N*) *and for which the following requirements are satisfied:*


$$\left|\boldsymbol{\nu}\_{j} - \boldsymbol{m}\right| \leq \Delta. \tag{69}$$


This definition illustrates in **Figure 12**.

It is understood that for specified values of *m* and Δ, there might be one, several, or no (*m*, Δ) clusters. Searching for clusters is performed by sequential checking of the satisfaction of inequality (69) for all *j* = 1, *…* , *N* for which flagj = 0. Note that inequality (69) is satisfied for at least *N*max values of *j*. In fact, from expressions (67) and (68), we derive the following equations:

$$Y\_{J\_{\text{max}}} \le m \le Y\_{J\_{\text{max}}} + \Delta,\tag{70}$$

and since the arrays *y <sup>j</sup>* and *Y <sup>j</sup>* differ only by permutation, it follows from Eq. (68) that the inequalities

$$Y\_{I\_{\text{max}}} \le y\_j \le Y\_{I\_{\text{max}}} + \Delta. \tag{71}$$

are fulfilled for *N*max index *j*. Inequality (69) is satisfied also for all these indexes, as follows from Eqs. (70) and (71).

If cluster was found, then we mark all points of it as 1, and then, we repeat the cluster search procedure. If even just one cluster has been found at this step, we transfer to Step 1. If not even one cluster has been found, then the search for clusters is complete and we transfer to Step 4.

Step 4: If even just one cluster has been found, we transfer to Step 5, and otherwise: (*a*) all points of the segment 1; ½ � *N* are marked as outliers and removed from further processing and (*b*) the operation of the algorithm terminates.

Step 5: Search for individual jumps in clusters. Let us assume that *n* ≥ 1 clusters have been found:

1*:* ½ � *k*1, *l*<sup>1</sup> is the ð Þ *m*1, Δ cluster

… *n:* ½ � *kn*, *ln* is the ð Þ *mn*, Δ cluster*:*

In each of the clusters that are found, outliers and 1-size jumps are possible. This follows immediately from the inequalities in (69) and the preestablished value Δ ¼ 1.2.

**Figure 12.** *Epochs with indexes k* ≤ *j* ≤ *l form m*ð Þ , Δ *cluster.*

*Effective Algorithms for Detection Outliers and Cycle Slip Repair in GNSS Data Measurements DOI: http://dx.doi.org/10.5772/intechopen.92658*

Substep 5.1: In detecting a 1-size jump in a cluster, we use modified Springer algorithm (see Refs. [9, 10]) combined with the proposed in Section 3 algorithm that executes a search for the optimal solution with a minimum quantity of defective data.

Substep 5.2: We find all epochs *Jp* and values Δ*n*5,*Jp* ≜ *n*5,*Jp* � *n*5,*Jp*�<sup>1</sup> (*p* = 1, … , *n*) of the jumps inside the clusters. The possible values for Δ*n*5,*Jp* are 0, � 1.

Substep 5.3: We repair the data by the value of each found jump, using the formula

$$\mathcal{Y}\_{j}^{(p)} = \begin{cases} \mathcal{Y}\_{j}^{(p-1)}; & j < J\_{p} \\ \mathcal{Y}\_{j}^{(p-1)} - \Delta n\_{\mathbb{S}J\_{p}}; & J\_{p} \le j < N \end{cases}; \quad p = 1, \ldots, n,\tag{72}$$

where *y* ð Þ 0 *<sup>j</sup>* ¼ *y <sup>j</sup>*

Substep 5.4: We rename: *y <sup>j</sup>* ¼ *y* ð Þ *n j* .

.

Step 6: Marking of points outside clusters. All points outside of the found clusters (if any) are marked as outliers.

Step 7: Ordering of clusters. We renumber the clusters so that they are placed left to right on the time axis. For the ordered set of clusters [*kp*, *lp*], *p* = 1, … , *n*, the conditions 1 ≤ *k*<sup>1</sup> < *l*<sup>1</sup> < *k*<sup>2</sup> < *l*<sup>2</sup> < *…* < *kn* < *ln* ≤ *N* are satisfied.

Step 8: Data screening within clusters and improving the mean values of *y <sup>j</sup>* **.**

Substep 8.1: In accordance with the algorithm proposed in Section 3, we perform screening from outliers in each of the *n* clusters.

Substep 8.2: For each of the clusters cleaned of outliers, we determine the modified mean values *m*<sup>∗</sup> *p* .

Step 9: Jumps between clusters. It follows from the description presented above that the remaining jumps in the data *y <sup>j</sup>* are on boundaries between clusters. If only one cluster is found, the algorithm is terminated: no additional jumps are detected, and the data are cleaned of outliers. If more than one cluster is found (*n* > 1), then the epochs *j*1, *…* , *jn*–<sup>1</sup> of the jumps will be the coordinates of the left-hand boundaries of clusters, beginning from the second: *j*<sup>l</sup> = *k*2, *…* , *jn*–<sup>1</sup> = *kn*. The values of jumps Δ*n*5, *<sup>j</sup> <sup>p</sup>* are found in this case as rounded to the nearest integer of the difference of mean values of adjacent clusters:

$$
\Delta n\_{5,j\_p} = \text{NINT}\left(m\_{p+1}^\* - m\_p^\*\right), p = 1, \ldots, n-1. \tag{73}
$$

Step 10: Repair data. We delete the jumps between clusters using formula analogous to (72):

$$\mathcal{Y}\_{j}^{(p)} = \begin{cases} \mathcal{Y}\_{j}^{(p-1)}; & j < j\_{p} \\ \mathcal{Y}\_{j}^{(p-1)} - \Delta n\_{\mathbb{S}J\_{p}}; & j\_{p} \le j < N \end{cases}; \quad p = 1, \ldots, n - 1,\tag{74}$$

where *y* ð Þ 0 *<sup>j</sup>* ¼ *y <sup>j</sup>* . We rename: *y <sup>j</sup>* ¼ *y* ð Þ *n*�1 *<sup>j</sup>* . The algorithm is terminated.

### **10. Numerical calculations using algorithms presented in Sections 8 and 9**

We present here the results of testing the proposed algorithm using real data obtained by the JOZ2 station, which is part of the IGS network [2]. These data are included in the distribution set of the installation software package [3]. Testing was carried out for data obtained from GPS satellite with number PRN = 13 for 2010, day 207. **Figure 13** shows the Melbourne-Wübbena combination values in the

**Figure 13.** *Values of the Melbourne-Wübbena combination for the JOZ2 station (*PRN *= 13 for 2010, day of year = 207).*

#### **Figure 14.**

*(a) Deviations of the values of the Melbourne-Wübbena combination from the mean after detection and elimination of jumps from the data using the algorithm from Ref. [3]. (b) Deviations of the values of the Melbourne-Wübbena combination from the mean after detection and elimination of jumps from the data using the proposed algorithm in Section 9.*

*Effective Algorithms for Detection Outliers and Cycle Slip Repair in GNSS Data Measurements DOI: http://dx.doi.org/10.5772/intechopen.92658*

107 min time interval (*N* = 215). The number *j* of time epochs counted from the beginning of 24-h day with interval of 30 seconds is plotted along the horizontal axis. The combination values *y <sup>j</sup>* expressed in cycles with wavelength λ<sup>5</sup> are plotted along the vertical axis.

**Figure 14a** and **b** presents the values of the deviations from the mean value *z* of data after cycle slip repair procedure, by using the algorithm applied in Ref. [3] (see **Figure 14a**) and the proposed algorithm (**Figure 14b**). The values *y <sup>j</sup>* –z in cycles of λ<sup>5</sup> are plotted along the vertical axes, and the number *j* of epochs is plotted along the horizontal axis. Epochs in which the measurement data were rejected are marked by light circles. In the first case (see **Figure 14a**), 111 of the conducted measurements or 51% of the total number of data was discarded. In the second case (see **Figure 14b**), 29 of these measurements were rejected (13%), which are almost 38% less than in the previous computation. The epochs of detected jumps are marked by daggers.

### **11. Conclusion**

This chapter presents several effective and stable algorithms for processing data received from GNSS receivers. These data form the basis of almost all engineering applications in the field of computational geo-dynamics and navigation and cadastral survey and in numerous fundamental research works as well. The accuracy of the results obtained is significantly influenced by the quality of the data used in the calculations. In particular, the presence of rough measurements (outliers) in the observation data can significantly reduce the accuracy of the calculations carried out. One of the tasks at the preliminary stage of data processing is reliable detection and removal of rough measurements from the series of measured data with minimum amount of rejected data. The so-called optimal solution, introduced in the chapter, made it possible to detect and eliminate outliers from observed data minimizing the number of rejected measurements. In addition, it is assumed that the data may contain a trend as an unknown function of time. The strategy for determining of the trend is depending on the physical process in question under an assumption that the trend is a continuous function of time. The efficiency of the search is definitely influenced by the choice of the function class from which the trend is searched. In this chapter, we considered the class of power polynomials, but in some cases, this choice may not lead to the expected result. It may require, for example, a class of trigonometric functions to find a suitable trend. The automatic search for the best functional class, together with the strategy of effectively finding an unknown trend, against the background of random noise and outliers, is a complex task for future research.

*Satellite Systems - Design, Modeling, Simulation and Analysis*
