3. Overall structure of indoor positioning system

• JBD and KLMVG outperformed the probabilistic neural network (PNN) and kNN with respect to the accuracy and the average error distance, indicating that the proposed combination scheme is more effective in the sensitive environments of WLAN-based

Global navigation satellite systems (GNSS) such as GLONASS (Russia's version of GPS), GALI-LEO, and GPS work well in outdoor environments, but their accuracy can significantly decrease in indoor environments due to many factors, such as penetration loss, refraction, multipath propagation, and absorption. Therefore, it is important to develop a system that can work in indoor environments with high accuracy. To this end, many techniques have been proposed for IPS in the last decade. In model-based techniques, the location is estimated based on a geometrical model, such as the log-distance path loss (LDPL) model, in which a semi-statistical function is built on the relationship between the RF propagation function and the RSS value. Several approaches have been proposed that are trade-offs between accuracy and cost, such as TOA, TDOA, AOA, and multidimensional scaling (MDS). MDS is a set of statistical techniques that are used to visualize the information in order to find similarities/dissimilarities in the data. The matrix in MDS begins with item-item dissimilarities, and AP-AP distances are determined by a radio attenuation model [9]. The fingerprinting-based technique depends on matching algorithms (e.g., kNN) that have been used in RADAR [14], which is one of the first Wi-Fi signal strength-based IPS and is considered the basis of WLAN fingerprinting IPS. Many developed kNN algorithms have been proposed for determining the similarity/dissimilarity in metrics, which is usually done using the Manhattan or Euclidean distance, such as in [11–18]. Ref. [19] proposed a new version of kNN that is more efficient than the probabilistic methods, neural networks, and traditional kNN, as it relies upon the decision tree of the training phases and takes into account the average of reference point (RP) measurements instead of needing the entire dataset to estimate the object's location. Ref. [20] performed a modified deterministic kNN technique with Mahalanobis, Manhattan, and Euclidian distances and found the Manhattan distance to be the most accurate. Recently, the use of probabilistic distribution measurements in many IPS applications has increased. The authors in [21] pioneered the use of the probabilistic distribution measurement in IPS and proposed a probabilistic framework by using the Bayesian network to estimate the location. In [22] the authors used a modified probability neural network (MPNN) to estimate the coordinates of the object and found that it outperformed the triangulation method. In [23], a kernel method was proposed to estimate the object's location using a histogram of the RSSI at the unknown location. In [24], the probability density function (PDF) was estimated using the Kullback-Leibler divergence (KLD) framework for composite hypothesis testing between the fingerprinting database and the test point, whereas in [25], the authors assumed that the RSSI distribution was multivariate Gaussian and used the KLD to estimate the impacts of the RPs on the test point in order to estimate the probability of the closest one and to

positioning systems.

144 Machine Learning - Advanced Techniques and Emerging Applications

identify the coordinates of the test point.

2. Related work

We begin with a typical WLAN scenario in which a person carries a smartphone device with WLAN access and takes RSS measurements from different APs within the College of Engineering and Applied Sciences (CEAS) at Western Michigan University (WMU). It is commonly assumed that the RSSI from multiple APs is distributed as a multimodal signal, as noted in [16]. However, in our study, the recorded signal-to-noise ratio for a single device varied significantly at any one location, with the values differing by as much as 10 dBm. Specifically, the signal-to-noise values were recorded for 35 min during rush hour for a single AP and in the same location.

There are many parameters that can affect the shape of the signal, such as reflection, diffraction, and pedestrian traffic. In this study, we sought to find a scenario that would lead to a better distribution of the Wi-Fi signal. During the offline phase, a realistic scenario was created that took into account the variation of the signal. However, because the effects of the body of the person holding the phone as well as pedestrian traffic can change the variation of the signal, a recording of the RSS was taken in four directions (45, 135, 225, and 315�) to reduce these variations. At each RP, a raw set of RSS data were collected as a time sample from the APs in the area of interest, denoted as q �ð Þ i,j ð Þ<sup>τ</sup> ; <sup>τ</sup> <sup>¼</sup> <sup>1</sup>; ::…; <sup>t</sup>; <sup>t</sup> <sup>¼</sup> <sup>100</sup> n o, where t represents the number of time samples and � �� is the orientation direction. Next, the average and covariance matrix of the RSS were obtained from the four different directions and ten scans used to create the fingerprinting database, known as the Radio Map, represented by Q �ð Þ [28]:

$$\mathbf{Q}^{(\circ)} = \begin{pmatrix} q\_{1,1}^{(\circ)} & q\_{1,2}^{(\circ)} & \cdots & q\_{1,N}^{(\circ)} \\ q\_{2,1}^{(\circ)} & q\_{2,2}^{(\circ)} & \cdots & q\_{2,N}^{(\circ)} \\ \vdots & \vdots & \ddots & \vdots \\ q\_{L,1}^{(\circ)} & q\_{L,2}^{(\circ)} & \cdots & q\_{L,N}^{(\circ)} \end{pmatrix} \tag{1}$$

where q �ð Þ i,j <sup>¼</sup> <sup>1</sup> q P<sup>t</sup> <sup>τ</sup>¼<sup>1</sup> <sup>q</sup> �ð Þ i,j ð Þτ and t = 10, which were randomly chosen from the 100 time samples. This allowed us to obtain the average of the RSS samples over time for different APs, i ¼ 1, 2, ::…L, j ¼ 1, 2, :…N, where N represents the number of RPs and L is the number of APs. The variance vector of each RP can be defined as

$$
\Delta\_{\mathfrak{j}}^{(\circlearrowright)} = \left[\Delta\_{1,j}^{(\circlearrowright)}, \Delta\_{2,j}^{(\circlearrowright)}, \Delta\_{3,j}^{(\circlearrowright)}, \dots, \Delta\_{L,j}^{(\circlearrowright)}\right] \tag{2}
$$

where σ is the kernel smoothing factor. The probability will be equal to 1 if p = q, and the

1. During the offline phase, RSS measurements are taken at different known locations, and 10 scans with 10 second

• The RSS measurements from the APs of smartphones from unknown locations are set in the same way as the database of the offline phase with respect to the similar media access control (MAC) address.

In recent times, approaches that measure the distortion in classes have become more common, instead of depending on a single distance. Indeed, the analysis of distortion is being used in many applications of machine learning, computational geometry, and IPS. Using Bregman divergence to measure the similarity/dissimilarity has recently become an attractive method because it encapsulates both information-theoretic relative entropy and the geometric Euclidean distance, which is a meta-algorithm [30]. The Bregman distance D<sup>φ</sup> between two sets of convex space data, p = (p1, …, pd) and q = (q1, …, qd), that is associated with φ (defined as a

Dφð Þ¼ p; q φð Þ� p φð Þ� q h i ∇φð Þp ; p � q (9)

<sup>2</sup> <sup>¼</sup> h i <sup>p</sup>; <sup>p</sup> , which is the parabolic potential function in Figure 2.

<sup>p</sup>ð Þ<sup>i</sup> <sup>q</sup>ð Þ<sup>i</sup> <sup>¼</sup> pTq (10)

Machine Learning Algorithm for Wireless Indoor Localization

http://dx.doi.org/10.5772/intechopen.74754

147

(11)

output will decrease when the difference between p and q becomes larger.

time delays are used to generate the Radio Map.

3. During the online phase, the following steps are performed:

• The minimum KLMvG is estimated using Eq. 8.

4. The maximum outputs to the output layer are transferred.

5. Bregman divergence algorithm formulation

strictly convex and differentiable function) can be defined as

where h i :; : denotes the dot product:

and ∇φð Þp denotes the gradient decent operator:

defining the distortion measurement in classes:

<sup>i</sup>�<sup>1</sup> pi

function as <sup>φ</sup>ð Þ¼ <sup>p</sup> <sup>P</sup><sup>d</sup>

Algorithm 1. The Kullback-Leibler multivariate Gaussian positioning method.

2. During the online phase, RSS measurements are taken at unknown locations of the smartphone.

• The previous step is repeated for different APs until the minimum distance is obtained.

h i <sup>p</sup>; <sup>q</sup> <sup>¼</sup> <sup>X</sup> d

∇φð Þ¼ p

i¼1

∂φ ∂p<sup>1</sup>

The Bregman divergence unifies the statistical KLD with the squared Euclidean distance by

• The Euclidean distance is obtained from the Bregman divergence by considering the convex

…∂<sup>φ</sup> ∂pd � �<sup>T</sup>

• A database for each RP is set using RSS measurements from different locations.

where

$$\Delta\_{i,j}^{\left(^{\circ}\right)} = \frac{1}{t-1} \sum\_{\tau=1}^{t} \left( q\_{i,j}^{\left(^{\circ}\right)}(\tau) - q\_{i,j}^{\left(^{\circ}\right)} \right)^2 \tag{3}$$

where Δ �ð Þ i,j is the variance for AP i at RP j with orientation � �� ; thus, the database table of the Radio Map is (xj, yj , q �ð Þ <sup>j</sup> ,<sup>Δ</sup> �ð Þ <sup>j</sup> ) with q �ð Þ <sup>j</sup> defined as

$$\boldsymbol{\sigma}\_{\circ}^{(\circ)} = \begin{bmatrix} \boldsymbol{q}\_{1,j}^{(\circ)}, \boldsymbol{q}\_{2,j}^{(\circ)}, \boldsymbol{q}\_{3,j}^{(\circ)}, \dots, \dots, \boldsymbol{q}\_{L,j}^{(\circ)} \end{bmatrix} \tag{4}$$

During the online phase, the RSS measurement is denoted as

$$p\_r = \left[p\_{1,r}, p\_{2,r}, \dots, p\_{L,r}\right] \tag{5}$$
