*2.1.2.8 K-Nearest Neighbour (K-NN)*

K-Nearest Neighbour is one of the most effective machines gaining knowledge of algorithms based totally on Supervised getting to know method. The K-NN algorithm makes the assumption that the new case and the existing cases are comparable, and it places the new instance in the category that is most like the existing categories. A new data point is classified using the K-NN algorithm based on similarity after all the existing data has been stored. This means that utilising the K-NN method, fresh data can be quickly and accurately sorted into a suitable category. Although the K-NN approach is most frequently employed for classification problems, it can also be

*Machine Learning Algorithms from Wireless Sensor Network's Perspective DOI: http://dx.doi.org/10.5772/intechopen.111417*

utilised for regression. Since K-NN is a non-parametric technique, it makes no assumptions about the underlying data [18].

It is also known as a lazy learner algorithm since it saves the training dataset rather than learning from it immediately. Instead, it uses the dataset to perform an action when classifying data. The KNN method simply saves the information during the training phase, and when it receives new data, it categorises it into a category that is quite similar to the new data.

The K-NN algorithm is excellent for WSN query processing jobs because of its simplicity.

The following algorithm can be used to describe how the K-NN works:

**Step 1:** Decide on the neighbours K-numbers;

**Step 2**: Calculate the Euclidean distance (or Hamming Distance) between K neighbours. The distance between two points, which we have already examined in geometry, is known as the Euclidean distance;

E.g.: Let there be two points A(x1,y1) and B(x2,y2). Now the Euclidean distance between them can be calculated as: ED ¼ √ ð Þ x2 � x1 <sup>2</sup> <sup>þ</sup> y2 � y1 � �<sup>2</sup> h i.

**Step 3:** Based on the determined Euclidean distance, select the K closest neighbours;

**Step 4:** Count the number of data points in each category among these k neighbours;

**Step 5:** Assign the fresh data points to the category where the neighbour count is highest;

**Step 6:** Model is complete.

#### **Some pros of using the KNN algorithm:**

1. It is straightforward to put in force;

2. It's far strong to the noisy schooling records;

3. It can be extra powerful if the training facts are huge.

#### **Some cons of using the KNN algorithm:**

1.K's value must always be determined, and sometimes that can be difficult;

2.The high computation cost is caused by the need to determine the separation between each data point for each training sample.

#### **2.2 Unsupervised learning**

In supervised machine learning, models are trained on labelled data while being watched over by training data. However, there may be several instances when lacking labelled data and need to identify hidden patterns in the supplied dataset. Therefore, one needs unsupervised learning strategies to handle these kinds of problems in machine learning. Unsupervised learning is a subcategory of machine learning wherein models are trained using unlabelled datasets and are free to operate on the data without being checked by a human observer.

Because unlike supervised learning, one has the input data but no corresponding output data, unsupervised learning cannot be used to solve a regression or classification problem directly. Finding the underlying structure of a dataset, classifying the data based on similarities, and representing the dataset in an unsupervised way and in the compressed format are the objectives of unsupervised learning [19].

The following are a few key arguments for the significance of unsupervised learning:


According to the **Figure 13**, input data is unlabelled, which means that neither its category nor any associated outputs are provided. Now, the machine learning model is being trained using the unlabelled input data. It will first evaluate the raw data to identify any hidden patterns in the data before applying the appropriate algorithms. The unsupervised leaning algorithm has two subtypes: Clustering and Association.
