*3.1.2.2 Crop prediction*

The crop prediction is a process of predicting the suitable crop for location specific crop cultivation and appropriate fertilizer without degrading the quality of the soil for the sustainable agriculture. The five different steps which are followed in this module are shown in **Figure 11**.


### 1.**Dataset Collection**

The dataset containing soil parameters are collected which includes moisture, temperature, humidity, soil type, NPK value, crop name and fertilizer name. The IoT enabled sensors data is stored in the matrix form. The grid size of the IoT enabled sensors from different locations of the agricultural land is 200 X 9. There are totally *Organic Farming for Sustainable Agriculture Using Water and Soil Nutrients DOI: http://dx.doi.org/10.5772/intechopen.100319*

**Figure 11.**

*Data flow for crop prediction.*

200 rows and 9 columns. The first 3 columns contain data which were in string format i.e., NPK values and rest 6 columns contain data which are in numerical.

#### 2.**Data pre-processing**

The information gathered from the IoT empowered sensors is in non-uniform and need preprocessing. In data pre-processing all the string data is converted into numerical data. In order to perform this, all the string features are converted into dummy variables which indirectly increased the column number, further the data is cleaned to remove null values. The dataset contains some categorical data and some continuous numeric data. This type of unstructured data cause problems to algorithm, hence the pre-processing task is performed standard feature scaling on all of the data to bring them into a common scale.

#### 3.**Data Splitting**

The crop prediction module requires the supervised classifier such as support vector machine (SVM). In-order to train the supervised classifier, it is required to segment the accessible dataset into preparing and testing dataset. It is the way towards parting the dataset into preparing and testing information. In the proposed work the dataset is divided into an 80:20 ratio, the algorithm is trained using the training data and tested using test data to find the accuracy.

#### 4.**Fitting algorithm**

The training the data file is to be carried out by loading the cpdata.csv file in order to separates features and labels that are done by applying fit module that is SVM (Support Vector Machine) algorithm used for classification, it works with managed learning method. In the event that the dataset comprise with the main highlights and names SVM works better. As SVM is primarily used for the binarization, binary classifier searches the hyper planes with possibility between positive and negative samples. The multi-SVM is used when the there are many classes from which the classification can happen successfully. There are various strategies offered, where a multiclass classifier is worked by blending the different parallel classifier and afterward used to prepare a SVM classifier in the choice tree root hub utilizing soil information.
