**3.3 Model parameters**

The proposed model has two configurable parameters, namely migration threshold and DC population size. The DC population size was set to 10 artificial DCs. The impact of migration threshold selection for the DC population is still an open research question. As noted in [9], a high migration threshold results in degraded performance for the DCA. Migration threshold selection was performed analyzing input signals for the datasets tested. Currently, this process needs to be adjusted depending on the dataset and selected features. The migration threshold was set as a uniform distribution generated in the closed real interval [0,.001]. This was chosen to have at least one migrated cell per iteration, in order to avoid oversampling in the antigen signature generation in the detection phase. The classification phase was performed by building a Decision Tree with pruning using the *fitctree* MATLAB model builder.

The parameters used to build the DT are detailed in **Table 4**. The Decision Tree contains only two predictor categories, namely "Normal" and "Anomalous". Antigen categories are defined as a combination of categorical features from the dataset, namely *flag* and *attack category* for the NSL-KDD dataset, and *protocol*, *service*, and *attack category* for the UNSW-NB15 dataset. Each predictor is determined as Anomalous if it is an attack of any kind. This process aims to increase antigen signature diversity, as only providing two antigen categories for the DCA detection phase will produce a two-observation classification task. This also reduces the performance penalties for miss-classification. One predictor is used as input for the classifier, namely *Kα*, as detailed in Eq. (6). An exact search is used as a predictor split. The cost for miss-classification is set as one. No maximum depth is set for the training process. For each split node, a maximum of 10 category levels is set, as to not increase computational complexity considerably. Leaf merging is also performed, where all leaves coming from the same parent are merged, if the risk value total is considered greater or equal to the associated parent risk value. The minimum amount of branch nodes is set to 10. Prior probabilities are set as empirical, as class probabilities are obtained from class frequencies in the class label.


**Table 4.** *DT model parameters.*


**Table 5.**

*Experimental results.*

### **3.4 Numerical results**

The tested model performance is summarized in **Table 5**, the testing performance for each dataset is highlighted in bold. The model was tested using the full train sets for each tested dataset, namely the UNSW-NB15 training set, and KDDTrain+. Testing performance is analyzed and used for comparisons. Classification performance for each dataset was tested using the UNSW-NB15 testing set and KDDTest+. Precision indicates the correctly classified anomaly proportion. The precision for the UNSW-NB15 dataset achieved 95.01%, NSK-KDD dataset was 88.91%. Specificity (or true negative rate) showed 94.24% for the UNSW-NB15 dataset, whereas the NSL-KDD dataset achieved 87.11%.

Additionally, the UNSW-NB15 and the NSL-KDD showed 99.98 and 94.85% in sensitivity. Higher sensitivity indicates the algorithm excels at identifying anomalies, whereas higher specificity denotes normal behavior correctly identified. Accuracy indicates the overall number of corrected assessments, and the UNSW-NB15 dataset achieved 97.25%, whereas the NSL-KDD achieved a 93.28%. Computation time in seconds was calculated. Training and testing time are aggregated to show the total algorithm runtime. The UNSW-NB15 dataset training time was 183.95 seconds, whereas the testing time was 66.95 seconds. Conversely, the NSL-KDD dataset training time was 126.42 seconds, and the testing time was 19.78 seconds.

Contemporary models based on the DCA are compared and presented in **Table 6**, the proposed method results are highlighted in bold. The proposed model was able to surpass other approaches and achieved a 97.25% in accuracy. The stochastic DCA [5] was tested using the UNSW-NB15 in [21], two proposals are included in the comparison and achieved results between 60.4 and 78.04% accuracy. The deterministic DCA [8] achieved a 90.14% accuracy. The fuzzy inference


**Table 6.** *DCA accuracy comparison.*

### *Network Intrusion Detection Using Dendritic Cells and Danger Theory DOI: http://dx.doi.org/10.5772/intechopen.99973*

DCA [21] achieved 89.30% accuracy. The deterministic DCA without signal categorization achieved the second best result with a 90.23% accuracy for the UNSW-NB15. The NSL-KDD dataset model accuracy was compared with two other models. The deterministic DCA with the multiplication of antigens [34] achieved the best results with a 98.6% accuracy, whereas the same model without implementing antigen multiplication achieved a 96.1% accuracy. The proposed approach achieved a 93.28% accuracy.

The accuracy of contemporary methods for binary classification is presented in **Table 7**. Accuracy for the NSL-KDD and UNSW-NB15 datasets were compared, the proposed model results are highlighted in bold. A comparison with state of the art machine learning-based models was performed. The best accuracy result for the NSL-KDD dataset was obtained by K-Nearest Neighbors classifier [35] with a 94.92% accuracy. The second-best result was achieved by the proposed model with 93.28% accuracy, followed by a Deep-Learning Long-Short Term Memory model [36] with an 86.99%. Other methods compared include Random Forest classifier [36] with 85.44% accuracy and Artificial Neural Network with 85.31% accuracy.
