Anomaly Detection vs Intrusion Detection

#### **Chapter 5**

## Anomaly Detection through Adaptive DASO Optimization Techniques

*Surendra Bhosale, Achala Deshmukh, Bhushan Deore and Parag Bhosale*

#### **Abstract**

An intrusion detection systems (IDS) detect and prevent network attacks. Due to the complicated network environment, the ID system merges a high number of samples into a small number of normal samples, resulting in inadequate samples to identify and train and a maximum false detection rate. External malicious attacks damage conventional IDS, which affects network activity. Adaptive Dolphin Atom Search Optimization overcomes this. Thus, the work aims to create an adaptive optimization-based network intrusion detection system that modifies the classifier for accurate prediction. The model selects feature and detects intrusions. Mutual information selects feature for further processing in the feature selection module. Deep RNNs detect intrusions. The novel Adaptive Dolphin Atom Search Optimization technique trains the deep RNN. Adaptive DASO combines the DASO algorithm with adaptive concepts. The DASO is the integration of the dolphin echolocation (DE) with the atom search optimization (ASO). Thus, the intrusions are detected using the adaptive DASO-based deep RNN. The developed adaptive DASO approach attains better detection performance based on several parameters such as specificity, accuracy, and sensitivity.

**Keywords:** intrusion detection system (IDS), recurrent neural network (RNN), machine learning, anomaly detection, optimization algorithm

#### **1. Introduction**

Anomaly detection identifies the mechanism to study abnormal behavior notion in the data, events or experimental observations. Role of AI is very important the wide area of applications, including statistical analysis, video surveillance, computer vision, medical image analysis, neuroscience studies, financial fraud detection, law enforcement, and cyber securities.

Data analytics based on different ML novel algorithms opens a new field of research of complex data processing solutions, which can handle huge data set. Today, statisticians, programmers, engineers of multidiscipline, and medicos are brought on a common platform to have concrete and fast solution.

Real-time monitoring and dynamic data processing and finding the outliers in the dataset and finding the fact that do not conform to the normal behavior in a dataset. Cybercrimes are increasing day by day and shortcomings of the security protocols, and algorithms are exposed and worldwide internet-based businesses are affected [1].

Anomaly detection can be either supervised or unsupervised, depending on whether the dataset contains labeled or unlabeled data points. Anomaly detection system first trains machine learning algorithm (considering a dataset) to learn the normal patterns behavior, further it can identify the data points that deviate significantly from the normal behavior.

The role of AI in anomaly detection is, therefore, crucial, and new avenues of research in the field of data analytics can be possible and the demand for accurate and efficient anomaly detection algorithm will increase substantially [2].

Anomaly identification is important for cleaning up data, making sure it is secure, and building strong AI systems. This talk will talk about recent work in our group on (a) benchmarking existing algorithms, (b) building a theoretical understanding of how they work, (c) explaining anomaly "alarms" to a data analyst, and (d) re-ranking potential anomalies in response to analyst feedback. The talk will then talk about two applications: (a) identifying and diagnosing sensor failures in weather networks and (b) open category detection in supervised learning.

Anomaly detection is the process of finding events or trends that do not fit with what would normally happen. This is important for things as if predicted maintenance, but it can be hard to do just through inspection. Anomaly detection methods that use machine learning and deep learning (AI) can find things in time series or image data that would be hard to find any other way. Find out how and why to use anomaly detection methods to find strange things in sensor data from hardware.

#### **2. Major research issues and areas in anomaly detection**

In spite of several types of research carried out for anomaly detection, network security is encountered with several challenges some of them are as follows [3]: the system developed for anomaly detection in disabled many security features and automatic backup settings, erases stored data, and opens associations to get commands from a remote PC.

The multivariate statistical intrusion detection schemes in have difficulties in estimating distributions for high-dimensional data. The problems rely on the fact that it does not identify undesirable behavior, and thus the FAR can be high. The problems in include the reliance on a well-defined security policy, which may be absent, and its inability to detect intrusions that have not yet been made to be known to the IDS [4]. Some of the common challenges encountered by the network IDS are listed below:


#### **3. Recent novel contributions in anomaly detection area**

Following are the broad area in which there is research work done in the last decade as developed based on "auto encoder-based techniques," "novel machine learning-based techniques," and "hybrid techniques in IDS." There are following four broad areas *viz* auto encoder-based techniques, supervised and unsupervised machine learning algorithm, hybrid techniques in IDS, and Deep learning-based approaches.

#### **3.1 Auto encoder-based techniques**

The auto encoder learning-based techniques utilized in intrusion detection system are demonstrated in this section.

Shone et al. [1] introduced a novel deep learning model for enabling NIDS operation within modern networks. This method was developed by grouping both shallow and deep learning, and it had the ability to correctly analyze broad-array of network traffic. Most principally, the influence of this nonsymmetric deep auto encoder (NDAE) and the accurateness, as well as speed of random forest algorithm were combined. Further, this method was evaluated practically and attaining promising outcomes. This method provided elevated stages of precision, recall, and accuracy and requires less time for training.

Mighan et al. [5] established a novel scalable IDS based on deep learning. Networkbased IDS was mainly focused on this paper. This method used one of the best processing tools Apache Spark for quick identification malicious traffic and for big data. Furthermore, this system utilized a network of stack auto encoder (SAE), subsequently an SVM classifier. For an underlying extraction of feature, the SAE was used. This methodology had four stages, known as, data preprocessing stage, decisionmaking stage, latent feature extraction stage, and attack classification stage. The stage of data preprocessing was the responsible for the preprocessing of data for making it ready for the extraction of feature.

Andresini et al. [6] established an auto encoder-based deep metric learning for network intrusion detection and the invented intrusion detection strategy evaluates the flow-based characteristics of the data of network traffic. The model of intrusion detection was learned by influencing a deep metric learning approach, which initially united the triplet networks, as well as auto encoders. Two distinct auto encoders were trained in the training stage on historical normal network flows, as well as attacks correspondingly. After that, a triplet network was trained for learning the inserting of the network flows' feature vector demonstration.

#### **3.2 Novel machine learning-based techniques**

The novel machine learning-based techniques utilized in intrusion detection are demonstrated in this section.

Kaja et al. [7] discovered a new two-stage intelligent IDS for the detection and protection of systems from such malicious attacks. Four preprocessing steps were performed in this method. Initially, the variance of values of feature was calculated for calculating the increase between features present in the dataset. The correlated features are estimated and eliminated in the second step of preprocessing in order to avoid overfitting. The third step utilized least square regression error (LSQE) to maximize the reduction of dimensionality and to minimize similarities of feature. At last, for analyzing relationships of feature, the maximal information compression index (MICI) was utilized.

Jin et al. [8] implemented a Bayes system based on light gradient boosting machine (light GBM) and parallel intrusion detection mechanism. The developed IDS was called as Swift IDS, which had the ability of both investigation of enormous data of traffic in speedy networks well-timed along with maintenance of reasonable performance of detection. The abovementioned goals were achieved by Swift IDS through two approaches. Light GBM was adopted in one approach as the algorithm of detecting intrusion for the management of enormous traffic data.

Sarker et al. [9] presented a machine learning-based security model namely, intrusion detection tree (IntruDTree). In this approach, initially, the security features ranking was considered in accordance with their significance in modeling. Afterward, a comprehensive model of intrusion detection based on a tree was constructed on the basis of the chosen significant features. After completing the construction of the entire tree by means of the integrating security data, the test data was utilized for authenticate the model.

Injadat et al. [10] developed a multistage optimized machine-learning framework for network intrusion detection. The effect of oversampling methods on the training model size of the models was studied initially and the least reasonable training size for successful intrusion identification was determined. This article suggests a multi stage enhanced machine learning-based NIDS structure, which decreases computational difficulty while keeping up with its performance of recognition. The stage of data preprocessing includes the process of normalization of data utilizing the Z-score strategy in addition to minority class oversampling by the usage of synthetic minority oversampling technique (SMOTE) algorithm.

Bertoli et al. [11] illustrated an end-to-end framework for machine learning-based NIDS. The AB-TRAP architecture was described in this paper, which enabled the application of updated network traffic, as well as think about the operational worries for enabling the entire employment of the resolution. The utilized AB-TRAP was a framework had five steps, comprising of the creation of the attack dataset, implementation of the models, the bonafide dataset, training of machine learning models, and the evaluation of performance of the recognized model following employment.

#### **3.3 Hybrid techniques in IDS**

The hybrid learning-based techniques exploited in intrusion detection are demonstrated in this section.

#### *Anomaly Detection through Adaptive DASO Optimization Techniques DOI: http://dx.doi.org/10.5772/intechopen.112421*

Jiang et al. [12] established a network intrusion detection algorithm combined hybrid sampling with deep hierarchical network for the improvement of detection accuracy. Two parts were included in the hybrid sampling. At first, for the elimination of noise samples in the mainstream class, the one side selection (OSS) algorithm was utilized. Next, the SMOTE was employed for creating the minority class samples in order to lighten the imbalance of samples of minority class. Therefore, for the classification, the imbalanced data was converted into balanced data. Deep hierarchical network was constructed for the difficulty of data features, which train the CNN in addition to bi-directional long short-term memory (Bi-LSTM) when learning the temporal and spatial feature of network traffic data.

Cavusoglu [13] implemented a new hybrid approach for intrusion detection using machine learning methods. The developed IDS used a mixture of various feature selection, as well as machine learning-based methods for offering high performance intrusion discovery in several types of attacks. The first step of this technique was data preprocessing, then the dataset's size was decreased by utilizing various feature selection algorithms. For feature selection, two novel methods were introduced. By the determination of suitable machine learning algorithms according to type of attack, the layered architecture was generated. This approach had low false positive rates and high accuracy in every form of attacks.

#### **3.4 Deep learning-based approaches**

The techniques on the basis of deep learning utilized in recognition of intrusion are demonstrated in this division. This section is again classified into three as follows:

#### *3.4.1 CNN-based techniques*

The CNN learning-based methods employed in intrusion detection are demonstrated in this division.

Tang et al. [14] developed a deep learning method for network intrusion detection in software-defined networking (SDN). In the established framework, the module of NIDS was established in the controller. The entire open flow switches were monitored by the SDN controller and called all network information when required; hence, the benefit of this global network was taken by the NIDS module for the detection of intrusion. After a fixed time window, a request message was sent to the entire open flow switches from the controller for requesting the network information.

Wu et al. [15] introduced an original intrusion detection model for a huge network using CNN. So as to involuntarily choose traffic features from raw dataset the CNN was utilized and the cost function weight coefficient of each category was set on the basis of its numbers for solving the problem of unprovoked dataset. Standard datasets were utilized by this approach for assessing the performance of the developed CNN model. The raw format of traffic vector was altered into image format in order to condense the cost for calculation. This method reduced the false alarm rate and improved the calculation cost and accuracy.

#### *3.4.2 DNN-based method*

The DNN learning-based methods exploited in intrusion detection are demonstrated in this division.

Vinayakumar et al. [16] presented a deep-learning approach for and intelligent IDS. This method utilized deep learning network (DNN) for designing an effective and flexible IDS for detecting and classifying unexpected and unknown cyber-attacks. The summary and high-dimensional feature demonstration of the IDS information are learned by sending them into several hidden layers. Moreover, this approach took up a multilayer perceptron (MLP) model that was a form of feed-forward neural (FFN) network consisting three or more than three layers with one output layer, one or more hidden layer, and one input layer wherein every layer had a lot of units or neurons in mathematical notation.

Gao et al. [17] explored an adaptive ensemble machine learning model for intrusion detection in which the advantages of every algorithm for various form of data detection was integrated, and optimal results are achieved through ensemble learning. The benefit of ensemble learning was merging the guesses of various fundamental estimators to enhance the robustness and generalize ability over a distinct estimator. A few widespread algorithms are utilized in this approach such as DNN, decision tree, and random forest to train this model. Also, the adaptive voting and multi-tree algorithm are developed in order to enhance the consequence of intrusion detection. It was found from the comparison with various existing methods; this method was superior to many former research outcomes and had good application prospects.

#### *3.4.3 Other techniques*

The other deep learning-based techniques utilized in intrusion detection are demonstrated in this section.

Otoum et al. [18] developed deep learning-based intrusion detection in the supervising of critical infrastructures through sensor networks. The main intention of this research is to examine the prospective of deep learning as a substitute for IDS based on robust machine learning. A restricted Boltzmann-based Clustered IDS (RBC-IDS) model was presented for a deep learning solution for detecting intrusion in critical network applications based on wireless sensor network.

Yang et al. [19] introduced a joined wireless network intrusion detection model in view of deep learning. The deep belief network (DBN) was involved in this approach as the layer of feature extraction and support vector machine (SVM) characterization layer. DBN layer, the error backpropagation algorithm, and the contrast divergence algorithm were utilized to decrease aspects of information and extort features. It assisted SVM for enhancing the capability to categorize high-dimensional information. Contrasted with the preceding forward proliferation, this approach changed the credence of the multi-restricted Boltzmann machine (RBM) with the backpropagation algorithm. The recognition model was prepared and laid out with the SVM to keep on enhancing the interruption. The classification performance of DBN was efficiently progressed by this approach. Thus, the precision rate, recall rate, and accuracy rate of this approach were superior to other methods.

Wu and Guo [20] introduced a hierarchical deep neural network for network intrusion detection namely, LuNet, which was made up of numerous levels of merged recurrent convolution sub-nets. The input data at every level was learned by RNN and CNN nets. The granularity of learning turned into increasingly detailed, as progress of the learning from the first level to the last level. With an understanding, the synergy of both RNN and CNN was efficiently utilized for the of both temporal and spatial feature extractions. By means of an in-depth examination and conversation for the arrangement of LuNet, high learning efficiency was attained by this method.

#### *Anomaly Detection through Adaptive DASO Optimization Techniques DOI: http://dx.doi.org/10.5772/intechopen.112421*

Khan et al. [21] presented a novel two-stage deep learning (TSDL) model, in view of a stacked auto-encoder with a soft-max classifier, for effective network intrusion identification. Two decision stages are involved in this model. This model varies from preceding models as it comprised two stages of feature representation. Feature representation was learned by the primary stage for characterizing typical and unusual network traffic with a possibility score value.

Sohi et al. [22] presented a recurrent neural network-based IDS, namely RNNIDS to catch complex designs in attacks and generate like ones. Initially, by the application of a new method, the generation of mutants of a malware was demonstrated, and this was the first step of this approach. This approach depends on the truth that an unknown pattern could be learned and extorted by a RNN. On the basis of this truth, new and unseen sequences were generated.

Zeng et al. [23] invented a deep learning-based network encrypted traffic classification and intrusion detection framework for detecting intrusions. The developed architecture was named as Deep-Full-Range (DFR), and it had three deep-learning algorithms such as CNN, LSTM, and SAE. The CNN was used for learning features of the raw traffic from spatial range. The features were learned from the time-related aspect by the use of LSTM. The SAE was taken for extracting features from coding characteristics.

#### **4. Proposed adaptive dolphin atom search optimization-based DRNN for network intrusion detection system**

The main challenges in network security are the development of efficient and robust NIDS. Although the significant developments in NIDS technology, the majority of solutions are still functioning by less capable signature-dependent techniques as opposed to anomaly detection approaches. The recent situation reaches a point, whereby reliance on such techniques leads to unsuccessful and inaccurate analysis. These challenges are utilized to create a widely-accepted anomaly detection (AD) technique capable of overcoming limitations caused by the ongoing changes happening in modern networks. NIDS is composed of data gathering, attribute extraction, attribute selection, IDS, and report generation. Every component in IDS have own impacts and functions, which are not noticed. There are three major limitations exist in IDS, where the contribution of this ID system is related to these limitations. First challenge relies on the enormous quantity of network information, and this issue can be handled using developing technique, which evaluates the data in an efficient manner. Second challenge is granularity and depth observing required for boosting up efficiency. Third limitation relies on quantity of distinct protocols and enormous quantity of data communication through traditional networks that commence large levels of intricacy and complexity. This augments the complexity for evaluating an exact scope of potential implementation or zero-day attacks [24].

Machine learning (ML) techniques are enormously adapted for recognizing distinct categories of attacks, and ML technique assists the system supervisor acquire the respective measures for preserving intrusions. Nevertheless, majority of conventional ML techniques be owned by superficial learning, which cannot successfully evaluate the issue of enormous intrusion information [25]. These limitations arise in the features of real system in application background. The invention of multi-classification process diminished accuracy with effective development of dataset. Additionally, superficial learning is inappropriate to knowledge-based analysis and broadcasting

necessities of elevated dimensional learning accompanied with enormous data. In contrast, deep learners have ability for extracting better illustration from review data for generating better learning schemes. Consequently, IDS has familiarized rapid improvement after diminishing into moderately slow period [26]. Though, majority of traditional ML techniques related to superficial learning and regularly emphasize selection and feature engineering [27]. The innovations of deep learning (DL) techniques employ a rapid improvement in the recent period, which gives an improvement for detecting the new IDS. Recurrent neural network (RNN) plays a significant function in enhancement of DL techniques in the domain of language processing, translation, image depiction, human behavior identification, and semantic realization [6]. Since, DL approaches contain potential for identifying better illustrations from the information for creating much better schemes and inspired by RNN [28].

The main aspire of research is the detection of intrusions exist in the network using DASO-based deep RNN. Initially, input image is fed into the feature selection using mutual information in which the relevant features are selected. Then, the selected features are forwarded to the ID module in which the process is done by deep RNN. The deep RNN is trained using adaptive DASO algorithm for predicting whether it is intruder or not.

*Proposed model:* The main contribution of the research is development of adaptive DASO-based deep RNN for intruder detection. The Adaptive DASO is utilized to train the DRNN for predicting whether the network is intruder or not.

#### **4.1 Developed adaptive DASO-based DRNN for NIDS**

NIDS is the efficient method for preserving the computer networks from malicious threats and attacks. Different ID methods are adapted to predict the behavior of malicious activities, but an accurate detection of intrusion exist in the network system offers a major challenge. To deal with this challenge, an effective optimization method, named adaptive DASO-based DRNN is developed for identifying intrusion behavior in the network. The developed optimization scheme completes ID approach using two stages such as feature selection and ID. Initially, input data is gathered from ID dataset, and then it is forwarded to the feature selection steps, where the relevant features are selected using mutual information. Once the suitable features are selected, the intrusions are detected using DRNN, where the weight of the classifier is trained using developed Adaptive DASO algorithm for predicting the malicious behavior. Adaptive DASO algorithm is designed by including adaptive concept with the integration of DE and ASO. **Figure 1** shows the schematic illustration of developed model.

#### **4.2 Get the input data**

Let us choose the database as *F* with *x* number of network intrusion data *D*, which is depicted as,

$$F = \left\{ D\_1, D\_2, \dots, D\_p, \dots D\_x \right\} \tag{1}$$

where, *D* depicts the intrusion data, *F* indicates the database, and *Dp* demonstrates the intrusion data situated at *pth* index. From the intrusion dataset, intrusion data of

*Anomaly Detection through Adaptive DASO Optimization Techniques DOI: http://dx.doi.org/10.5772/intechopen.112421*

**Figure 1.** *Schematic illustration of developed model for ID.*

input network *Dp* is considered and is permitted to feature selection module for performing ID process.

#### **4.3 Selection of features through mutual information**

The input data *Dp* is gathered from database and is fed to the feature selection stage, where the important features are successfully extracted to reduce the dimensionality of data. Here, the feature selection module is modeled using theory of mutual Information [29], which is employed to overcome the nuisance of dimensionality in prediction of malicious system. The motive of feature selection is to extract the relevant features suitable for identifying the behavior of intrusions. Mutual information establishes the relation among class label and features that are sampled simultaneously for predicting the relevant features. According to information theory [30], the mutual information among two constraints is nothing if and only if two constraints are statistically autonomous. Joint distribution of mutual information among two features *S*, and class label *T* is evaluated as,

$$L(\mathbb{S}, T) = \sum\_{s \in \mathbb{S}} \sum\_{t \in T} L(s, t) \log \frac{L(s, t)}{L(s).L(t)} \tag{2}$$

*L S*ð Þ and *L T*ð Þ depict borderline distributions of *S* and *T* formed through marginalization approach. Here, *S* shows the features, and *T* shows the labels of class. Finally, the features are chosen by the mutual information theory are represented as *S* and expressed by,

$$\mathcal{S} = \{\mathcal{S}\_1, \mathcal{S}\_2, \dots \mathcal{S}\_m, \dots \mathcal{S}\_n\} \tag{3}$$

Here, *Sn* indicates the total count of features, and *Sm* depicts the *mth* feature. The output attained from the mutual information theory can be either normal or abnormal behavior, which is considered based on threshold value for choosing the features [31]. The relevant features selected using mutual information are denoted as *S* with the dimension of 1½ � � *n* . Furthermore, the features chosen from feature selection are fed to the input of deep RNN for performing ID process.

#### **4.4 ID using developed adaptive model**

NIDS is performed using developed adaptive model. The developed Adaptive DASO is constructed by combining DE and ASO with adaptive concept. The deep RNN classifier takes feature *S* as input obtained from feature selection module and initiates intrusion detection process with the hidden layers of neural network. Furthermore, the developed Adaptive DASO is employed for training the weights of classifier for achieving optimal performance [32].

#### *4.4.1 Structure of DRNN*

Deep RNN uses the information from the feature selection tool to do its work. It has three levels, including the input layer, the hidden layer, and the output layer. In neural network design, the input layer is at the top and the output layer is at the bottom. The hidden layer is in the middle. The output pattern of the last layer is fed into the first layer of the next layer, and so on. The repeating link is only made between levels that are hidden. Deep RNN classifier is better because it takes less time to learn the data. The system design of deep RNN is depicted in **Figure 2**.

The organization of DRNN classifier is formed by picking the input vector of *i th* layer at *j th*time as *<sup>S</sup>*ð Þ *<sup>i</sup>*,*<sup>j</sup>* <sup>¼</sup> *<sup>S</sup>* ð Þ *i*,*j* <sup>1</sup> , *S* ð Þ *i*,*j* <sup>2</sup> , *S* ð Þ *i*,*j* <sup>3</sup> , … *S* ð Þ *i*,*j <sup>h</sup>* , … , *<sup>S</sup>*ð Þ *<sup>i</sup>*,*<sup>j</sup> n* n o and output vector of *<sup>i</sup> th* layer at *j th* time as *<sup>R</sup>*ð Þ *<sup>i</sup>*,*<sup>j</sup>* <sup>¼</sup> *<sup>R</sup>*ð Þ *<sup>i</sup>*,*<sup>j</sup>* <sup>1</sup> , *<sup>R</sup>*ð Þ *<sup>i</sup>*,*<sup>j</sup>* <sup>2</sup> , *<sup>R</sup>*ð Þ *<sup>i</sup>*,*<sup>j</sup>* <sup>3</sup> , … *<sup>R</sup>*ð Þ *<sup>i</sup>*,*<sup>j</sup> <sup>h</sup>* , … , *<sup>R</sup>*ð Þ *<sup>i</sup>*,*<sup>j</sup> n* n o, respectively. *<sup>h</sup>* represents arbitrary unit number of *i th*layer, and *n* species the total count of units of *i th* layer. Moreover, the arbitrary unit number, total number of units of ð Þ *<sup>i</sup>* � <sup>1</sup> *th* layer is indicated as *<sup>i</sup>* and *j*, respectively However, the elements of the input vector are demonstrated as,

$$\mathcal{S}\_{h}^{ij} = \sum\_{x=1}^{q} r\_{hx}^{i} \mathcal{R}\_{x}^{(i-1j)} + \sum\_{h'}^{n} u\_{hh'}^{i} \mathcal{R}\_{h'}^{(ij-1)} \tag{4}$$

where, *r<sup>i</sup> hz* and *ui hh*<sup>0</sup> are the elements of *<sup>G</sup>*ð Þ*<sup>i</sup>* and *<sup>g</sup>*ð Þ*<sup>i</sup>* . Arbitrary unit number of *<sup>i</sup> th* layer is represented as *h*<sup>0</sup> . The elements of the output vector of *i th* layer are expressed as,

$$\mathcal{R}\_h^{(i,j)} = \mathcal{X}^{(i)}\left(\mathcal{S}\_h^{(i,j)}\right) \tag{5}$$

*Anomaly Detection through Adaptive DASO Optimization Techniques DOI: http://dx.doi.org/10.5772/intechopen.112421*

#### **Figure 2.** *System design of deep RNN classifier.*

where, *χ*ð Þ*<sup>i</sup>* depicts the activation function. After assessing the output vector, the look becomes

$$\mathcal{R}^{(i,j)} = \chi^{(i)} \left( \mathbf{G}\_i \mathbf{R}^{(i-1,j)} + \mathbf{g}^{(i)} \mathbf{R}^{(i,j-1)} \right. \tag{6}$$

Here, *R*ð Þ *<sup>i</sup>*,*<sup>j</sup>* indicates the output of classifier.

#### *4.4.2 Training process of Deep RNN using adaptive DASO algorithm*

The DRNN classifier is trained using developed Adaptive DASO. The developed Adaptive DASO is formed by including adaptive concept with the integration of DE and ASO. The ASO method is developed using the attraction and repulsion behavior of atoms. Every atom interacts with other atoms using attraction behavior and repulses the premature and over-concentrated atoms using repulsion properties. On the other hand, the DE method is introduced for enhancing the security of detection using the behavior of dolphins. DE method investigates the large space of candidate solutions, and it is performed till the global solution is achieved. The mixture of DE and ASO scheme, called as DASO technique, that offers best solution to solve optimization

issues; however, this method consumes more computational time. Hence, in this research adaptive concept is included with DASO method for obtaining less computational time.

*Solution encoding:* The developed optimization is employed to estimate the optimal solution and reduced the error rate for NIDS-based on fitness measure. However, the implementation steps engaged in the developed adaptive model are summarized as below:

#### *4.4.2.1 Population initialization*

Let *υ* be the number of atoms and the position of *dth* atom is depicted as,

$$I\_d = \begin{bmatrix} I\_d^1, \dots, I\_d^d \end{bmatrix}; d = [\mathbf{1}, \dots, l] \tag{7}$$

where, *I a <sup>d</sup>* denotes the *<sup>a</sup>th* position component of *dth* atom.

#### *4.4.2.2 Fitness function*

Fitness function is evaluated by estimating the variation of predicted output and classifier output, and the less error value is selected as the best solution, which is expressed as,

$$
\sigma\_d = \frac{1}{\mu} \left[ \sum\_{s=1}^{\mu} e\_s - R\_s^{(i,j)} \right] \tag{8}
$$

where, *σ<sup>d</sup>* indicates the fitness value of *dth* atom, *R*ð Þ *<sup>i</sup>*,*<sup>j</sup> <sup>s</sup>* depicts the classifier output, and *εs*denotes the predicted output.

#### *4.4.2.3 Mass computation*

Atom mass is estimated using fitness function and mass of *dth* atom at *f th* iteration is specified as,

$$\mathcal{M}\_d(f) = \frac{\mathfrak{e}\_d(f)}{\sum\_{d=1}^l \mathfrak{e}\_f(f)} \tag{9}$$

where, *Md*ð Þ*f* indicates the mass, and the term *ed*ð Þ*f* is expressed as,

$$\sigma\_d(f) = \frac{\sigma\_d - \sigma\_{\text{best}}}{e^{\sigma\_{\text{worst}} - \sigma\_{\text{best}}}} \tag{10}$$

where, the terms *σbest* and *σworst* specifies the best and worst value, and the expression is depicted as,

$$
\sigma\_{\text{best}} = \min\_{d=1,\dots,l} \sigma\_d \tag{11}
$$

$$
\sigma\_{\text{worst}} = \max\_{d=1,\dots,l} \sigma\_d \tag{12}
$$

*Anomaly Detection through Adaptive DASO Optimization Techniques DOI: http://dx.doi.org/10.5772/intechopen.112421*

#### *4.4.2.4 Evaluate neighbor*

The exploration of initial iteration is enhanced by selecting the *N* neighbors, which is based on the fitness value of interactions between atoms. The expression for *N* is depicted as,

$$N(f) = l - (l - 2)\sqrt{\frac{f}{d}}\tag{13}$$

#### *4.4.2.5 Calculate the total force and constraint force*

The summation of overall component that performed on the *dth* atoms from neighboring atoms is specified as total force, and the expression given by,

$$Q\_d^d(f) = \sum\_{s \in \mathcal{N}\_{\text{but}}} rand\_s Q\_{ds}^d(f) \tag{14}$$

where, *Q<sup>a</sup> <sup>d</sup>*ð Þ*f* indicates the force, and the term *rands* specifies the random number and varies from 0 to 1, respectively. Every atom in the population space behaves as the best atom along with the constraint force of *dth* atom is expressed as,

$$\lambda\_d^\mathfrak{a}(f) = H(f) \left( I\_{\text{best}}^\mathfrak{a}(f) - I\_d^\mathfrak{a}(f) \right) \tag{15}$$

where, *H f*ð Þ indicates the lagrangian multiplier.

#### *4.4.2.6 Estimate the acceleration*

The acceleration of *dth* atom at *f th* time is calculated as,

$$A\_d^a(f) = \frac{Q\_d^a(f)}{M\_d^a(f)} + \frac{\lambda\_d^a(f)}{M\_d^a(f)}\tag{16}$$

where, *Q<sup>a</sup> <sup>d</sup>*ð Þ*<sup>f</sup>* is the total force, *<sup>λ</sup><sup>a</sup> <sup>d</sup>*ð Þ*<sup>f</sup>* is the constraint force, *<sup>M</sup><sup>a</sup> <sup>d</sup>*ð Þ*f* indicates the mass, and *Aa <sup>d</sup>*ð Þ*<sup>f</sup>* indicates acceleration of *<sup>d</sup>th* atom at *<sup>f</sup> th* time.

#### *4.4.2.7 Renew the velocity*

The velocity of *dth* atom at *<sup>f</sup>* <sup>þ</sup> 1 iteration is expressed as,

$$V\_d^a(f+\mathbf{1}) = rand\_d^a V\_d^a(f) + A\_d^a(f) \tag{17}$$

where, *randd <sup>a</sup>* indicates the random number, and *Aa <sup>d</sup>*ð Þ*f* specifies the acceleration.

#### *4.4.2.8 Update the atom location*

The final updated equation of DASO algorithm is given as follows.

$$I\_d(f+1) = \frac{\alpha \nu\_{2d} M\_d(f)}{\alpha \nu\_{2d} M\_d(f) - Z\overline{\pi}^{-\frac{-2\theta}{a}}} \left[ \frac{I\_d(f) + \operatorname{rand}\_d V\_d(f) - \psi \left(1 - \frac{f-1}{a}\right)^3 \varepsilon^{\frac{-2\theta}{a}} \sum\_{l \in \mathcal{N}\_{\text{far}}} \frac{\operatorname{rand}\_l \left[2 \times \left(\varepsilon\_{ld}(f)\right)^{13} - \left(\varepsilon\_{ld}\right)^7\right]}{M\_d(f)} \right] \right] \tag{11}$$
 
$$\left[\frac{\left(I\_d(f) - T\_d(f)\right)}{\|I\_d(f)I\_l(f)\|\_2} - Z\overline{\varepsilon}^{\frac{-2\theta}{a}} \frac{I\_d(f) + W\_d(f) + \alpha\_{1l}I\_d - \alpha\_{1l}I\_d(f)}{\alpha\_{2l}M\_d(f)}\right] \tag{12}$$

where, *Md*ð Þ*<sup>f</sup>* specifies the mass of *dth* atom, *Vd*ð Þ*<sup>f</sup>* is the velocity, *<sup>Z</sup>* indicates the multiplier weight, *ψ* specifies the depth weight, *α* shows the maximum iteration, *Wd* signifies the search space dimension, *Jd* depicts the personal best solution, and *ω*1*<sup>d</sup>* and *ω*2*<sup>d</sup>* are the random number that lies between 0 to 1.

In equation, the term *ψ* is made adaptive for better performance of intrusion detection. The expression *ψ* is given by,

$$
\psi = \psi\_{\text{max}} - \frac{f(\psi\_{\text{max}} - \psi\_{\text{min}})}{a} \tag{19}
$$

where, *α* signifies the depth weight, which is made adaptive, *ψ*max and *ψ*max depicts the predefined max, and min value of *ψ* and *α* signifies the highest iteration. Algorithm 1 states the pseudocode of the developed adaptive model.

#### *4.4.2.9 Re-compute the fitness*

Fitness value is predicted using objective function, which is mentioned in Eq. (8), where the fitness with optimal value is selected as optimal solution.

#### *4.4.2.10 Termination*

The abovementioned iteration is repeated until the stopping criteria are reached. The pseudocode of developed adaptive DASO-based deep RNN techniques is specified in Algorithm 1.


**Algorithm 1.** Pseudocode of the developed adaptive model.

*Anomaly Detection through Adaptive DASO Optimization Techniques DOI: http://dx.doi.org/10.5772/intechopen.112421*


By including the Adaptive concept with ASO and DE provides enhanced optimal result, and the computation time is also reduced. The performance of intrusion detection is also enhanced by including the adaptive concept within the hybrid optimization algorithm.

#### **5. Results and discussion**

The results of developed adaptive model are briefly discussed in this area in terms of sensitivity, accuracy, and specificity.

#### **5.1 Experimental setup and dataset description**

The developed adaptive model is executed in Pythontool using NSL-KDD dataset [33], and BoT-IoT dataset [34]. Dataset-1 includes multiple information for solving the optimization troubles such that this information is reasonable. The Dataset-2 comprises the source files with different formats such as CSV files, argus files, and pcap files. However, these files are partitioned based on the kind of attacks.

#### **5.2 Evaluation parameters**

The performance parameters utilized for the analysis of intrusion detection in the proposed adaptive model are sensitivity, accuracy, and specificity.

#### *5.2.1 Sensitivity*

The sensitivity is the proportion of true positive (TP) to the addition of TP and false negative (FN). The sensitivity is expressed as,

$$Sensitivity = \frac{P\_T}{N\_F + P\_T} \tag{20}$$

#### *5.2.2 Accuracy*

The accuracy is the degree of proximity between predicted and original value. The accuracy is expressed as,

$$Accuracy = \frac{N\_T + P\_T}{P\_F + N\_F + P\_T + N\_T} \tag{21}$$

*5.2.3 Specificity*

The specificity is the proportion of true negative (TN) to the addition of false positive (FP) and true negative (TN). The specificity is termed as,

$$\text{Specificity} = \frac{N\_T}{N\_T + P\_F} \tag{22}$$

where, *PT*, *PF*, *NT* and *NF* represented the true positive, false positive, true negative, and false negative, respectively.

#### **5.3 Comparative methods**

The performance of the developed method is analyzed by comparing developed method with the other state-of-the-art techniques, such as DBN [1], CNN [13], as well as DSAE [14], respectively.

#### **5.4 Comparative analysis**

This part talked about how the developed adaptive DASO-based DRNN with dataset-1 and dataset-2 were compared.

#### *5.4.1 Analysis using dataset-1*

**Figure 3a** shows how accuracy can be looked at by changing the training data. For 60% of training data, the accuracy of the newly created adaptive model is 0.8856, while the accuracy of the currently used methods, such as DBN, DSAE, CNN, and DASO-based DRNN, is 0.8290, 0.8224, 0.8056, and 0.860317, respectively. The performance of the adaptive DASO-based deep RNN was improved by 6.39354%, 7.1329%, 9.0376%, and 2.8613% when compared to state-of-the-art methods such as DBN, DSAE, CNN, and DASO-based deep RNN.

**Figure 3b** shows how the sensitivity and training data were looked at. The created adaptive model has a sensitivity of 0.9849, while the training data is 70%. With existing methods, such as DBN, DSAE, CNN, and DASO-based deep RNN, the sensitivity values are 0.9362, 0.9230, 0.89, and 0.9779. The performance of the adaptive DASO-based deep RNN was improved by 4.94251%, 6.2823%, 9.64145%, and 0.7154% when compared to state-of-the-art methods such as DBN, DSAE, CNN, and DASObased deep RNN.

**Figure 3c** shows how the precision of training data was tested. With 80% of the training data, the created adaptive DASO-based deep RNN gets a specificity value of 0.9754. Existing methods such as DBN, DSAE, CNN, and DASO-based deep RNN get specificities of 0.7394, 0.8969, 0.9174, and 0.9611. The performance of the developed

#### **Figure 3.**

*Comparative analysis using dataset-1, (a) accuracy, (b) sensitivity, and (c) specificity.*

adaptive DASO-based deep RNN was found to be better than state-of-the-art methods such as DBN, DSAE, CNN, and DASO-based deep RNN by 24,193%, 80,476%, 59,513%, and 14,712%, respectively.

#### *5.4.2 Analysis using dataset-2*

**Figure 4a** shows how the accuracy of the training data was compared to the accuracy of the test data. For 60% of the training data, the accuracy of the adaptive model is 0.9767, while the accuracy of DBN, DSAE, CNN, and DASO-based DRNN is 0.9305, 0.9329, 0.9388, and 0.956087, respectively. When comparing the developed adaptive model to state-of-the-art methods such as DBN, DSAE, CNN, and DASObased deep RNN, the performance improvement is 4.7341%, 4.4860%, 3.8829%, and 2.1169%, respectively.

**Figure 4b** shows how the sensitivity analysis is done with the training data. For 70% of training data, the developed adaptive DASO-based deep RNN gets a specificity value of 0.9894, while existing methods such as DBN, DSAE, and CNN get values of 0.9560, 0.9238, 0.9280, and 0.9821 for sensitivity. When comparing the developed adaptive DASO-based deep RNN with the most advanced methods, such as DBN, DSAE, CNN, and DASO-based deep RNN, the performance improvement is 3.3776%, 6.6318%, 6.2063%, and 0.7416%, respectively.

#### **Figure 4.**

*Comparative analysis of dataset-2, (a) accuracy, (b) sensitivity, and (c) specificity.*

**Figure 4c** shows how sensitivity is tested and how training data is used. When the training data is 80%, the developed adaptive model has a sensitivity of 0.8513. On the other hand, existing methods like DBN, DSAE, CNN, and DASO-based deep RNN have specificities of 0.7370, 0.8178, 0.8041, and 0.8255. When comparing the developed adaptive DASO-based deep RNN with the most advanced methods, like DBN, DSAE, CNN, and DASO-based deep RNN, the performance improvement was found to be 13.4295%, 3.9380%, 5.539%, and 3.02870%, respectively.

#### **5.5 Comparative discussion**

**Table 1** shows a comparison of the adaptive model that has been created. Using dataset-1 as an example, the accuracy of the current DBN, DSAE, CNN, and DASObased deep RNN is 0.8479, 0.8245, 0.8094, and 0.9180, while the accuracy of the proposed adaptive model is 0.93679, which is better. With dataset-1, the DBN, DSAE, CNN, and DASO-based deep RNN each got a sensitivity of 0.9364, 0.9281, 0.89, and 0.9788, but the suggested adaptive model did better, getting a sensitivity of 0.9851. With dataset-2, the accuracy of the existing DBN, DSAE, CNN, and DASO-based DRNN is 0.9512, 0.9735, 0.9552, and 0.9822, respectively, while the accuracy of the suggested adaptive model is 0.9854. With dataset-2, the specificity of the existing DBN, DSAE, CNN, and DASO-based deep RNN is 0.7370, 0.8178, 0.8041, and 0.82557, respectively, while the specificity of the suggested adaptive model is 0.8513.

*Anomaly Detection through Adaptive DASO Optimization Techniques DOI: http://dx.doi.org/10.5772/intechopen.112421*


#### **Table 1.**

*Comparative discussion.*

#### **6. Conclusion**

In this paper, a novel network ID mechanism named adaptive DASO-based deep RNN is developed to predict the abnormal behavior in the network. At first, the data are obtained from database and send this data to feature selection module using mutual information, which selects the relevant features. The features selected through feature selection are based on the threshold value. Once the features are selected, these features are forwarded to the IDS for predicting the malicious behavior in the network. The malicious activity is obtained by the developed DRNN, which is trained using Adaptive DASO algorithm. The Adaptive DASO model is designed by integrating adaptive concept, DE, and ASO. Although, the combined DA and ASO algorithm provides better result, but this method consumes high computational time. Thus, the adaptive concept is introduced with the DASO for reducing computational time. This algorithm predicts that the behavior of the network is either normal or abnormal. The weights are accurately measured by the developed Adaptive DASO algorithm through fitness function. In addition, the developed Adaptive DASO achieved optimal performance utilizing the evaluation metrics such as accuracy, sensitivity, and specificity with the values of 0.9854, 0.99, and 0.8513, using dataset-1. In the future, the detecting capacity of IDS can be enhanced by using some other optimization techniques.

#### **Author details**

Surendra Bhosale<sup>1</sup> \*, Achala Deshmukh<sup>2</sup> , Bhushan Deore1,3 and Parag Bhosale<sup>4</sup>


© 2023 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

### **References**

[1] Shone N, Ngoc TN, Phai VD, Shi Q. A deep learning approach to network intrusion detection. IEEE Transactions on Emerging Topics in Computational Intelligence. 2018;**2**(1): 41-50

[2] Yin C, Zhu Y, Fei J, He X. A deep learning approach for intrusion detection using recurrent neural networks. IEEE Access. 2017;**5**: 21954-21961

[3] Azad C, Jha VK. Fuzzy min–max neural network and particle swarm optimization based intrusion detection system. Microsystem Technologies. 2017;**23**(4):907-918

[4] Sohi SM, Seifert JP, Ganji F. RNNIDS: Enhancing network intrusion detection systems through deep learning. Computers and Security. 2020;**2020**: 102151

[5] Mighan SN, Kahani M. A novel scalable intrusion detection system based on deep learning. International Journal of Information Security. 2021;**20**(3): 387-403

[6] Andresini G, Appice A, Malerba D. Autoencoder-based deep metric learning for network intrusion detection. Information Sciences. 2021;**569**: 706-727

[7] Kaja N, Shaout A, Ma D. An intelligent intrusion detection system. Applied Intelligence. 2019;**49**(9): 3235-3247

[8] Jin D, Lu Y, Qin J, Cheng Z, Mao Z. SwiftIDS: Real-time intrusion detection system based on LightGBM and parallel intrusion detection mechanism. Computers & Security. 2020;**97**: 101984

[9] Sarker IH, Abushark YB, Alsolami F, Khan AI. Intrudtree: A machine learning based cyber security intrusion detection model. Symmetry. 2020;**12**(5):754

[10] Injadat M, Moubayed A, Nassif AB, Shami A. Multi-stage optimized machine learning framework for network intrusion detection. IEEE Transactions on Network and Service Management. 2020;**18**(2):1803-1816

[11] Bertoli GDC, Júnior LAP, Saotome O, Dos Santos AL, Verri FAN, Marcondes CAC, et al. An end-to-end framework for machine learning-based network intrusion detection system. IEEE Access. 2021;**9**:106790-106805

[12] Jiang K, Wang W, Wang A, Wu H. Network intrusion detection combined hybrid sampling with deep hierarchical network. IEEE Access. 2020;**8**:32464-32476

[13] Çavuşoğlu Ü. A new hybrid approach for intrusion detection using machine learning methods. Applied Intelligence. 2019;**49**(7):2735-2761

[14] Tang TA, Mhamdi L, McLernon D, Zaidi SAR, Ghogho M. Deep learning approach for network intrusion detection in software defined networking. In: Proceedings of 2016 International Conference on Wireless Networks and Mobile Communications (WINCOM). Fez, Morocco: IEEE; 2016. pp. 258-263

[15] Wu K, Chen Z, Li W. A novel intrusion detection model for a massive network using convolutional neural networks. IEEE Access. 2018;**6**: 50850-50859

[16] Vinayakumar R, Alazab M, Soman KP, Poornachandran P, Al-Nemrat A, Venkatraman S. Deep learning approach for intelligent

intrusion detection system. IEEE Access. 2019;**7**:41525-41550

[17] Gao X, Shan C, Hu C, Niu Z, Liu Z. An adaptive ensemble machine learning model for intrusion detection. IEEE Access. 2019;**7**:82512-82521

[18] Otoum S, Kantarci B, Mouftah HT. On the feasibility of deep learning in sensor network intrusion detection. IEEE Networking Letters. 2019;**1**(2):68-71

[19] Yang H, Qin G, Ye L. Combined wireless network intrusion detection model based on deep learning. IEEE Access. 2019;**7**:82624-82632

[20] Wu P, Guo H. LuNET: A deep neural network for network intrusion detection. In: Proceedings of 2019 IEEE Symposium Series on Computational Intelligence (SSCI). Xiamen, China: IEEE; 2019. pp. 617-624

[21] Khan FA, Gumaei A, Derhab A, Hussain A. A novel two-stage deep learning model for efficient network intrusion detection. IEEE Access. 2019;**7**: 30373-30385

[22] Sohi SM, Seifert JP, Ganji F. RNNIDS: Enhancing network intrusion detection systems through deep learning. Computers & Security. 2021;**102**:102151

[23] Zeng Y, Gu H, Wei W, Guo Y. Deepfull-range: A deep learning based network encrypted traffic classification and intrusion detection framework. IEEE Access. 2019;**7**:45182-45190

[24] Jouad M, Diouani S, Houmani H, Zaki A. Security challenges in intrusion detection. In: Proceedings of International Conference on Cloud Technologies and Applications (CloudTech). Marrakech, Morocco. 2015. pp. 1-11

[25] Borkar GM, Mahajan AR. A secure and trust based on-demand multipath

routing scheme for self-organized mobile ad-hoc networks. Wireless Networks. 2017;**23**(8):2455-2472

[26] Zhao W, Wang L, Zhang Z. A novel atom search optimization for dispersion coefficient estimation in groundwater. Future Generation Computer Systems. 2019;**91**:601-610

[27] Inoue M, Inoue S, Nishida T. Deep recurrent neural network for mobile human activity recognition with high throughput. Artificial Life and Robotics. 2018;**23**(2):173-185

[28] Erik G. Entropy and Mutual Information. Amherst; 2013

[29] Wu K, Chen Z, LiW. A novel intrusion detection model for a massive network using convolutional neural networks. IEEE Access. 2018;**2018**:1-1

[30] Khan FA, Gumaei A, Derhab A, Hussain A. TSDL: A two stage deep learning model for efficient network intrusion detection. IEEE Access. 2019:1-1

[31] Dong B, Wang X. Comparison deep learning method to traditional methods using for network intrusion detection. In: Proceedings of 8th IEEE International Conference on Communication Software and Networks (ICCSN). Beijing, China. 2016. pp. 581-585

[32] Sangeetha S, Ramya R, Dharani MK, Sathya P. Signature based semantic intrusion detection system on cloud. Information Systems Design and Intelligent Applications. 2015;**2015**:657-666

[33] NSL-KDD Dataset. Available from: https://www.unb.ca/cic/datasets/nsl. html [Accessed: August 2022]

[34] BoT-IoT Dataset. Available from: https://research.unsw.edu.au/projects/ BoT-IoT-dataset [Accessed: August 2022]

#### **Chapter 6**

### Anomaly Detection in Intrusion Detection Systems

*Siamak Parhizkari*

#### **Abstract**

Intrusion detection systems (IDS) play a critical role in network security by monitoring systems and network traffic to detect anomalies and attacks. This study explores the different types of IDS, including host-based and network-based, along with their deployment scenarios. A key focus is on incorporating anomaly detection techniques within IDS to identify novel and unknown threats that evade signaturebased methods. Statistical approaches like outlier detection and machine learning techniques like neural networks are discussed for building effective anomaly detection models. Data collection and preprocessing techniques, including feature engineering, are examined. Both unsupervised techniques like clustering and density estimation and supervised methods like classification are covered. Evaluation datasets and performance metrics for assessing anomaly detection models are highlighted. Challenges like curse of dimensionality and concept drift are outlined. Emerging trends include integrating deep learning and explainable AI into anomaly detection. Overall, this comprehensive study examines the role of anomaly detection within IDS, delves into various techniques and algorithms, surveys evaluation practices, discusses limitations and challenges, and provides insights into future research directions to advance network security through improved anomaly detection capabilities.

**Keywords:** anomaly detection, intrusion detection systems (IDS), fraud detection, cybersecurity, abnormal patterns

#### **1. Introduction**

An intrusion detection system (IDS) is a security tool designed to monitor network or system activities to detect and respond to unauthorized or malicious activities. It serves as an additional layer of defense in a comprehensive cybersecurity strategy.

The primary goal of an IDS is to identify and alert security administrators about potential security incidents, such as unauthorized access attempts, malware infections, or suspicious network traffic patterns. By analyzing network packets, log files, system activities, and other relevant data, IDS can help detect and respond to security threats in real-time.

There are two main types of IDS:

Network-based intrusion detection systems (NIDS) [1–7]: NIDS monitors network traffic in real-time, analyzing packets to identify suspicious or malicious activity. It

operates at the network layer and can detect threats such as port scanning, denial-ofservice (DoS) attacks, and network intrusions. NIDS can be deployed as a standalone device or as part of a network security infrastructure.

Host-based intrusion detection systems (HIDS) [1–7]: HIDS monitors the activities occurring on individual hosts or endpoints, such as servers or workstations. It analyzes system logs, file integrity, and user activities to identify unauthorized access attempts, privilege escalations, or suspicious behavior at the host level. HIDS is particularly useful for detecting insider threats or malware infections that may bypass networkbased defenses.

IDS employs different detection techniques to identify potential threats:

Signature-based detection [4]: This technique relies on a database of known attack signatures or patterns. IDS compares the incoming network traffic or system activities against these signatures to identify known attacks. While effective against known threats, signature-based detection may struggle with detecting new or zero-day attacks.

Anomaly-based detection [4–6, 8]: Anomaly detection involves establishing a baseline of normal behavior for a network or system and then identifying deviations from this baseline. It analyzes traffic patterns, system performance, user behavior, and other metrics to detect anomalies that could indicate a potential security breach.

When an IDS detects an intrusion or suspicious activity, it generates an alert or notification for security administrators. These alerts provide information about the nature of the incident, the affected system or network, and any additional details to aid in the response and mitigation process.

It is important to note that IDS is not a standalone solution but works in conjunction with other security measures like firewalls, antivirus software, and security policies. Additionally, intrusion prevention systems (IPS) are often used in conjunction with IDS to not only detect but also actively block or prevent detected threats.

In summary, Intrusion Detection Systems play a crucial role in identifying and responding to potential security incidents in real-time. By monitoring network and system activities, IDS helps organizations strengthen their overall security posture and minimize the potential impact of cyber threats.

#### **2. Anomaly detection techniques in IDS**

**Table 1** shows Anomaly detection techniques with pros and cons.

#### **2.1 Signature-based detection vs. anomaly detection**

Signature-based detection, also known as rule-based detection, relies on predefined signatures or patterns of known attacks to identify intrusions. However, signature-based detection has limitations as it can only detect known attacks for which signatures have been defined. New or unknown attacks can easily evade signature-based detection. Anomaly detection techniques, on the other hand, focus on identifying deviations from normal behavior, without relying on predefined signatures. This makes anomaly detection more effective in detecting unknown or novel attacks that do not have specific signatures [4, 6, 9]. **Figure 1** shows the concept of signature-based IDS.


#### **Table 1.**

*Anomaly detection techniques with pros and cons.*

#### **2.2 Statistical approaches for anomaly detection**

Statistical approaches are commonly employed for anomaly detection in IDS. These techniques involve the use of statistical methods to establish normal behavior baselines and detect deviations from these baselines. Outlier detection algorithms, such as the statistical outlier detection method or the Z-score method, are used to identify data points that significantly deviate from the expected behavior. Time series analysis techniques, such as autoregressive integrated moving average (ARIMA) models [10, 11], are used to detect anomalies in temporal data. Statistical modeling approaches, such as Gaussian mixture models or hidden Markov models, are utilized to capture the statistical characteristics of normal behavior and detect anomalies based on deviations from the learned models [4, 5].

#### **2.3 Machine learning approaches for anomaly detection**

Machine learning algorithms play a crucial role in anomaly detection for IDS. These algorithms can learn patterns and behaviors from historical data and apply that knowledge to detect anomalies in real-time. Clustering algorithms, such as k-means or

#### **Figure 1.** *Concept of signature based IDS [4].*

DBSCAN, group similar instances together and flag instances that do not fit into any cluster as anomalies. Classification algorithms, such as support vector machines (SVM) or random forests, learn from labeled data to classify instances as normal or anomalous. Neural networks, including deep learning models like convolutional neural networks (CNN) [9] or recurrent neural networks (RNN) [9], can capture complex patterns and relationships to identify anomalies [12]. **Figure 2** shows machine learning approaches in IDS.

#### **2.4 Hybrid approaches**

Hybrid approaches combine both statistical and machine learning techniques to improve the accuracy and effectiveness of anomaly detection in IDS [4]. By leveraging the strengths of different approaches, hybrid models can provide enhanced detection capabilities. For example, a hybrid approach may use statistical techniques to establish baseline behavior and machine learning algorithms to classify instances as normal or anomalous. This combination allows for a more comprehensive and robust anomaly detection system.

### **3. Data collection and preprocessing in IDS**

Data sources for IDS [4, 12].

Intrusion detection systems (IDS) rely on various sources of data to detect anomalies and potential security breaches. Some common data sources used in IDS include:


*Anomaly Detection in Intrusion Detection Systems DOI: http://dx.doi.org/10.5772/intechopen.112733*

**Figure 2.** *Machine learning approaches in IDS [12].*

user authentication, or administrative actions. Analyzing audit trails can help to identify unauthorized actions, unusual user behavior, or privilege escalation attempts.

Data preprocessing techniques.

Data preprocessing [13, 14] plays a crucial role in preparing the data for effective anomaly detection in IDS. Several techniques are commonly used in the preprocessing stage, including:


normal and anomalous behavior. Feature selection can help to reduce computational complexity, improve detection accuracy, and eliminate redundant or irrelevant features.


#### **4. Unsupervised anomaly detection in IDS**

Unsupervised anomaly detection techniques in intrusion detection systems (IDS) aim to identify anomalies in data without relying on pre-labeled instances of normal and anomalous behavior [4, 9]. These techniques are particularly useful in scenarios where labeled training data is scarce or unavailable, making it challenging to train supervised models. Unsupervised anomaly detection methods utilize statistical, clustering, or density-based approaches to identify patterns that deviate from normal behavior. Here are some commonly used unsupervised anomaly detection techniques in IDS and **Figure 3** shows a summary of these techniques:

	- Gaussian distribution: The Gaussian distribution, also known as the normal distribution, is frequently used in statistical-based anomaly detection. It assumes that the normal behavior of the data follows a bell-shaped curve. Anomalies are identified as instances that fall outside a specified range or threshold based on the estimated mean and standard deviation of the data. Instances that lie in the tails of the distribution, beyond a certain number of standard deviations from the mean, are considered anomalies.
	- Mahalanobis distance: The Mahalanobis distance measures the distance between a data point and the center of a distribution, taking into account the

**Figure 3.** *Summarize of unsupervised techniques.*

correlation between variables. It accounts for the covariance structure of the data and is particularly useful when the variables are correlated. The Mahalanobis distance can be used to detect anomalies by comparing the distance of each data point to a threshold value. Points with a large Mahalanobis distance are considered anomalies.


Statistical-based techniques provide a solid foundation for detecting anomalies based on deviations from expected statistical behavior. However, it is important to note that these methods assume the data follows specific statistical distributions and may not be suitable for data with complex or non-parametric distributions. Additionally, choosing appropriate thresholds or significance levels is crucial and requires careful consideration and domain knowledge.

	- K-means clustering: K-means clustering is a popular technique that aims to partition the data into K clusters. The algorithm iteratively assigns data points to the nearest cluster centroid based on distance measures such as Euclidean distance. Anomalies are typically identified as instances that do not fit well into any cluster or are located far from the cluster centroids. However, K-means alone may not be sufficient for anomaly detection as it assumes that all clusters have similar sizes and shapes, which may not hold true for anomalous instances.
	- Density-based spatial clustering of applications with noise (DBSCAN): DBSCAN is a density-based clustering algorithm that identifies clusters based on the density of instances [18]. It groups together instances that are close to each other and have a sufficient number of nearby neighbors. Anomalies are typically instances that do not have enough nearby neighbors to form a cluster and are considered noise points. DBSCAN can effectively identify clusters of different shapes and sizes, making it suitable for detecting anomalies that do not conform to regular cluster patterns.
	- Ordering points to identify the clustering structure (OPTICS): OPTICS is an extension of DBSCAN that provides a hierarchical view of the clustering structure. It orders instances based on their density and identifies core

**Figure 4.** *Clustering [14].*

points, reachability distances, and clusters. Anomalies are typically instances that have low density and are located in regions with sparse or no clusters. OPTICS allows for flexible parameterization, making it more adaptive to different datasets and providing a richer characterization of the data structure.

• Hierarchical clustering: Hierarchical clustering methods create a hierarchy of clusters by successively merging or splitting clusters based on their similarity. Agglomerative hierarchical clustering starts with each instance as a separate cluster and iteratively merges similar clusters until a single cluster is formed. Divisive hierarchical clustering starts with all instances in one cluster and iteratively splits the cluster into smaller clusters. Anomalies can be identified as instances that do not fit well into any cluster or do not conform to the hierarchical structure.

Clustering-based techniques offer flexibility in detecting anomalies by identifying instances that do not conform to regular cluster patterns. However, these techniques require careful consideration of parameters such as the number of clusters or density thresholds, and the interpretation of anomalies may depend on the dataset and the clustering algorithm used.

	- Kernel density estimation (KDE): Kernel density estimation is a nonparametric technique used to estimate the underlying density distribution of the data. It places a kernel function on each data point and sums them to estimate the density at any given point. Anomalies are typically identified as instances with significantly lower density values compared to the majority of the data. The choice of kernel function and bandwidth parameter affects the smoothness and accuracy of density estimation.
	- Local outlier factor (LOF): The local outlier factor measures the deviation of an instance's density compared to its neighboring instances. It calculates a local density for each data point based on the distances to its k nearest neighbors. Anomalies are identified as instances with significantly lower local densities compared to their neighbors. LOF takes into account the local density variations in the data, making it robust to varying densities and useful for detecting anomalies in clusters or regions of different densities.
	- Distance-based techniques: Distance-based density estimation techniques measure the distances between instances and identify anomalies based on deviations from the expected distance distribution. For example, the nearest neighbor distance (NND) approach calculates the average distance to the k nearest neighbors for each instance. Anomalies are identified as instances with significantly larger or smaller distances compared to the majority of the data. Distance-based techniques are effective in identifying anomalies that exhibit unusual distance patterns.

• Density-based clustering [18]: Density-based clustering algorithms, such as DBSCAN, can also be used for anomaly detection. These algorithms identify clusters based on the density of instances and label as anomalies the instances that do not belong to any cluster. Anomalies are typically located in regions with low density or as individual points far from the clusters.

Density-based techniques provide flexibility in detecting anomalies by focusing on regions of low density or deviations from expected distance patterns. These techniques are effective in identifying anomalies that do not conform to regular density distributions or exhibit unusual distance patterns. However, careful parameter selection, such as the neighborhood size or density thresholds, is important to ensure accurate anomaly detection.

	- Autoencoder-based anomaly detection: Autoencoders are neural network models that are trained to reconstruct their input data. They consist of an encoder that compresses the input data into a lower-dimensional representation and a decoder that reconstructs the data from this representation. During training, autoencoders learn to minimize the reconstruction error by capturing the patterns and regularities in the data. Anomalies are identified as instances that result in high reconstruction errors, indicating deviations from the learned normal behavior.
	- Variational autoencoders (VAEs): Variational autoencoders are a type of generative model that learns a low-dimensional representation of the data and generates new samples by sampling from this learned representation. VAEs consist of an encoder that learns the parameters of a probability distribution in the latent space and a decoder that generates samples from this distribution. Anomalies can be identified based on the reconstruction error or by measuring the dissimilarity between the original data and the

**Figure 5.** *Structure of autoencoders [12].*

generated samples. VAEs can capture the underlying distribution of the data and detect anomalies that deviate significantly from this distribution.

• Generative adversarial networks (GANs) [21]: Generative adversarial networks are another type of generative model that consists of a generator network and a discriminator network. The generator network learns to generate realistic samples that resemble the normal behavior of the data, while the discriminator network learns to distinguish between real and generated samples. Anomalies can be identified as instances that are not well captured by the generator network or are classified as fake by the discriminator network. GANs can learn complex data distributions and detect anomalies that differ significantly from the learned distribution.

Reconstruction-based techniques offer the advantage of learning the normal behavior of the data and identifying anomalies based on deviations from this learned behavior. They can capture complex patterns and variations in the data, making them effective for detecting anomalies that do not conform to specific statistical or density distributions. However, these techniques require a representative dataset of normal behavior for training the models and may be sensitive to the choice of model architecture and training parameters.

### **5. Supervised anomaly detection in IDS**

Supervised anomaly detection in intrusion detection systems (IDS) involves training a model on labeled data, where both normal and anomalous instances are explicitly identified [4, 9]. The model learns the patterns and characteristics of normal behavior during the training phase and can subsequently classify new instances as either normal or anomalous based on the learned knowledge. Here are some commonly used techniques for supervised anomaly detection in IDS:


in **Figure 7**, and deep autoencoders, have shown promising results in supervised anomaly detection. These models can learn complex representations of the input data, capture intricate patterns, and generalize well to unseen instances. Deep learning approaches require large amounts of labeled data and can be computationally intensive but can achieve high accuracy in detecting anomalies in IDS.

	- Training phase: In the training phase, One-class SVM learns a decision boundary that encloses the majority of the training data points, representing the normal class. The goal is to find a hyperplane that maximally separates the normal data instances from the origin or the center of the feature space.

**Figure 6.**

*Structure of convolutional neural network [12].*

**Figure 7.**

*Structure of recurrent neural network [12].*

*Anomaly Detection in Intrusion Detection Systems DOI: http://dx.doi.org/10.5772/intechopen.112733*


One-class SVM offers several advantages for anomaly detection:


However, One-class SVM also has certain limitations:


Supervised anomaly detection in IDS offers the advantage of explicitly labeled data for training and can achieve high detection accuracy. However, it relies on the availability of accurately labeled training data and may face challenges when dealing with evolving or previously unseen anomalies. Moreover, supervised approaches may not capture novel or unknown anomalies that were not present in the training data.

#### **6. Evaluation and performance metrics in IDS**

Evaluation datasets play a crucial role in assessing the performance of anomaly detection techniques in intrusion detection systems (IDS). These datasets are used to evaluate how well a detection technique can accurately classify instances as normal or anomalous. Several datasets have been widely used in the field of IDS for evaluation purposes. Here are some commonly used datasets:


Performance metrics are used to quantitatively measure the effectiveness of anomaly detection techniques in IDS. These metrics provide insights into the model's accuracy, precision, recall, and overall performance. Here are some commonly used performance metrics:

1.Accuracy [4, 9, 25, 26]: Accuracy measures the overall correctness of the model's predictions. It calculates the ratio of correctly classified instances to the total number of instances. However, accuracy can be misleading in imbalanced datasets where anomalies are rare.

$$Accuracy = \frac{TP + TN}{TP + TN + FP + FN} \tag{1}$$

2.Precision [4, 9, 25, 26]: Precision measures the proportion of correctly identified anomalies among all instances classified as anomalies. It focuses on the correctness of positive predictions, indicating the model's ability to avoid false positives.

$$Precision = \frac{TP}{TP + FP} \tag{2}$$

3.Recall [4, 9, 25, 26]: Recall, also known as sensitivity or true positive rate, measures the proportion of actual anomalies that are correctly identified by the model. It represents the model's ability to detect anomalies and avoid false negatives.

$$Recall = \frac{TP}{TP + FN} \tag{3}$$

4.F1-score [4, 9, 25, 26]: The F1-score is a harmonic mean of precision and recall, providing a balanced measure of the model's performance. It considers both false positives and false negatives and is especially useful when dealing with imbalanced datasets.

$$f - score = \frac{Precision.Recall}{Precision + Recall} \tag{4}$$

5.ROC curve and AUC [4, 9]: The receiver operating characteristic (ROC) curve illustrates the trade-off between the true positive rate (TPR) and the false positive rate (FPR) at different classification thresholds. The area under the curve (AUC) summarizes the performance of the model across all possible thresholds. A higher AUC indicates better discrimination between normal and anomalous instances.

These performance metrics help in evaluating the accuracy, effectiveness, and reliability of anomaly detection techniques in IDS.

### **7. Challenges in anomaly detection for IDS**

1.Curse of dimensionality [27]: The curse of dimensionality refers to the phenomenon where the effectiveness of certain algorithms and techniques deteriorates as the dimensionality of the data increases. In the context of intrusion detection systems (IDS), the curse of dimensionality poses a significant challenge for anomaly detection.

Anomaly detection in IDS often involves analyzing high-dimensional data, such as network traffic logs, system logs, or audit trails. Each data instance is typically represented by a large number of features or attributes that describe various aspects of the network or system behavior. However, as the number of features increases, the available data becomes increasingly sparse in the high-dimensional space.

The curse of dimensionality has several implications for anomaly detection in IDS:


lower dimensionality. Anomaly detection algorithms may require more computational resources and time to analyze and classify instances accurately, affecting the real-time performance of IDS.

To mitigate the curse of dimensionality in IDS, various techniques can be employed:


Overall, addressing the curse of dimensionality in IDS requires careful consideration of data representation, feature selection, and dimensionality reduction techniques. By reducing the dimensionality of the data and focusing on relevant features, anomaly detection algorithms can be more effective in accurately identifying anomalies in high-dimensional data.

2.Concept drift [29, 30]: Concept drift refers to the phenomenon where the underlying data distribution, which defines what is considered normal or anomalous, changes over time. In the context of intrusion detection systems (IDS), concept drift poses a significant challenge for anomaly detection.

In IDS, anomaly detection models are trained on historical data to learn patterns of normal behavior and identify deviations from those patterns as anomalies. However, the characteristics of network traffic and system behavior can evolve over time due to various factors such as changes in network infrastructure, software updates, and emerging attack techniques. As a result, the learned model may become outdated and less effective in detecting new types of anomalies.

Concept drift in IDS can occur in different forms:

• Gradual concept drift: In gradual concept drift, the change in the underlying data distribution is relatively slow and progressive. The statistical properties of the data gradually shift over time, leading to a gradual degradation in the performance of the anomaly detection model. This type of concept drift requires continuous monitoring and adaptation of the model to maintain its effectiveness.

• Sudden concept drift: In sudden concept drift, the change in the underlying data distribution occurs abruptly and unpredictably. This can happen due to sudden changes in network conditions, system configurations, or the introduction of new attack techniques. Sudden concept drift poses a significant challenge as the model needs to quickly adapt to the new data distribution to accurately detect anomalies.

Addressing concept drift in IDS is essential to maintain the effectiveness of anomaly detection over time. Several techniques can be employed:


It is important to note that concept drift detection and adaptation in IDS is an ongoing research area, and the development of effective techniques to handle concept drift remains an active research topic.

3.Adversarial attacks [31–33]: Adversarial attacks in IDS refer to deliberate attempts by adversaries to exploit vulnerabilities in the system and manipulate its behavior in order to evade detection or cause misclassification of normal or malicious activities. These attacks are specifically designed to target the anomaly detection capabilities of IDS and can have serious consequences for the security of the network.

There are different types of adversarial attacks that can be launched against IDS:


Addressing adversarial attacks in IDS is a challenging task. Some strategies and techniques that can help to mitigate the impact of these attacks include:


It is worth noting that adversarial attacks and defense mechanisms in IDS are evolving research areas, and new attack techniques and defense strategies are continuously being developed.

#### **8. Emerging trends and future directions**


#### **9. Conclusion**

In conclusion, this chapter has provided an overview of anomaly detection techniques in intrusion detection systems (IDS). We discussed the two main types of IDS, including signature-based detection and anomaly detection, and highlighted the advantages of using anomaly detection techniques over signature-based approaches. We explored various anomaly detection techniques, including statistical-based techniques, clustering-based techniques, density-based techniques, reconstruction-based techniques, and One-class support vector machines (SVM).

We also discussed the importance of data collection and preprocessing in IDS, emphasizing the relevance of different data sources and the need for effective preprocessing techniques to enhance anomaly detection accuracy. Furthermore, we covered the evaluation and performance metrics used to assess the effectiveness of anomaly detection techniques, including commonly used evaluation datasets and performance metrics such as accuracy, precision, recall, F1-score, ROC curve, and AUC.

We highlighted the challenges faced in anomaly detection for IDS, such as the curse of dimensionality, concept drift, and adversarial attacks. These challenges require ongoing research and development efforts to improve the accuracy and

resilience of anomaly detection techniques. Additionally, we discussed emerging trends and future directions in the field, including the integration of deep learning techniques, the use of explainable AI, and the exploration of real-time and streaming anomaly detection methods.

In conclusion, anomaly detection techniques play a crucial role in IDS for enhancing network security by identifying potential threats and attacks. However, there are ongoing challenges and opportunities for further research and development. By addressing these challenges and embracing emerging trends, we can advance the field of anomaly detection in IDS and improve the detection and prevention of sophisticated and unknown attacks, ultimately enhancing the overall security of network systems.

#### **Author details**

Siamak Parhizkari Islamic Azad University, Iran

\*Address all correspondence to: parhizkari.siamak@live.com

© 2023 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

*Anomaly Detection in Intrusion Detection Systems DOI: http://dx.doi.org/10.5772/intechopen.112733*

#### **References**

[1] Kumar KN, Sukumaran S. A survey on network intrusion detection system techniques. International Journal of Advanced Technology and Engineering Exploration. 2018;**5**(47):385-393

[2] Modi C, Patel D, Borisaniya B, Patel H, Patel A, Rajarajan M. A survey of intrusion detection techniques in cloud. Journal of Network and Computer Applications. 2013;**36**(1):42-57

[3] Liu M, Xue Z, Xu X, Zhong C, Chen J. Host-based intrusion detection system with system calls: Review and future trends. ACM Computing Surveys (CSUR). 2018;**51**(5):1-36

[4] Khraisat A, Gondal I, Vamplew P, Kamruzzaman J. Survey of intrusion detection systems: Techniques, datasets and challenges. Cybersecurity. 2019; **2**(1):1-22

[5] Jyothsna V, Prasad R, Prasad KM. A review of anomaly based intrusion detection systems. International Journal of Computer Applications. 2011;**28**(7): 26-35

[6] Gangwar A, Sahu S. A survey on anomaly and signature based intrusion detection system (IDS). International Journal of Engineering Research and Applications. 2014;**4**(4):67-72

[7] Jmila H, Khedher MI. Adversarial machine learning for network intrusion detection: A comparative study. Computer Networks. 2022;**214**:109073

[8] Zamani M, Movahedi M. Machine Learning Techniques for Intrusion Detection. 2013. 11 p. Available from: arxiv.org [Revised in 2015]

[9] Kocher G, Kumar G. Machine learning and deep learning methods for intrusion detection systems: Recent developments and challenges. Soft Computing. 2021;**25**(15):9731-9763

[10] Yaacob AH, Tan IK, Chien SF, Tan HK. Arima based network anomaly detection. In: 2nd International Conference on Communication Software and Networks, 2010, Singapore. Singapore: IEEE; 2010. pp. 205-209

[11] Shirani P, Azgomi MA, Alrabaee S. A method for intrusion detection in web services based on time series. In: 28th IEEE Canadian Conference on Electrical and Computer Engineering, CCECE (CCECE). Halifax, Canada: IEEE; 2015. pp. 836-841

[12] Liu H, Lang B. Machine learning and deep learning methods for intrusion detection systems: A survey. Applied Sciences. 2019;**9**(20):4396

[13] Davis JJ, Clark AJ. Data preprocessing for anomaly based network intrusion detection: A review. Computers & Security. 2011;**30**(6–7): 353-375

[14] Alasadi SA, Bhaya WS. Review of data preprocessing techniques in data mining. Journal of Engineering and Applied Sciences. 2017;**12**(16):4102-4107

[15] Haq NF, Onik AR, Hridoy MAK, Rafni M, Shah FM, Farid DM. Application of machine learning approaches in intrusion detection system: A survey. IJARAI-International Journal of Advanced Research in Artificial Intelligence. 2015;**4**(3):9-18

[16] Salih AA, Abdulazeez AM. Evaluation of classification algorithms for intrusion detection system: A review. Journal of Soft Computing and Data Mining. 2021;**2**(1):31-40

[17] Aburomman AA, Reaz MBI. Survey of learning methods in intrusion detection systems. In: 2016 International Conference on Advances in Electrical, Electronic and Systems Engineering (ICAEES). Putrajaya, Malaysia: IEEE; 2016

[18] Bohara B, Bhuyan J, Wu F, Ding J. A survey on the use of data clustering for intrusion detection system in cybersecurity. International Journal of Network Security & Its Applications. 2020;**12**(1):1

[19] Wicaksana AK, Cahyani DE. Modification of a density-based spatial clustering algorithm for applications with noise for data reduction in intrusion detection systems. International Journal of Fuzzy Logic and Intelligent Systems. 2021;**21**(2):189-203

[20] Xu Y-X, Pang M, Feng J, Ting KM, Jiang Y, Zhou Z-H. Reconstructionbased anomaly detection with completely random forest. In: HAPPENING VIRTUALLY: SIAM International Conference on Data Mining (SDM21) April 29 - May 1, 2021, Virtual Conference. Philadelphia, PA, USA: SIAM; 2021

[21] Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, et al. Generative adversarial networks. Communications of the ACM. 2020;**63**(11):139-144

[22] Mahfouz AM, Abuhussein A, Venugopal D, Shiva SG. Network intrusion detection model using oneclass support vector machine. In: Advances in Machine Learning and Computational Intelligence: Proceedings of ICMLCI 2019. Singapore: Springer Nature; 2021

[23] Panigrahi R, Borah S. A detailed analysis of CICIDS2017 dataset for

designing intrusion detection systems. International journal of. Engineering & Technology. 2018;**7**(3.24):479-482

[24] Stiawan D, Idris MYB, Bamhdi AM, Budiarto R. CICIDS-2017 dataset feature analysis with information gain for anomaly detection. IEEE Access. 2020;**8**: 132911-132921

[25] Wang G, Hao J, Ma J, Huang L. A new approach to intrusion detection using artificial neural networks and fuzzy clustering. Expert Systems With Applications. 2010;**37**(9):6225-6232

[26] Parhizkari S, Menhaj MB, Sajedin A. A Cognitive Based Intrusion Detection System. 2020. 19 p. Available from: arxiv.org [Revised in 2022]

[27] Verleysen M, François D. The curse of dimensionality in data mining and time series prediction. In: Computational Intelligence and Bioinspired Systems: 8th International Work-Conference on Artificial Neural Networks, IWANN 2005, Vilanova i la Geltrú, Barcelona, Spain, June 8–10, 2005 Proceedings 8. Barcelona, Spain: Springer; 2005

[28] Aljanabi M, Ismail MA, Ali AH. Intrusion detection systems, issues, challenges, and needs. International Journal of Computational Intelligence Systems. 2021;**14**(1):560-571

[29] Brownlee J. Concept drift 2023. Available from: https://machinelearning mastery.com/gentle-introduction-conce pt-drift-machine-learning/

[30] Castillo D. what is concept drift 2023. Available from: https://www.seld on.io/machine-learning-concept-drift.

[31] Mbow M, Sakurai K, Koide H. Advances in adversarial attacks and defenses in intrusion detection system: A survey. In: Science of Cyber Security*Anomaly Detection in Intrusion Detection Systems DOI: http://dx.doi.org/10.5772/intechopen.112733*

SciSec 2022 Workshops: AI-CryptoSec, TA-BC-NFT, and MathSci-Qsafe 2022, Matsue, Japan, August 10–12, 2022, Revised Selected Papers. Matsue, Japan: Springer; 2023

[32] Zizzo G, Hankin C, Maffeis S, Jones K. Adversarial attacks on timeseries intrusion detection for industrial control systems. In: 2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom) 29 Dec 2020 - 01 Jan 2021. Guangzhou, China: IEEE; 2020. ISBN: 978-0-7381-4380-4

[33] Alotaibi A, Rassam MA. Adversarial machine learning attacks against intrusion detection systems: A survey on strategies and defense. Future Internet. 2023;**15**(2):62

[34] Yehuda Y. New Trends in AI and Machine Learning for Anomaly Detection 2023. Available from: https:// www.rad.com/blog/new-trends-ai-a nd-machine-learning-anomaly-detection

[35] Zehra S, Faseeha U, Syed HJ, Samad F, Ibrahim AO, Abulfaraj AW, et al. Machine learning-based anomaly detection in NFV: A comprehensive survey. Sensors. 2023;**23**(11):5340

Section 3
