**3.6 MLAPT: detection of APT attacks using machine learning and correlation analysis**

APT detection research mainly rely on the analysis of malware payload used in different phases of APT attack. This kind of approach result in high false positives in case of multi-stage malware deployment. In order to address this issue, Ghafir et al. proposed a model to detect multi-stage APT malware by using machine learning and correlation analysis (MLAPT) [19]. The MLAPT system is broadly divided into three modules, i.e. 1) Threat detection module, 2) Alert correlation module and 3) Prediction module. Initially, network traffic is passed to the Threat detection module in which authors built several submodules to detect multi-stage attacks. The Output alerts from the Threat detection module are passed to the Alert correlation module. Alert correlation module filters redundant alerts and clusters these alerts based on correlation time interval. The correlation indexing sub-module determines a given scenario is either a full APT scenario or sub-APT scenario based on alert correlation score. The prediction module consider sub APT scenarios and predict its probability of becoming a full APT scenario. Based on that prediction module, alerts are escalated to the network security team to stop this APT kill chain. The novelty of this research lies in the detection of APT across all life cycle phases. Added to this, the MLAPT system monitors and detects real-time APT attacks with a decent 81% Accuracy.

#### **3.7 Detection of APT attacks using fractal dimensions**

Detecting APT network patterns is a complex task as it tries to mimic the behaviour of regular TCP traffic. APT malware opens and closes TCP connections to its C2C servers like any other regular legitimate connection with a minimal data transfer to stay low under the radar. Single scale analysis does not extract the complexities of this kind of APT traffic and lowers the detection accuracy. Researchers found that current supervised ML models use euclidean based error minimization, which results in high false positives while detecting complex APT traffic. To address these issues, Sana Siddiqui et al. proposed an APT detection model using multi-fractal based analysis to extract the hidden information of TCP connections [20]. Initially, the authors considered 30% of labelled datasets and computed prior correlation fractal dimension values for normal and APT data points. Both these computed values are loaded into the memory before processing the remaining 70% unlabelled dataset. Each point in the remaining 70% dataset is added to both normal and APT labelled dataset, and posterior fractal dimension values are calculated in the next step. The absolute difference between prior and posterior values for both regular and APT samples are calculated to determine the closest cluster to the data point. If fd\_anom (absolute difference between prior and posterior for APT sample) ≤ fd\_norm (absolute difference between prior and posterior for normal sample), then that data point is classified as an APT sample and vice versa. As per the experimental observations, fractal dimension based ML models performs better in terms of accuracy (94.42%) than the euclidean based ML models.

*DMAPT: Study of Data Mining and Machine Learning Techniques in Advanced Persistent Threat… DOI: http://dx.doi.org/10.5772/intechopen.99291*

#### **3.8 APT detection using context-based detection framework**

Paul Guira et al. proposed a conceptual framework known as the Attack Pyramid for APT detection [21]. In this approach, the goal of the attack (data exfiltration in most of the cases) should be identified and placed on top of the pyramid. Further more, the model identifies various planes such as user plane, application plane, network plane and physical plane where the possibility of attacks are maximised. From the proposed approach, one can identify the correlation between various events across different planes. In general, an APT attack span multiple planes as the attack life cycle progresses. So, it is possible to identify the attack contexts that span through multiple attack planes. Events from different sources, i.e. VPN logs, firewall logs, IDS logs, authentication logs, system event logs are passed as data source to the detection engine. From these logs, the context of attack is identified using correlation rules. In the next step, the suspicious activities are identified by matching the attack contexts using a signature database. This model requires updating signatures at regular intervals to identify new attack contexts in real-time scenarios.

#### **3.9 APT detection system based on API log data mining**

Chun-I Fan et al. [22] proposed a generalised way for APT detection using system calls log data. The model was built based on the principles of dynamic malware analysis where API call (system call) events were passed through a detection engine. The novelty of this work lies in the approach of handling the API calls. Modern APT malware is often used to create child processes or inject code into a new process to evade detection. Authors have created a program named "TraceHook" that monitors all the code injection activities. Tracehook outputs the API count for the executable samples (benign/malware), and a machine learning classifier model is constructed on top of the obtained API count values. The proposed model considers only six important DLLs to monitor and can be combined with other APT detection models to build a robust APT detection engine.

#### **3.10 Ensemble models for C2C communication detection**

Identifying and stopping a particular life cycle event can break the full APT cycle and minimise damage to a considerable proportion. Based on this idea, researchers proposed various methods to stop malicious C2C communication. Modern-day malware employed a new way to communicate with their C2C server with the help of Domain Generation Algorithms (DGA). DGA creates a dynamic list of domain names in which a few domain names are active for a limited amount of time. So, the malware communicates to a different C2C domain name for every successful communication. This practice helps the smart malware to avoid detection from the traditional antivirus, firewalls, and other network scanning software. Anand et al. [23] proposed a classification technique to detect character-based DGA, i.e. domain names are constructed by concatenating characters in a pseudo-random manner, for example, wqzdsqtuxsbht.com. In this method, author extracted various lexical-based features such as n-grams, character frequencies, and statistical features to build an ensemble classifier. The proposed model can detect character-based DGA domain names with a decent accuracy score of 97%. Charan et al. [24] proposed a similar technique to detect word-based DGA domain names where domain names are constructed by concatenating two or three words from dictionaries, for example crossmentioncare.com. In their model, the author consider lexical, statistical, network-based features to build an

ensemble classifier. A combination of the above two models can detect the C2C communication activity with a decent accuracy.
