**3.4 A study on cyber threat prediction based on intrusion event for APT attack detection**

Yong-Ho Kim et al. [17] proposed a theoretical model for APT detection that consider intrusion detection system logs as data source. From the IDS logs, correlation rules between various system events are identified to build an attack graph. Identifying the correlation between the intrusion detection logs helps in predicting the future attacks. In the initial phase, intrusion detection logs are collected and corresponding intrusion events were extracted. The extracted events are passed to different function blocks, each corresponding to a particular detection activity. One of the functional block identifies the single-directional i.e. (host to C2C interaction) and bi-directional (host to C2C, C2C to host) communication activities. Another block identifies the repetitive intrusion events and combines them as a single event to optimise the time and resource constraints. A correlation analysis block identifies the context of intrusion detection events and creates sequential rules based on the principles of 5 W and 1H (When, Where, Why, Who, What and How). Finally, the prediction engine consider the attack scenario and tries to predict one or more events that can occur after a single intrusion event. This module consider data mining principles such as support and confidence to produce the best possible result. The time constraint is one of the practical problems with this model, as some of the functional blocks take a longer time to process events. Another important aspect is that, rules of the intrusion detection systems will directly affect the outcome of this model.

#### **3.5 APT detection using long short term memory neural networks**

Charan et al. [18] proposed an APT detection engine that takes SIEM event logs as input and use LSTM neural networks to detect the successful APT espionage. The author consider Splunk SIEM logs as a data source and streamline data to the Hadoop framework to process and obtain the event codes for every activity. Based on the APT life cycle phases, the author listed out the possible event codes and their sequence, leading to successful APT espionage. The core part of this work is to identify the event codes occurring in a sequence, and this process requires memorising the previous state event codes. So, in the proposed model, LSTM (a variant of RNN) is considered a

classifier because it overcomes vanishing gradient problem by remembering the previous state event codes to confirm APT attack presence. However, this model may suffer from a high false-positive rate when smart malware techniques are employed in crafting the APT attack.
