**2.1 DeepAPT: APT attribution using deep neural network and transfer learning**

APT attribution is quite a challenging task to the security community because of various reasons. Majorly, State-sponsored APTs are developed in the supervision of different units and equipped with default Anti-VM and Anti-Debugging techniques to obfuscate the payloads. This technique makes feature extraction extremely challenging to most security firms. In addition to this, APT malware samples are highly targeted so that very few samples will be available for analysis purposes. In order to address this issue, Rosenberg et al. proposed a technique for APT attribution by using a Deep Neural Network (DNN) classifier [9]. In this research work, the authors used 3200 malware samples for training DNN classifiers, 400 samples for validation and 1000 samples for testing the model. All the APT malware samples are executed in a cuckoo sandbox environment, and generated reports are used as raw input in training the classifier. DNN is effective in learning high-level features on its own from raw inputs. In order to train DNN models more effectively, in this work, the authors removed top 50,000 frequent words from input features of all cuckoo reports. So, DNN models take very uncommon words from all cuckoo reports and build a much more effective model to perform APT attribution. This DNN architecture is a 10 layer, fully connected network (50,0000 neurons at the input layer and 2,000 in the first hidden layer) with a ReLU activation function. The final trained APT attribution model did decent work on test data with 98.6% accuracy. Added to this, the authors also applied transfer learning on trained DNN models by removing and retraining top layer neurons. After applying transfer learning, the model still performs exceptionally well with 97.8% accuracy. From the t-distributed stochastic neighbour embedding algorithm (used to reduce from 500 dimension space to 2 dimension space), we can see that the trained model could separate different APT malware groups as shown in **Figure 3**, respectively.

**Figure 3.** *2-dimensional visualisation of APT families using t-SNE algorithm [9].*

*DMAPT: Study of Data Mining and Machine Learning Techniques in Advanced Persistent Threat… DOI: http://dx.doi.org/10.5772/intechopen.99291*

## **2.2 APT attribution based on threat intelligence reports**

Most APT attribution techniques heavily rely on performing analysis for malware samples used in that particular campaign. The key disadvantage of this strategy is that the same malware samples can be used in several operations. In some situations, APT groups specifically buy malware from the dark web based on their requirements. So, the ML models constructed by only considering malware samples may not give efficient results in terms of APT attribution. In order to address this issue, Lior Perry et al. proposed a method named NO-DOUBT, i.e. Novel Document Based Attribution, by constructing models on threat intelligence reports with the help of Natural Language Processing (NLP) techniques [10]. In this research, the authors collected 249 threat intelligence reports of 12 different APT actors and considered APT attack attribution as a multi-text classification problem. The proposed model consists of mainly two phases, as shown in **Figure 4**. In the training phase, labelled reports and word embeddings transform the input data to a vector representation. For generating this vector representation, authors propose SMOBI (Smoothed Binary Vector) algorithm, which will find cosine similarities between input words in labelled data sets and word embeddings to form a huge n m matrix. This vector representation and labels are given to the ensemble xGBoost classifier to construct a known actor model. In the deployment phase, new test reports (unlabelled) are also converted to vector representation and given to the known actor model to determine the probability predictions to the known classes. These probability predictions are given to a New Actor Model (a binary classifier that outputs whether it is a known APT actor or a new unknown actor) to make final predictions. Although this model struggles to detect Deep Panda and APT29 actors, SMOBI based APT attribution outperforms previous text-based APT attribution models (unigrams + bigrams and tf-idf) in terms of Accuracy, Precision and Recall.

**Figure 4.** *NO-DOUBT method for APT attribution [10].*
