**4. Experimental setup**

Our experimental setup comprises of real X-ray security imagery dataset and one constructed using the TIP based synthetic compositing approach outlined in Section 3.1. These are evaluated within a common CNN training environment using the CNN architecture outlined in Section 3.2.

Dbf3*Real* dataset: The Durham Dataset Full Three-class (Dbf3) images are generated using a Smith Detection dual-energy X-ray scanner (**Figure 1**). It consists of total 7603 images, which is divided into three classes of prohibited item. In this experiment we uses subsets of the datasets, which consists of three types metallic prohibited item, {Firearm, Firearm Parts, Knives}. Out of these three classes, we incorporate 3192 images of firearms, 1204 images of firearms parts, and 3207 images of knives, within cluttered and complex X-ray security dataset.

Dbf3*SC* dataset: The Synthetically Composited (SC) dataset is generated using TIP approach of Section 3.1. We use 3366 benign X-ray security images, generated by a Smith Detection X-ray scanner, and 123 individual prohibited objects of three classes {*Firearm, Firearm Parts, Knives*}. The prohibited item are composed into the benign images to create synthetically composited X-ray security imagery dataset. We use the same number of images as *Dbf3Real* in the synthetically composited dataset. Exemplar images from *Dbf3Real* (**Figure 4A**) and *Dbf3SC* (**Figure 4B**) are visually realistic and challenging to distinguish from the real images.

Dbf3*Real*+*SC* dataset: A subset of *Dbf3Real* and subset of *Dbf3SC* images are combined to create this dataset, where the numbers of synthetic and real images are used in equal number to present a data set with 50% of each which is itself the same size as *Dbf3Real*.

The CNN architecture (Section 3.2) are trained on a GTX 1080Ti GPU, optimised by Stochastic Gradient Descent (SGD) with a weight decay of 0.0001, learning rate of *Evaluating Convolutional Neural Networks for Prohibited Item Detection Using Real… DOI: http://dx.doi.org/10.5772/intechopen.105162*

#### **Figure 4.**

*Visual comparison of real (A) and SC (B) X-ray security imagery of prohibited items.*

0.01 and termination at maximum of 180k epochs. ResNet50 and ResNet101 are chosen as network backbone to operate within the detection framework of [28]. We split each dataset into training (60%), validation (20%) and test sets (20%) so that each split has similar class distribution. All CNN architecture are initialised with ImageNet [5] pretrained weights for their respective model [29].
