**1. Introduction**

To ensure transport and border security, X-ray security screening is commonplace within public transport and border security installations such as airports, railway and metro stations. However, due to the nature of cluttered and complex X-ray imagery

#### **Figure 1.**

*Exemplar X-ray security baggage images with prohibited objects—red box: (A) Firearm (B) Firearm Parts and (C) Knife.*

(**Figure 1**), the process of X-ray screening is complicated by tightly packed items within baggage making it challenging and time-consuming to identify the presence of prohibited items. With the natural occurrence of such prohibited items being rare, previous studies cite time constraints as a major factor in the performance limitations of human operators for this screening task [1, 2].

Whilst challenging for a human, a reliable automatic prohibited item detection system may assist in improving the performance and throughput of such screening processes [3]. To date, contemporary X-ray security scanners already implement material discrimination via dual-energy multiple view X-ray imagery to enable threat material detection [4]. This use of dual- energy X-ray gives rise to the false-colour mapped appearance of X-ray security imagery (e.g., metals, alloy or hard plastic are shown in blue while less dense objects are shown in green/orange—see **Figure 1**).

Convolutional Neural Network (CNN) based methods have proven effective in detecting a wide range of object classes within this context [5–8]. However, the performance of such object detection approaches is heavily reliant on the availability of a substantial volume of labelled X-ray imagery. Unfortunately, the availability of such X-ray imagery datasets suitable for training CNN architectures is limited and also restricted in size and item coverage (e.g. GDXray [9], SIXray [10]).

Commonly, it is challenging to collect sufficient X-ray imagery containing example of prohibited items with large variations in pose, scale and item construction. To overcome this challenge, contemporary data augmentation schemes such as image translation, rotation, flipping and re-scaling are applied to enlarge the availability of otherwise limited training datasets [5]. However, such methods suffer from the fact that the resulting augmented dataset still lacks diversity in terms of prohibited item variation and inter-occlusion emplacement within complex and cluttered X-ray security imagery. This motivates the use of synthetically composed imagery, where such imagery readily enables the introduction of more variability in pose, scale and prohibited item usage in an efficient and readily available way.

In this work, we devise a Synthetically Composited (SC) data augmentation approach via the use of Threat Image Projection (TIP). TIP is an established process within operational aviation security for the monitoring of human operators which uses a smaller collection of X-ray imagery comprising of isolated prohibited objects (only), which are subsequently superimposed onto more readily available benign Xray security imagery. Here this approach additionally facilitates the generation of synthetic, yet realistic prohibited X-ray security imagery for the purpose of CNN training. Our key contributions are the following: (a) the synthesis of high quality

prohibited images from benign X-ray imagery using a documented TIP approach and (b) an extended comparative evaluation on how real and synthetically generated X-ray imagery impacts the performance for prohibited object detection and classification using CNN architectures.
