**2. Background**

Inspection analysis can be classified into one of the following categories [8] *structural quality*, which searches for the presence of unnecessary components or lack of required parts; *surface quality*, where objects surfaces are inspected for wear, scratches, cracks, and other defects; *dimensional quality*, where the dimensions of the objects are checked to fall within given tolerances; *operational quality*, which evaluates the correctness of the quality inspection processes.

As of today, different methods have been proposed for inspecting the welding process online [9]. Their design is suited to diverse defects types and differ in the data processed during the evaluation. Among the sensing technologies employed in literature, optical detectors [10], acoustic measurements [11], and vision analysis [12] are surely the most utilized. While, for classification applications, artificial neural networks [13–15] and fuzzy inference systems [16, 17] are usually preferred thanks to the wide range of problems and diversity of defects they could cope with as in the case of classification of steel strip defects [18, 19].

However, the focus of these works is on defects classification and not on their detection. Therefore, they could not cope with feature understanding problems such as discriminating between good samples and defective ones. A different approach is proposed by Ak et al. [20] where X-ray images are used to detect defects in metal castings.

Recent literature is plenty of research addressing the problem of welding localization employing off-the-shelf DL architectures or introducing slight modifications on the tail of popular networks. These approaches are mostly based on the R-CNN [21], Faster R-CNN [22], and YOLO [23] architectures. The reason behind their adoption is that these architectures usually require little fine-tuning procedures for efficiently localizing welding areas and spots. Such efficiency is strictly related to the presence of plain metal surfaces in the surrounding area of the welding by enabling simple and accurate segmentation of the feature under inspection. This is the case of resistance spot welding (RSW) processes typically employed to connect metal sheets at a low cost and in a short time.

Concerning detection approaches, early methods based on traditional computer vision techniques [24] require hand-crafted features and complex threshold settings to adapt to environmental conditions. However, approaches based on deep learning allow increasing the robustness of the detection coping with environmental noise and the sensitivity of the welding processes.

The majority of approaches are built upon the above-mentioned architectures for welding spots localization. Fast R-CNN [25] is a region proposal network that computes the region of interest (ROI) on the feature map, thus improving upon the R-CNN architecture. Faster R-CNN integrates convolutional layers for object classification, feature extraction, bounding box regression, and region proposals into a network, further improving the detection performance but still not reaching real-time capabilities. Unlike the R-CNN family, which has a two-stage detection architecture, YOLO implements a regression network with a grid of bounding boxes and associated class probabilities, thus enabling real-time detection with recent hardware. In the race for timing performance, YOLOv2 [26] borrowed the anchor mechanism from SSD [27] and Faster R-CNN, which also enhanced the network *accuracy*. Focusing on small object detection (like welding spots), YOLOv3 [28] builds upon a backbone network combined with a feature pyramids network (FPN) [29] improving multi-scale prediction. The efficiency of the detection depends on the selected backbone network. Common choice are VGG [30], ResNet [31], DenseNet [32], and MobileNet [33]. These architectures differ in the computational complexity, the number of parameters, and inference speed.

Considering the reduced dimension of small spot welds, low-resolution feature maps in the backbone, and convolution strides dimension could cause an information leak. To face this issue, the work proposed by Dai et al. [34] introduces a modified MobileNEtV3 [35] architecture obtaining a good tradeoff between *accuracy* and *timing*.

Focusing on the classification and detection of defects over the welding area or joint, off-the-shelf solutions are no more efficient by themselves, and some issues need to be faced to enable the use of DNNs. Clustering and segmentation become difficult because the feature to be recognized are not easily separable. This chapter introduces some of the most common issues in the employment of DL for industrial quality inspection discussing the practical case of detection of welding defects in diesel injectors heads.
