**Abstract**

Robust perception is generally produced through complex multimodal perception pipelines, but these kinds of methods are unsuitable for autonomous UAV deployment, given the restriction found on the platforms. This chapter describes developments and experimental results produced to develop new deep learning (DL) solutions for industrial perception problems. An earlier solution combining camera, LiDAR, GPS, and IMU sensors to produce high rate, accurate, robust detection, and positioning of pipes in industrial environments is to be replaced by a single camera computationally lightweight convolutional neural network (CNN) perception technique. In order to develop DL solutions, large image datasets with ground truth labels are required, so the previous multimodal technique is modified to be used to capture and label datasets. The labeling method developed automatically computes the labels when possible for the images captured with the UAV platform. To validate the automated dataset generator, a dataset is produced and used to train a lightweight AlexNet-based full convolutional network (FCN). To produce a comparison point, a weakened version of the multimodal approach—without using prior data—is evaluated with the same DL-based metrics.

**Keywords:** deep learning, autonomous robotics, UAV, multimodal perception, computer vision

## **1. Introduction**

Robotics, as a commercial technology, started to be widespread some decades ago, but instead of decreasing, it has been growing year by year with new contributions in all the related fields that it integrates. The introduction of new materials, sensors, actuators, software, communications and use scenarios converted Robotics in a pushing area that embraces our everyday life. New robotic morphologies are the most shocking aspect that society perceives (i.e., the first models of each type generally produce the largest impact), but the long-term success of robotics is found in its capability to automate productive processes. Manufacturers and developers know that the market is found not only in large-scale companies (car manufacturers and electronics mainly) but also in the SME that provides solutions to problems that are manually performed so far. Also, robotics has opened the doors to new applications that did not exist some years ago and are also attractive to investors. These facts, together with lower prices for equipment, better programming and communication tools, and new fast-growing user-friendly collaborative robotic frameworks, have pushed robotics technology at the edge in many areas.

It is clear that industrial robotics leads the market worldwide, but social/gaming uses of robots have increased sales. Nevertheless, the most promising scenario for the present time and short term is the use of robots in commercial applications out of the plant floor. Emergency systems, inspection, and maintenance of facilities of any kind, rescues, surveillance, agriculture, fishing, border patrolling, and many other applications (without military use) attract users/clients because their use increases the productivity of the different sectors, low prices and high profitability are the keys.

There exist many robot morphologies and types (surface, underwater, aerial, underground, legged, wheels, caterpillar, etc.) but authors want to draw attention in the unmanned aerial vehicles (UAVs), which have several properties that make them attractive for a set of application that cannot be done with any other type of robot. First, those autonomous robots can fly, and therefore, they can reach areas that humans or other robots cannot. They are light, easy to move from one area to another, and can be adapted to any area, terrain, soil, building, or facility. The drawback is the fragility in front of adverse meteorological events, and their autonomy is quite limited compared with unmanned surface vehicles (USVs).

UAVs have seen the birth of a new era of unthinkable cheap, easy applications up to now. The authors would like to focus its use in the maintenance and inspection of industrial facilities, but specifically in the inspection of pipes in big, complex factories (mainly gas and oil companies) where the manual inspection (and even location and mapping) of pipes becomes an impossible task. Manned helicopters (with thermal engines) cannot fly close to pipes or even among a bunch of pipes. Scaffolds cannot be put up in complex, unstable, and fragile pipes to manually inspect them. Therefore, a complex problem can be solved through the use of UAVs for inspecting pipes of different diameters, colors, textures, and conditions in hazardous factories. This problem is not new and some solutions have been brought to an incipient market. Works as those in [1, 2] propose the creation of a map of the pipe set navigating among it with odometry and inertial units [3]. Obstacle avoidance in a crowded 3D world of pipes becomes of great interest when planning a flight; in [4], some contributions are made in this direction although the accuracy of object is deficient to be a reliable technology. Work in [5] overcomes some of the latter problems with the use of a big range of sensors, cameras, laser, barometer, ultrasound, and a computationally inefficient software scheme made the UAV too heavy and unreliable due to the excessive sensor fusion approach.

Many of the technical developments that have helped robotics grow have had a wider impact, especially those related with increasing computational power and parallelization levels. Faster processors, with tens of cores and additional multiple threat capabilities, and modern GPUs (graphics processing unit) have led to the emergence of GPGPU (general-purpose computing on GPU). These type of computing techniques have led to huge advances in the artificial intelligence (AI) field, producing the emergence of the "deep learning" field. The deep learning (DL) field is focused in using artificial neural networks (ANNs) that present tens or hundreds of layers, exploiting the huge parallelization capabilities of modern GPU. This is used in exploiting computational cores (e.g., CUDA cores), which compared on a one-to-one basis with a processor core, they are less powerful and slower, but can be found in amounts of hundreds or thousands. This has allowed the transition from shallow ANN to the deeper architectures and innovations such as several types of convolutional layers. In this work, the authors present a novel approach to detect pipes in industrial environments based in fully convolutional networks (FCNs). These will be used to extract the apparent contour of the pipes, replacing most of the architecture developed in [6] and discussed in Section 2. To properly train these networks, a custom dataset relevant to the domain is required, so the authors

**133**

**Figure 1.**

*Deep Learning-Based Detection of Pipes in Industrial Environments*

captured a dataset and developed an automatic label generation procedure base in previous works. Two different state-of-the-art semantic segmentation approaches were trained and evaluated with the standard metrics to prove the validity of the whole approach. Thus, in the following section, some generalities about the pipe detection and positioning problem are discussed, and the authors' previous work [6] on it, as it will be relevant later. The next section discusses the semantic segmentation problem as a way to extract the apparent contour, both surveying classical methods, considered for earlier works, and state of the art deep-learningbased methodologies. The fourth section describes how the automatic label generator using multimodal data was derived and some features to the process. The experimental section starts discussing the metrics employed to validate the results, the particularities of the domain dataset generated and describes how an AlexNet FCN architecture was trained through transfer learning and the results achieved. To conclude, some discussion on the quality of the results and possible enhancements is introduced, discussing which would be the best strategies to follow continuing

As it has been discussed, inspection and surveying are a frequent problem where UAV technologies are applied. The most common scenario found is that of a hard to reach infrastructure that is visually inspected through different sensors onboard a piloted UAV. Some projects have proposed the introduction of higher level perception and automation capacities, depending on the specific problem. In these cases, it is common to join state-of-the-art academic and industrial expertise to reach

In one of these projects, the specific challenge of accurately detecting and positioning a pipe in real time using only the hardware deployable in a small (per industry standards) UAV platform was considered (**Figure 1**), with several solutions studied and tested (including vision- and LIDAR-based techniques).

In the case of LIDAR-based detection, finding a pipe is generally treated as a segmentation problem in the sensor space (using R3 data collected as "*point clouds*"). There are many methods used for LIDAR detection, but the most successful are based on stochastic model fitting and registration, commonly in RANSAC (Random Sample Consensus [7]) or derived approaches [8, 9]. Three different data density levels were tested using the libraries available through ROS: using RANSAC over a map estimated by a SLAM technique, namely LOAM [10]; detecting the pipe

*One of the UAV used for the development of perception tasks in the AEROARMS project. Several sensors were deployed, processing them with a set of SBCs (single-board computers), including a Velodyne LiDAR, two* 

*different cameras, ultrasonic range-finder (height), and optical flow.*

*DOI: http://dx.doi.org/10.5772/intechopen.93164*

this research.

**2. Related work**

functional solutions.

### *Deep Learning-Based Detection of Pipes in Industrial Environments DOI: http://dx.doi.org/10.5772/intechopen.93164*

captured a dataset and developed an automatic label generation procedure base in previous works. Two different state-of-the-art semantic segmentation approaches were trained and evaluated with the standard metrics to prove the validity of the whole approach. Thus, in the following section, some generalities about the pipe detection and positioning problem are discussed, and the authors' previous work [6] on it, as it will be relevant later. The next section discusses the semantic segmentation problem as a way to extract the apparent contour, both surveying classical methods, considered for earlier works, and state of the art deep-learningbased methodologies. The fourth section describes how the automatic label generator using multimodal data was derived and some features to the process. The experimental section starts discussing the metrics employed to validate the results, the particularities of the domain dataset generated and describes how an AlexNet FCN architecture was trained through transfer learning and the results achieved. To conclude, some discussion on the quality of the results and possible enhancements is introduced, discussing which would be the best strategies to follow continuing this research.
