**1. Introduction**

Presently, two approaches are typically used to monitor the condition of pavements: manual distress surveys and automated condition surveys using specially equipped vehicles. Traditionally, in order to determine the serviceability of road pavements, designated pavement officers perform on-site inspection, either by walk-observe-record or by windshield (drive-by) inspection, so as to aggregate the roughness, rutting and surface distresses [1, 2]. With the advancement of sensor technology, numerous automatic pavement evaluation systems have been proposed to aid in pavement condition inspection during the last two decades [3]. Currently, there exist several off-the-shelf commercial systems, which are being widely used

by some of the road maintenance agencies for detailed pavement distress evaluation and exclusive crack analysis. Among which, the Fugro Roadware's ARAN, CSIRO's RoadCrack and Ramböll OPQ's PAVUE are of the world's leading manufacturers offering an integrated full-fledged pavement evaluation system equipped with Global Positioning System (GPS)/Inertial Measurement Unit (IMU) sensors, Light Detection And Ranging (LiDAR) system, high definition video camera, and special lighting illumination systems [2]. Nonetheless, technology for the monitoring of pavement condition does not appear to have kept pace with other technological improvements over the past several years. Furthermore, these pavement monitoring and evaluation approaches remain rather reactive than proactive in terms of detecting distresses and damage, since they merely record the distress that has already appeared, and most of these methods either require significant personnel time or the use of costly equipment. Thus these systems and techniques can only be used cost-effectively on a periodic and or localized basis, and may not allow for continuous long-term monitoring and deployment at the network level, due limitations in hardware and software development and costs.

**2. Measurement principle of the Kinect v2.0 RGB-D sensor**

and IR camera has a larger vertical FOV [15].

*DOI: http://dx.doi.org/10.5772/intechopen.88877*

**Figure 1.**

**147**

*and IR cameras [15].*

The Kinect v2.0 is the successor of the Xtion Pro Live RGB-D camera, called the

*<sup>z</sup>* <sup>¼</sup> *<sup>h</sup>* <sup>¼</sup> *<sup>c</sup>* � <sup>Δ</sup>*<sup>φ</sup>*

*<sup>x</sup>* <sup>¼</sup> *<sup>u</sup>* � *Cx f x*

*<sup>y</sup>* <sup>¼</sup> *<sup>v</sup>* � *Cx f y*

where *z* is the depth measure in meters, Δ*φ* is the phase shift, *c* is the speed of light and *f* is the modulation frequency; *x* is the horizontal position, *u* is the vertical image coordinate, *Cx* is optical center in the X-direction and *f <sup>x</sup>* is the focal length in the X-direction, and *y* is the vertical position, *v* is the horizontal image coordinate, *Cy* is optical center in the Y-direction and *f <sup>y</sup>* is the focal length in the Y-direction. In **Figure 1(b)**, **P** is the measured point on object surface, E is the IR emitter C is the IR sensor, and *h* or *z* is the unknown distance of measured point from sensor origin. For the Kinect v1.0 RGB-D camera, the IR camera analyses a fixed speckle pattern projected by the IR projector and computes depth values by triangulation. This pattern analysis is referred to as the structured light (SL) approach, whereby a memorized IR pattern stored in the RGB-D camera's computer architecture is

*(a) Kinect sensor v2.0 cameras; (b) and (c) principle of Time of Flight (ToF) phase measurement in Kinect v2.0, and (d) Kinect v2.0 and the field of view geometry [13, 14]. (e) Field of view (FoV) of Kinect v2.0 RGB*

<sup>4</sup>*π<sup>f</sup>* (1)

(2)

(3)

Kinect v1.0. The version 2.0 Kinect RGB-D camera consists of a color (RGB) camera, an IR illuminator or projector and IR camera (**Figure 1(a)**). While the RGB camera records color information in high definition (HD), the IR projector emits an infrared laser and the IR camera is the sensor for the infrared laser. The Kinect v2 field in the horizontal is 70.6° and 60° in the vertical as depicted in **Figure 1(c)**. The values in the *z*-direction (depth values), are calculated using the Time of Flight (ToF) principle [16, 17], as shown in Eq. (1), and the *x* and *y* values are determined by using the homogeneous image coordinates *u* and *v*, and calculated as in Eqs. (2) and (3) [18]. The RGB and IR images acquired with the Kinect v2.0 partially overlap, because the RGB color camera has a wider horizontal field of view (FOV),

*On the Use of Low-Cost RGB-D Sensors for Autonomous Pothole Detection with Spatial…*

For sustainable and cost-effective road infrastructure management, the road agencies charged with the responsibility of road maintenance and repairs should be able to continuously collect road condition data within their network, with the objective of building and implementing pavement information and management systems (PIMS) using non-destructive techniques. However, as already stated above, data collection for a whole network such as an entire city or town is expensive and time consuming, if pursued by traditional surveys. Developments in sensor technology for digital image acquisition and computer technology for image data storage and processing can allow the local agencies to use digital image processing for pavement distress analyses. In order to overcome the cost limitations in pavement data collection, this chapter presents a pervasive and 'smart' nature of the low-cost consumer-grade devices, in the acquisition of roadway condition data. By using such devices, no dedicated and expensive platforms and drivers are needed for automated data collection, and are as such suitable in the long-term in terms of costs, implementation and operations for road condition surveys.

Besides the data acquisition systems, in order to enhance the automation of pavement condition monitoring, there have also been advancements in the data collection techniques (e.g., [4–7]), and automated data processing techniques [8–10]. Because of the irregularities in terms of noise and topographic structure of pavement surfaces, more research is still ongoing on the accurate detection, classification and quantification of cracks and potholes. In addition, the computational costs for automated pavement distress detections are expensive, and better approaches are still necessary in the evaluation of the automated crack measurement systems under the various conditions [11].

The commercially available state-of-the-art systems, which comprise of digital camera and laser-illumination module, and laser road-imaging vehicles costs about \$150,000. On the other hand, the pavement-surface profiler laser sensors, which are commonly used for measurement of road rutting-depth or surface-roughness, cost in the range of \$130,000–\$150,000. Comparatively, mobile pavement imaging techniques and manual inspection approaches respectively costs \$88.5/mile and \$428.8/mile, and the cost of using multi-sensor hybrid systems can range from \$541/mile to \$933/mile [2]. For fully automated pavement mapping systems, the cost of the imaging sensors and operations defines the purchase pricing, which averages at approximately \$697,152 [12]. This chapter presents an approach for the customization of a low-cost imaging system, Kinect v2.0 sensor, as a prototype for cost-effective pavement imaging, and a data processing pipeline for pothole detection and extraction on asphalt pavements.

*On the Use of Low-Cost RGB-D Sensors for Autonomous Pothole Detection with Spatial… DOI: http://dx.doi.org/10.5772/intechopen.88877*

## **2. Measurement principle of the Kinect v2.0 RGB-D sensor**

The Kinect v2.0 is the successor of the Xtion Pro Live RGB-D camera, called the Kinect v1.0. The version 2.0 Kinect RGB-D camera consists of a color (RGB) camera, an IR illuminator or projector and IR camera (**Figure 1(a)**). While the RGB camera records color information in high definition (HD), the IR projector emits an infrared laser and the IR camera is the sensor for the infrared laser. The Kinect v2 field in the horizontal is 70.6° and 60° in the vertical as depicted in **Figure 1(c)**. The values in the *z*-direction (depth values), are calculated using the Time of Flight (ToF) principle [16, 17], as shown in Eq. (1), and the *x* and *y* values are determined by using the homogeneous image coordinates *u* and *v*, and calculated as in Eqs. (2) and (3) [18]. The RGB and IR images acquired with the Kinect v2.0 partially overlap, because the RGB color camera has a wider horizontal field of view (FOV), and IR camera has a larger vertical FOV [15].

$$z = h = \frac{c \cdot \Delta \varphi}{4\pi f} \tag{1}$$

$$\infty = \frac{u - C\_{\infty}}{f\_{\infty}} \tag{2}$$

$$y = \frac{v - \mathbb{C}\_{\mathbf{x}}}{f\_{\mathcal{I}}} \tag{3}$$

where *z* is the depth measure in meters, Δ*φ* is the phase shift, *c* is the speed of light and *f* is the modulation frequency; *x* is the horizontal position, *u* is the vertical image coordinate, *Cx* is optical center in the X-direction and *f <sup>x</sup>* is the focal length in the X-direction, and *y* is the vertical position, *v* is the horizontal image coordinate, *Cy* is optical center in the Y-direction and *f <sup>y</sup>* is the focal length in the Y-direction. In **Figure 1(b)**, **P** is the measured point on object surface, E is the IR emitter C is the IR sensor, and *h* or *z* is the unknown distance of measured point from sensor origin.

For the Kinect v1.0 RGB-D camera, the IR camera analyses a fixed speckle pattern projected by the IR projector and computes depth values by triangulation. This pattern analysis is referred to as the structured light (SL) approach, whereby a memorized IR pattern stored in the RGB-D camera's computer architecture is

#### **Figure 1.**

by some of the road maintenance agencies for detailed pavement distress evaluation and exclusive crack analysis. Among which, the Fugro Roadware's ARAN, CSIRO's RoadCrack and Ramböll OPQ's PAVUE are of the world's leading manufacturers offering an integrated full-fledged pavement evaluation system equipped with Global Positioning System (GPS)/Inertial Measurement Unit (IMU) sensors, Light Detection And Ranging (LiDAR) system, high definition video camera, and special lighting illumination systems [2]. Nonetheless, technology for the monitoring of pavement condition does not appear to have kept pace with other technological improvements over the past several years. Furthermore, these pavement monitoring and evaluation approaches remain rather reactive than proactive in terms of detecting distresses and damage, since they merely record the distress that has already appeared, and most of these methods either require significant personnel time or the use of costly equipment. Thus these systems and techniques can only be used cost-effectively on a periodic and or localized basis, and may not allow for continuous long-term monitoring and deployment at the network level, due limita-

For sustainable and cost-effective road infrastructure management, the road agencies charged with the responsibility of road maintenance and repairs should be able to continuously collect road condition data within their network, with the objective of building and implementing pavement information and management systems (PIMS) using non-destructive techniques. However, as already stated above, data collection for a whole network such as an entire city or town is expensive and time consuming, if pursued by traditional surveys. Developments in sensor technology for digital image acquisition and computer technology for image data storage and processing can allow the local agencies to use digital image processing for pavement distress analyses. In order to overcome the cost limitations in pavement data collection, this chapter presents a pervasive and 'smart' nature of the low-cost consumer-grade devices, in the acquisition of roadway condition data. By using such devices, no dedicated and expensive platforms and drivers are needed for automated data collection, and are as such suitable in the long-term in terms of

Besides the data acquisition systems, in order to enhance the automation of pavement condition monitoring, there have also been advancements in the data collection techniques (e.g., [4–7]), and automated data processing techniques [8–10]. Because of the irregularities in terms of noise and topographic structure of pavement surfaces, more research is still ongoing on the accurate detection, classification and quantification of cracks and potholes. In addition, the computational costs for automated pavement distress detections are expensive, and better approaches are still necessary in the evaluation of the automated crack measure-

The commercially available state-of-the-art systems, which comprise of digital camera and laser-illumination module, and laser road-imaging vehicles costs about \$150,000. On the other hand, the pavement-surface profiler laser sensors, which are commonly used for measurement of road rutting-depth or surface-roughness, cost in the range of \$130,000–\$150,000. Comparatively, mobile pavement imaging techniques and manual inspection approaches respectively costs \$88.5/mile and \$428.8/mile, and the cost of using multi-sensor hybrid systems can range from \$541/mile to \$933/mile [2]. For fully automated pavement mapping systems, the cost of the imaging sensors and operations defines the purchase pricing, which averages at approximately \$697,152 [12]. This chapter presents an approach for the customization of a low-cost imaging system, Kinect v2.0 sensor, as a prototype for cost-effective pavement imaging, and a data processing pipeline for pothole detec-

tions in hardware and software development and costs.

*Geographic Information Systems in Geospatial Intelligence*

costs, implementation and operations for road condition surveys.

ment systems under the various conditions [11].

tion and extraction on asphalt pavements.

**146**

*(a) Kinect sensor v2.0 cameras; (b) and (c) principle of Time of Flight (ToF) phase measurement in Kinect v2.0, and (d) Kinect v2.0 and the field of view geometry [13, 14]. (e) Field of view (FoV) of Kinect v2.0 RGB and IR cameras [15].*
