Enabling Technologies

**Chapter 5**

Driving

**Abstract**

Visibility-Based Technologies and

Methodologies for Autonomous

*Said Easa, Yang Ma, Ashraf Elshorbagy, Ahmed Shaker,*

therefore should help to advance road safety for autonomous vehicles.

sight distance, highway design, technologies

**1. Introduction**

**101**

**Keywords:** Lidar, traffic visibility, infrastructure visibility, high-precision maps,

Autonomous vehicle (AV) will reduce human errors and is expected to lead to significant benefits in safety, mobility, and sustainability [1–5]. The technical feasibility of automated highways has been demonstrated in San Diego, California in 1997 [6, 7]. The technology is emerging around the world on both passenger and freight sides. Autonomous vehicles have already started to appear on the roads across the globe. Clearly, as the AV market expands, transportation professionals and researchers must address an array of challenges before AV soon becomes a reality. Several government and industry entities have deployed demonstrations and field tests of the technology. Centres for testing and education, products, and standards for AV have also been established. Currently, researchers, scientists, and engineers are investing significant resources to develop supporting technologies.

The three main elements of autonomous vehicles (AV) are orientation, visibility, and decision. This chapter presents an overview of the implementation of visibilitybased technologies and methodologies. The chapter first presents two fundamental aspects that are necessary for understanding the main contents. The first aspect is highway geometric design as it relates to sight distance and highway alignment. The second aspect is mathematical basics, including coordinate transformation and visual space segmentation. Details on the Light Detection and Ranging (Lidar) system, which represents the 'eye' of the AV are presented. In particular, a new Lidar 3D mapping system, that can be operated on different platforms and modes for a new mapping scheme is described. The visibility methodologies include two types. Infrastructure visibility mainly addresses high-precision maps and sight obstacle detection. Traffic visibility (vehicles, pedestrians, and cyclists) addresses identification of critical positions and visibility estimation. Then, an overview of the decision element (path planning and intelligent car-following) for the movement of AV is presented. The chapter provides important information for researchers and

*Songnian Li and Shriniwas Arkatkar*

#### **Chapter 5**

## Visibility-Based Technologies and Methodologies for Autonomous Driving

*Said Easa, Yang Ma, Ashraf Elshorbagy, Ahmed Shaker, Songnian Li and Shriniwas Arkatkar*

#### **Abstract**

The three main elements of autonomous vehicles (AV) are orientation, visibility, and decision. This chapter presents an overview of the implementation of visibilitybased technologies and methodologies. The chapter first presents two fundamental aspects that are necessary for understanding the main contents. The first aspect is highway geometric design as it relates to sight distance and highway alignment. The second aspect is mathematical basics, including coordinate transformation and visual space segmentation. Details on the Light Detection and Ranging (Lidar) system, which represents the 'eye' of the AV are presented. In particular, a new Lidar 3D mapping system, that can be operated on different platforms and modes for a new mapping scheme is described. The visibility methodologies include two types. Infrastructure visibility mainly addresses high-precision maps and sight obstacle detection. Traffic visibility (vehicles, pedestrians, and cyclists) addresses identification of critical positions and visibility estimation. Then, an overview of the decision element (path planning and intelligent car-following) for the movement of AV is presented. The chapter provides important information for researchers and therefore should help to advance road safety for autonomous vehicles.

**Keywords:** Lidar, traffic visibility, infrastructure visibility, high-precision maps, sight distance, highway design, technologies

#### **1. Introduction**

Autonomous vehicle (AV) will reduce human errors and is expected to lead to significant benefits in safety, mobility, and sustainability [1–5]. The technical feasibility of automated highways has been demonstrated in San Diego, California in 1997 [6, 7]. The technology is emerging around the world on both passenger and freight sides. Autonomous vehicles have already started to appear on the roads across the globe. Clearly, as the AV market expands, transportation professionals and researchers must address an array of challenges before AV soon becomes a reality. Several government and industry entities have deployed demonstrations and field tests of the technology. Centres for testing and education, products, and standards for AV have also been established. Currently, researchers, scientists, and engineers are investing significant resources to develop supporting technologies.

The self-driving system involves three main elements: orientation, visibility, and decision, as shown in **Figure 1**. For each element, certain functions are performed with the aid of one or a combination of technologies. For AV orientation, the position of the vehicle is determined mainly using Global Navigation Satellite System (GNSS). To increase reliability and accuracy, this technology is supplemented with other data gathered from specific perception technologies, such as cameras and internal measurement devices (tachometers, altimeters, and gyroscopes).

performance of AV, such as real-time location and speed, idling duration, fuel

This chapter mainly focuses on the visibility of AV and briefly addresses the decision element. Section 2 presents important visibility fundamentals related to highway geometric design and mathematical tools. Section 3 presents details on the Lidar system, which represents the 'eye' of the AV. This section also introduces a new Lidar system that has great potential for high precision mapping for autonomous vehicles. Section 4 describes the methodologies related to infrastructure visibility and traffic visibility. Section 5 presents some implementation aspects and

In the geometric design guides by American Association of State Highways and Transportation Officials and Transportation Association of Canada [8–10], the available sight distance for human-driven vehicles is measured from the driver's height above the pavement surface. For autonomous vehicles, the driver's eye is the Lidar. The ability of the AV to 'see' ahead is critical for the safe and efficient operation. Sufficient sight distance must be provided to allow the AV to stop, avoid obstacles on the roadway surface, overtake slow vehicles on two-lane highways, and make safe turns/crossings at intersections. For human-driven vehicles, the required sight distance is based on driver's perception-reaction time (PRT), vehicle speed, and other factors. A value of 2.5 s is used for human-driven vehicles and a value of 0.5 s has been assumed for AV in the literature [11]. This smaller reaction time will

For autonomous vehicles, four basic types of SD are defined as follows:

• *Stopping Sight Distance (SSD)*: This is the distance traveled by the AV during system reaction time and the braking distance from the operating speed to stop.

• *Passing Sight Distance (PSD)*: This is the distance, including system reaction time, required by the AV on rural two-lane roads to allow the vehicle to pass a

• *Decision Sight Distance (DSD)*: This is the distance that allows the AV to maneuver or change its operating speed or stop to avoid an obstacle on the

• *Intersection Sight Distance (ISD)*: This is the distance along a cross road with a right of way that must be clearly visible to the crossing AV so that it can decide and complete the maneuver without conflicting with the cross-road vehicles.

Since the AV response time is less than the driver's PRT, the required SD for autonomous vehicles would be shorter than that for human-driven vehicles. The impact of AV on highway geometric design has been explored in a preliminary manner by making simple assumptions regarding system reaction time and Lidar

The required SD for autonomous vehicles, Lidar height, and object height influence the design and evaluation of highway vertical, horizontal, and 3D alignments, as shown in **Figure 2**. The Lidar height above the pavement surface is critical in

field of view [11]. The study focused on SSD, DSD, and vertical curves.

consumption, and the condition of its drivetrain.

*DOI: http://dx.doi.org/10.5772/intechopen.95328*

*Visibility-Based Technologies and Methodologies for Autonomous Driving*

result in shorter required sight distances for AV.

slower vehicle by using the opposing lane.

Section 6 presents the conclusions.

**2. Visibility fundamentals**

**2.1 Highway sight distance**

roadway surface.

**103**

The visibility of AV, sometimes referred to as perception, involves infrastructure detection (e.g. sight obstacles, road markings, and traffic control devices) and traffic detection (other vehicles, pedestrians, and cyclists). High-precision (HP) maps help the AV not only perceive the environment in its vicinity, but also plan for the turns and intersection far beyond the sensors' horizons. The other main technology that represents the 'eye' of the AV is the Light Detection and Ranging (Lidar). Other technologies used for infrastructure visibility include video cameras which can detect traffic signals, read road signs, keep track of other vehicles, and record the presence of pedestrians and other obstacles. Radar sensors can determine the positions of other nearby vehicles, while ultrasonic sensors (normally mounted on wheels) can measure the position of close objects (e.g. curbs). Lidar sensors can also be used to identify road features, like lane markings. Three primary sensors (camera, radar, and Lidar) work together to provide AV with visuals of its 3D environment and help detect the speed and distance of nearby objects. The visibility information is essential for the safe operation of the AV. For example, the information can be used to ensure that adequate sight distance (SD) is available and if not to take appropriate actions.

The decision of AV is based on the data generated by the AV sensors and HP maps. Examples of the decisions that are made by AV include routing, changing lanes, overtaking vehicles, stopping at traffic lights, and turning at an intersection. The decision is made by an on-board centralized artificial intelligence (AI) computer that is linked with cloud data and include the needed algorithms and software.

For the AV to function effectively, it must communicate with the environment as part of a continuously updating, dynamic urban map [3]. This involves vehicleto-vehicle (V2V) and vehicle-to-infrastructure (V2I) communications. Internet of Things (IoT) technology provides telematics data about the condition and

**Figure 1.**

*Elements of autonomous vehicles and their functions and technologies.*

#### *Visibility-Based Technologies and Methodologies for Autonomous Driving DOI: http://dx.doi.org/10.5772/intechopen.95328*

performance of AV, such as real-time location and speed, idling duration, fuel consumption, and the condition of its drivetrain.

This chapter mainly focuses on the visibility of AV and briefly addresses the decision element. Section 2 presents important visibility fundamentals related to highway geometric design and mathematical tools. Section 3 presents details on the Lidar system, which represents the 'eye' of the AV. This section also introduces a new Lidar system that has great potential for high precision mapping for autonomous vehicles. Section 4 describes the methodologies related to infrastructure visibility and traffic visibility. Section 5 presents some implementation aspects and Section 6 presents the conclusions.

#### **2. Visibility fundamentals**

The self-driving system involves three main elements: orientation, visibility, and decision, as shown in **Figure 1**. For each element, certain functions are performed with the aid of one or a combination of technologies. For AV orientation, the position of the vehicle is determined mainly using Global Navigation Satellite System (GNSS). To increase reliability and accuracy, this technology is supplemented with other data gathered from specific perception technologies, such as cameras and

The visibility of AV, sometimes referred to as perception, involves infrastructure detection (e.g. sight obstacles, road markings, and traffic control devices) and traffic detection (other vehicles, pedestrians, and cyclists). High-precision (HP) maps help the AV not only perceive the environment in its vicinity, but also plan for the turns and intersection far beyond the sensors' horizons. The other main technology that represents the 'eye' of the AV is the Light Detection and Ranging (Lidar). Other technologies used for infrastructure visibility include video cameras which can detect traffic signals, read road signs, keep track of other vehicles, and record the presence of pedestrians and other obstacles. Radar sensors can determine the positions of other nearby vehicles, while ultrasonic sensors (normally mounted on wheels) can measure the position of close objects (e.g. curbs). Lidar sensors can also be used to identify road features, like lane markings. Three primary sensors (camera, radar, and Lidar) work together to provide AV with visuals of its 3D environment and help detect the speed and distance of nearby objects. The visibility information is essential for the safe operation of the AV. For example, the information can be used to ensure that adequate sight distance (SD) is available and if not to

The decision of AV is based on the data generated by the AV sensors and HP maps. Examples of the decisions that are made by AV include routing, changing lanes, overtaking vehicles, stopping at traffic lights, and turning at an intersection. The decision is made by an on-board centralized artificial intelligence (AI) computer that is linked with cloud data and include the needed algorithms and software. For the AV to function effectively, it must communicate with the environment as part of a continuously updating, dynamic urban map [3]. This involves vehicleto-vehicle (V2V) and vehicle-to-infrastructure (V2I) communications. Internet of

Things (IoT) technology provides telematics data about the condition and

*Elements of autonomous vehicles and their functions and technologies.*

internal measurement devices (tachometers, altimeters, and gyroscopes).

*Self-Driving Vehicles and Enabling Technologies*

take appropriate actions.

**Figure 1.**

**102**

#### **2.1 Highway sight distance**

In the geometric design guides by American Association of State Highways and Transportation Officials and Transportation Association of Canada [8–10], the available sight distance for human-driven vehicles is measured from the driver's height above the pavement surface. For autonomous vehicles, the driver's eye is the Lidar. The ability of the AV to 'see' ahead is critical for the safe and efficient operation. Sufficient sight distance must be provided to allow the AV to stop, avoid obstacles on the roadway surface, overtake slow vehicles on two-lane highways, and make safe turns/crossings at intersections. For human-driven vehicles, the required sight distance is based on driver's perception-reaction time (PRT), vehicle speed, and other factors. A value of 2.5 s is used for human-driven vehicles and a value of 0.5 s has been assumed for AV in the literature [11]. This smaller reaction time will result in shorter required sight distances for AV.

For autonomous vehicles, four basic types of SD are defined as follows:


Since the AV response time is less than the driver's PRT, the required SD for autonomous vehicles would be shorter than that for human-driven vehicles. The impact of AV on highway geometric design has been explored in a preliminary manner by making simple assumptions regarding system reaction time and Lidar field of view [11]. The study focused on SSD, DSD, and vertical curves.

The required SD for autonomous vehicles, Lidar height, and object height influence the design and evaluation of highway vertical, horizontal, and 3D alignments, as shown in **Figure 2**. The Lidar height above the pavement surface is critical in

determining the available sight distance. For crest vertical curves (**Figure 2a**), since the required SD for autonomous vehicles are shorter, the required Lidar height for safe operation, *hL*, would be somewhat less than the design driver's eye height, *h1*. Thus, by placing the Lidar at or above *hL*, the AV can safely operate on existing highways without the need for modifying their design. For a sag vertical curve with an overpass (**Figure 2b**), where the truck driver's eye height controls the traditional curve design, the required Lidar height for safe operation would be somewhat larger than the design driver's eye height. Thus, by placing the Lidar at or below *hL*, the AV can safely operate on existing highways. For horizontal curves (**Figure 2c**), the Lidar height is not important for detecting horizontal obstacles, except when cut slopes are present.

The accuracy of VE is closely associated with the precision of obstacle data. In the past, due to a lack of effective and affordable technique to collect dense and precise geospatial data, the obstacle data were conventionally represented by digital surface and terrain models generated from sparse geospatial points. However, it was proved that neither model could handle objects with complex shapes and may yield

Over the past five years, mobile laser scanning (MLS) data have been recognized

as a reliable data source for conducting visibility-related analyses [15–18]. MLS point clouds enable a very accurate and precise representation of real-world environment. In this case, a number of indoor computerized estimations on MLS point clouds are viable to replace risky and time-consuming outdoor field measurements [16]. In addition, MLS data are also the main data source for manufacturing HP maps which are indispensable to AV. Therefore, the mathematical model for esti-

The general workflow of VE using MLS data is graphically presented in **Figure 4**. For a given eye position, both the target points and obstacle points undergo the same procedure (i.e. coordinate transformation and segmentation) and two respective depth maps of the same size are generated. The visibility estimation is achieved through comparing two 3D depth maps. The process of VE is

To illustrate, consider VE at a single position. Let *S* be the sight point. Let *F*

the forward direction. Let **Φ** and **Ψ** be the set of target points and obstacle points, respectively. To reduce the computational complexity, a local coordinate frame is established first as depicted in **Figure 5a**. Set S as the origin. Then, the *Y*' and *X*'

*Z*'-axis is set along the upward direction perpendicular to the *X*'*SY*'plane. The local

cos *α* � cos *β* � cos *β* � sin *α* sin *β* sin *α* cos *α* 0 � cos *α* � sin *β* sin *α* � sin *β* cos *β*

*\**

and the horizontal vector normal to *F*

*\** be

, respectively. Finally, the

*x* � *xs y* � *ys z* � *zs*

2 6 4

mating visibility in autonomous driving is built based on MLS data.

inaccurate VE results in some cases [12–14].

*Basic components of quantitative visibility estimation.*

*DOI: http://dx.doi.org/10.5772/intechopen.95328*

*Visibility-Based Technologies and Methodologies for Autonomous Driving*

described in detail next.

axes are set along *F*

*x*0 *y*0 *z*0

**Figure 4.**

**105**

**Figure 3.**

3 7 <sup>5</sup> <sup>¼</sup> <sup>R</sup> �

2 6 4

*2.2.2 Coordinate transformation*

*\**

*x* � *xs y* � *ys z* � *zs*

2 6 4

*Workflow of VE using MLS data.*

coordinates of points (*S*, **Φ** and **Ψ**) are obtained via:

2 6 4

For human-driven vehicles, sag vertical curves (**Figure 2d**) are designed based on the distance that the vehicle headlamp can illuminate ahead of the vehicle, where the angle of light beam upward from the vehicle plane is considered equal to 1o . What is interesting is that for AV the Lidar height is irrelevant to the operation on sag vertical curves. The reason is that Lidar detection does not require light and therefore an obstacle ahead of the autonomous vehicle can be detected under all light conditions. It is necessary, however, to ensure that the effective Lidar range is greater than the required SD.

Note that the geometry of SD shown in **Figure 2** represents individual horizontal or vertical alignments. For 3D alignment (combined horizontal and vertical curves), the AV can directly determine the available SD and compare it with the required SD. The sight line in this case may be obstructed by the pavement surface or by obstacles on the roadside, such as a building or a cut slope.

#### **2.2 Mathematical tools**

#### *2.2.1 Overview of visibility modeling*

Quantitative visibility estimation (VE) involves four basic components as shown in **Figure 3**: (1) sight point, (2) target points (e.g. lane marking, traffic sign, and stalled vehicle), (3) line of sight (LOS) which connects the sight point with target points, and (4) obstacle data that obstruct LOS (e.g. vegetation, barrier, and building). The purpose of VE is to determine whether the obstacles affect the visibility of the targets to the sight point in a mathematical manner.

#### **Figure 2.**

*Effect of Lidar height and sight distance on the operation of AV on different highway alignments. Note:* A *= algebraic difference in grade,* C *= lateral clearance,* hL *= Lidar height,* h2 *= object height, SL = effective Lidar range,* L *= vertical curve length,* R *= radius of horizontal curve,* S *= sight distance. (a) Crest vertical curve (b) sag vertical curve with overpass, (c) horizontal curve and (d) sag vertical curve.*

*Visibility-Based Technologies and Methodologies for Autonomous Driving DOI: http://dx.doi.org/10.5772/intechopen.95328*

#### **Figure 3.**

determining the available sight distance. For crest vertical curves (**Figure 2a**), since the required SD for autonomous vehicles are shorter, the required Lidar height for safe operation, *hL*, would be somewhat less than the design driver's eye height, *h1*. Thus, by placing the Lidar at or above *hL*, the AV can safely operate on existing highways without the need for modifying their design. For a sag vertical curve with an overpass (**Figure 2b**), where the truck driver's eye height controls the traditional curve design, the required Lidar height for safe operation would be somewhat larger than the design driver's eye height. Thus, by placing the Lidar at or below *hL*, the AV can safely operate on existing highways. For horizontal curves (**Figure 2c**), the Lidar height is not important for detecting horizontal obstacles, except when cut

For human-driven vehicles, sag vertical curves (**Figure 2d**) are designed based on the distance that the vehicle headlamp can illuminate ahead of the vehicle, where the

Quantitative visibility estimation (VE) involves four basic components as shown in **Figure 3**: (1) sight point, (2) target points (e.g. lane marking, traffic sign, and stalled vehicle), (3) line of sight (LOS) which connects the sight point with target points, and (4) obstacle data that obstruct LOS (e.g. vegetation, barrier, and building). The purpose of VE is to determine whether the obstacles affect the visibility of

*Effect of Lidar height and sight distance on the operation of AV on different highway alignments. Note:* A *= algebraic difference in grade,* C *= lateral clearance,* hL *= Lidar height,* h2 *= object height, SL = effective Lidar range,* L *= vertical curve length,* R *= radius of horizontal curve,* S *= sight distance. (a) Crest vertical curve*

*(b) sag vertical curve with overpass, (c) horizontal curve and (d) sag vertical curve.*

interesting is that for AV the Lidar height is irrelevant to the operation on sag vertical curves. The reason is that Lidar detection does not require light and therefore an obstacle ahead of the autonomous vehicle can be detected under all light conditions. It is necessary, however, to ensure that the effective Lidar range is greater than the required SD. Note that the geometry of SD shown in **Figure 2** represents individual horizontal or vertical alignments. For 3D alignment (combined horizontal and vertical curves), the AV can directly determine the available SD and compare it with the required SD. The sight line in this case may be obstructed by the pavement surface or by

. What is

angle of light beam upward from the vehicle plane is considered equal to 1o

obstacles on the roadside, such as a building or a cut slope.

the targets to the sight point in a mathematical manner.

slopes are present.

*Self-Driving Vehicles and Enabling Technologies*

**2.2 Mathematical tools**

**Figure 2.**

**104**

*2.2.1 Overview of visibility modeling*

*Basic components of quantitative visibility estimation.*

The accuracy of VE is closely associated with the precision of obstacle data. In the past, due to a lack of effective and affordable technique to collect dense and precise geospatial data, the obstacle data were conventionally represented by digital surface and terrain models generated from sparse geospatial points. However, it was proved that neither model could handle objects with complex shapes and may yield inaccurate VE results in some cases [12–14].

Over the past five years, mobile laser scanning (MLS) data have been recognized as a reliable data source for conducting visibility-related analyses [15–18]. MLS point clouds enable a very accurate and precise representation of real-world environment. In this case, a number of indoor computerized estimations on MLS point clouds are viable to replace risky and time-consuming outdoor field measurements [16]. In addition, MLS data are also the main data source for manufacturing HP maps which are indispensable to AV. Therefore, the mathematical model for estimating visibility in autonomous driving is built based on MLS data.

The general workflow of VE using MLS data is graphically presented in **Figure 4**. For a given eye position, both the target points and obstacle points undergo the same procedure (i.e. coordinate transformation and segmentation) and two respective depth maps of the same size are generated. The visibility estimation is achieved through comparing two 3D depth maps. The process of VE is described in detail next.

#### *2.2.2 Coordinate transformation*

To illustrate, consider VE at a single position. Let *S* be the sight point. Let *F \** be the forward direction. Let **Φ** and **Ψ** be the set of target points and obstacle points, respectively. To reduce the computational complexity, a local coordinate frame is established first as depicted in **Figure 5a**. Set S as the origin. Then, the *Y*' and *X*'

axes are set along *F \** and the horizontal vector normal to *F \** , respectively. Finally, the *Z*'-axis is set along the upward direction perpendicular to the *X*'*SY*'plane. The local coordinates of points (*S*, **Φ** and **Ψ**) are obtained via:

**Figure 4.** *Workflow of VE using MLS data.*

**Figure 5.**

*Coordinate transformation: (a) different coordinate systems and (b) visualization of points in different frames (generated with Matlab 2020b; the same for Figures 6, 7, and 8-10).*

where R = the rotation matrix, *xs*, *ys* , *zs* � �*<sup>T</sup>* = geodetic coordinates of the sight point, ð Þ *x*, *y*, *z <sup>T</sup>* and *x*<sup>0</sup> , *y*<sup>0</sup> , *<sup>z</sup>*<sup>0</sup> ð Þ*<sup>T</sup>* = geodetic and local coordinates of points (*S*, **<sup>Φ</sup>** and **<sup>Ψ</sup>**), respectively, and *α*, *β* = rotation angles around the *Z* and *X*-axes, respectively.

To enable a more intuitive and efficient estimation of target visibility, the local Cartesian coordinates are further converted into spherical coordinates as follows,

$$\begin{cases} \rho = \arctan \frac{\mathbf{y}'}{\mathbf{x}'}\\ \phi = \arctan \frac{\mathbf{z}'}{\sqrt{\mathbf{x}'^2 + \mathbf{y}'^2}}\\ d = \sqrt{\mathbf{x}'^2 + \mathbf{y}'^2 + \mathbf{z}'^2} \end{cases} \tag{2}$$

*2.2.3 Visual space segmentation*

*Sight obstacle detection: (a) principle and (b) results.*

**Figure 6.**

**Figure 7.**

**107**

*Comparison of two depth maps.*

*Visibility-Based Technologies and Methodologies for Autonomous Driving*

*DOI: http://dx.doi.org/10.5772/intechopen.95328*

main steps as follows:

*Step 1*: Grid the original 3D data ð Þ *a*, *b*,*c*

Based on the preceding assumption, the main task is to segment points in the *φ*-*ϕ*-*d* space into different pillars. To reduce the computational complexity and thus improve the efficiency, a linear index based segmentation procedure was developed and applied to partition points. The procedure assigns a new one-dimensional (1D) index to each point and can divide point cloud data into pillars or voxels. Taking the voxel generation for illustration, the segmentation procedure compromises three

> *ag* ¼ ⟦ *a δa* ⟧ � *δ<sup>a</sup>*

8 >>>>>>><

>>>>>>>:

the dense points that are converted to the gridded data.

*bg* ¼ ⟦ *b δb* ⟧ � *δ<sup>b</sup>*

*cg* <sup>¼</sup> ⟦ *<sup>c</sup> δc* ⟧ � *δ<sup>c</sup>*

where ⟦*:*⟧ = a function that rounds the number to an integer, *δa*, *δb*, *δ<sup>c</sup>* = userdefined voxel dimensions along the *a*, *b*, and *c* axes, respectively. **Figure 11a** shows

*<sup>T</sup>* using the following formula:

(3)

where ð Þ *<sup>φ</sup>*, *<sup>ϕ</sup>*, *<sup>d</sup> <sup>T</sup>*= azimuth, elevation, and radius of the spherical coordinates shown in **Figure 5a**.

Points in different coordinate frames are visualized in **Figure 5b**. The sight point marked in figure corresponds to *S* in **Figure 5a**. In the *φ*-*ϕ*-*d* space, *φ* and *ϕ* refer to the horizontal and vertical angles between a LOS and the forward direction, respectively, while *d* measures the depth from the sight to the target point. Conventionally, as shown in **Figure 5a**, a sight cone or sight pyramid is constructed around a given LOS in a local frame to detect whether the obstacle points therein touch that LOS [12, 17]. However, when the number of target points is very large, the practice of building sight pyramids and then searching for points therein in dense MLS point clouds is quite time-consuming. With the coordinate transformation, a sight pyramid in the *x*'-*y*'-*z*' space is equivalent to a pillar in the *φ*-*ϕ*-*d* space. In this case, it is reasonable to assume that when *δφ* and *δϕ* are small, only the point with the minimum *d* can be seen by an observer in a pillar. This process is more straightforward than detecting the closest point in a sight pyramid in the *x*'-*y*'-*z*' coordinate system.

*Visibility-Based Technologies and Methodologies for Autonomous Driving DOI: http://dx.doi.org/10.5772/intechopen.95328*

**Figure 6.** *Comparison of two depth maps.*

where R = the rotation matrix, *xs*, *ys*

*Self-Driving Vehicles and Enabling Technologies*

, *y*<sup>0</sup>

*(generated with Matlab 2020b; the same for Figures 6, 7, and 8-10).*

*<sup>T</sup>* and *x*<sup>0</sup>

point, ð Þ *x*, *y*, *z*

**Figure 5.**

shown in **Figure 5a**.

coordinate system.

**106**

, *zs* � �*<sup>T</sup>*

*Coordinate transformation: (a) different coordinate systems and (b) visualization of points in different frames*

To enable a more intuitive and efficient estimation of target visibility, the local Cartesian coordinates are further converted into spherical coordinates as follows,

*x*0

q

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi *<sup>x</sup>*0<sup>2</sup> <sup>þ</sup> *<sup>y</sup>*0<sup>2</sup> <sup>þ</sup> *<sup>z</sup>*0<sup>2</sup>

where ð Þ *<sup>φ</sup>*, *<sup>ϕ</sup>*, *<sup>d</sup> <sup>T</sup>*= azimuth, elevation, and radius of the spherical coordinates

Points in different coordinate frames are visualized in **Figure 5b**. The sight point marked in figure corresponds to *S* in **Figure 5a**. In the *φ*-*ϕ*-*d* space, *φ* and *ϕ* refer to

*z*0 ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi *<sup>x</sup>*0<sup>2</sup> <sup>þ</sup> *<sup>y</sup>*0<sup>2</sup>

respectively, and *α*, *β* = rotation angles around the *Z* and *X*-axes, respectively.

*<sup>φ</sup>* <sup>¼</sup> arctan *<sup>y</sup>*<sup>0</sup>

*ϕ* ¼ arctan

q

the horizontal and vertical angles between a LOS and the forward direction, respectively, while *d* measures the depth from the sight to the target point. Conventionally, as shown in **Figure 5a**, a sight cone or sight pyramid is constructed around a given LOS in a local frame to detect whether the obstacle points therein touch that LOS [12, 17]. However, when the number of target points is very large, the practice of building sight pyramids and then searching for points therein in dense MLS point clouds is quite time-consuming. With the coordinate transformation, a sight pyramid in the *x*'-*y*'-*z*' space is equivalent to a pillar in the *φ*-*ϕ*-*d* space. In this case, it is reasonable to assume that when *δφ* and *δϕ* are small, only the point with the minimum *d* can be seen by an observer in a pillar. This process is more straightforward than detecting the closest point in a sight pyramid in the *x*'-*y*'-*z*'

*d* ¼

8 >>>>>>><

>>>>>>>:

= geodetic coordinates of the sight

(2)

, *<sup>z</sup>*<sup>0</sup> ð Þ*<sup>T</sup>* = geodetic and local coordinates of points (*S*, **<sup>Φ</sup>** and **<sup>Ψ</sup>**),

**Figure 7.** *Sight obstacle detection: (a) principle and (b) results.*

#### *2.2.3 Visual space segmentation*

Based on the preceding assumption, the main task is to segment points in the *φ*-*ϕ*-*d* space into different pillars. To reduce the computational complexity and thus improve the efficiency, a linear index based segmentation procedure was developed and applied to partition points. The procedure assigns a new one-dimensional (1D) index to each point and can divide point cloud data into pillars or voxels. Taking the voxel generation for illustration, the segmentation procedure compromises three main steps as follows:

*Step 1*: Grid the original 3D data ð Þ *a*, *b*,*c <sup>T</sup>* using the following formula:

$$\begin{cases} a\_{\mathcal{g}} = \|\frac{\mathcal{a}}{\delta\_a}\| \cdot \delta\_a \\\\ b\_{\mathcal{g}} = \|\frac{\mathcal{b}}{\delta\_b}\| \cdot \delta\_b \\\\ c\_{\mathcal{g}} = \|\frac{\mathcal{c}}{\delta\_c}\| \cdot \delta\_c \end{cases} \tag{3}$$

where ⟦*:*⟧ = a function that rounds the number to an integer, *δa*, *δb*, *δ<sup>c</sup>* = userdefined voxel dimensions along the *a*, *b*, and *c* axes, respectively. **Figure 11a** shows the dense points that are converted to the gridded data.

**Figure 8.** *Extraction of critical positions: (a) general process of identifying planar ground points and (b) critical positions and driving lines.*

**Figure 9.** *VE results at a single node.*

*Step 2*: Let *Da*, *Db*, and *Dc* be the distances along the *a*, *b*, and *c* axes from min *ag*, min *bg* , max *cg <sup>T</sup>* to max *ag*, max *bg*, min *cg <sup>T</sup>* , respectively. Let *da*, *db*, and *dc* be the distances along the *a*, *b*, and *c* axes from min *ag* , min *bg*, max *cg <sup>T</sup>* to a certain grid point *P*, respectively. A null ð Þ� *Db* þ 1 ð Þ� *Da* þ 1 ð Þ *Dc* þ 1 cell matrix f g Θ and the same-size numerical matrix ½ � Λ are then constructed. The location of the grid point *<sup>P</sup>* in f g <sup>Θ</sup> or ½ � <sup>Λ</sup> is ð Þ *db* <sup>þ</sup> 1, *da* <sup>þ</sup> 1, *dc* <sup>þ</sup> <sup>1</sup> *<sup>T</sup>*. The location can be converted to a linear index *idx*1*<sup>D</sup>* as follows

**Figure 11.**

**109**

**Figure 10.**

*Linear index-based segmentation: (a) process and (b) time performance.*

*Visualization of VE results: (a) VE process, (b) cumulative times of invisibility, and (c) visibility ratio.*

*Visibility-Based Technologies and Methodologies for Autonomous Driving*

*DOI: http://dx.doi.org/10.5772/intechopen.95328*

#### *Visibility-Based Technologies and Methodologies for Autonomous Driving DOI: http://dx.doi.org/10.5772/intechopen.95328*

**Figure 10.** *Visualization of VE results: (a) VE process, (b) cumulative times of invisibility, and (c) visibility ratio.*

**Figure 11.** *Linear index-based segmentation: (a) process and (b) time performance.*

*Step 2*: Let *Da*, *Db*, and *Dc* be the distances along the *a*, *b*, and *c* axes from

*Extraction of critical positions: (a) general process of identifying planar ground points and (b) critical positions*

to max *ag*, max *bg*, min *cg <sup>T</sup>*

certain grid point *P*, respectively. A null ð Þ� *Db* þ 1 ð Þ� *Da* þ 1 ð Þ *Dc* þ 1 cell matrix f g Θ and the same-size numerical matrix ½ � Λ are then constructed. The location of the grid point *<sup>P</sup>* in f g <sup>Θ</sup> or ½ � <sup>Λ</sup> is ð Þ *db* <sup>þ</sup> 1, *da* <sup>þ</sup> 1, *dc* <sup>þ</sup> <sup>1</sup> *<sup>T</sup>*. The location can be

*dc* be the distances along the *a*, *b*, and *c* axes from min *ag* , min *bg*, max *cg*

, respectively. Let *da*, *db*, and

to a

*<sup>T</sup>*

min *ag*, min *bg* , max *cg <sup>T</sup>*

*VE results at a single node.*

**Figure 9.**

**108**

**Figure 8.**

*and driving lines.*

*Self-Driving Vehicles and Enabling Technologies*

converted to a linear index *idx*1*<sup>D</sup>* as follows

$$id\infty\_{1D} = d\_c \cdot (D\_b + \mathbf{1}) \cdot (D\_a + \mathbf{1}) + d\_a \cdot (D\_b + \mathbf{1}) + d\_b + \mathbf{1} \tag{4}$$

sensors are the main sensor types that can be used for scanning the surrounding environment and generating 3D data. Lidar scanners are the main active sensors, while optical cameras are the main passive sensors. The Lidar sensors are considered the standard sensors used for 3D scanning and mapping. Unlike passive sensors, Lidar scanners do not require an external source of light to illuminate the target. Thus, it can scan day or night under all lighting conditions and with a higher resiliency to scan in adverse weather conditions. The technological advances in Lidar scanners and its miniaturization are rapidly growing and can be deployed for several transportation application domains, such as AV, road furniture mapping,

*Visibility-Based Technologies and Methodologies for Autonomous Driving*

A Lidar scanner is an active Remote Sensing Sensor (RSS) that uses light as the source of target illumination [22]. The Lidar unit emits a pulsed light beam or a continuous light wave that hits the object and reflects back to the sensor. The precise measurement of the range from the sensor to the target might follow one of two methods. The first method involves the accurate measurement of the Time of Flight, which is the time interval that has elapsed between the emission of a short (but intense) light pulse by the sensor and its return after being reflected from an

where *R* = range (m), *v* = speed of light (m/s), and *t* = time interval measured from the emission to reception of the light pulse (s), where *v* = �299,792,458 m/s. The precision of the time measurement determines the precision of the range

where Δ*R* = range precision, Δ*v* = speed of light precision, and Δ*t* = time interval

The precision of the time measurement determines the precision of the range measurement, with the high-end Lidar scanners (surveying scanners) capable of measuring single pulse timing to a 3 picosecond (ps) accuracy and thus achieving a

where *M* = integer number of wavelengths ð Þ*λ* , and Δ*λ* = fractional part of the

Generally, laser scanners survey the surrounding environment by steering the light beam through a mirror or a prism mechanism to cover one direction (e.g. vertical direction). To provide a sequence of profiles around the vertical axis of the

controlled and measured motion in another direction (e.g. azimuth direction in the static terrestrial laser scanning case) is applied. Nevertheless, if the laser unit is mounted on a moving platform, the controlled and measured motion in the azimuth

Beam divergence is yet another factor that affects the 3D point cloud generation [22]. The light beam is collimated when emitted from the laser unit, but as the light beam propagates, the beam radius or diameter increases and the increase is related

laser unit and generate a 3D point cloud of the area around the laser unit, a

precision. This method is commonly used in most Lidar systems.

1 mm range resolution. The range precision is given by

direction may be substituted by the platform movement.

*R* ¼ *v t=*2 (6)

Δ*R* ¼ *t*Δ*v=*2 þ *v*Δ*t=*2 (7)

*R* ¼ ð Þ *Mλ* þ Δ*λ =*2 (8)

road condition assessment, and 3D visualization.

*DOI: http://dx.doi.org/10.5772/intechopen.95328*

object to the sensor. Then, the range is given by

measurement which is given by

wavelength.

**111**

**3.2 Lidar operation**

where *da* ≤ *Da*, *db* ≤ *Db*, *dc* ≤ *Dc* and *idx*1*<sup>D</sup>* ≤ð Þ� *Db* þ 1 ð Þ� *Da* þ 1 ð Þ *Dc* þ 1 .

In a matrix, each element can be accessed through either its 3D location or the linear index *idx*1*<sup>D</sup>* [19–21]. Using Eq. (4), a 1D index is obtained for each point.

*Step 3*: Sort the points by increasing *idx*1*D*, where the points falling inside a voxel will share the same *idx*1*D*, as shown in **Figure 11a**. By computing the difference Δ*idx*1*<sup>D</sup>* between two consecutive *idx*1*D*, the points in a certain voxel is determined by detecting the positions where Δ*idx*1*<sup>D</sup>* jumps. Suppose

*<sup>I</sup> <sup>I</sup>*1,*I*<sup>2</sup> … *Ij* … *Im*<sup>j</sup> *<sup>j</sup>*>1, *<sup>m</sup>* <sup>≤</sup> ð Þ� *Db* <sup>þ</sup> <sup>1</sup> ð Þ� *Da* <sup>þ</sup> <sup>1</sup> ð Þ� *Dc* <sup>þ</sup> <sup>1</sup> <sup>1</sup> is the set of indices of Δ*idx*1*<sup>D</sup>* >0. The *Ij*-th to *I <sup>j</sup>*þ1-th points are placed in the *idx*1*D*,*j*-th element of f g Θ . In the meantime, set the *idx*1*D*,*j*-th element of ½ � Λ equal to 1. Regarding the pillars generation, the 1D index is written as

$$id\mathbf{x}\_{\rm{1D}} = d\_a \cdot (D\_b + \mathbf{1}) + d\_b + \mathbf{1} \tag{5}$$

Due to the high computational simplicity, detecting the jumps of Δ*idx*1*<sup>D</sup>* to segment 3D data is faster than using the exhaustive searching or the kd-tree neighbors searching algorithm. Besides, a corresponding binary matrix is generated in the process of segmentation, which may aid in detecting either the indices of nonempty pillars or the connected components (CC) when necessary.

**Figure 11b** shows the time performance of the linear index based segmentation procedure on a computer with the processor of Intel®371 Core™ i5–6600, RAM of 16 GB and GPU of Nvida®GTX-960. As *δφ* and *δϕ* are both set as 0*:*1°, the time the segmentation procedure takes is linearly correlated with number of points. Notably, it can process 2 million points within 1 seconds (s). Given the same dataset, the processing time of the indexing operation is negatively and exponentially correlated with the angular resolutions. However, the efficiency is still satisfactory as it takes less than 1 s to handle 1.5 million points when both *δφ* and *δϕ* are set as 0*:*05°.

Using the procedure shown in **Figure 11a**, the target points **Φ** and obstacle points **Ψ** in the *φ* � *ϕ* � *d* space can be separately partitioned into numerous pillars and thus two 3D depth maps are generated as shown in **Figure 6**. It is noteworthy that the starting and the terminating points (see **Figure 11**) for calculating the linear indices of **Φ** and **Ψ** are identical (i.e. the size of generated ½ � Λ or f g Θ is the same). As such, there is a one-to-one correspondence between pillars of depth maps 1 and 2. The binary matrix ½ � Λ of depth map 2 can help determine the positions or indices of pillars that have target points (i.e.½ � ¼¼ Λ 1). Then the obstacle points in the corresponding pillar of depth map 1 can be retrieved via the linear index and compared with the target points as noted in **Figure 6**. Let *dmin* bet the minimum distance of the closest obstacle point to the sight point. As previously mentioned in Section 2.2.2, when *δφ* and *δϕ* are small, the target points are invisible (marked in red in Figure) if their *d*-values exceed *dmin*, otherwise they are visible (marked in green in Figure).

#### **3. Visibility equipment**

#### **3.1 Overview**

A growing number of diverse end users requires 3D data as many new outdoor and indoor application domains benefits from three dimensional (3D) maps. The new application domains include AV, smart cities, asset management, augmented virtual reality, as-built drawings, and even gaming industry. Active and passive

*Visibility-Based Technologies and Methodologies for Autonomous Driving DOI: http://dx.doi.org/10.5772/intechopen.95328*

sensors are the main sensor types that can be used for scanning the surrounding environment and generating 3D data. Lidar scanners are the main active sensors, while optical cameras are the main passive sensors. The Lidar sensors are considered the standard sensors used for 3D scanning and mapping. Unlike passive sensors, Lidar scanners do not require an external source of light to illuminate the target. Thus, it can scan day or night under all lighting conditions and with a higher resiliency to scan in adverse weather conditions. The technological advances in Lidar scanners and its miniaturization are rapidly growing and can be deployed for several transportation application domains, such as AV, road furniture mapping, road condition assessment, and 3D visualization.

#### **3.2 Lidar operation**

*idx*1*<sup>D</sup>* ¼ *dc* � ð Þ� *Db* þ 1 ð Þþ *Da* þ 1 *da* � ð Þþ *Db* þ 1 *db* þ 1 (4)

*idx*1*<sup>D</sup>* ¼ *da* � ð Þþ *Db* þ 1 *db* þ 1 (5)

where *da* ≤ *Da*, *db* ≤ *Db*, *dc* ≤ *Dc* and *idx*1*<sup>D</sup>* ≤ð Þ� *Db* þ 1 ð Þ� *Da* þ 1 ð Þ *Dc* þ 1 . In a matrix, each element can be accessed through either its 3D location or the linear index *idx*1*<sup>D</sup>* [19–21]. Using Eq. (4), a 1D index is obtained for each point.

will share the same *idx*1*D*, as shown in **Figure 11a**. By computing the difference Δ*idx*1*<sup>D</sup>* between two consecutive *idx*1*D*, the points in a certain voxel is determined

*<sup>I</sup> <sup>I</sup>*1,*I*<sup>2</sup> … *Ij* … *Im*<sup>j</sup> *<sup>j</sup>*>1, *<sup>m</sup>* <sup>≤</sup> ð Þ� *Db* <sup>þ</sup> <sup>1</sup> ð Þ� *Da* <sup>þ</sup> <sup>1</sup> ð Þ� *Dc* <sup>þ</sup> <sup>1</sup> <sup>1</sup> is the set of indices of Δ*idx*1*<sup>D</sup>* >0. The *Ij*-th to *I <sup>j</sup>*þ1-th points are placed in the *idx*1*D*,*j*-th element of f g Θ . In the meantime, set the *idx*1*D*,*j*-th element of ½ � Λ equal to 1. Regarding the pillars

Due to the high computational simplicity, detecting the jumps of Δ*idx*1*<sup>D</sup>* to segment 3D data is faster than using the exhaustive searching or the kd-tree neighbors searching algorithm. Besides, a corresponding binary matrix is generated in the process of segmentation, which may aid in detecting either the indices of non-

**Figure 11b** shows the time performance of the linear index based segmentation procedure on a computer with the processor of Intel®371 Core™ i5–6600, RAM of 16 GB and GPU of Nvida®GTX-960. As *δφ* and *δϕ* are both set as 0*:*1°, the time the segmentation procedure takes is linearly correlated with number of points. Notably, it can process 2 million points within 1 seconds (s). Given the same dataset, the processing time of the indexing operation is negatively and exponentially correlated with the angular resolutions. However, the efficiency is still satisfactory as it takes less than 1 s to handle 1.5 million points when both *δφ* and *δϕ* are

Using the procedure shown in **Figure 11a**, the target points **Φ** and obstacle points **Ψ** in the *φ* � *ϕ* � *d* space can be separately partitioned into numerous pillars and thus two 3D depth maps are generated as shown in **Figure 6**. It is noteworthy that the starting and the terminating points (see **Figure 11**) for calculating the linear indices of **Φ** and **Ψ** are identical (i.e. the size of generated ½ � Λ or f g Θ is the same). As such, there is a one-to-one correspondence between pillars of depth maps 1 and 2. The binary matrix ½ � Λ of depth map 2 can help determine the positions or indices of pillars that have target points (i.e.½ � ¼¼ Λ 1). Then the obstacle points in the corresponding pillar of depth map 1 can be retrieved via the linear index and compared with the target points as noted in **Figure 6**. Let *dmin* bet the minimum distance of the closest obstacle point to the sight point. As previously mentioned in Section 2.2.2, when *δφ* and *δϕ* are small, the target points are invisible (marked in red in Figure) if their *d*-values exceed *dmin*, otherwise they are visible (marked in green in Figure).

A growing number of diverse end users requires 3D data as many new outdoor and indoor application domains benefits from three dimensional (3D) maps. The new application domains include AV, smart cities, asset management, augmented virtual reality, as-built drawings, and even gaming industry. Active and passive

empty pillars or the connected components (CC) when necessary.

by detecting the positions where Δ*idx*1*<sup>D</sup>* jumps. Suppose

generation, the 1D index is written as

*Self-Driving Vehicles and Enabling Technologies*

set as 0*:*05°.

**3. Visibility equipment**

**3.1 Overview**

**110**

*Step 3*: Sort the points by increasing *idx*1*D*, where the points falling inside a voxel

A Lidar scanner is an active Remote Sensing Sensor (RSS) that uses light as the source of target illumination [22]. The Lidar unit emits a pulsed light beam or a continuous light wave that hits the object and reflects back to the sensor. The precise measurement of the range from the sensor to the target might follow one of two methods. The first method involves the accurate measurement of the Time of Flight, which is the time interval that has elapsed between the emission of a short (but intense) light pulse by the sensor and its return after being reflected from an object to the sensor. Then, the range is given by

$$R = \upsilon \text{ t/2} \tag{6}$$

where *R* = range (m), *v* = speed of light (m/s), and *t* = time interval measured from the emission to reception of the light pulse (s), where *v* = �299,792,458 m/s.

The precision of the time measurement determines the precision of the range measurement which is given by

$$
\Delta \mathcal{R} = \mathfrak{t} \Delta v / 2 + \upsilon \Delta t / 2 \tag{7}
$$

where Δ*R* = range precision, Δ*v* = speed of light precision, and Δ*t* = time interval precision. This method is commonly used in most Lidar systems.

The precision of the time measurement determines the precision of the range measurement, with the high-end Lidar scanners (surveying scanners) capable of measuring single pulse timing to a 3 picosecond (ps) accuracy and thus achieving a 1 mm range resolution. The range precision is given by

$$R = (\mathsf{M}\lambda + \Delta\lambda)/2\tag{8}$$

where *M* = integer number of wavelengths ð Þ*λ* , and Δ*λ* = fractional part of the wavelength.

Generally, laser scanners survey the surrounding environment by steering the light beam through a mirror or a prism mechanism to cover one direction (e.g. vertical direction). To provide a sequence of profiles around the vertical axis of the laser unit and generate a 3D point cloud of the area around the laser unit, a controlled and measured motion in another direction (e.g. azimuth direction in the static terrestrial laser scanning case) is applied. Nevertheless, if the laser unit is mounted on a moving platform, the controlled and measured motion in the azimuth direction may be substituted by the platform movement.

Beam divergence is yet another factor that affects the 3D point cloud generation [22]. The light beam is collimated when emitted from the laser unit, but as the light beam propagates, the beam radius or diameter increases and the increase is related

to the distance the beam travels. Beam divergence is an angular measure that relates this increase to the distance it travels. The divergence affects the footprint that is measured by the beam. Thus, the measured distance represents a wider area on the target and, in turn, will decrease the specificity of the measured distance, as it will miss any position variation within the footprint. The effect is further highlighted in the case of long-range sensors, such as those used in Airborne Laser Scanning (ALS). This explains the narrow beam divergence needed in the sensors used for ALS, which is typically 0.5 mrad or less.

The Lidar sensor can be used in a mobile or static mode. In the mobile mode, the sensor is mounted on a moving platform such as AV or included as part of a 3D mapping system for mobile mapping and 3D map generation. In the static mode, also known as roadside Lidar for traffic monitoring, the sensor is mounted at a fixed location at an intersection to measure real-time traffic volume and speed. In the static mode, the sensor can use power over Ethernet as a versatile way of providing power to the sensor. The sensed data can then be classified using AI and big data algorithms as pedestrian/cyclist/vehicles objects, where a unique ID is then assigned to each classified object. The object can be continuously tracked using the position, direction, and speed. A complete solution that empowers a smart traffic flow management can be designed by integrating the Lidar within a perception software and IoT communications, thus improving mobility.

promising as they are specifically designed for vehicle environment grid occupancy detection and collision avoidance. As the technology advances, however, it is anticipated that the solid-state Lidar technology will have a substantially positive effect on the 3D mapping field. Two Livox Lidar sensors are available: Mid-70 which has zero blind spot and Avia which has a detection range of up to 450 m along with the multi-scanning modes (repetitive and non-repetitive). These sensors better meet the needs for low-speed autonomous driving (Mid-70 sensor) and such applications as topographic surveying and mapping and power line inspection (Avia sensor).

*Examples of solid state Lidar: (a) Velodyne Velarray: (b) Quanergy S3, and (c) Leddartech M16-LSR. Source:*

*(a) www.velodyne.com, (b) www.quanergy.com, and (c) www.leddartech.com.*

*Visibility-Based Technologies and Methodologies for Autonomous Driving*

*DOI: http://dx.doi.org/10.5772/intechopen.95328*

3D maps are a key infrastructure component needed for smart cities and smart transportation applications. Mobile mapping allows the generation of 3D maps for very large areas, which are considered cost-prohibitive to be mapped using conventional mapping equipment. Active sensors are the primary RSS in a number of mobile mapping systems. A mobile mapping system (MMS) is a moving platform (typically a vehicle) where the Lidar-based system is mounted on. This setup allows capturing data of large areas quickly, which is needed if conventional terrestrial mapping is to be used. Mapping-grade MMS normally achieves sub-meter accuracy, while survey-grade MMS achieves cm-level accuracy, where cost is typically US

Several factors hinder the current mobile mapping systems usage for many user segments, including major investment, operational cost, unease of deployment, and required level of expertise. The design, development, and implementation of a new Lidar-based generic 3D mapping system has been carried out at the Department of Civil Engineering, Ryerson University [24, 25]. The developed system uses the relatively low-cost recently released RSS. The optimized selection of the sensors, the smart integration of the 3D mapping system components on both the hardware and software allow a higher level of versatility, ease of deployment, and substantial cost reduction, compared to commercially systems, while maintaining a comparable accuracy metrics. The system can be used on drones, cars, or as a stationary device.

*Examples of the Livox Lidar sensors: (a) Livox Mid-70 and (b) Livox Avia. Source: www.livoxtech.com.*

The characteristics of the new system are shown in **Table 1**.

The sensors are shown in **Figure 14**.

\$400 k and US\$1 M, respectively.

**Figure 14.**

**113**

**3.4 New Lidar system**

**Figure 13.**

#### **3.3 Types of Lidar scanners**

Spinning multi-beam Lidar scanners, which are relatively recent, have been introduced to meet AV industry requirements. Unlike 2D laser scanners that depend on the platform movement which covers the third dimension, the sensors have multiple beams with pairs of an emitter and a receiver for each beam. Each beam is oriented at a fixed vertical angle from the sensor origin [23]. The multibeam mechanism spins mechanically around a spinning axis to cover 360° horizontal field of view. The frequency of the spinning axis frequency of the Velodyne Lidar sensors reaches 20 Hz, thus enabling a fast and rich 3D point cloud of the vehicle's environment and enhancing its 3D perception. Three examples of these sensors are shown in **Figure 12**.

The solid-state flash Lidar is yet another Lidar technology. This technology illuminates large areas simultaneously and measures the reflected energy on a photonic phased array in an analogy of the digital camera complementary metal– oxide–semiconductor sensor. Unlike the previously-discussed sensor data measurement mechanisms, solid-state Lidar sensors do not have any moving parts and the miniaturization allows on-chip lasers [23]. Three examples of these sensors are shown in **Figure 13**.

It is worth noting that the current rapid increase in laser sensor technology is being driven by the AV application domain. The solid-state Lidar sensors are very

**Figure 12.**

*Examples of spinning multi-beam laser sensors: (a) Velodyne VLP-32c, (b) Hesai Pandar64, and (c) Quanergy M8. Source: (a) www.velodyne.com (b) www.hesaitech.com, and (c) www.quanergy.com.*

*Visibility-Based Technologies and Methodologies for Autonomous Driving DOI: http://dx.doi.org/10.5772/intechopen.95328*

#### **Figure 13.**

to the distance the beam travels. Beam divergence is an angular measure that relates this increase to the distance it travels. The divergence affects the footprint that is measured by the beam. Thus, the measured distance represents a wider area on the target and, in turn, will decrease the specificity of the measured distance, as it will miss any position variation within the footprint. The effect is further highlighted in the case of long-range sensors, such as those used in Airborne Laser Scanning (ALS). This explains the narrow beam divergence needed in the sensors used for

The Lidar sensor can be used in a mobile or static mode. In the mobile mode, the sensor is mounted on a moving platform such as AV or included as part of a 3D mapping system for mobile mapping and 3D map generation. In the static mode, also known as roadside Lidar for traffic monitoring, the sensor is mounted at a fixed location at an intersection to measure real-time traffic volume and speed. In the static mode, the sensor can use power over Ethernet as a versatile way of providing power to the sensor. The sensed data can then be classified using AI and big data algorithms as pedestrian/cyclist/vehicles objects, where a unique ID is then assigned to each classified object. The object can be continuously tracked using the position, direction, and speed. A complete solution that empowers a smart traffic flow management can be designed by integrating the Lidar within a perception software and

Spinning multi-beam Lidar scanners, which are relatively recent, have been introduced to meet AV industry requirements. Unlike 2D laser scanners that depend on the platform movement which covers the third dimension, the sensors have multiple beams with pairs of an emitter and a receiver for each beam. Each beam is oriented at a fixed vertical angle from the sensor origin [23]. The multibeam mechanism spins mechanically around a spinning axis to cover 360° horizontal field of view. The frequency of the spinning axis frequency of the Velodyne Lidar sensors reaches 20 Hz, thus enabling a fast and rich 3D point cloud of the vehicle's environment and enhancing its 3D perception. Three examples of these

The solid-state flash Lidar is yet another Lidar technology. This technology illuminates large areas simultaneously and measures the reflected energy on a photonic phased array in an analogy of the digital camera complementary metal– oxide–semiconductor sensor. Unlike the previously-discussed sensor data measurement mechanisms, solid-state Lidar sensors do not have any moving parts and the miniaturization allows on-chip lasers [23]. Three examples of these sensors are

It is worth noting that the current rapid increase in laser sensor technology is being driven by the AV application domain. The solid-state Lidar sensors are very

*Examples of spinning multi-beam laser sensors: (a) Velodyne VLP-32c, (b) Hesai Pandar64, and (c) Quanergy M8. Source: (a) www.velodyne.com (b) www.hesaitech.com, and (c) www.quanergy.com.*

ALS, which is typically 0.5 mrad or less.

*Self-Driving Vehicles and Enabling Technologies*

IoT communications, thus improving mobility.

**3.3 Types of Lidar scanners**

sensors are shown in **Figure 12**.

shown in **Figure 13**.

**Figure 12.**

**112**

*Examples of solid state Lidar: (a) Velodyne Velarray: (b) Quanergy S3, and (c) Leddartech M16-LSR. Source: (a) www.velodyne.com, (b) www.quanergy.com, and (c) www.leddartech.com.*

promising as they are specifically designed for vehicle environment grid occupancy detection and collision avoidance. As the technology advances, however, it is anticipated that the solid-state Lidar technology will have a substantially positive effect on the 3D mapping field. Two Livox Lidar sensors are available: Mid-70 which has zero blind spot and Avia which has a detection range of up to 450 m along with the multi-scanning modes (repetitive and non-repetitive). These sensors better meet the needs for low-speed autonomous driving (Mid-70 sensor) and such applications as topographic surveying and mapping and power line inspection (Avia sensor). The sensors are shown in **Figure 14**.

#### **3.4 New Lidar system**

3D maps are a key infrastructure component needed for smart cities and smart transportation applications. Mobile mapping allows the generation of 3D maps for very large areas, which are considered cost-prohibitive to be mapped using conventional mapping equipment. Active sensors are the primary RSS in a number of mobile mapping systems. A mobile mapping system (MMS) is a moving platform (typically a vehicle) where the Lidar-based system is mounted on. This setup allows capturing data of large areas quickly, which is needed if conventional terrestrial mapping is to be used. Mapping-grade MMS normally achieves sub-meter accuracy, while survey-grade MMS achieves cm-level accuracy, where cost is typically US \$400 k and US\$1 M, respectively.

Several factors hinder the current mobile mapping systems usage for many user segments, including major investment, operational cost, unease of deployment, and required level of expertise. The design, development, and implementation of a new Lidar-based generic 3D mapping system has been carried out at the Department of Civil Engineering, Ryerson University [24, 25]. The developed system uses the relatively low-cost recently released RSS. The optimized selection of the sensors, the smart integration of the 3D mapping system components on both the hardware and software allow a higher level of versatility, ease of deployment, and substantial cost reduction, compared to commercially systems, while maintaining a comparable accuracy metrics. The system can be used on drones, cars, or as a stationary device. The characteristics of the new system are shown in **Table 1**.


infrastructure needed for smart cities and smart transportation applications. Besides, Lidar sensors provide AV with its ability to perceive the 3D environment. In addition, laser scanners can measure the amount of energy that has been reflected from the target after being illuminated with the scanner emitted pulse. The amount of reflected energy from each measured point constitutes the intensity as measured by the sensor. This measurement can prove valuable in a number of applications as asset management and lane marking extraction. The measured intensity depends on the target reflectivity, which affects the amount of energy reflected that can be detected by the sensor. As the target reflectivity decreases, the amount of reflected energy diminishes, thus weakening the signal returned to the sensor and deeming the target undetectable. The range and incidence angle to the

*Visibility-Based Technologies and Methodologies for Autonomous Driving*

Pavement markings have a different reflectance than the asphalt, thus allowing the automated extraction of the markings from the Lidar measured object intensities. This enables lane mapping and also serves as a tool that can be deployed for pavement marking assessment and evaluation. Street signs and furniture can be automatically extracted from the Lidar measured 3D point cloud. This is done using the fusion of the measured intensities, coupled with some machine learning algorithms that uses features geometry which is also measured by

the Lidar 3D point cloud and optical imagery, as shown in **Figure 16b**.

A sample of a 3D point cloud intensity showing different features is depicted in **Figure 16**. The numbers in the figure refer to objects as follows: lane markings (1), vehicles (2), road surface arrows (3), zebra crossings (4), bike lane (5), a building (6), and pedestrian (7). Some highlighted features from the Lidar 3D point cloud intensity depicted in **Figure 16a**. As noted, the Lidar data intensity can be very useful in the automated feature extraction of street furniture, pavement marking assessment and evaluation which acts as an enabler for automated asset management. Note that a true color 3D Lidar point cloud can be produced by the fusion of

The fusion of Lidar 3D point cloud and optical imagery can prove very useful in

such applications as digital twin and 3D visualization with and enhanced visual appeal. In addition, the different modality between Lidar data and optical imagery improves automated 3D point cloud classification as the classification process uses object geometry coupled with reflectance. The continued advances in

Lidar sensor technologies, IoT, AI, and big-data algorithms and with faster wireless communications technologies that support cellular data networks 5G (even the 6G networks expected in 2030s), with data transfer rate of speeds of

target also affect the measured intensity.

*DOI: http://dx.doi.org/10.5772/intechopen.95328*

Lidar sensors.

**Figure 16.**

**115**

*Lidar 3D point cloud: (a) intensity and (b) true color.*

#### **Table 1.**

*Features of new Lidar 3D mapping system.*

A sample of the 3D point cloud for a block at Ryerson University campus collected by the new system and colorized by height is shown in **Figure 15**. The figure clearly shows thin features such as overhead electricity wires, the traveled way, sidewalk, pedestrian crosswalk, traffic lights, and other vehicles.

The new Lidar-based system holds great potential for implementation in 3D map generation for AV. The drive behind the Lidar sensor technology advancement is to build a very low-cost, miniaturized sensor that can provide the AV with the ability of a robust 3D perception of the environment. A number of different competing factors need to be optimized, including sensor characteristics, such as range, range precision, horizontal and vertical fields of view, and size and power consumption. The main objective is to allow obstacle detection in a fast and reliable manner even at long ranges, thus ensuring safe vehicle operation.

#### **3.5 Potential applications**

The new Lidar sensors can benefit a multitude of application domains within the transportation sector. Those sensors are used to create the 3D maps that serve as the

**Figure 15.** *3D point cloud of part of Ryerson University campus, showing street furniture (color-coded by height).*

#### *Visibility-Based Technologies and Methodologies for Autonomous Driving DOI: http://dx.doi.org/10.5772/intechopen.95328*

infrastructure needed for smart cities and smart transportation applications. Besides, Lidar sensors provide AV with its ability to perceive the 3D environment.

In addition, laser scanners can measure the amount of energy that has been reflected from the target after being illuminated with the scanner emitted pulse. The amount of reflected energy from each measured point constitutes the intensity as measured by the sensor. This measurement can prove valuable in a number of applications as asset management and lane marking extraction. The measured intensity depends on the target reflectivity, which affects the amount of energy reflected that can be detected by the sensor. As the target reflectivity decreases, the amount of reflected energy diminishes, thus weakening the signal returned to the sensor and deeming the target undetectable. The range and incidence angle to the target also affect the measured intensity.

Pavement markings have a different reflectance than the asphalt, thus allowing the automated extraction of the markings from the Lidar measured object intensities. This enables lane mapping and also serves as a tool that can be deployed for pavement marking assessment and evaluation. Street signs and furniture can be automatically extracted from the Lidar measured 3D point cloud. This is done using the fusion of the measured intensities, coupled with some machine learning algorithms that uses features geometry which is also measured by Lidar sensors.

A sample of a 3D point cloud intensity showing different features is depicted in **Figure 16**. The numbers in the figure refer to objects as follows: lane markings (1), vehicles (2), road surface arrows (3), zebra crossings (4), bike lane (5), a building (6), and pedestrian (7). Some highlighted features from the Lidar 3D point cloud intensity depicted in **Figure 16a**. As noted, the Lidar data intensity can be very useful in the automated feature extraction of street furniture, pavement marking assessment and evaluation which acts as an enabler for automated asset management. Note that a true color 3D Lidar point cloud can be produced by the fusion of the Lidar 3D point cloud and optical imagery, as shown in **Figure 16b**.

The fusion of Lidar 3D point cloud and optical imagery can prove very useful in such applications as digital twin and 3D visualization with and enhanced visual appeal. In addition, the different modality between Lidar data and optical imagery improves automated 3D point cloud classification as the classification process uses object geometry coupled with reflectance. The continued advances in Lidar sensor technologies, IoT, AI, and big-data algorithms and with faster wireless communications technologies that support cellular data networks 5G (even the 6G networks expected in 2030s), with data transfer rate of speeds of

**Figure 16.** *Lidar 3D point cloud: (a) intensity and (b) true color.*

A sample of the 3D point cloud for a block at Ryerson University campus collected by the new system and colorized by height is shown in **Figure 15**. The figure clearly shows thin features such as overhead electricity wires, the traveled

Data 300,000 pts./sec (single return) 600,000 pts./sec (dual return)

The new Lidar-based system holds great potential for implementation in 3D map generation for AV. The drive behind the Lidar sensor technology advancement is to build a very low-cost, miniaturized sensor that can provide the AV with the ability of a robust 3D perception of the environment. A number of different competing factors need to be optimized, including sensor characteristics, such as range, range precision, horizontal and vertical fields of view, and size and power consumption. The main objective is to allow obstacle detection in a fast and reliable manner even

1.5 kg 10.3 cm

The new Lidar sensors can benefit a multitude of application domains within the transportation sector. Those sensors are used to create the 3D maps that serve as the

*3D point cloud of part of Ryerson University campus, showing street furniture (color-coded by height).*

way, sidewalk, pedestrian crosswalk, traffic lights, and other vehicles.

Power consumption 19 W (Autonomy 1.2 hr) Operating Temperature: -10°C to +60°C

**Feature Value** System accuracy 5 cm @ 50 m (1σ)

Height 15 cm HFOV (VFOV) 360° (+15° to 15°)

at long ranges, thus ensuring safe vehicle operation.

**3.5 Potential applications**

Weight (battery included)

*Self-Driving Vehicles and Enabling Technologies*

*Features of new Lidar 3D mapping system.*

Diameter

**Table 1.**

**Figure 15.**

**114**

95 Gb/s, the dream of mass deployment of an advanced driver assistance system level 5 can become a reality.

The sight obstacles can be detected in a similar way. The respective depth maps of the obstacle points and target infrastructure points are first generated following the steps presented in Section 2.2. Then, at the stage of comparing the depth maps, another parameter is calculated, as shown in **Figure 7a**. Let *dmax* be the maximum distance of the target points from the sight point. The obstacle points farther than *dmax* will not affect the visibility of the target points. On the contrary, the obstacle points whose *d*-values lie between *dmin* and *dmax* will obstruct the target infrastructure points whose d-values exceed *dmin*. The sight obstacle detection is completed when all pillars are estimated. **Figure 7b** shows the detected sight obstacles that affect the visibility of a traffic sign. In this case, when combined with the techniques for automatically identifying traffic signs from MLS data, the visibility estimation procedure can help understand the visibility of traffic signs to the AV along a road

*Visibility-Based Technologies and Methodologies for Autonomous Driving*

*DOI: http://dx.doi.org/10.5772/intechopen.95328*

Guaranteeing adequate traffic visibility is crucial to the safe operation of AV. If the sight lines to the conflicting objects (e.g. vehicles, pedestrians, and cyclists) are obstructed by some obstacles (**Figure 17a**), the AV may not identify the objects at a safe distance, which may lead to a collision. It is quite challenging for AV vehicles to predict where pedestrians or cyclists may be present in a complex road scene in real-time [30]. In this case, it is meaningful to identify the positions where road users may exist in real-world using dense MLS points and investigate the visibility

georeferenced visibility information can be incorporated into HP maps, which may help AV make some proactive measures to reduce the collision risk at locations with

*Traffic visibility estimation for AV: (a) reasons for estimating traffic visibility, (b) critical positions, and (c)*

of these locations to the AV in advance (see **Figure 17a** and **b**). Then the

corridor.

**Figure 17.**

**117**

*general workflow of traffic visibility estimation.*

**4.2 Traffic visibility**

unsatisfactory traffic visibility.

*4.2.1 Overview*

#### **4. Visibility methodologies**

#### **4.1 Infrastructure visibility**

#### *4.1.1 High-precision 3D maps*

High-precision maps have been identified as one of the key technologies for autonomous driving [26]. These maps need to be purpose-built and highly accurate, with great level of details, and be updated in real-time. These maps, which are particularly made for autonomous driving, provide true-ground-absolute accuracy, normally at centimeter-level (5 cm or better), and contain details organized in multiple map layers, such as base map, geometric map, semantic map, map priors and real-time knowledge [27]. Technologies that can be leveraged for creating such maps include aerial imagery, aerial Lidar data, mobile Lidar mapping, and Unmanned Aerial Vehicles equipped with lightweight sensors. On the ground, HP maps are mainly created using vehicles equipped with high-tech instruments [26]. Efforts on HP mapping in 3D include Civil Maps, HERE, DeepMap, TomTom HP Maps, Mobileye, Uber Localization and Mapping, and Mapper, while most of them follow crowdsourcing models.

In addition to supporting localization and navigation, HP 3D maps can be leveraged for lane network construction, dynamic object detection, prediction of motion of vulnerable road users, and visibility analysis for the correlation between environmental visibility and navigational comfort under autonomous driving [28, 29]. The so-called self-healing mapping systems allows a detailed inventory of road features and objects, part of road infrastructure, on the side of the road for real-time determination of infrastructure visibility.

To provide dynamic routing for any situation, Geographic Information System (GIS) requires updated road network data, real-time traffic information, vehicle current location, and destination. Real-time vehicle location is obtained using Radar and Camera sensors using localization techniques such as simultaneous localization and mapping (SLAM) that can localize vehicle with high precision and map its exact location with respect to the surrounding environment. Localization can not only determine vehicle location, but also map landmarks to update HP maps. Depending on the use of sensors, the algorithm can be Lidar-based, Lidar and camera integrated, or camera-based.

Building and maintaining detailed HP maps in advance presents a very appealing solution for autonomous driving. However, exiting technologies may at the best provide near real-time mapping, and may have missed critical information, such as road markings and dynamic changes of road infrastructure. Real-time SLAM provides better solution to dynamic updating of HP maps for locating vehicles and mapping the surrounding environment.

#### *4.1.2 Sight obstacle detection*

The technique for estimating visibility of target points described in Section 2.2 can be extended to detect sight obstacles that restrict infrastructure visibility (e.g. traffic signs). Specifically, as shown in **Figure 7a**, the visibility of the target points are determined by comparing their distances to the sight point with the minimum distance (i.e. *dmin*) of the closest obstacle point to the sight point in each pillar.

*Visibility-Based Technologies and Methodologies for Autonomous Driving DOI: http://dx.doi.org/10.5772/intechopen.95328*

The sight obstacles can be detected in a similar way. The respective depth maps of the obstacle points and target infrastructure points are first generated following the steps presented in Section 2.2. Then, at the stage of comparing the depth maps, another parameter is calculated, as shown in **Figure 7a**. Let *dmax* be the maximum distance of the target points from the sight point. The obstacle points farther than *dmax* will not affect the visibility of the target points. On the contrary, the obstacle points whose *d*-values lie between *dmin* and *dmax* will obstruct the target infrastructure points whose d-values exceed *dmin*. The sight obstacle detection is completed when all pillars are estimated. **Figure 7b** shows the detected sight obstacles that affect the visibility of a traffic sign. In this case, when combined with the techniques for automatically identifying traffic signs from MLS data, the visibility estimation procedure can help understand the visibility of traffic signs to the AV along a road corridor.

#### **4.2 Traffic visibility**

#### *4.2.1 Overview*

95 Gb/s, the dream of mass deployment of an advanced driver assistance

High-precision maps have been identified as one of the key technologies for autonomous driving [26]. These maps need to be purpose-built and highly accurate, with great level of details, and be updated in real-time. These maps, which are particularly made for autonomous driving, provide true-ground-absolute accuracy, normally at centimeter-level (5 cm or better), and contain details organized in multiple map layers, such as base map, geometric map, semantic map, map priors and real-time knowledge [27]. Technologies that can be leveraged for creating such

maps include aerial imagery, aerial Lidar data, mobile Lidar mapping, and

Unmanned Aerial Vehicles equipped with lightweight sensors. On the ground, HP maps are mainly created using vehicles equipped with high-tech instruments [26]. Efforts on HP mapping in 3D include Civil Maps, HERE, DeepMap, TomTom HP Maps, Mobileye, Uber Localization and Mapping, and Mapper, while most of them

In addition to supporting localization and navigation, HP 3D maps can be leveraged for lane network construction, dynamic object detection, prediction of motion of vulnerable road users, and visibility analysis for the correlation between environmental visibility and navigational comfort under autonomous driving [28, 29]. The so-called self-healing mapping systems allows a detailed inventory of road features and objects, part of road infrastructure, on the side of the road for real-time

To provide dynamic routing for any situation, Geographic Information System (GIS) requires updated road network data, real-time traffic information, vehicle current location, and destination. Real-time vehicle location is obtained using Radar and Camera sensors using localization techniques such as simultaneous localization and mapping (SLAM) that can localize vehicle with high precision and map its exact location with respect to the surrounding environment. Localization can not only determine vehicle location, but also map landmarks to update HP maps. Depending on the use of sensors, the algorithm can be Lidar-based, Lidar and camera inte-

Building and maintaining detailed HP maps in advance presents a very appealing solution for autonomous driving. However, exiting technologies may at the best provide near real-time mapping, and may have missed critical information, such as road markings and dynamic changes of road infrastructure. Real-time SLAM provides better solution to dynamic updating of HP maps for locating vehicles and

The technique for estimating visibility of target points described in Section 2.2 can be extended to detect sight obstacles that restrict infrastructure visibility (e.g. traffic signs). Specifically, as shown in **Figure 7a**, the visibility of the target points are determined by comparing their distances to the sight point with the minimum distance (i.e. *dmin*) of the closest obstacle point to the sight point in each pillar.

system level 5 can become a reality.

*Self-Driving Vehicles and Enabling Technologies*

**4. Visibility methodologies**

**4.1 Infrastructure visibility**

*4.1.1 High-precision 3D maps*

follow crowdsourcing models.

grated, or camera-based.

*4.1.2 Sight obstacle detection*

**116**

determination of infrastructure visibility.

mapping the surrounding environment.

Guaranteeing adequate traffic visibility is crucial to the safe operation of AV. If the sight lines to the conflicting objects (e.g. vehicles, pedestrians, and cyclists) are obstructed by some obstacles (**Figure 17a**), the AV may not identify the objects at a safe distance, which may lead to a collision. It is quite challenging for AV vehicles to predict where pedestrians or cyclists may be present in a complex road scene in real-time [30]. In this case, it is meaningful to identify the positions where road users may exist in real-world using dense MLS points and investigate the visibility of these locations to the AV in advance (see **Figure 17a** and **b**). Then the georeferenced visibility information can be incorporated into HP maps, which may help AV make some proactive measures to reduce the collision risk at locations with unsatisfactory traffic visibility.

#### **Figure 17.**

*Traffic visibility estimation for AV: (a) reasons for estimating traffic visibility, (b) critical positions, and (c) general workflow of traffic visibility estimation.*

As shown in **Figure 17c**, the general process of traffic visibility analysis comprises two main components: (1) critical positions and (2) driving lines. The critical positions correspond to the target points in the VE process (see **Figure 4**). The driving lines in HP maps which aid in navigating AV can help derive sight points in this case. Traffic visibility estimation in autonomous driving involves estimating the visibility of some critical positions along the pre-defined driving lines to the AV and generating the visibility map.

The VE results are visualized in **Figure 10**. In this example, two types of quantitative information are derived based on VE results at each node. **Figure 10b** maps the cumulative times of invisibility *Ninvis* at a point level. The initial *Ninvis* of each target point is zero. During the process of VE, *Ninvis* ¼ *Ninvis* þ 1 each time when the target point is not seen by the ego vehicle. A large *Ninvis* means the target point is invisible to

**Figure 17b** which may help to identify the possible locations of the blind areas for the ego vehicles. As marked with rectangles, the visibility of these locations are poorer, which means that the AV may need to decelerate when approaching these locations to avoid potential right-angle collisions. Also, the blind-area results can be estimated in conjunction with collision data in future studies to better understand road safety. **Figure 10c** shows the variation of the visibility ratio *Vr* along the driving lines. The *Vr* measures the ratio of visible targets to all targets at a single position. A low *Vr* may indicate that a majority of target points are invisible to the AV. Because *Vr* can be integrated with the driving lines, the AV can know where the visibility ratio is relatively low based on its location. In that case, the ego vehicle can decelerate in advance at a location with very low *Vr* to reduce the probability of colliding with a

This chapter has addressed so far the visibility element of autonomous vehicles,

which is the main focus of the chapter. It is useful, however, to highlight the decision element (see **Figure 1**), which involves path planning. This element is considered to be the main challenge of autonomous vehicles. Path planning enables the autonomous vehicle to find the safest, most convenient, and most economic route between two points. One of the main modeling approaches to define such a route is Model Predictive Control (MPC). There is a variety of MPC, but basically the model solves a finite-time constrained optimal control problem in a receding horizon using nonlinear optimization. An application of path planning and control for overtaking on two-lane highways using nonlinear MPC can be found in [32]. There are two approaches for path planning (hierarchical and parallel) [33]. In the hierarchical approach the autonomous vehicle completes long-term mission and

reduce the workload of motion planning. The input higher-level mission is

**Figure 18** illustrates the hierarchical approach, based on [33].

decomposed into sub-missions and then passed to the next level down. The hierarchical model helps to resolve many complicated problems, yet it might slow down the work of a vehicle's feedback control and complicate the performance of sophisticated maneuvers. In the parallel approach, the tasks are more independent and can proceed simultaneously. In this approach, each controller has dedicated sensors and actuation mechanisms. The advantages of this approach: (1) the controllers are running at high frequency, which makes them safe and stable, (2) a high level of smoothness and performance is achieved by the controllers, and (3) the approach is relatively inexpensive and does not require using complicated motion planning devices. However, for some purposes, the hierarchical approach is more efficient.

Extensive research has been conducted for path planning for automated parking, which is now a reality in many cities [34–38]. Automated parking aims to eliminate the influence of human factors, improve the quality and accuracy of control, and reduce the maneuver time by optimizing vehicle path in restricted parking zones. The vehicle to anything technology (V2X) allows communication between AV and

the AV for many times. The magnitude of *Ninvis* is visualized with colors in

*Visibility-Based Technologies and Methodologies for Autonomous Driving*

*DOI: http://dx.doi.org/10.5772/intechopen.95328*

running pedestrian from a blind area.

**5. Decision**

**119**

**5.1 Path planning**

#### *4.2.2 Critical position identification*

In this example, the planar ground points are identified as the locations where vehicles, pedestrians, or cyclists may exist. A common workflow shown in **Figure 8a** is used to extract the critical positions. More specifically, MLS data are first partitioned into a number of pillars with the linear-index based segmentation technique illustrated in **Figure 11**. Next, a pillar-wise filtering is performed that the points with *δ<sup>h</sup>* (user–defined, e.g., 0.2 m) higher than the lowest point are removed. The remaining points are considered as the rough ground points. Then, the kd-tree data structure is applied to find k-nearest neighbors for each data point. The neighboring points are used to derive the normal vector *ξ \** of each point [31]. Let γ be the angle between *ξ \** and the vertical direction (i.e. 0, 0, 1 ð Þ*<sup>T</sup>*). The points with <sup>γ</sup>≤5° are considered as the horizontal and planar points.

Because we mainly focus on the locations where the road users may exist and may conflict with the AV, a distance-based segmentation method is applied to remove some isolated planar point clusters. The point cluster with the largest size is determined as the 'critical positions'. MLS data of a 500 m long urban road section is used to illustrate the process of VE. The extracted critical positions using the procedure shown in **Figure 8a** are marked in red in **Figure 8b**. The critical positions covers pavement surface, sidewalks, and some planar surfaces connected to the pavement. It is also recommended to use the semantic segmentation techniques powered by deep learning to identify the critical positions more accurately.

#### *4.2.3 Visibility estimation*

The driving lines of AV are also plotted in **Figure 8b**. Each driving line is composed of many consecutive nodes. At each node, suppose that the sight point overlap with the node on the horizontal plane. However, the elevation of sight point *hs* is adjustable to accommodate different height of Lidar sensors mounted on the autonomous driving vehicles. Then, the VE procedure described previously is applied to estimate the visibility of critical positions to the sight point. **Figure 9** presents the VE results at a single node. The visible and invisible critical positions are marked in green and red, respectively, while the potential obstacle points are marked in gray. In **Figure 9**, the range of *φ* is ½ � �180°, 180° , where both the front and rear positions are estimated. Different limits can be set on the range of *φ* to simulate varied horizontal viewing angles. The VE procedure at a single node can be extended to estimate the locations where the sight triangle is clear at intersections [16]. Specifically, if there are invisible (red) regions (see **Figure 9**) inside the sight triangle, it means the sight triangle is not clear.

The VE procedure is executed along the driving lines node by node (see **Figure 10a**) to gain a better understanding of traffic visibility for AV. In this phase, the variables involved are *hs* = 1.6 m, *dview* = 100 m, *φview* = �60 to 60°, *ϕview* = � 30 to 30°, δ*φ* = 0.1°, and δ*ϕ* = 0.1°. The users may also adjust *dview*, *φview* etc. to investigate different situations.

*Visibility-Based Technologies and Methodologies for Autonomous Driving DOI: http://dx.doi.org/10.5772/intechopen.95328*

The VE results are visualized in **Figure 10**. In this example, two types of quantitative information are derived based on VE results at each node. **Figure 10b** maps the cumulative times of invisibility *Ninvis* at a point level. The initial *Ninvis* of each target point is zero. During the process of VE, *Ninvis* ¼ *Ninvis* þ 1 each time when the target point is not seen by the ego vehicle. A large *Ninvis* means the target point is invisible to the AV for many times. The magnitude of *Ninvis* is visualized with colors in **Figure 17b** which may help to identify the possible locations of the blind areas for the ego vehicles. As marked with rectangles, the visibility of these locations are poorer, which means that the AV may need to decelerate when approaching these locations to avoid potential right-angle collisions. Also, the blind-area results can be estimated in conjunction with collision data in future studies to better understand road safety.

**Figure 10c** shows the variation of the visibility ratio *Vr* along the driving lines. The *Vr* measures the ratio of visible targets to all targets at a single position. A low *Vr* may indicate that a majority of target points are invisible to the AV. Because *Vr* can be integrated with the driving lines, the AV can know where the visibility ratio is relatively low based on its location. In that case, the ego vehicle can decelerate in advance at a location with very low *Vr* to reduce the probability of colliding with a running pedestrian from a blind area.

#### **5. Decision**

As shown in **Figure 17c**, the general process of traffic visibility analysis comprises two main components: (1) critical positions and (2) driving lines. The critical positions correspond to the target points in the VE process (see **Figure 4**). The driving lines in HP maps which aid in navigating AV can help derive sight points in this case. Traffic visibility estimation in autonomous driving involves estimating the visibility of some critical positions along the pre-defined driving lines to the AV and

In this example, the planar ground points are identified as the locations where

Because we mainly focus on the locations where the road users may exist and may conflict with the AV, a distance-based segmentation method is applied to remove some isolated planar point clusters. The point cluster with the largest size is determined as the 'critical positions'. MLS data of a 500 m long urban road section is used to illustrate the process of VE. The extracted critical positions using the procedure shown in **Figure 8a** are marked in red in **Figure 8b**. The critical positions covers pavement surface, sidewalks, and some planar surfaces connected to the pavement. It is also recommended to use the semantic segmentation techniques powered by deep learning to identify the critical positions more accurately.

The driving lines of AV are also plotted in **Figure 8b**. Each driving line is composed of many consecutive nodes. At each node, suppose that the sight point overlap with the node on the horizontal plane. However, the elevation of sight point *hs* is adjustable to accommodate different height of Lidar sensors mounted on the autonomous driving vehicles. Then, the VE procedure described previously is applied to estimate the visibility of critical positions to the sight point. **Figure 9** presents the VE results at a single node. The visible and invisible critical positions are marked in green and red, respectively, while the potential obstacle points are marked in gray. In **Figure 9**, the range of *φ* is ½ � �180°, 180° , where both the front and rear positions are estimated. Different limits can be set on the range of *φ* to simulate varied horizontal viewing angles. The VE procedure at a single node can be extended to estimate the locations where the sight triangle is clear at intersections [16]. Specifically, if there are invisible (red) regions (see **Figure 9**) inside the sight

The VE procedure is executed along the driving lines node by node (see **Figure 10a**) to gain a better understanding of traffic visibility for AV. In this phase, the variables involved are *hs* = 1.6 m, *dview* = 100 m, *φview* = �60 to 60°, *ϕview* = � 30

to 30°, δ*φ* = 0.1°, and δ*ϕ* = 0.1°. The users may also adjust *dview*, *φview* etc. to

*\**

and the vertical direction (i.e. 0, 0, 1 ð Þ*<sup>T</sup>*). The points with <sup>γ</sup>≤5° are

of each point [31]. Let γ be the

vehicles, pedestrians, or cyclists may exist. A common workflow shown in **Figure 8a** is used to extract the critical positions. More specifically, MLS data are first partitioned into a number of pillars with the linear-index based segmentation technique illustrated in **Figure 11**. Next, a pillar-wise filtering is performed that the points with *δ<sup>h</sup>* (user–defined, e.g., 0.2 m) higher than the lowest point are removed. The remaining points are considered as the rough ground points. Then, the kd-tree data structure is applied to find k-nearest neighbors for each data point. The neigh-

generating the visibility map.

angle between *ξ*

*4.2.3 Visibility estimation*

*4.2.2 Critical position identification*

*Self-Driving Vehicles and Enabling Technologies*

*\**

boring points are used to derive the normal vector *ξ*

considered as the horizontal and planar points.

triangle, it means the sight triangle is not clear.

investigate different situations.

**118**

#### **5.1 Path planning**

This chapter has addressed so far the visibility element of autonomous vehicles, which is the main focus of the chapter. It is useful, however, to highlight the decision element (see **Figure 1**), which involves path planning. This element is considered to be the main challenge of autonomous vehicles. Path planning enables the autonomous vehicle to find the safest, most convenient, and most economic route between two points. One of the main modeling approaches to define such a route is Model Predictive Control (MPC). There is a variety of MPC, but basically the model solves a finite-time constrained optimal control problem in a receding horizon using nonlinear optimization. An application of path planning and control for overtaking on two-lane highways using nonlinear MPC can be found in [32].

There are two approaches for path planning (hierarchical and parallel) [33]. In the hierarchical approach the autonomous vehicle completes long-term mission and reduce the workload of motion planning. The input higher-level mission is decomposed into sub-missions and then passed to the next level down. The hierarchical model helps to resolve many complicated problems, yet it might slow down the work of a vehicle's feedback control and complicate the performance of sophisticated maneuvers. In the parallel approach, the tasks are more independent and can proceed simultaneously. In this approach, each controller has dedicated sensors and actuation mechanisms. The advantages of this approach: (1) the controllers are running at high frequency, which makes them safe and stable, (2) a high level of smoothness and performance is achieved by the controllers, and (3) the approach is relatively inexpensive and does not require using complicated motion planning devices. However, for some purposes, the hierarchical approach is more efficient. **Figure 18** illustrates the hierarchical approach, based on [33].

Extensive research has been conducted for path planning for automated parking, which is now a reality in many cities [34–38]. Automated parking aims to eliminate the influence of human factors, improve the quality and accuracy of control, and reduce the maneuver time by optimizing vehicle path in restricted parking zones. The vehicle to anything technology (V2X) allows communication between AV and

occurs between the wheels and the road. This assumption is reasonable for vehicles moving at low speeds, which is the case for parking. However, dynamic models should be used for path planning of autonomous vehicles on highways since they

Autonomous vehicles and connected automated vehicles (CAV) with advanced embedded technology can deliver safe and effective traffic movements [40]. However, there will be a transition period when AV and human-driven vehicles would share the common road space. Therefore, it would be extremely critical to organize the interactions which will likely be generated from that vehicle mix. Due to the anticipated asymmetric response of these two modes of vehicles, many possible combinations of these vehicles are possible. These combinations may give rise to

To ensure safe interactions, the intelligent vehicles may include adaptive cruise control (ACC) and cooperative adaptive cruise control (CACC) systems. These systems mainly assist the acceleration control for the longitudinal movements. The systems control the acceleration based on the distance gap and the speed difference between the current vehicle and the vehicle ahead (leader), where the vehicle accelerates and decelerates based on the speed changes of the leader. For the ACC system, the distance and speed are obtained using radar, Lidar, or video cameras. For the CACC system, V2V communications are used to share the acceleration, deceleration, breaking capability, and vehicle positions [41]. This communication significantly shortens the time headway (0.5 s) of the CACC vehicle compared to

To obtain deeper insights into car-following behavior, micro simulation studies were conducted to estimate the impacts of AV and CAV using a variety of assumptions [41]. Car-following models involving intelligent vehicles were developed to evaluate their traffic impacts on such aspects as capacity and level of service, traffic stability, travel time, and vehicle speed. Several studies estimated the energy, environmental, and safety impacts using surrogate safety measures, such as travel

New car-following models for AV and CAV were developed by modifying available traditional car-following models to mimic the intelligent vehicle characteristics. The Intelligent Driver Model (IDM) [42], a time-continuous car-following model for the simulation of freeway and urban traffic, is the most commonly used model for simulations of intelligent vehicles. The basic function of the model is given by

> *vo* � �*<sup>δ</sup>*

� *<sup>s</sup>* <sup>∗</sup> ð Þ *<sup>v</sup>*,Δ*<sup>v</sup> s* � �<sup>2</sup> " # (9)

), and *b* = comfortable deceleration rate (m/s<sup>2</sup>

*ab* <sup>p</sup> (10)

).

*v*Δ*v* 2 ffiffiffiffiffi

interactions due to longitudinal and transverse movements.

*Visibility-Based Technologies and Methodologies for Autonomous Driving*

speed, time-to-collision, and post-encroachment-time.

*dv*

*dt* <sup>¼</sup> *<sup>a</sup>* <sup>1</sup> � *<sup>v</sup>*

*<sup>s</sup>* <sup>∗</sup> ð Þ¼ *<sup>v</sup>*,Δ*<sup>v</sup> so* <sup>þ</sup> *vT* <sup>þ</sup>

where *s* = current distance gap to the leader (m), *so* = minimum distance gap (m), *v* = current speed (m/s), *vo* = desired safety speed (m/s), Δ*v* = speed difference between the current vehicle and the leader (m/s), δ = parameter that determines the magnitude of acceleration decrease (usually set equal to 4),*T* = desired safety gap

Another car-following model that is commonly used models for conducting cooperative intelligent-vehicle simulations is the MICroscopic Model for Simulation

*aIDM*ð Þ¼ *s*, *v*,Δ*v*

Note that s\*(v, Δ*v*) is the desired distance gap (m).

(s), *a* = maximum acceletation (m/s2

**121**

are more accurate.

**5.2 Intelligent car-following**

*DOI: http://dx.doi.org/10.5772/intechopen.95328*

the ACC vehicle (1.4 s).

**Figure 18.** *Hierarchical approach for path planning.*

a building through data exchange to find an unoccupied space and generate the route to a destination. For large-sized trucks, the tasks involve predicting stable and safe passing on road curves and forecasting precise control for docking. Numerous approaches that consider different control strategies, sensory means, and prediction algorithms (e.g. geometric, fuzzy, and neural) for predicting vehicle parking path have been developed.

Recently, the bicycle kinematic models for vehicle motion have been used for path planning of automated parking [39]. Three basic types of vehicles were considered: passenger car, long wheelbase truck, and articulated vehicles with and without steered semitrailer axes. The authors presented a system of differential equations in matrix form and expressions for linearizing the nonlinear motion equations that increased the speed of finding the optimal solution. An original algorithm that considered numerous constraints was developed for determining vehicle permissible positions within the closed boundaries of the parking area using nonlinear MPC that finds the best trajectories.

**Figure 19** shows the kinematics of the curvilinear motion and the simulation results that validated the proposed model. Note that in this study, kinematic vehicle models were used instead of dynamic models. Kinematic models assume that no slip

#### **Figure 19.**

*Modeling and simulation of autonomous docking of tractor-semitrailer vehicle with semitrailer's steered axles: (a) kinematics of curvilinear motion and (b) simulation results [39].*

occurs between the wheels and the road. This assumption is reasonable for vehicles moving at low speeds, which is the case for parking. However, dynamic models should be used for path planning of autonomous vehicles on highways since they are more accurate.

#### **5.2 Intelligent car-following**

Autonomous vehicles and connected automated vehicles (CAV) with advanced embedded technology can deliver safe and effective traffic movements [40]. However, there will be a transition period when AV and human-driven vehicles would share the common road space. Therefore, it would be extremely critical to organize the interactions which will likely be generated from that vehicle mix. Due to the anticipated asymmetric response of these two modes of vehicles, many possible combinations of these vehicles are possible. These combinations may give rise to interactions due to longitudinal and transverse movements.

To ensure safe interactions, the intelligent vehicles may include adaptive cruise control (ACC) and cooperative adaptive cruise control (CACC) systems. These systems mainly assist the acceleration control for the longitudinal movements. The systems control the acceleration based on the distance gap and the speed difference between the current vehicle and the vehicle ahead (leader), where the vehicle accelerates and decelerates based on the speed changes of the leader. For the ACC system, the distance and speed are obtained using radar, Lidar, or video cameras. For the CACC system, V2V communications are used to share the acceleration, deceleration, breaking capability, and vehicle positions [41]. This communication significantly shortens the time headway (0.5 s) of the CACC vehicle compared to the ACC vehicle (1.4 s).

To obtain deeper insights into car-following behavior, micro simulation studies were conducted to estimate the impacts of AV and CAV using a variety of assumptions [41]. Car-following models involving intelligent vehicles were developed to evaluate their traffic impacts on such aspects as capacity and level of service, traffic stability, travel time, and vehicle speed. Several studies estimated the energy, environmental, and safety impacts using surrogate safety measures, such as travel speed, time-to-collision, and post-encroachment-time.

New car-following models for AV and CAV were developed by modifying available traditional car-following models to mimic the intelligent vehicle characteristics. The Intelligent Driver Model (IDM) [42], a time-continuous car-following model for the simulation of freeway and urban traffic, is the most commonly used model for simulations of intelligent vehicles. The basic function of the model is given by

$$a\_{\rm IDM}(s, v, \Delta v) = \frac{dv}{dt} = a \left[ \mathbf{1} - \left( \frac{v}{v\_o} \right)^{\delta} - \left( \frac{s^\* \left( v, \Delta v \right)}{s} \right)^2 \right] \tag{9}$$

$$s^\*\left(v, \Delta v\right) = s\_o + vT + \frac{v\Delta v}{2\sqrt{ab}}\tag{10}$$

where *s* = current distance gap to the leader (m), *so* = minimum distance gap (m), *v* = current speed (m/s), *vo* = desired safety speed (m/s), Δ*v* = speed difference between the current vehicle and the leader (m/s), δ = parameter that determines the magnitude of acceleration decrease (usually set equal to 4),*T* = desired safety gap (s), *a* = maximum acceletation (m/s2 ), and *b* = comfortable deceleration rate (m/s<sup>2</sup> ). Note that s\*(v, Δ*v*) is the desired distance gap (m).

Another car-following model that is commonly used models for conducting cooperative intelligent-vehicle simulations is the MICroscopic Model for Simulation

a building through data exchange to find an unoccupied space and generate the route to a destination. For large-sized trucks, the tasks involve predicting stable and safe passing on road curves and forecasting precise control for docking. Numerous approaches that consider different control strategies, sensory means, and prediction algorithms (e.g. geometric, fuzzy, and neural) for predicting vehicle parking path

Recently, the bicycle kinematic models for vehicle motion have been used for path planning of automated parking [39]. Three basic types of vehicles were considered: passenger car, long wheelbase truck, and articulated vehicles with and without steered semitrailer axes. The authors presented a system of differential equations in matrix form and expressions for linearizing the nonlinear motion equations that increased the speed of finding the optimal solution. An original algorithm that considered numerous constraints was developed for determining vehicle permissible positions within the closed boundaries of the parking area using

**Figure 19** shows the kinematics of the curvilinear motion and the simulation results that validated the proposed model. Note that in this study, kinematic vehicle models were used instead of dynamic models. Kinematic models assume that no slip

*Modeling and simulation of autonomous docking of tractor-semitrailer vehicle with semitrailer's steered axles:*

have been developed.

*Hierarchical approach for path planning.*

*Self-Driving Vehicles and Enabling Technologies*

**Figure 18.**

**Figure 19.**

**120**

nonlinear MPC that finds the best trajectories.

*(a) kinematics of curvilinear motion and (b) simulation results [39].*

of Intelligent Cruise Control (MIXIC) [43]. Both IDM and MIXIC have been used as the benchmark models for combined AV and human-driven vehicles for various market penetration rates ranging from 0 to 25% [44, 45]. However, detailed calibration procedures of model parameters used in these models are not available due to the limited or unavailability of empirical data.

the autonomous vehicle develop some proactive speed control strategies at

4.The uncertainty of the input variables can substantially affect the reliability of the design elements. Therefore, reliability analysis should be incorporated in all the tasks related to SD determination to ensure safe operation of the autonomous vehicles. Extensive uncertainty research has been conducted on the required SD and similar research is lacking for the available SD determined

5.Geographic information systems provide support for route planning and realtime and dynamic routing/navigation of AV using GNSS, localization

techniques, and HP maps. The GIS navigation helps to guide the autonomous

performance in simulation environments. However, in the calibration using real-field conditions/experiments, the input to simulation models would not be beneficial as it is limited to the available parameters that correspond to specific conditions. The parameters should be based on the surrounding environmental factors for all possible combinations of autonomous and human-driven vehicles. This is a mammoth challenge due to the limited empirical data.

7.Another challenge of autonomous driving is related to the complexity of the software algorithms needed to process the large amount of information coming into the AV (estimated at 1 billion lines of code) which is then used to

compounded by the fact that AV will operate on roads full of unpredictable

This chapter is financially supported by the Natural Sciences and Engineering

make decisions about the proper action. The software complexity is

locations where the visibility is unsatisfactory.

*DOI: http://dx.doi.org/10.5772/intechopen.95328*

*Visibility-Based Technologies and Methodologies for Autonomous Driving*

vehicle along the best route to its destination safely.

drivers of human-driven vehicles.

Research Council of Canada (NSERC).

ACC adaptive cruise control AI artificial intelligence ALS airborne laser scanning AV autonomous vehicles

DSD decision sight distance

IDM intelligent driver model IoT internet of things

ISD intersection sight distance Lidar light detection and ranging

HP high-precision

**123**

CACC cooperative adaptive cruise control CAV connected automated vehicles

GIS geographical information system GNSS global navigation satellite system

**Acknowledgements**

**List of acronyms**

6.The intelligent car-following strategies are expected to show good

using the Lidar.

On these lines, substantiating vehicle's longitudinal motion in mixed traffic conditions is even more critical using available car-following model parameters. In the present context, the car-following models used in simulation packages need to be suitably modified. Further, most of these models are straightforward in their approach with a limited set of parameters. The simplification of available model parameters might not affect the simulation results to the extent the intelligent vehicles may impact the traffic streams in real-field experiments. For example, most car-following models tend to have fixed parameters and will average out driving characteristics. If such models are used for AV, they can reflect neither the driving style of the vehicles' actual drivers nor the contexts in which they drive.

In this direction, the intelligent car-following behavior models, which can concurrently consider all necessary parameters and adapt to specific field actions must successfully integrate possible mixed traffic scenarios. Hence, there is a strong need to modify the existing conventional models to handle the higher order of stochasticity due to mixed driving conditions. Alternatively, data-driven methodologies can be used to provide high-quality trajectory data for varying scenarios of automation. These data may be generated from experiments by creating test beds or even using driving simulators with human-machine interaction. For this purpose, the trajectory data available from simulation models may also be useful. Considering the recent highlights of AI, even intelligent car-following models can benefit from data learning methodologies. Clearly, the impact analysis of intelligent vehicles should consider the many uncertainties due to mixed traffic conditions. Although a few new models have been developed or modified to incorporate car-following and lane-changing attributes of intelligent vehicles, empirical data are warranted for model calibration.

#### **6. Conclusions**

This chapter has presented an overview of the visibility-based technologies and methodologies for autonomous driving. Based on this overview, the following comments are offered:


the autonomous vehicle develop some proactive speed control strategies at locations where the visibility is unsatisfactory.


#### **Acknowledgements**

of Intelligent Cruise Control (MIXIC) [43]. Both IDM and MIXIC have been used as the benchmark models for combined AV and human-driven vehicles for various market penetration rates ranging from 0 to 25% [44, 45]. However, detailed calibration procedures of model parameters used in these models are not available due

On these lines, substantiating vehicle's longitudinal motion in mixed traffic conditions is even more critical using available car-following model parameters. In the present context, the car-following models used in simulation packages need to be suitably modified. Further, most of these models are straightforward in their approach with a limited set of parameters. The simplification of available model parameters might not affect the simulation results to the extent the intelligent vehicles may impact the traffic streams in real-field experiments. For example, most car-following models tend to have fixed parameters and will average out driving characteristics. If such models are used for AV, they can reflect neither the driving style of the vehicles' actual drivers nor the contexts in which they drive. In this direction, the intelligent car-following behavior models, which can concurrently consider all necessary parameters and adapt to specific field actions must successfully integrate possible mixed traffic scenarios. Hence, there is a strong need

to modify the existing conventional models to handle the higher order of

stochasticity due to mixed driving conditions. Alternatively, data-driven methodologies can be used to provide high-quality trajectory data for varying scenarios of automation. These data may be generated from experiments by creating test beds or even using driving simulators with human-machine interaction. For this purpose, the trajectory data available from simulation models may also be useful. Considering the recent highlights of AI, even intelligent car-following models can benefit from data learning methodologies. Clearly, the impact analysis of intelligent vehicles should consider the many uncertainties due to mixed traffic conditions. Although a few new models have been developed or modified to incorporate car-following and lane-changing attributes of intelligent vehicles, empirical data are

This chapter has presented an overview of the visibility-based technologies and methodologies for autonomous driving. Based on this overview, the following com-

1.The Lidar height is a key parameter that affects the visibility of the road ahead and in turn sight distance that influences the design of horizontal and vertical alignment as well as intersections. To ensure safety of AV when operating with human-driven vehicles on existing highways, the Lidar height should generally be larger than the current design heights for passenger cars

2.The new Lidar system developed at Ryerson University holds great potential for generating 3D maps for AV. The system is cheap, allows AV a robust 3D perception of the environment, and allows obstacle detection in a fast and

3.The infrastructure and traffic visibilities can be estimated for autonomous vehicles based on a combination of MLS data and the driving lines of HP maps. The generated visibility map can be incorporated into the HP maps and help

to the limited or unavailability of empirical data.

*Self-Driving Vehicles and Enabling Technologies*

warranted for model calibration.

and less than that for trucks.

reliable manner even at long ranges.

**6. Conclusions**

ments are offered:

**122**

This chapter is financially supported by the Natural Sciences and Engineering Research Council of Canada (NSERC).

### **List of acronyms**



**References**

[1] Shladover S. Connected & automated vehicle systems: Review. *Intelligent*

*DOI: http://dx.doi.org/10.5772/intechopen.95328*

*Visibility-Based Technologies and Methodologies for Autonomous Driving*

[11] Khoury J et al. Initial investigation of the effects of AV on geometric design. *Journal of Advanced Transportation*, Vol.

[12] Ma Y, Zheng Y, Cheng J, Zhang Y, and Han W. A convolutional neural network method to improve efficiency and visualization in modeling driver's visual field on roads using MLS data. *Transportation Research Part C: Emerging Technologies*, 106, 2019, 317–344.

[13] Ma Y, Zheng Y, Cheng J, and Easa S M. Analysis of dynamic available passing sight distance near right-turn horizontal curves during overtaking using Lidar Data. *Can J Civ. Eng*., 2019.

[14] Ma Y, Zheng Y, Cheng J, and Easa S M. Real-time visualization method for estimating 3D highway sight distance using Lidar data. *J. Transp. Eng. - Part A:*

[15] Gargoum S A, El-Basyouny K, and Sabbagh J. Assessing Stopping and Passing Sight Distance on Highways Using Mobile Lidar Data. *J. Compt. Civ.*

*Eng*., 32(4), 04018025, 2018.

[16] Jung J, Olsen M J, Hurwitz D S, Kashani A G, and Buker K. 3D virtual intersection sight distance analysis using Lidar data. *Transportation Research Part C: Emerging Technologies,* 86(JAN.), 563–

[17] Shalkamy A, El-Basyouny K, and Xu H Y. Voxel-Based Methodology for Automated 3D Sight Distance

Assessment on Highways using Mobile Light Detection and Ranging Data. *Transportation Research Record*, 2674(5),

[18] Zhang S X, Wang C, Lin L L, Wen C L, Yang C H, Zhang Z M, and Li J. Automated visual recognizability evaluation of traffic sign based on 3D

587–599, 2020. doi:10.1177/

0361198120917376.

2019, 2019, 1–10

*Systems*, 2019.

579, 2018.

[2] Allied Market Research*. Autonomous Vehicle Market by Level of Automation, Component, and Application*. 2019. https://www.alliedmarketresearch. com/autonomous-vehicle-market.

[3] NCHRP *Connected & autonomous vehicles and transportation infrastructure readiness*. Project 20–24(111), TRB,

[4] Mohamed A. Literature survey for autonomous vehicles: Sensor fusion, computer vision, system identification, and fault tolerance. *Inter. J. Auto.*

[5] Al-Qaysi Q, Easa S M, Ali N.

highway operations (automated

*I: Journal of Systems and Control Engineering*, 219(1), 53-75, 2005.

Sons, New York, N.Y., 2007.

FL, Chapter 63, 2002, 1-39.

[10] Transportation Association of Canada. *Geometric design guide for Canadian roads*. TAC, Ottawa, Ontario,

2008.

2017.

**125**

[8] American Association of State Highway and Transportation Officials. *A policy on geometric design of highway and streets*. AASHTO, Washington DC,

Proposed Canadian automated highway system architecture: object-oriented approach. *Can. J. Civ. Eng.,* 30(6), 2003.

[6] Shladover S. Automated vehicles for

highway systems). IME *Proceedings, Part*

[7] Easa S M. *Automated highways*. In Encyclopedia of Electrical and

Electronics Engineering, John Wiley &

[9] Easa S M. *Geometric design*. In Civil Engineering Handbook, W.F. Chen and J.Y. Liew eds., CRC Press, Boca Raton,

*Transp. Sys. J*, 22(3), 2018.

Washington, DC, 2017.

*Control*, 12(4), 2018.

#### **Author details**

Said Easa<sup>1</sup> \*, Yang Ma<sup>2</sup> , Ashraf Elshorbagy<sup>1</sup> , Ahmed Shaker<sup>1</sup> , Songnian Li<sup>1</sup> and Shriniwas Arkatkar<sup>3</sup>


\*Address all correspondence to: seasa@ryerson.ca

© 2020 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/ by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

*Visibility-Based Technologies and Methodologies for Autonomous Driving DOI: http://dx.doi.org/10.5772/intechopen.95328*

#### **References**

LOS line of sight

SD sight distance

**Author details**

\*, Yang Ma<sup>2</sup>

1 Ryerson University, Toronto, Canada

2 Southeast University, Nanjing, China

provided the original work is properly cited.

\*Address all correspondence to: seasa@ryerson.ca

and Shriniwas Arkatkar<sup>3</sup>

, Ashraf Elshorbagy<sup>1</sup>

3 Sardar Vallabhbhai National Institute of Technology, Surat, Gujarat, India

© 2020 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/ by/3.0), which permits unrestricted use, distribution, and reproduction in any medium,

, Ahmed Shaker<sup>1</sup>

, Songnian Li<sup>1</sup>

Said Easa<sup>1</sup>

**124**

MLS mobile laser scanning MSS mobile mapping system PSD passing sight distance

*Self-Driving Vehicles and Enabling Technologies*

SSD stopping sight distance RSS remote sensing sensor VE visibility estimation V2I vehicle-to-infrastructure V2V vehicle-to-vehicle 1D one dimensional 3D three dimensional

SLAM simultaneous localization and mapping

[1] Shladover S. Connected & automated vehicle systems: Review. *Intelligent Transp. Sys. J*, 22(3), 2018.

[2] Allied Market Research*. Autonomous Vehicle Market by Level of Automation, Component, and Application*. 2019. https://www.alliedmarketresearch. com/autonomous-vehicle-market.

[3] NCHRP *Connected & autonomous vehicles and transportation infrastructure readiness*. Project 20–24(111), TRB, Washington, DC, 2017.

[4] Mohamed A. Literature survey for autonomous vehicles: Sensor fusion, computer vision, system identification, and fault tolerance. *Inter. J. Auto. Control*, 12(4), 2018.

[5] Al-Qaysi Q, Easa S M, Ali N. Proposed Canadian automated highway system architecture: object-oriented approach. *Can. J. Civ. Eng.,* 30(6), 2003.

[6] Shladover S. Automated vehicles for highway operations (automated highway systems). IME *Proceedings, Part I: Journal of Systems and Control Engineering*, 219(1), 53-75, 2005.

[7] Easa S M. *Automated highways*. In Encyclopedia of Electrical and Electronics Engineering, John Wiley & Sons, New York, N.Y., 2007.

[8] American Association of State Highway and Transportation Officials. *A policy on geometric design of highway and streets*. AASHTO, Washington DC, 2008.

[9] Easa S M. *Geometric design*. In Civil Engineering Handbook, W.F. Chen and J.Y. Liew eds., CRC Press, Boca Raton, FL, Chapter 63, 2002, 1-39.

[10] Transportation Association of Canada. *Geometric design guide for Canadian roads*. TAC, Ottawa, Ontario, 2017.

[11] Khoury J et al. Initial investigation of the effects of AV on geometric design. *Journal of Advanced Transportation*, Vol. 2019, 2019, 1–10

[12] Ma Y, Zheng Y, Cheng J, Zhang Y, and Han W. A convolutional neural network method to improve efficiency and visualization in modeling driver's visual field on roads using MLS data. *Transportation Research Part C: Emerging Technologies*, 106, 2019, 317–344.

[13] Ma Y, Zheng Y, Cheng J, and Easa S M. Analysis of dynamic available passing sight distance near right-turn horizontal curves during overtaking using Lidar Data. *Can J Civ. Eng*., 2019.

[14] Ma Y, Zheng Y, Cheng J, and Easa S M. Real-time visualization method for estimating 3D highway sight distance using Lidar data. *J. Transp. Eng. - Part A: Systems*, 2019.

[15] Gargoum S A, El-Basyouny K, and Sabbagh J. Assessing Stopping and Passing Sight Distance on Highways Using Mobile Lidar Data. *J. Compt. Civ. Eng*., 32(4), 04018025, 2018.

[16] Jung J, Olsen M J, Hurwitz D S, Kashani A G, and Buker K. 3D virtual intersection sight distance analysis using Lidar data. *Transportation Research Part C: Emerging Technologies,* 86(JAN.), 563– 579, 2018.

[17] Shalkamy A, El-Basyouny K, and Xu H Y. Voxel-Based Methodology for Automated 3D Sight Distance Assessment on Highways using Mobile Light Detection and Ranging Data. *Transportation Research Record*, 2674(5), 587–599, 2020. doi:10.1177/ 0361198120917376.

[18] Zhang S X, Wang C, Lin L L, Wen C L, Yang C H, Zhang Z M, and Li J. Automated visual recognizability evaluation of traffic sign based on 3D

Lidar point clouds. *Remote Sensing*, 11 (12), 2019. doi:10.3390/rs11121453.

[19] MATLAB. 2020. sub2ind: Convert subscripts to linear indices. Retrieved from https://ww2.mathworks.cn/help/ matlab/ref/sub2ind.html?lang=en (accessed on November 1, 2020).

[20] OpenDSA. 2020. CS3 Data Structures and Algorithms. 2020, https://opendsa-server.cs.vt.edu/ODSA/ Books/CS3/html/LinearIndexing.html (accessed on November 1, 2020).

[21] Ma Y, Zheng Y, Easa S M, and Cheng J. Semi-automated framework for generating cycling lane centerlines on roads with roadside barriers from noisy MLS data. ISPRS J. Photo. Remote Sensing, 167, 2020, 396–417.

[22] Shan J, and Toth C K. *Topographic laser ranging and scanning: Principles and processing*. CRC Press, Boca Raton, FL, 2018.

[23] Elshorbagy A. *A Crosscutting Threemodes-of-operation Unique LiDAR-based 3D Mapping System, Generic Framework Architecture, Uncertainty Predictive Model and SfM Augmentation*. Doctoral Dissertation, Department of Civil Engineering, Ryerson University, Toronto, Canada, 2020.

[24] Shaker A and Elshorbagy A. Systems and methods for multi-sensor mapping. Application Number 62/ 889,845–24440-P58819US00, 2019.

[25] Shaker A and Elshorbagy A. Systems and methods for multi-sensor mapping using a single device that can operate in multiple modes. PCT Patent Application No. PCT/CA2020/051133, 2019.

[26] Seif H G and Hu X. Autonomous driving in the iCity—HD maps as a key challenge of the automotive industry. *Engineering,* 2(2), 2016, 159–162.

[27] Vardhan H. HD Maps: New age maps powering autonomous vehicles. Geospatial World, 2017 https://www.ge ospatialworld.net/article/hd-maps-a utonomous-vehicles (accessed on November 1, 2020).

Control Conference, Vol. 3, Atlanta,

*DOI: http://dx.doi.org/10.5772/intechopen.95328*

*Visibility-Based Technologies and Methodologies for Autonomous Driving*

*Intelligent Transportation Systems*, 7(4),

[44] Kesting A, Treiber M, Schönhof M, and Helbing D. Adaptive cruise control design for active congestion avoidance. *Transportation Research Part C: Emerging Technologies*, 16(6), 2008, 668–683.

[45] Talebpour A and Mahmassani H S. Influence of connected and autonomous vehicles on traffic flow stability and throughput. *Transportation Research Part C: Emerging Technologies*, 71, 2016,

2006, 429–436.

143–163.

[36] Kiss D and Tevesz G. Autonomous path planning for road vehicles in narrow environments: An efficient continuous curvature approach. *J. Adv.*

[37] Wang Y, Jha D K, and Akemi Y. A two-stage RRT path planner for automated parking. Proc., *13th IEEE Conference on Automation Science and Engineering*, Xi'anChina, 2017, 496–502.

[38] Ballinas E, Montiel O, Castillo O, Rubio Y, and Aguilar L T. Automatic parallel parking algorithm for a car-like robot using fuzzy PD+1. *Control. Eng.*

[39] Diachuk M, Easa S M, and Bannis J.

utonomous vehicles in restricted space a nd low speed. *J. Infrastructures,* 5(4),

[41] Wooseok D, Omid M R, Luis M-M. Simulation-based connected and automated vehicle models on highway sections: A Literature review. *Journal of Advanced Transportation*, 2019. https://

Path and control planning for a

[40] Greer H, Fraser L, Hicks D, Mercer M, and Thompson K. *Intelligent transportation systems benefits, costs, and*

*lessons learned*. U.S. Dept. of Transportation, ITS Joint Program

doi.org/10.1155/2019/9343705.

simulations. *Physical Review E: Statistical, Nonlinear, and Soft Matter Physics*, 62(2), 1805–1824, 2000.

[43] Van Arem B, Van Driel C J, and Visser R. The impact of cooperative adaptive cruise control on traffic-flow characteristics. *IEEE Transactions on*

[42] Treiber M, Hennecke A, and Helbing D. Congested traffic states in empirical observations and microscopic

*Lett.*, 26, 2018, 447–454.

2020, 1–27.

Office, 2018.

**127**

GA, USA, 2018.

*Transp*., 2017.

[28] Chou F C, Lin T H, Cui H, Radosavljevic V, Nguyen T, Huang T K, and Djuric N. Predicting motion of vulnerable road users using highdefinition maps and efficient convnets,. 2019, arXiv preprint arXiv:1906.08469.

[29] Morales Y, Even J, Kallakuri N, Ikeda T, Shinozawa K, Kondo T, and Hagita N. Visibility analysis for autonomous vehicle comfortable navigation. *IEEE International Conference on Robotics and Automation*, 2014, 2197–2202.

[30] Ahmed S. Pedestrian/cyclist detection and intent estimation for AV: A survey. *App. Sc*., 9, 2019.

[31] Yang B S, Liu Y, Dong Z, Liang F X, Li B J, and Peng X Y. 3D local feature BKD to extract road information from mobile laser scanning point clouds. *Journal of Photogrammetry and Remote Sensing*, 130, 2017, 329–343, doi: 10.1016/j.isprsjprs.2017.06.007.

[32] Easa S M and Diachuk M. Optimal speed plan for overtaking of a utonomous vehicles on two-lane highwa ys. *J. Infrastructures,* 5(44), 2020, 1–25.

[33] Ryabchuk P. How Does Path Planning for Autonomous Vehicles Work? 2020, https://dzone.com/users/ 3246906/paulryabchuk.html (accessed on November 1, 2020).

[34] Lee H, Chun J, and Jeon K. Autonomous back-in parking based on occupancy grid map and EKF SLAM with W-band radar. *Proc., International Conference on Radar*, Brisbane, Australia, 2018, 1–4.

[35] Lin L and Zhu J J. Path planning for autonomous car parking. In Proceedings of the ASME Dynamic Systems and

*Visibility-Based Technologies and Methodologies for Autonomous Driving DOI: http://dx.doi.org/10.5772/intechopen.95328*

Control Conference, Vol. 3, Atlanta, GA, USA, 2018.

Lidar point clouds. *Remote Sensing*, 11 (12), 2019. doi:10.3390/rs11121453.

*Self-Driving Vehicles and Enabling Technologies*

Geospatial World, 2017 https://www.ge ospatialworld.net/article/hd-maps-a utonomous-vehicles (accessed on

Radosavljevic V, Nguyen T, Huang T K, and Djuric N. Predicting motion of vulnerable road users using highdefinition maps and efficient convnets,. 2019, arXiv preprint arXiv:1906.08469.

[29] Morales Y, Even J, Kallakuri N, Ikeda T, Shinozawa K, Kondo T, and Hagita N. Visibility analysis for autonomous vehicle comfortable navigation. *IEEE International*

*Conference on Robotics and Automation*,

[31] Yang B S, Liu Y, Dong Z, Liang F X, Li B J, and Peng X Y. 3D local feature BKD to extract road information from mobile laser scanning point clouds. *Journal of Photogrammetry and Remote Sensing*, 130, 2017, 329–343, doi: 10.1016/j.isprsjprs.2017.06.007.

[32] Easa S M and Diachuk M. Optimal

utonomous vehicles on two-lane highwa ys. *J. Infrastructures,* 5(44), 2020, 1–25.

speed plan for overtaking of a

[33] Ryabchuk P. How Does Path Planning for Autonomous Vehicles Work? 2020, https://dzone.com/users/ 3246906/paulryabchuk.html (accessed

[34] Lee H, Chun J, and Jeon K.

*Conference on Radar*, Brisbane,

Autonomous back-in parking based on occupancy grid map and EKF SLAM with W-band radar. *Proc., International*

[35] Lin L and Zhu J J. Path planning for autonomous car parking. In Proceedings of the ASME Dynamic Systems and

on November 1, 2020).

Australia, 2018, 1–4.

[30] Ahmed S. Pedestrian/cyclist detection and intent estimation for AV:

A survey. *App. Sc*., 9, 2019.

November 1, 2020).

2014, 2197–2202.

[28] Chou F C, Lin T H, Cui H,

[19] MATLAB. 2020. sub2ind: Convert subscripts to linear indices. Retrieved from https://ww2.mathworks.cn/help/ matlab/ref/sub2ind.html?lang=en (accessed on November 1, 2020).

[20] OpenDSA. 2020. CS3 Data Structures and Algorithms. 2020, https://opendsa-server.cs.vt.edu/ODSA/ Books/CS3/html/LinearIndexing.html (accessed on November 1, 2020).

[21] Ma Y, Zheng Y, Easa S M, and Cheng J. Semi-automated framework for generating cycling lane centerlines on roads with roadside barriers from noisy MLS data. ISPRS J. Photo. Remote

[22] Shan J, and Toth C K. *Topographic laser ranging and scanning: Principles and processing*. CRC Press, Boca Raton, FL,

[23] Elshorbagy A. *A Crosscutting Threemodes-of-operation Unique LiDAR-based 3D Mapping System, Generic Framework Architecture, Uncertainty Predictive Model and SfM Augmentation*. Doctoral Dissertation, Department of Civil Engineering, Ryerson University,

Sensing, 167, 2020, 396–417.

Toronto, Canada, 2020.

[24] Shaker A and Elshorbagy A. Systems and methods for multi-sensor mapping. Application Number 62/ 889,845–24440-P58819US00, 2019.

No. PCT/CA2020/051133, 2019.

[26] Seif H G and Hu X. Autonomous driving in the iCity—HD maps as a key challenge of the automotive industry. *Engineering,* 2(2), 2016, 159–162.

[27] Vardhan H. HD Maps: New age maps powering autonomous vehicles.

[25] Shaker A and Elshorbagy A. Systems and methods for multi-sensor mapping using a single device that can operate in multiple modes. PCT Patent Application

2018.

**126**

[36] Kiss D and Tevesz G. Autonomous path planning for road vehicles in narrow environments: An efficient continuous curvature approach. *J. Adv. Transp*., 2017.

[37] Wang Y, Jha D K, and Akemi Y. A two-stage RRT path planner for automated parking. Proc., *13th IEEE Conference on Automation Science and Engineering*, Xi'anChina, 2017, 496–502.

[38] Ballinas E, Montiel O, Castillo O, Rubio Y, and Aguilar L T. Automatic parallel parking algorithm for a car-like robot using fuzzy PD+1. *Control. Eng. Lett.*, 26, 2018, 447–454.

[39] Diachuk M, Easa S M, and Bannis J. Path and control planning for a utonomous vehicles in restricted space a nd low speed. *J. Infrastructures,* 5(4), 2020, 1–27.

[40] Greer H, Fraser L, Hicks D, Mercer M, and Thompson K. *Intelligent transportation systems benefits, costs, and lessons learned*. U.S. Dept. of Transportation, ITS Joint Program Office, 2018.

[41] Wooseok D, Omid M R, Luis M-M. Simulation-based connected and automated vehicle models on highway sections: A Literature review. *Journal of Advanced Transportation*, 2019. https:// doi.org/10.1155/2019/9343705.

[42] Treiber M, Hennecke A, and Helbing D. Congested traffic states in empirical observations and microscopic simulations. *Physical Review E: Statistical, Nonlinear, and Soft Matter Physics*, 62(2), 1805–1824, 2000.

[43] Van Arem B, Van Driel C J, and Visser R. The impact of cooperative adaptive cruise control on traffic-flow characteristics. *IEEE Transactions on*

*Intelligent Transportation Systems*, 7(4), 2006, 429–436.

[44] Kesting A, Treiber M, Schönhof M, and Helbing D. Adaptive cruise control design for active congestion avoidance. *Transportation Research Part C: Emerging Technologies*, 16(6), 2008, 668–683.

[45] Talebpour A and Mahmassani H S. Influence of connected and autonomous vehicles on traffic flow stability and throughput. *Transportation Research Part C: Emerging Technologies*, 71, 2016, 143–163.

**Chapter 6**

**Abstract**

Selected Issues and Constraints of

Image Matching in Terrain-Aided

Navigation: A Comparative Study

*Piotr Turek, Stanisław Grzywiński and Witold Bużantowicz*

The sensitivity of global navigation satellite systems to disruptions precludes their use in conditions of armed conflict with an opponent possessing comparable technical capabilities. In military unmanned aerial vehicles (UAVs) the aim is to obtain navigational data to determine the location and correction of flight routes by means of other types of navigational systems. To correct the position of an UAV relative to a given trajectory, the systems that associate reference terrain maps with image information can be used. Over the last dozen or so years, new, effective algorithms for matching digital images have been developed. The results of their performance effectiveness are based on images that are fragments taken from source files, and therefore their qualitatively identical counterparts exist in the reference images. However, the differences between the reference image stored in the memory of navigation system and the image recorded by the sensor can be significant. In this paper modern methods of image registration and matching to UAV position refinement are compared, and adaptation of available methods to the

operating conditions of the UAV navigation system is discussed.

of other types of navigation and self-guidance systems.

This results in a significant navigational error.

unmanned aerial vehicle, cruise missile

**1. Introduction**

**129**

**Keywords:** digital image processing, image matching, terrain-aided navigation,

Global navigation satellite systems are widely used in both civil and military technology areas. The advantage of such systems is very high accuracy in determining the coordinates, however, the possibility of easy interference precludes their use in conditions of armed conflict with an opponent equipped with comparable technical capabilities. In the case of military autonomous unmanned aerial vehicles (UAVs), in particular cruise missiles (CM), the aim is therefore to determine navigation data for specifying the position and correcting the flight paths by means

Such systems are usually based on inertial navigation systems (INS) which use accelerometers, angular rate gyroscopes and magnetometers to provide relatively accurate tracking of an object's position and orientation in space. However, they are exposed to drift and systematic errors of sensors, hence the divergence between the actual and the measured position of the object is constantly increasing with time.

#### **Chapter 6**

## Selected Issues and Constraints of Image Matching in Terrain-Aided Navigation: A Comparative Study

*Piotr Turek, Stanisław Grzywiński and Witold Bużantowicz*

#### **Abstract**

The sensitivity of global navigation satellite systems to disruptions precludes their use in conditions of armed conflict with an opponent possessing comparable technical capabilities. In military unmanned aerial vehicles (UAVs) the aim is to obtain navigational data to determine the location and correction of flight routes by means of other types of navigational systems. To correct the position of an UAV relative to a given trajectory, the systems that associate reference terrain maps with image information can be used. Over the last dozen or so years, new, effective algorithms for matching digital images have been developed. The results of their performance effectiveness are based on images that are fragments taken from source files, and therefore their qualitatively identical counterparts exist in the reference images. However, the differences between the reference image stored in the memory of navigation system and the image recorded by the sensor can be significant. In this paper modern methods of image registration and matching to UAV position refinement are compared, and adaptation of available methods to the operating conditions of the UAV navigation system is discussed.

**Keywords:** digital image processing, image matching, terrain-aided navigation, unmanned aerial vehicle, cruise missile

#### **1. Introduction**

Global navigation satellite systems are widely used in both civil and military technology areas. The advantage of such systems is very high accuracy in determining the coordinates, however, the possibility of easy interference precludes their use in conditions of armed conflict with an opponent equipped with comparable technical capabilities. In the case of military autonomous unmanned aerial vehicles (UAVs), in particular cruise missiles (CM), the aim is therefore to determine navigation data for specifying the position and correcting the flight paths by means of other types of navigation and self-guidance systems.

Such systems are usually based on inertial navigation systems (INS) which use accelerometers, angular rate gyroscopes and magnetometers to provide relatively accurate tracking of an object's position and orientation in space. However, they are exposed to drift and systematic errors of sensors, hence the divergence between the actual and the measured position of the object is constantly increasing with time. This results in a significant navigational error.

Therefore, two types of systems designed to correct the position of an object in relation to a given trajectory are normally used in the solutions of the UAV/CM navigation and self-guidance systems. The first group contains systems whose task is to determine the position on the basis of data obtained from radio altimeters, related to reference height maps. Such systems include, for example: TERCOM (terrain contour matching), used in Tomahawk cruise missiles, SITAN (Sandia inertial terrain-aided navigation), using terrain gradients as input for the modified extended Kalman filter (EKF) estimating the position of the object, and VATAN (Viterbi-algorithm terrain-aided navigation), a version of the system based on the Viterbi algorithm and characterised – in relation to the SITAN system – with lower mean square error of position estimation [1–5]. The main disadvantage of these solutions is the active operation of measuring devices, which reveals the position of the object in space and eliminates the advantages associated with the use of a passive (and therefore undetectable) inertial navigation system. The second group consists of systems associating reference terrain maps with image information obtained by means of visible, residual or infrared light cameras [6, 7]. Such systems include the American DSMAC (digital scene matching area correlator), also used in Tomahawk missiles [8, 9], and its Russian counterpart used in Kalibr (aka Club) missiles. Their advantage is both the accuracy of positioning and the secrecy (understood as passivity) of operation.

Due to the dynamic development of UAVs/CMs equipped with navigation systems operating independently of satellite systems and a number of problems associated with the implementation of the discussed issue, the assessment on the sensitivity of the selected methods to environmental conditions and constraints in the measurement systems, which often negatively affect the results obtained, has been carried out. The essence of the work is to consider issues related to the processing of image information obtained from optical sensors carried by UAV/CM and its association with terrain reference images. In particular, issues of the correctness of image data matching and the limitations of the possibilities of their similarities' assessment are considered. The article compares modern image matching methods assuming real conditions for obtaining information. The main goal set by the authors is to verify selected algorithms, identify the key aspects determining the effectiveness of their operation and indicate potential directions of their development.

conducted in different weather conditions than those in which the UAV/CM mission takes place (**Figure 1**). Therefore, image feature matching becomes a complex issue. The conditions related to the image recording parameters, e.g. variable view angle, maintaining scale or using various types of sensors, turn out to be

*Images of the same fragment of the Earth's surface taken under different weather and lighting conditions.*

*Selected Issues and Constraints of Image Matching in Terrain-Aided Navigation: A Comparative…*

*DOI: http://dx.doi.org/10.5772/intechopen.95039*

of digital image in technology. Initially, the classical Fourier and correlation methods were used. However, these methods did not allow for successful multimodal, multi-temporal, and multi-perspective matching of different images. The taxonomy of the classical methods used in the image matching process was

and Fourier methods. For over a dozen years, new, effective algorithms for processing and matching digital images have been developed, using statistical methods based on matching local features in images [11–14], cf. **Figure 2**. Their authors point to the greater invariance of the proposed algorithms to perspective distortions, rotation, translation, scaling and lighting changes. Given their high reliability under static conditions, as well as their low sensitivity to changes in the optical system's position, including translation, orientation and scale, it is justified to conduct studies in order to verify their usefulness and effectiveness. The paper focuses on modern image matching algorithms, which can potentially be used in topographic navigation issues. It should be stressed that the problem is completely different in the indicated context. This is due to the fact that although the matched images represent the same area of the terrain, the manner and time of the recording differ significantly from each other. This is not a typical application of these algo-

rithms, hence a limited effectiveness of their operation can be expected. The common feature of all methods is the use of the so-called *scale space* described in [15], allowing the decimation of image data and examination of

Image matching methods began to be strongly developed with the dissemination

presented in the early 1990s [11]. The image feature space, considered as a source of information necessary for image matching, was defined and local variations in the image were identified as the greatest difficulty in the matching process. In the 21st century, further development of methods based on the features of the image continued [12]. It should be emphasised that most image matching methods based on image features include four consecutive basic steps: feature detection, feature matching, model estimation of mutual image transformation and final transformation of the analysed image. These methods became an alternative to the correlation

equally important.

**Figure 1.**

**131**

#### **2. State of the art**

The operation of classic object identification algorithms, indicating the similarities between the recorded and reference images (the so-called *patterns*), is mainly based on the use of correlation methods. These algorithms, although effectively implemented in solutions to typical technical problems, are insufficiently effective in the case of topographic navigation. It is related to, inter alia, the limitations and conditions in the measurement system, environmental conditions and characteristics of the detected objects, which have a strong negative impact on the obtained correlation results. This disqualifies the possibility of their direct use in the tasks of matching reference terrain maps with the acquired image information.

A particularly significant obstacle is the fact that the sensory elements of navigation systems installed on UAV/CM record image data in various environmental and lighting conditions [10]. Frequently, reference data of high informative value, due to various conditions, constitute a pattern of little use or even lead to incorrect results. This is the case, for example, when the reconnaissance is

*Selected Issues and Constraints of Image Matching in Terrain-Aided Navigation: A Comparative… DOI: http://dx.doi.org/10.5772/intechopen.95039*

**Figure 1.** *Images of the same fragment of the Earth's surface taken under different weather and lighting conditions.*

conducted in different weather conditions than those in which the UAV/CM mission takes place (**Figure 1**). Therefore, image feature matching becomes a complex issue. The conditions related to the image recording parameters, e.g. variable view angle, maintaining scale or using various types of sensors, turn out to be equally important.

Image matching methods began to be strongly developed with the dissemination of digital image in technology. Initially, the classical Fourier and correlation methods were used. However, these methods did not allow for successful multimodal, multi-temporal, and multi-perspective matching of different images. The taxonomy of the classical methods used in the image matching process was presented in the early 1990s [11]. The image feature space, considered as a source of information necessary for image matching, was defined and local variations in the image were identified as the greatest difficulty in the matching process. In the 21st century, further development of methods based on the features of the image continued [12]. It should be emphasised that most image matching methods based on image features include four consecutive basic steps: feature detection, feature matching, model estimation of mutual image transformation and final transformation of the analysed image. These methods became an alternative to the correlation and Fourier methods. For over a dozen years, new, effective algorithms for processing and matching digital images have been developed, using statistical methods based on matching local features in images [11–14], cf. **Figure 2**. Their authors point to the greater invariance of the proposed algorithms to perspective distortions, rotation, translation, scaling and lighting changes. Given their high reliability under static conditions, as well as their low sensitivity to changes in the optical system's position, including translation, orientation and scale, it is justified to conduct studies in order to verify their usefulness and effectiveness. The paper focuses on modern image matching algorithms, which can potentially be used in topographic navigation issues. It should be stressed that the problem is completely different in the indicated context. This is due to the fact that although the matched images represent the same area of the terrain, the manner and time of the recording differ significantly from each other. This is not a typical application of these algorithms, hence a limited effectiveness of their operation can be expected.

The common feature of all methods is the use of the so-called *scale space* described in [15], allowing the decimation of image data and examination of

Therefore, two types of systems designed to correct the position of an object in relation to a given trajectory are normally used in the solutions of the UAV/CM navigation and self-guidance systems. The first group contains systems whose task is to determine the position on the basis of data obtained from radio altimeters, related to reference height maps. Such systems include, for example: TERCOM (terrain contour matching), used in Tomahawk cruise missiles, SITAN (Sandia inertial terrain-aided navigation), using terrain gradients as input for the modified extended Kalman filter (EKF) estimating the position of the object, and VATAN (Viterbi-algorithm terrain-aided navigation), a version of the system based on the Viterbi algorithm and characterised – in relation to the SITAN system – with lower mean square error of position estimation [1–5]. The main disadvantage of these solutions is the active operation of measuring devices, which reveals the position of the object in space and eliminates the advantages associated with the use of a passive (and therefore undetectable) inertial navigation system. The second group consists of systems associating reference terrain maps with image information obtained by means of visible, residual or infrared light cameras [6, 7]. Such systems include the American DSMAC (digital scene matching area correlator), also used in Tomahawk missiles [8, 9], and its Russian counterpart used in Kalibr (aka Club) missiles. Their advantage is both the accuracy of positioning and the secrecy

Due to the dynamic development of UAVs/CMs equipped with navigation systems operating independently of satellite systems and a number of problems associated with the implementation of the discussed issue, the assessment on the sensitivity of the selected methods to environmental conditions and constraints in the measurement systems, which often negatively affect the results obtained, has been carried out. The essence of the work is to consider issues related to the

processing of image information obtained from optical sensors carried by UAV/CM and its association with terrain reference images. In particular, issues of the correctness of image data matching and the limitations of the possibilities of their similarities' assessment are considered. The article compares modern image matching methods assuming real conditions for obtaining information. The main goal set by the authors is to verify selected algorithms, identify the key aspects determining the effectiveness of their operation and indicate potential directions of

The operation of classic object identification algorithms, indicating the similarities between the recorded and reference images (the so-called *patterns*), is mainly based on the use of correlation methods. These algorithms, although effectively implemented in solutions to typical technical problems, are insufficiently effective in the case of topographic navigation. It is related to, inter alia, the limitations and conditions in the measurement system, environmental

conditions and characteristics of the detected objects, which have a strong negative impact on the obtained correlation results. This disqualifies the possibility of their direct use in the tasks of matching reference terrain maps with the acquired

A particularly significant obstacle is the fact that the sensory elements of navigation systems installed on UAV/CM record image data in various environmental and lighting conditions [10]. Frequently, reference data of high informative value, due to various conditions, constitute a pattern of little use or even lead to incorrect results. This is the case, for example, when the reconnaissance is

(understood as passivity) of operation.

*Self-Driving Vehicles and Enabling Technologies*

their development.

**2. State of the art**

image information.

**130**

#### **Figure 2.**

*Classification of the selected methods of image feature matching.*

similarities between images of different scales. A significant step in the development of image matching methods based on local features was the development of the Scale-Invariant Feature Transform (SIFT) algorithm [16]. In this algorithm, the characteristics are selected locally and their position does not change while the image is scaled. Their indication is done by determining the local extremes of function *D*ð Þ *x*^ , that is the difference between the results of image *I x*ð Þ , *y* convolution with Gaussian functions *G x*ð Þ , *y*, *σ* with different values of scale parameter *σ*:

$$D(\hat{\mathbf{x}}) = D + \frac{1}{2} \frac{\partial D^T}{\partial \mathbf{x}} \hat{\mathbf{x}} \tag{1}$$

where

coefficient *r*:

diffusion can be:

**133**

*Lxx*ð Þ¼ *<sup>x</sup>*, *<sup>y</sup>*, *<sup>σ</sup> <sup>∂</sup>*<sup>2</sup>

*DOI: http://dx.doi.org/10.5772/intechopen.95039*

*Lxy*ð Þ¼ *<sup>x</sup>*, *<sup>y</sup>*, *<sup>σ</sup> <sup>∂</sup>*<sup>2</sup>

*Lyy*ð Þ¼ *<sup>x</sup>*, *<sup>y</sup>*, *<sup>σ</sup> <sup>∂</sup>*<sup>2</sup>

the Frobenius norm is given as

*∂x∂y*

*<sup>∂</sup>x*<sup>2</sup> *G x*ð Þ , *<sup>y</sup>*, *<sup>σ</sup>* <sup>⋆</sup>*I x*ð Þ , *<sup>y</sup>* <sup>≈</sup> *Bxx* <sup>⋆</sup>*I x*ð Þ , *<sup>y</sup>*

*<sup>∂</sup>y*<sup>2</sup> *G x*ð Þ , *<sup>y</sup>*, *<sup>σ</sup>* <sup>⋆</sup>*I x*ð Þ , *<sup>y</sup>* <sup>≈</sup>*Byy* <sup>⋆</sup>*I x*ð Þ , *<sup>y</sup>*

10 *Bxy* <sup>2</sup>

The determinant of the Hessian matrix after approximation using box filters and

*Selected Issues and Constraints of Image Matching in Terrain-Aided Navigation: A Comparative…*

After detecting the local extremes of detð Þ *H* , similarly to *D*ð Þ *x*^ , the location of characteristic points representing local features, called *blob-like* for the SIFT and SURF methods, is determined. In this step, the SIFT method also rejects features whose contrast is lower than the assumed threshold *t* by comparing ∣*D*ð Þ *x*^ ∣<*t* as well as points lying on isolated edges. This is done by comparing the value of the

detð Þ *<sup>H</sup>* <sup>≈</sup>*BxxByy* � <sup>9</sup>

Hessian matrix *H* trace quotient, and its determinant with the curvature

trð Þ *H*

areas and to define characteristic points in their neighbourhood.

*∂I ∂t*

detð Þ *<sup>H</sup>* <sup>&</sup>lt; ð Þ *<sup>r</sup>* <sup>þ</sup> <sup>1</sup> <sup>2</sup>

In 2011, an alternative method to SIFT and SURF, Oriented FAST and Rotated BRIEF (ORB), was proposed [19]. The method was based on the modified Features from Accelerated Segment Test (FAST) detector [20, 21], enabling corner and edge detection, and a modified Binary Robust Independent Elementary Features (BRIEF) descriptor [22]. This approach involves changing the scale of the image on the basis of blurring with an increasing value of Gaussian filter. Despite the noise reduction and enhancing the uniformity of areas interpreted by human beings as unique (e.g. surface of the lake, wall of a building, shape of a vehicle, etc.), it causes blurring of their edges. This often leads to the inability to indicate the boundaries between

The solution to this problem was proposed in the KAZE method (Japanese for "wind") [23]. Unlike the SIFT and SURF methods, which use the Gaussian function causing isotropic diffusion of luminance to generalise the image, in the KAZE method the generalisation is based on nonlinear diffusion in consecutive octaves of the scale [24]. The anisotropic image blurring in this method depends on the local luminance distribution. Nonlinear diffusion can be presented in the following equation:

The blur intensity can be adapted by the introduced conductivity function *c*, which is usually related to time. However, using the approach proposed in [15], parameter *t* is related to the image scale. Various forms of the conductivity function *c* were proposed in related works developing the use of nonlinear diffusion in the context of image filtration [24–26]. One of the functions used for nonlinear

¼ div½ � *c x*ð Þ , *y*, *t* ∇*I* (7)

*r*

*G x*ð Þ , *y*, *σ* ⋆*I x*ð Þ , *y* ≈ *Bxy* ⋆*I x*ð Þ , *y*

(4)

(5)

(6)

where

$$\begin{split} D &= D(\mathbf{x}, \mathbf{y}, \sigma) = \left[ G\_{\sigma\_1}(\mathbf{x}, \mathbf{y}) - G\_{\sigma\_2}(\mathbf{x}, \mathbf{y}) \right] \star I(\mathbf{x}, \mathbf{y}) = \\ &= \frac{1}{\sqrt{2\pi}} \left[ \frac{1}{\sigma\_1} \mathbf{e} \left( \frac{\mathbf{x}^2 \boldsymbol{\gamma}^2}{2\sigma^2} \right) - \frac{1}{\sigma\_2} \mathbf{e} \left( \frac{\mathbf{x}^2 \boldsymbol{\gamma}^2}{2\sigma^2} \right) \right] \star I(\mathbf{x}, \mathbf{y}) \end{split} \tag{2}$$

A more numerically efficient version of the SIFT algorithm, called Speeded-Up Robust Features (SURF) is based on the so-called *integral images* [17]. Both methods use the basic processing steps described in [12]. Additionally, in order to ensure the effectiveness of feature detection in images of different resolutions, a scale space, consisting of octaves which represent the series of responses of a convolutional filter with a variable size, was introduced.

Simply put, the detection of the characteristic point is based on the use of the determinant of a Hessian matrix detð Þ *H* . In the case of SURF, the second-order derivatives of the Gaussian function *G* approximated by the box filters *Bxx*, *Bxy*, *Byy*, and the integral image are also used [18]. The Hessian matrix in these methods takes the form

$$H = \begin{bmatrix} L\_{\text{xx}}(\mathbf{x}, \mathbf{y}, \sigma) & L\_{\text{xy}}(\mathbf{x}, \mathbf{y}, \sigma) \\\\ L\_{\text{xy}}(\mathbf{x}, \mathbf{y}, \sigma) & L\_{\text{yy}}(\mathbf{x}, \mathbf{y}, \sigma) \end{bmatrix} \tag{3}$$

*Selected Issues and Constraints of Image Matching in Terrain-Aided Navigation: A Comparative… DOI: http://dx.doi.org/10.5772/intechopen.95039*

where

similarities between images of different scales. A significant step in the development of image matching methods based on local features was the development of the Scale-Invariant Feature Transform (SIFT) algorithm [16]. In this algorithm, the characteristics are selected locally and their position does not change while the image is scaled. Their indication is done by determining the local extremes of function *D*ð Þ *x*^ , that is the difference between the results of image *I x*ð Þ , *y* convolution with Gaussian functions *G x*ð Þ , *y*, *σ* with different values of scale parameter *σ*:

*D*ð Þ¼ *x*^ *D* þ

where

**Figure 2.**

the form

**132**

<sup>¼</sup> <sup>1</sup> ffiffiffiffiffi <sup>2</sup>*<sup>π</sup>* <sup>p</sup>

*Classification of the selected methods of image feature matching.*

*Self-Driving Vehicles and Enabling Technologies*

filter with a variable size, was introduced.

1 *σ*1

*H* ¼

<sup>e</sup> �*x*2*y*<sup>2</sup> 2*σ*2 � � 1 2 *∂D*<sup>T</sup> *∂x*

<sup>e</sup> �*x*2*y*<sup>2</sup> 2*σ*2

⋆*I x*ð Þ , *y*

*D* ¼ *D x*ð Þ¼ , *y*, *σ Gσ*<sup>1</sup> ð Þ� *x*, *y Gσ*<sup>2</sup> ½ � ð*x*, *y*Þ ⋆*I x*ð Þ¼ , *y*

A more numerically efficient version of the SIFT algorithm, called Speeded-Up Robust Features (SURF) is based on the so-called *integral images* [17]. Both methods use the basic processing steps described in [12]. Additionally, in order to ensure the effectiveness of feature detection in images of different resolutions, a scale space, consisting of octaves which represent the series of responses of a convolutional

Simply put, the detection of the characteristic point is based on the use of the determinant of a Hessian matrix detð Þ *H* . In the case of SURF, the second-order derivatives of the Gaussian function *G* approximated by the box filters *Bxx*, *Bxy*, *Byy*, and the integral image are also used [18]. The Hessian matrix in these methods takes

*Lxx*ð Þ *x*, *y*, *σ Lxy*ð Þ *x*, *y*, *σ*

" #

*Lxy*ð Þ *x*, *y*, *σ Lyy*ð Þ *x*, *y*, *σ*

� 1 *σ*2

� � � �

*x*^ (1)

(2)

(3)

$$\begin{aligned} L\_{\text{xx}}(\mathbf{x}, \boldsymbol{y}, \sigma) &= \frac{\partial^2}{\partial \mathbf{x}^2} G(\mathbf{x}, \boldsymbol{y}, \sigma) \star I(\mathbf{x}, \boldsymbol{y}) \approx B\_{\text{xx}} \star I(\mathbf{x}, \boldsymbol{y}) \\ L\_{\text{xy}}(\mathbf{x}, \boldsymbol{y}, \sigma) &= \frac{\partial^2}{\partial \mathbf{x} \partial \boldsymbol{y}} G(\mathbf{x}, \boldsymbol{y}, \sigma) \star I(\mathbf{x}, \boldsymbol{y}) \approx B\_{\text{xy}} \star I(\mathbf{x}, \boldsymbol{y}) \\ L\_{\text{yy}}(\mathbf{x}, \boldsymbol{y}, \sigma) &= \frac{\partial^2}{\partial \mathbf{y}^2} G(\mathbf{x}, \boldsymbol{y}, \sigma) \star I(\mathbf{x}, \boldsymbol{y}) \approx B\_{\text{yy}} \star I(\mathbf{x}, \boldsymbol{y}) \end{aligned} \tag{4}$$

The determinant of the Hessian matrix after approximation using box filters and the Frobenius norm is given as

$$\det(H) \approx B\_{\infty} B\_{\mathcal{Y}} - \left(\frac{9}{10} B\_{\infty}\right)^2 \tag{5}$$

After detecting the local extremes of detð Þ *H* , similarly to *D*ð Þ *x*^ , the location of characteristic points representing local features, called *blob-like* for the SIFT and SURF methods, is determined. In this step, the SIFT method also rejects features whose contrast is lower than the assumed threshold *t* by comparing ∣*D*ð Þ *x*^ ∣<*t* as well as points lying on isolated edges. This is done by comparing the value of the Hessian matrix *H* trace quotient, and its determinant with the curvature coefficient *r*:

$$\frac{\text{tr}(H)}{\text{det}(H)} < \frac{\left(r+1\right)^2}{r} \tag{6}$$

In 2011, an alternative method to SIFT and SURF, Oriented FAST and Rotated BRIEF (ORB), was proposed [19]. The method was based on the modified Features from Accelerated Segment Test (FAST) detector [20, 21], enabling corner and edge detection, and a modified Binary Robust Independent Elementary Features (BRIEF) descriptor [22]. This approach involves changing the scale of the image on the basis of blurring with an increasing value of Gaussian filter. Despite the noise reduction and enhancing the uniformity of areas interpreted by human beings as unique (e.g. surface of the lake, wall of a building, shape of a vehicle, etc.), it causes blurring of their edges. This often leads to the inability to indicate the boundaries between areas and to define characteristic points in their neighbourhood.

The solution to this problem was proposed in the KAZE method (Japanese for "wind") [23]. Unlike the SIFT and SURF methods, which use the Gaussian function causing isotropic diffusion of luminance to generalise the image, in the KAZE method the generalisation is based on nonlinear diffusion in consecutive octaves of the scale [24]. The anisotropic image blurring in this method depends on the local luminance distribution. Nonlinear diffusion can be presented in the following equation:

$$\frac{\partial I}{\partial t} = \text{div}[c(\mathbf{x}, \mathbf{y}, t)\nabla I] \tag{7}$$

The blur intensity can be adapted by the introduced conductivity function *c*, which is usually related to time. However, using the approach proposed in [15], parameter *t* is related to the image scale. Various forms of the conductivity function *c* were proposed in related works developing the use of nonlinear diffusion in the context of image filtration [24–26]. One of the functions used for nonlinear diffusion can be:

$$c = \exp\left(-\frac{\left|\nabla L\_{\sigma}\right|^2}{k^2}\right) \tag{8}$$

image which is a continuous subset of D such that ∀*p*, *q* ∈ *Q* exist sequences *p*, *a*1, *a*2, … , *q* and *pAa*1, … , *aiAai*þ1, … , *anAq*, where *A* ⊂ D�D is the

The desired maximally stable extremal region (MSER) is the region *R* ¼ *Qi*

regions is repeated throughout the assumed *σ* scale space.

*x* � *xg* þ *θ y* � *yg* h i � � <sup>2</sup>

*DOI: http://dx.doi.org/10.5772/intechopen.95039*

feature is determined as shown in the equation:

luminance values in the neighbourhood of the feature:

8 < :

*τ*ð Þ¼ *Lσ*; *x*, *y*

gravity of the *C* ¼ *I xg* , *yg*

according to the formula

where

**135**

the procedure described in [28]. The ellipse equation is given as

*<sup>a</sup>*<sup>1</sup> <sup>1</sup> <sup>þ</sup> *<sup>θ</sup>*<sup>2</sup> � � <sup>þ</sup>

neighbourhood relation and the formula *aiAai*þ<sup>1</sup> is the neighbourhood between pixels *ai* and *ai*þ1. The extremal region is *<sup>Q</sup>* <sup>⊂</sup> <sup>D</sup>, such that <sup>∀</sup>*p*<sup>∈</sup> *<sup>Q</sup>*, <sup>∀</sup>*q*∈*∂<sup>Q</sup>* : *I p*ð Þ>*I q*ð Þ (maximum intensity region) or *I p*ð Þ< *I q*ð Þ (minimum intensity region).

*Selected Issues and Constraints of Image Matching in Terrain-Aided Navigation: A Comparative…*

for the sequence of extremal regions *Q*1, … , *Qi*�1, *Qi*, nested i.e. *Qi* ⊂ *Qi*þ<sup>1</sup> and for *q i*ðÞ¼ <sup>∣</sup>*Qi*þ<sup>Δ</sup>\*Qi*�<sup>Δ</sup>∣*=*∣*Qi*<sup>∣</sup> has a local minimum at *<sup>i</sup>* <sup>∗</sup> . Whereas <sup>Δ</sup> <sup>∈</sup>**<sup>S</sup>** is the stability parameter and the luminance threshold. The procedure of determining MSER

In the feature description stage, a vector using image moments is determined for each region. Based on the moments *m*00, *m*01, … , *m*20, the centre of gravity of each MSER region and the ellipse approximating the region are determined according to

The orientation *θ* and the size of the ellipse defined by its *a*<sup>1</sup> and *a*<sup>2</sup> axes allow to describe the features of the region taken for comparison in the matching step. The moment *m* of the order ð Þ *p* þ *q* of the MSER region used to determine the centre of

� � region can be represented as follows:

f g *x*, *y* ∈ ℝ

The use of moments and the centre of gravity is also a feature of ORB method which uses machine learning approach for corner detection. After their detection, based on the image moments, the centre of gravity *C* is determined for each corner

*mpq* <sup>¼</sup> <sup>X</sup>

*<sup>C</sup>* <sup>¼</sup> *<sup>m</sup>*<sup>10</sup> *m*<sup>00</sup> , *m*<sup>01</sup>

*mpq* <sup>¼</sup> <sup>X</sup>*xpyq*

On the basis of the corner's position and centre of gravity, the orientation of the

The feature description step uses the assigned orientation to complete the binary BRIEF descriptor [22], with the condition of verifying the belonging of the point *Lσ*ð Þ *x*, *y* to the matrix **W***θ*. It is based on the simple comparison of the pixel

1 ⇔ *Lσ*ð Þ *x*, *y* < *L<sup>σ</sup> x*1, *y*<sup>1</sup>

0 ⇔ *Lσ*ð Þ *x*, *y* ≥ *L<sup>σ</sup> x*1, *y*<sup>1</sup>

*y* � *yg* þ *θ x* � *xg* � � h i<sup>2</sup>

*<sup>a</sup>*<sup>2</sup> <sup>1</sup> <sup>þ</sup> *<sup>θ</sup>*<sup>2</sup> � � � <sup>1</sup> <sup>¼</sup> 0 (12)

*xpyq* (13)

*<sup>m</sup>*<sup>00</sup> � � (14)

*θ*ð Þ¼ *x*, *y* atan2ð Þ *m*01, *m*<sup>10</sup> (16)

� �

� �

(17)

*I x*ð Þ , *y* (15)

<sup>∗</sup> , which

where ∇*L<sup>σ</sup>* is the gradient of the Gaussian blurred function of the original image *I* on the scale *σ*, and *k* is the contrast ratio.

This function allows for blurring the image while maintaining the edges of structures. As a result, more features can be detected at different image scales. However, it involves the use of a gradient, which in the case of intense image disturbance, e.g. in the form of a shadow, may cause an unfavourable (due to the subsequent detection of features) distribution of diffusion in the image.

An important stage of the considered methods is the description of a characteristic point by means of a vector containing information about its surroundings. The SIFT method uses a luminance gradient and in the SURF method the image response to horizontally and vertically oriented Haar wavelet is applied. In general, around the characteristic point in the area with a defined radius dependent on the *σ* scale, a certain number of cells are created and dominant values of the gradient or the responses to Haar wavelets are determined. These are the basis for calculating the so-called *feature metrics*. Finally, the dominant orientation is established. Characteristic features in the SIFT method are determined by

$$\begin{aligned} m(\mathbf{x}, \boldsymbol{y}) &= \sqrt{\left[L\_{\sigma}(\mathbf{x} + \mathbf{1}, \boldsymbol{y}) - L\_{\sigma}(\mathbf{x} - \mathbf{1}, \boldsymbol{y})\right]^2 + \left[L\_{\sigma}(\mathbf{x}, \boldsymbol{y} + \mathbf{1}) - L\_{\sigma}(\mathbf{x}, \boldsymbol{y} - \mathbf{1})\right]^2} \\ \theta(\mathbf{x}, \boldsymbol{y}) &= \tan^{-1}\left[\frac{L\_{\sigma}(\mathbf{x}, \boldsymbol{y} + \mathbf{1}) - L\_{\sigma}(\mathbf{x}, \boldsymbol{y} - \mathbf{1})}{L\_{\sigma}(\mathbf{x} + \mathbf{1}, \boldsymbol{y}) - L\_{\sigma}(\mathbf{x} - \mathbf{1}, \boldsymbol{y})}\right] \end{aligned} \tag{9}$$

where *m x*ð Þ , *y* is the gradient size, *θ*ð Þ *x*, *y* is the orientation, and *L<sup>σ</sup>* is the blurred image discussed above.

The SIFT method creates thereunder a gradient histogram that sums up the determined values in four cells. In analogous cells, according to the SURF method, the responses to Haar wavelets distributed along the radii in the neighbourhood of the point with an interval of *π=*3 are summed up. In each SURF subregion, a vector **v** is determined:

$$\mathbf{v} = \begin{bmatrix} \sum d\_{\mathbf{x}} & \sum d\_{\mathbf{y}} & \sum |d\_{\mathbf{x}}| & \sum |d\_{\mathbf{y}}| \end{bmatrix} \tag{10}$$

where *dx* and *dy* are the characteristic point's neighbourhood responses to the horizontally and vertically oriented Haar wavelets, respectively.

In the KAZE method the procedure is similar as for the SURF method with the difference that the first order derivatives from the image function are used. The point description operation is performed for all levels in the adopted scale space, thereby creating a pyramid of vectors assigned to subsequent levels containing an increasingly generalised image.

The Maximally Stable Extremal Regions (MSER) method introduced in [27] has a different approach to the detection and description of local features. In this method, regions (shapes), referred to as *maximally stable*, are selected as the characteristics of the image. The image in this method is treated as a function *I* which transforms

$$I: \mathcal{D} \subset \mathbb{Z}^2 \to \mathbf{S} \tag{11}$$

where D is the domain of *I* and **S** is its set of values, usually **S** ¼ f g 0, 1, … , 255 . Regions (areas, shapes) with a specific (typically average) luminance level can be determined in the image. Region *Q* is understood as a subset of pixels in the

*Selected Issues and Constraints of Image Matching in Terrain-Aided Navigation: A Comparative… DOI: http://dx.doi.org/10.5772/intechopen.95039*

image which is a continuous subset of D such that ∀*p*, *q* ∈ *Q* exist sequences *p*, *a*1, *a*2, … , *q* and *pAa*1, … , *aiAai*þ1, … , *anAq*, where *A* ⊂ D�D is the neighbourhood relation and the formula *aiAai*þ<sup>1</sup> is the neighbourhood between pixels *ai* and *ai*þ1. The extremal region is *<sup>Q</sup>* <sup>⊂</sup> <sup>D</sup>, such that <sup>∀</sup>*p*<sup>∈</sup> *<sup>Q</sup>*, <sup>∀</sup>*q*∈*∂<sup>Q</sup>* : *I p*ð Þ>*I q*ð Þ (maximum intensity region) or *I p*ð Þ< *I q*ð Þ (minimum intensity region). The desired maximally stable extremal region (MSER) is the region *R* ¼ *Qi* <sup>∗</sup> , which for the sequence of extremal regions *Q*1, … , *Qi*�1, *Qi*, nested i.e. *Qi* ⊂ *Qi*þ<sup>1</sup> and for *q i*ðÞ¼ <sup>∣</sup>*Qi*þ<sup>Δ</sup>\*Qi*�<sup>Δ</sup>∣*=*∣*Qi*<sup>∣</sup> has a local minimum at *<sup>i</sup>* <sup>∗</sup> . Whereas <sup>Δ</sup> <sup>∈</sup>**<sup>S</sup>** is the stability parameter and the luminance threshold. The procedure of determining MSER regions is repeated throughout the assumed *σ* scale space.

In the feature description stage, a vector using image moments is determined for each region. Based on the moments *m*00, *m*01, … , *m*20, the centre of gravity of each MSER region and the ellipse approximating the region are determined according to the procedure described in [28]. The ellipse equation is given as

$$\frac{\left[\mathbf{x} - \mathbf{x}\_{\text{g}} + \theta \left(\mathbf{y} - \mathbf{y}\_{\text{g}}\right)\right]^2}{a\_1 \left(\mathbf{1} + \theta^2\right)} + \frac{\left[\mathbf{y} - \mathbf{y}\_{\text{g}} + \theta \left(\mathbf{x} - \mathbf{x}\_{\text{g}}\right)\right]^2}{a\_2 \left(\mathbf{1} + \theta^2\right)} - \mathbf{1} = \mathbf{0} \tag{12}$$

The orientation *θ* and the size of the ellipse defined by its *a*<sup>1</sup> and *a*<sup>2</sup> axes allow to describe the features of the region taken for comparison in the matching step. The moment *m* of the order ð Þ *p* þ *q* of the MSER region used to determine the centre of gravity of the *C* ¼ *I xg* , *yg* � � region can be represented as follows:

$$m\_{pq} = \sum\_{\{\mathbf{x}, \mathbf{y}\} \in \mathbb{R}} \mathbf{x}^p \mathbf{y}^q \tag{13}$$

The use of moments and the centre of gravity is also a feature of ORB method which uses machine learning approach for corner detection. After their detection, based on the image moments, the centre of gravity *C* is determined for each corner according to the formula

$$C = \left(\frac{m\_{10}}{m\_{00}}, \frac{m\_{01}}{m\_{00}}\right) \tag{14}$$

where

*<sup>c</sup>* <sup>¼</sup> exp � j j <sup>∇</sup>*L<sup>σ</sup>* <sup>2</sup>

This function allows for blurring the image while maintaining the edges of structures. As a result, more features can be detected at different image scales. However, it involves the use of a gradient, which in the case of intense image disturbance, e.g. in the form of a shadow, may cause an unfavourable (due to the

subsequent detection of features) distribution of diffusion in the image.

acteristic features in the SIFT method are determined by

*Lσ*ð Þ� *x* þ 1, *y Lσ*ð Þ *x* � 1, *y*

*<sup>θ</sup>*ð Þ¼ *<sup>x</sup>*, *<sup>y</sup>* tan �<sup>1</sup> *<sup>L</sup>σ*ð Þ� *<sup>x</sup>*, *<sup>y</sup>* <sup>þ</sup> <sup>1</sup> *<sup>L</sup>σ*ð Þ *<sup>x</sup>*, *<sup>y</sup>* � <sup>1</sup>

**<sup>v</sup>** <sup>¼</sup> <sup>X</sup>*dx*

horizontally and vertically oriented Haar wavelets, respectively.

*m x*ð Þ¼ , *y*

determined:

**134**

image discussed above.

increasingly generalised image.

q

SIFT method uses a luminance gradient and in the SURF method the image

*I* on the scale *σ*, and *k* is the contrast ratio.

*Self-Driving Vehicles and Enabling Technologies*

where ∇*L<sup>σ</sup>* is the gradient of the Gaussian blurred function of the original image

An important stage of the considered methods is the description of a characteristic point by means of a vector containing information about its surroundings. The

response to horizontally and vertically oriented Haar wavelet is applied. In general, around the characteristic point in the area with a defined radius dependent on the *σ* scale, a certain number of cells are created and dominant values of the gradient or the responses to Haar wavelets are determined. These are the basis for calculating the so-called *feature metrics*. Finally, the dominant orientation is established. Char-

> ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ½ � *<sup>L</sup>σ*ð Þ� *<sup>x</sup>* <sup>þ</sup> 1, *<sup>y</sup> <sup>L</sup>σ*ð*<sup>x</sup>* � 1, *<sup>y</sup>*<sup>Þ</sup> <sup>2</sup> <sup>þ</sup> ½ � *<sup>L</sup>σ*ð Þ� *<sup>x</sup>*, *<sup>y</sup>* <sup>þ</sup> <sup>1</sup> *<sup>L</sup>σ*ð*x*, *<sup>y</sup>* � <sup>1</sup><sup>Þ</sup> <sup>2</sup>

where *m x*ð Þ , *y* is the gradient size, *θ*ð Þ *x*, *y* is the orientation, and *L<sup>σ</sup>* is the blurred

h i

where *dx* and *dy* are the characteristic point's neighbourhood responses to the

In the KAZE method the procedure is similar as for the SURF method with the difference that the first order derivatives from the image function are used. The point description operation is performed for all levels in the adopted scale space, thereby creating a pyramid of vectors assigned to subsequent levels containing an

The Maximally Stable Extremal Regions (MSER) method introduced in [27] has a different approach to the detection and description of local features. In this method, regions (shapes), referred to as *maximally stable*, are selected as the characteristics of the image. The image in this method is treated as a function *I* which transforms

where D is the domain of *I* and **S** is its set of values, usually **S** ¼ f g 0, 1, … , 255 . Regions (areas, shapes) with a specific (typically average) luminance level can be determined in the image. Region *Q* is understood as a subset of pixels in the

The SIFT method creates thereunder a gradient histogram that sums up the determined values in four cells. In analogous cells, according to the SURF method, the responses to Haar wavelets distributed along the radii in the neighbourhood of the point with an interval of *π=*3 are summed up. In each SURF subregion, a vector **v** is

<sup>X</sup>*dy*

� � (9)

<sup>X</sup>j*dx*<sup>j</sup> <sup>X</sup>j*dy*<sup>j</sup>

*<sup>I</sup>* : <sup>D</sup> <sup>⊂</sup> <sup>2</sup> ! **<sup>S</sup>** (11)

*k*2 !

(8)

(10)

$$m\_{pq} = \sum \mathbf{x}^p \mathbf{y}^q I(\mathbf{x}, \mathbf{y}) \tag{15}$$

On the basis of the corner's position and centre of gravity, the orientation of the feature is determined as shown in the equation:

$$\theta(x, y) = \operatorname{atan2}(m\_{01}, m\_{10}) \tag{16}$$

The feature description step uses the assigned orientation to complete the binary BRIEF descriptor [22], with the condition of verifying the belonging of the point *Lσ*ð Þ *x*, *y* to the matrix **W***θ*. It is based on the simple comparison of the pixel luminance values in the neighbourhood of the feature:

$$\pi(L\_{\sigma}; \mathfrak{x}, \mathfrak{y}) = \begin{cases} 1 & \Leftrightarrow \quad L\_{\sigma}(\mathfrak{x}, \mathfrak{y}) < L\_{\sigma}(\mathfrak{x}\_{1}, \mathfrak{y}\_{1}) \\\\ 0 & \Leftrightarrow \quad L\_{\sigma}(\mathfrak{x}, \mathfrak{y}) \ge L\_{\sigma}(\mathfrak{x}\_{1}, \mathfrak{y}\_{1}) \end{cases} \tag{17}$$

The matrix **W***<sup>θ</sup>* is the product of the original matrix **W**, containing the locations of the points, which are subject to the tests, and the rotation matrix based on the determined angles *θ*ð Þ *x*, *y* . In such case the ORB vector describing the feature takes the form:

$$\mathbf{v}\_n = \sum\_{1 \le i \le n} 2^{i-1} \pi(L\_\sigma; \boldsymbol{x}\_i, \boldsymbol{y}\_i) \mid (\boldsymbol{x}\_i, \boldsymbol{y}\_i) \in \mathbf{W}\_\theta \tag{18}$$

image *I <sup>j</sup>*, the following similarity measures were used: mean square error *J*MSE and the related *J*PSNR (peak-signal-to-noise ratio) and *J*SSIM (structural similarity index

*Selected Issues and Constraints of Image Matching in Terrain-Aided Navigation: A Comparative…*

The mean square error is determined by the formula:

� � <sup>¼</sup> <sup>1</sup>

*J*PSNR *Ij*,*Ik*

*MN*

� � <sup>¼</sup> 10 log <sup>10</sup>

2*μ<sup>I</sup> <sup>j</sup>*

ð Þ <sup>0</sup>*:*01<sup>ℓ</sup> <sup>2</sup> and *<sup>ξ</sup>*<sup>2</sup> <sup>¼</sup> ð Þ <sup>0</sup>*:*03<sup>ℓ</sup> <sup>2</sup> are positive constants avoiding instability when the

the set **I** in relation to the reference element *I <sup>j</sup>* is sought, assuming that the

*μ*2 *<sup>I</sup> <sup>j</sup>* <sup>þ</sup> *<sup>μ</sup>*<sup>2</sup>

*σIk* are the variances of *Ij* and *Ik*, *σ<sup>I</sup> jIk* is covariance of a pair *I <sup>j</sup>*,*Ik*

that makes it possible to explicitly state that the selected pair *I <sup>j</sup>*,*Ik*

denominator of the Eq. (24) is very close to zero [30–32].

similarity index measures of the examined pairs *Ij*,*Ik*

*J*PSNR *I <sup>j</sup>*,*Ik*

same fragment of the Earth's surface.

**4. Performance analysis**

**137**

X *M*

X *N*

*Ij*ð Þ� *<sup>x</sup>*, *<sup>y</sup> Ik*ð*x*, *<sup>y</sup>*<sup>Þ</sup> � �<sup>2</sup> (22)

� � ( ) (23)

*y*¼1

in which *M* and *N* are the width and height of images in pixels. Index *J*PSNR is

where ℓ is the range of changes of the luminance value, while the index *J*SSIM can

*μIk* þ *ξ*<sup>1</sup> � � <sup>2</sup>*σ<sup>I</sup> jIk* <sup>þ</sup> *<sup>ξ</sup>*<sup>2</sup>

*Ik* þ *ξ*<sup>1</sup> � � *<sup>σ</sup>*<sup>2</sup>

in which *μ<sup>I</sup> <sup>j</sup>* and *μIk* are the mean values of the *I <sup>j</sup>* and *Ik* image luminance, *σ<sup>I</sup> <sup>j</sup>* and

For such defined initial conditions, the best match of the subsequent elements of

� � ! 0 and *<sup>J</sup>*SSIM *<sup>I</sup> <sup>j</sup>*,*Ik*

In order to verify the sensitivity of the selected methods to limitations in the measurement system and environmental changes, a number of studies taking into account the actual conditions of obtaining information were conducted. Due to their difficult nature, they were performed with the use of computer simulation methods. The research was carried out in three stages. In the first stage, a detailed analysis of the test sets, using the values of the similarity indexes of the elements defined in the article, was completed. On the basis of the performed tests, special cases were selected and subjected to detailed analysis. In the further part of the study, the methods and verification of the correctness of image data matching in the scope of mutual matching of the sets presented to the analysis were compared. Finally, the influence of changes in the contrast of the acquired image on the number of features detected and the subsequent matching results was examined.

The term *best match* is understood as defining certain vectors **v** *<sup>j</sup>* and **v***<sup>k</sup>* of values, which characterise the considered signals *Ij* and *Ik*, and then linking them in a way

ℓ2 *J*MSE *Ij*,*Ik*

� �

*<sup>I</sup> <sup>j</sup>* <sup>þ</sup> *<sup>σ</sup>*<sup>2</sup>

*Ik* þ *ξ*<sup>2</sup>

� � (24)

� �, and *<sup>ξ</sup>*<sup>1</sup> <sup>¼</sup>

� � are strongly undesirable, i.e.

� � ≪ 1 (25)

� � describes the

*x*¼1

*J*MSE *I <sup>j</sup>*,*Ik*

*DOI: http://dx.doi.org/10.5772/intechopen.95039*

*J*SSIM *I <sup>j</sup>*,*Ik* � � <sup>¼</sup>

measure).

defined as:

be described as:

The common element for the described methods is the stage of comparing the distinguished features detected on the reference and registered images. It is of fundamental importance in the field of absolute terrain position designation, because the location of the matched features is the source of determining the matrix of mutual image transformation. In this comparison, vectors describing the features in a given method, e.g. feature metric and its orientation, are taken into account.

The determination of the similarity between the feature description vectors **v***<sup>a</sup>* and **v***<sup>b</sup>* is based on various measures. The most commonly used are the distances defined as follows:

$$d\_1(\mathbf{v}\_a, \mathbf{v}\_b) = \sum |\mathbf{v}\_a - \mathbf{v}\_b| \text{ and } d\_2(\mathbf{v}\_a, \mathbf{v}\_b) = \sum \left(\mathbf{v}\_a - \mathbf{v}\_b\right)^2 \tag{19}$$

The third frequently used norm for binary vectors is the Hamming distance given as:

$$d\_3(\mathbf{v}\_a, \mathbf{v}\_b) = \sum \text{XOR}(\mathbf{v}\_a, \mathbf{v}\_b) \tag{20}$$

Another approach for matching two features is the nearest neighbour algorithm based on the ratio of the distances *d*<sup>1</sup> and *d*2. However, it should be remembered that the matching result for the described distances may vary, hence the importance of the features' detection and description step.

The final step in all the discussed methods is the statistical verification of a set of matched local features. It happens that, as a result of the initial comparison of the vectors which describe the features, mismatches resulting from the acquisition conditions described above are indicated. Therefore, after the pre-processing step, additional criteria are applied to distinguish matches from mismatches, e.g. based on the Random Sample Consensus (RANSAC) method [29]. This method allows for the estimation of a mathematical model describing the location of local features in the image provided that most of the matched points fit into this model (with the assumed maximum error). Then those points that do not fit into the estimated model are discarded in the step of determining the image transformation matrix.

#### **3. Problem formulation**

The following set is considered:

$$\mathbf{I} = \{I\_i, i \in \mathbb{N}\} \tag{21}$$

Elements of **I** are two-dimensional discrete signals (digital images) and describe the same part of the Earth's surface, but recorded at different times and therefore under different environmental and lighting conditions. Image *Ij* chosen from the set **I** is treated as a reference signal, i.e. characterised by excellent structural similarity to itself. In order to compare any *Ik* image selected from the set **I** with reference

*Selected Issues and Constraints of Image Matching in Terrain-Aided Navigation: A Comparative… DOI: http://dx.doi.org/10.5772/intechopen.95039*

image *I <sup>j</sup>*, the following similarity measures were used: mean square error *J*MSE and the related *J*PSNR (peak-signal-to-noise ratio) and *J*SSIM (structural similarity index measure).

The mean square error is determined by the formula:

$$J\_{\rm MSE}(I\_j, I\_k) = \frac{1}{\rm MN} \sum\_{\mathbf{x}=1}^{M} \sum\_{\mathbf{y}=1}^{N} \left[ I\_j(\mathbf{x}, \mathbf{y}) - I\_k(\mathbf{x}, \mathbf{y}) \right]^2 \tag{22}$$

in which *M* and *N* are the width and height of images in pixels. Index *J*PSNR is defined as:

$$J\_{\rm PSNR}(I\_j, I\_k) = 10 \log\_{10} \left\{ \frac{\ell^2}{J\_{\rm MSE}(I\_j, I\_k)} \right\} \tag{23}$$

where ℓ is the range of changes of the luminance value, while the index *J*SSIM can be described as:

$$J\_{\text{SSIM}}(I\_j, I\_k) = \frac{\left(2\mu\_{I\_j}\mu\_{I\_k} + \xi\_1\right)\left(2\sigma\_{I\_j l\_k} + \xi\_2\right)}{\left(\mu\_{I\_j}^2 + \mu\_{I\_k}^2 + \xi\_1\right)\left(\sigma\_{I\_j}^2 + \sigma\_{I\_k}^2 + \xi\_2\right)}\tag{24}$$

in which *μ<sup>I</sup> <sup>j</sup>* and *μIk* are the mean values of the *I <sup>j</sup>* and *Ik* image luminance, *σ<sup>I</sup> <sup>j</sup>* and *σIk* are the variances of *Ij* and *Ik*, *σ<sup>I</sup> jIk* is covariance of a pair *I <sup>j</sup>*,*Ik* � �, and *<sup>ξ</sup>*<sup>1</sup> <sup>¼</sup> ð Þ <sup>0</sup>*:*01<sup>ℓ</sup> <sup>2</sup> and *<sup>ξ</sup>*<sup>2</sup> <sup>¼</sup> ð Þ <sup>0</sup>*:*03<sup>ℓ</sup> <sup>2</sup> are positive constants avoiding instability when the denominator of the Eq. (24) is very close to zero [30–32].

For such defined initial conditions, the best match of the subsequent elements of the set **I** in relation to the reference element *I <sup>j</sup>* is sought, assuming that the similarity index measures of the examined pairs *Ij*,*Ik* � � are strongly undesirable, i.e.

$$J\_{\rm PSNR}(I\_j, I\_k) \to \mathbf{0} \quad \text{and} \quad J\_{\rm SSIM}(I\_j, I\_k) \ll \mathbf{1} \tag{25}$$

The term *best match* is understood as defining certain vectors **v** *<sup>j</sup>* and **v***<sup>k</sup>* of values, which characterise the considered signals *Ij* and *Ik*, and then linking them in a way that makes it possible to explicitly state that the selected pair *I <sup>j</sup>*,*Ik* � � describes the same fragment of the Earth's surface.

#### **4. Performance analysis**

In order to verify the sensitivity of the selected methods to limitations in the measurement system and environmental changes, a number of studies taking into account the actual conditions of obtaining information were conducted. Due to their difficult nature, they were performed with the use of computer simulation methods. The research was carried out in three stages. In the first stage, a detailed analysis of the test sets, using the values of the similarity indexes of the elements defined in the article, was completed. On the basis of the performed tests, special cases were selected and subjected to detailed analysis. In the further part of the study, the methods and verification of the correctness of image data matching in the scope of mutual matching of the sets presented to the analysis were compared. Finally, the influence of changes in the contrast of the acquired image on the number of features detected and the subsequent matching results was examined.

The matrix **W***<sup>θ</sup>* is the product of the original matrix **W**, containing the locations of the points, which are subject to the tests, and the rotation matrix based on the determined angles *θ*ð Þ *x*, *y* . In such case the ORB vector describing the feature takes

> *τ Lσ*; *xi*, *yi* � � <sup>∣</sup> *xi*, *yi*

The common element for the described methods is the stage of comparing the distinguished features detected on the reference and registered images. It is of fundamental importance in the field of absolute terrain position designation, because the location of the matched features is the source of determining the matrix of mutual image transformation. In this comparison, vectors describing the features in a given method, e.g. feature metric and its orientation, are taken into

The determination of the similarity between the feature description vectors **v***<sup>a</sup>* and **v***<sup>b</sup>* is based on various measures. The most commonly used are the distances

The third frequently used norm for binary vectors is the Hamming distance

Another approach for matching two features is the nearest neighbour algorithm based on the ratio of the distances *d*<sup>1</sup> and *d*2. However, it should be remembered that the matching result for the described distances may vary, hence the importance

The final step in all the discussed methods is the statistical verification of a set of matched local features. It happens that, as a result of the initial comparison of the vectors which describe the features, mismatches resulting from the acquisition conditions described above are indicated. Therefore, after the pre-processing step, additional criteria are applied to distinguish matches from mismatches, e.g. based on the Random Sample Consensus (RANSAC) method [29]. This method allows for the estimation of a mathematical model describing the location of local features in the image provided that most of the matched points fit into this model (with the assumed maximum error). Then those points that do not fit into the estimated model are discarded in the step of determining the image transformation matrix.

Elements of **I** are two-dimensional discrete signals (digital images) and describe the same part of the Earth's surface, but recorded at different times and therefore under different environmental and lighting conditions. Image *Ij* chosen from the set **I** is treated as a reference signal, i.e. characterised by excellent structural similarity to itself. In order to compare any *Ik* image selected from the set **I** with reference

<sup>X</sup>∣**v***<sup>a</sup>* � **<sup>v</sup>***<sup>b</sup>* <sup>∣</sup> and *<sup>d</sup>*2ð Þ¼ **<sup>v</sup>***a*, **<sup>v</sup>***<sup>b</sup>*

*d*3ð Þ¼ **v***a*, **v***<sup>b</sup>*

of the features' detection and description step.

� �∈**W***<sup>θ</sup>* (18)

<sup>X</sup>ð Þ **<sup>v</sup>***<sup>a</sup>* � **<sup>v</sup>***<sup>b</sup>*

<sup>X</sup>XORð Þ **<sup>v</sup>***a*, **<sup>v</sup>***<sup>b</sup>* (20)

**I** ¼ *Ii* f g , *i*∈ (21)

<sup>2</sup> (19)

**<sup>v</sup>***<sup>n</sup>* <sup>¼</sup> <sup>X</sup> 1≤*i* ≤*n*

*Self-Driving Vehicles and Enabling Technologies*

2*i*�<sup>1</sup>

the form:

account.

given as:

defined as follows:

*d*1ð Þ¼ **v***a*, **v***<sup>b</sup>*

**3. Problem formulation**

**136**

The following set is considered:

#### **4.1 Analysis of test set elements**

For the initial numerical tests, the test set **I** consisting of four elements was adopted, whereby *I*<sup>0</sup> is treated as reference. Each element of the set **I** is a 24-bit digital image with a size of 1080 1080 pixels, representing the same fragment of the Earth's surface, with a centrally located characteristic terrain object (**Figure 3**).

The object is located in a natural environment characteristic for tundra and therefore distinguished by a rocky ground with a very low plant cover, dominated by mosses and lichens. Image *I*<sup>0</sup> (reference) was taken in the autumn and mostly brown colours, associated with the tundra soils and rock formations in this area, prevail in it. The image *I*<sup>1</sup> shows the environment in spring–summer conditions, i.e. during the growing season. Images *I*<sup>2</sup> and *I*<sup>3</sup> were taken in winter, with snow cover, whereby in the case of *I*3, there is also a strong cloud cover. Test set **I** elements' similarity index measures, determined on the basis of the Eqs. (22)–(24) with the reference image *I*0, are presented in **Table 1** (columns 2–5).

Based on the obtained results, it can be shown that the elements constituting the test set **I** were selected so that only one of them (*I*1) has a relatively high degree of

ð Þ *I***0,***I***<sup>0</sup>** ð Þ *I***0,***I***<sup>1</sup>** ð Þ *I***0,***I***<sup>2</sup>** ð Þ *I***0,***I***<sup>3</sup>** ð Þ *I***2,***I***<sup>2</sup>** ð Þ *I***2,***I***<sup>3</sup>**

*J*MSE 0 2.23E03 1.46E04 1.19E04 0 2.99E03 *J*PSNR ∞ 14.65 6.47 7.36 ∞ 13.37 *J*SSIM 1 0.4373 0.0742 0.0755 1 0.3378

*Selected Issues and Constraints of Image Matching in Terrain-Aided Navigation: A Comparative…*

**Table 1.**

**Figure 4.**

**139**

*Image pair I*ð Þ 0,*I*<sup>1</sup> *matching result for: (a) SURF, (b) KAZE, and (c) MSER method.*

*Similarity index measures for selected pairs of the set* **I***.*

*DOI: http://dx.doi.org/10.5772/intechopen.95039*

**Figure 3.** *Test image set* **I***: (a) reference image I*0*, (b)-(d) test images I*1,*I*2,*I*<sup>3</sup> *(source: Google Earth).*

*Selected Issues and Constraints of Image Matching in Terrain-Aided Navigation: A Comparative… DOI: http://dx.doi.org/10.5772/intechopen.95039*


**Table 1.**

**4.1 Analysis of test set elements**

*Self-Driving Vehicles and Enabling Technologies*

**Figure 3.**

**138**

For the initial numerical tests, the test set **I** consisting of four elements was adopted, whereby *I*<sup>0</sup> is treated as reference. Each element of the set **I** is a 24-bit digital image with a size of 1080 1080 pixels, representing the same fragment of the Earth's surface, with a centrally located characteristic terrain object (**Figure 3**). The object is located in a natural environment characteristic for tundra and therefore distinguished by a rocky ground with a very low plant cover, dominated by mosses and lichens. Image *I*<sup>0</sup> (reference) was taken in the autumn and mostly brown colours, associated with the tundra soils and rock formations in this area, prevail in it. The image *I*<sup>1</sup> shows the environment in spring–summer conditions, i.e. during the growing season. Images *I*<sup>2</sup> and *I*<sup>3</sup> were taken in winter, with snow cover, whereby in the case of *I*3, there is also a strong cloud cover. Test set **I** elements' similarity index measures, determined on the basis of the Eqs. (22)–(24) with the

Based on the obtained results, it can be shown that the elements constituting the test set **I** were selected so that only one of them (*I*1) has a relatively high degree of

reference image *I*0, are presented in **Table 1** (columns 2–5).

*Test image set* **I***: (a) reference image I*0*, (b)-(d) test images I*1,*I*2,*I*<sup>3</sup> *(source: Google Earth).*

*Similarity index measures for selected pairs of the set* **I***.*

**Figure 4.** *Image pair I*ð Þ 0,*I*<sup>1</sup> *matching result for: (a) SURF, (b) KAZE, and (c) MSER method.*

**Figure 5.** *Image pair I*ð Þ 0,*I*<sup>1</sup> *matching result for ORB method.*

similarity to the reference image. The remaining items (*I*<sup>2</sup> and *I*3) have unfavourable similarity index measures *J*PSNR and *J*SSIM, which enables the assumptions of the Eq. (25) to be met. Although in a subjective manner the image *I*<sup>2</sup> is more similar to the reference image *I*0, *I*<sup>3</sup> is characterised by more favourable *J*PSNR and *J*SSIM index measures.

It should be noted that *I*<sup>2</sup> and *I*<sup>3</sup> were also carefully selected. Both images show a similar arrangement of snow cover, which is reflected in the determined values of the pair similarity indexes ð Þ *I*2,*I*<sup>3</sup> , cf. **Table 1**, columns 6 and 7. This is justified in practice: it may happen that the data from the reconnaissance are accurate (e.g. they take into account the snow in the area concerned, the lack of leaves on the trees in late autumn or the high water level in spring), but strong fogging or rainfall make the image *I*<sup>3</sup> obtained from UAV/CM recording systems several hours later significantly different from the reference image (in this case *I*2).

#### **4.2 Comparison of the selected methods of image feature matching**

The test set **I** presented in SubSection 4.1 was examined in order to compare the effectiveness of image matching performed by algorithms using local features. The SURF, MSER, ORB and KAZE methods were taken into account. Image *I*<sup>0</sup> is a pattern, and *I*1, *I*2, *I*<sup>3</sup> are matched images. In the algorithms, the values of parameters proposed by their authors were used with the exception of the features'similarity threshold, which was lowered to the level of 50% due to large differences between individual elements of the set **I**. The best matching of individual features in the compared images was assumed, using the similarity measures proposed for these methods. The RANSAC method was used for the final correction of the matched features, for which an affine transformation model between images was adopted. In order to verify the effectiveness of the considered methods and the correctness of the parameters adopted in the last study, the pattern was replaced. It was assumed that *I*<sup>2</sup> is a reference image and *I*<sup>3</sup> is a matched image. The matching results of the individual test pairs of **I** are shown in **Figures 4**–**8** and in **Tables 2**–**5**.

of the pair ð Þ *I*0,*I*<sup>2</sup> were located. When comparing a relatively similar pair ð Þ *I*2,*I*<sup>3</sup> , it appeared that all algorithms indicated mismatches or lack thereof, with KAZE and

*Selected Issues and Constraints of Image Matching in Terrain-Aided Navigation: A Comparative…*

*DOI: http://dx.doi.org/10.5772/intechopen.95039*

In general, the KAZE method proved to be the most effective, while the ORB method showed the least processing efficiency of the set **I**. Due to the lack of any pair ð Þ *I*0,*I*<sup>2</sup> , ð Þ *I*0,*I*<sup>3</sup> and ð Þ *I*2,*I*<sup>3</sup> matches, no graphical results are presented for the ORB method. A potential cause of a lack of pair ð Þ *I*2,*I*<sup>3</sup> matches is a large contrast change, characteristic of the occurrence of acquisition interference, such as the fog

MSER indicating two correct matches each (**Table 5**).

*Image pair I*ð Þ 0,*I*<sup>2</sup> *matching result for: (a) SURF, (b) KAZE, and (c) MSER method.*

visible in the image *I*3.

**Figure 6.**

**141**

Analysis of the matching results has shown that the selected algorithms are not effective when the matched images, despite the same content, differ significantly, cf. pair ð Þ *I*0,*I*<sup>3</sup> . All methods were most effective in matching the pair ð Þ *I*0,*I*<sup>1</sup> . While the SURF and MSER methods indicated mismatches for matching pairs ð Þ *I*0,*I*<sup>2</sup> and ð Þ *I*0,*I*<sup>3</sup> , the ORB method did not (cf. **Table 3** and **Table 4**). The KAZE method identified correctly the fragment of the image on which the corresponding features *Selected Issues and Constraints of Image Matching in Terrain-Aided Navigation: A Comparative… DOI: http://dx.doi.org/10.5772/intechopen.95039*

of the pair ð Þ *I*0,*I*<sup>2</sup> were located. When comparing a relatively similar pair ð Þ *I*2,*I*<sup>3</sup> , it appeared that all algorithms indicated mismatches or lack thereof, with KAZE and MSER indicating two correct matches each (**Table 5**).

In general, the KAZE method proved to be the most effective, while the ORB method showed the least processing efficiency of the set **I**. Due to the lack of any pair ð Þ *I*0,*I*<sup>2</sup> , ð Þ *I*0,*I*<sup>3</sup> and ð Þ *I*2,*I*<sup>3</sup> matches, no graphical results are presented for the ORB method. A potential cause of a lack of pair ð Þ *I*2,*I*<sup>3</sup> matches is a large contrast change, characteristic of the occurrence of acquisition interference, such as the fog visible in the image *I*3.

similarity to the reference image. The remaining items (*I*<sup>2</sup> and *I*3) have

cantly different from the reference image (in this case *I*2).

**4.2 Comparison of the selected methods of image feature matching**

*J*SSIM index measures.

*Image pair I*ð Þ 0,*I*<sup>1</sup> *matching result for ORB method.*

*Self-Driving Vehicles and Enabling Technologies*

**Figure 5.**

in **Tables 2**–**5**.

**140**

unfavourable similarity index measures *J*PSNR and *J*SSIM, which enables the assumptions of the Eq. (25) to be met. Although in a subjective manner the image *I*<sup>2</sup> is more similar to the reference image *I*0, *I*<sup>3</sup> is characterised by more favourable *J*PSNR and

It should be noted that *I*<sup>2</sup> and *I*<sup>3</sup> were also carefully selected. Both images show a similar arrangement of snow cover, which is reflected in the determined values of the pair similarity indexes ð Þ *I*2,*I*<sup>3</sup> , cf. **Table 1**, columns 6 and 7. This is justified in practice: it may happen that the data from the reconnaissance are accurate (e.g. they take into account the snow in the area concerned, the lack of leaves on the trees in late autumn or the high water level in spring), but strong fogging or rainfall make the image *I*<sup>3</sup> obtained from UAV/CM recording systems several hours later signifi-

The test set **I** presented in SubSection 4.1 was examined in order to compare the effectiveness of image matching performed by algorithms using local features. The SURF, MSER, ORB and KAZE methods were taken into account. Image *I*<sup>0</sup> is a pattern, and *I*1, *I*2, *I*<sup>3</sup> are matched images. In the algorithms, the values of parameters proposed by their authors were used with the exception of the features'similarity threshold, which was lowered to the level of 50% due to large differences between individual elements of the set **I**. The best matching of individual features in the compared images was assumed, using the similarity measures proposed for these methods. The RANSAC method was used for the final correction of the matched features, for which an affine transformation model between images was adopted. In order to verify the effectiveness of the considered methods and the correctness of the parameters adopted in the last study, the pattern was replaced. It was assumed that *I*<sup>2</sup> is a reference image and *I*<sup>3</sup> is a matched image. The matching results of the individual test pairs of **I** are shown in **Figures 4**–**8** and

Analysis of the matching results has shown that the selected algorithms are not effective when the matched images, despite the same content, differ significantly, cf. pair ð Þ *I*0,*I*<sup>3</sup> . All methods were most effective in matching the pair ð Þ *I*0,*I*<sup>1</sup> . While the SURF and MSER methods indicated mismatches for matching pairs ð Þ *I*0,*I*<sup>2</sup> and ð Þ *I*0,*I*<sup>3</sup> , the ORB method did not (cf. **Table 3** and **Table 4**). The KAZE method identified correctly the fragment of the image on which the corresponding features

**Figure 7.** *Image pair I*0,*I*<sup>3</sup> *matching result for: (a) SURF, (b) KAZE, and (c) MSER method.*

#### **4.3 Effect of contrast change on the number of the detected features**

The research focused on the analysis of the effect of contrast change on the number of features detected in the image. For this purpose, the contrast of the image *I*<sup>0</sup> had been gradually reduced until a uniform colour throughout the image was obtained. Afterwards, the transformed set was further analysed. SURF, MSER, ORB and KAZE methods were used again. **Figure 9** shows the cumulative results of this study.

**Figure 8.** *Image pair I*2,*I*<sup>3</sup>

**Table 2.**

**143**

*Image pair I*ð Þ 0,*I*<sup>1</sup> *matching results.*

*matching result for: (a) SURF, (b) KAZE, and (c) MSER method.*

Correct matches 9 489 2 11 Mismatches 0 0 1 0 Percentage of correct matches 100% 100% 67% 100%

*Selected Issues and Constraints of Image Matching in Terrain-Aided Navigation: A Comparative…*

*DOI: http://dx.doi.org/10.5772/intechopen.95039*

**SURF KAZE MSER ORB**

*Selected Issues and Constraints of Image Matching in Terrain-Aided Navigation: A Comparative… DOI: http://dx.doi.org/10.5772/intechopen.95039*

**Figure 8.** *Image pair I*2,*I*<sup>3</sup> *matching result for: (a) SURF, (b) KAZE, and (c) MSER method.*


**Table 2.** *Image pair I*ð Þ 0,*I*<sup>1</sup> *matching results.*

**4.3 Effect of contrast change on the number of the detected features**

*matching result for: (a) SURF, (b) KAZE, and (c) MSER method.*

this study.

**142**

**Figure 7.** *Image pair I*0,*I*<sup>3</sup>

*Self-Driving Vehicles and Enabling Technologies*

The research focused on the analysis of the effect of contrast change on the number of features detected in the image. For this purpose, the contrast of the image *I*<sup>0</sup> had been gradually reduced until a uniform colour throughout the image was obtained. Afterwards, the transformed set was further analysed. SURF, MSER, ORB and KAZE methods were used again. **Figure 9** shows the cumulative results of

#### *Self-Driving Vehicles and Enabling Technologies*


**5. Conclusions and final remarks**

*DOI: http://dx.doi.org/10.5772/intechopen.95039*

surement system.

**Acknowledgements**

DOB-2P/03/06/2018).

**Conflict of interest**

**145**

The authors declare no conflict of interest.

The results of the algorithms presented in the literature are usually related to images that are fragments of source images, i.e. have their qualitatively identical counterparts in Ref. images. In the analysed cases, the differences between the reference image stored in the memory of the navigation system and that recorded by the sensor are significant. As a result, there are certain consequences that often prevent the image representing the same field object from being effectively

*Selected Issues and Constraints of Image Matching in Terrain-Aided Navigation: A Comparative…*

matched. This is due to real environmental conditions and restrictions on obtaining information. The measurement system parameters and the quality of the images taken have a direct impact on the number of detected features. For example, the lack of complete information about the accuracy of field object's image mapping makes it impossible to properly select the size of the filters. This results in the detection of objects that are completely irrelevant to the issue considered, such as bushes, leaves or grass blades, which are highly variable over time. Consequently, it

The study concluded that the use of statistical algorithms such as RANSAC improves the effectiveness of the selected methods. However, the results obtained strongly depend on the size of the set taken into consideration and the match/ mismatch ratio. Therefore, in the terrain image processing, it is necessary to conduct an analysis of the informational characteristics of the examined objects and the conditions of acquisition. This allows for extracting characteristic points whose description does not significantly change due to atmospheric conditions.

The results of the simulation tests enable a general conclusion that the methods considered are often insufficient to determine the coordinates of a UAV/CM flying under unfavourable environmental conditions. The greatest development potential, in the context of the implementations examined in this work, is characterised by methods based on anisotropic diffusion, which in the course of simulation studies showed the highest effectiveness. Therefore, it seems justified to focus the research effort on further development of new image processing methods within the group of anisotropic diffusion methods. In particular, it is proposed to take the informative character of terrain images as determinants of the input parameters of the designed processing methods into account, to apply pre-processing methods aimed at decimation of the input data, their segmentation and determination of the main components, and to extend the definition of the designed methods with additional criteria increasing the effectiveness of detection and image feature matching. The newly developed methods should be aimed at the improvement of feature detection efficiency in terrain images and the selection of processing parameters taking into account environmental conditions as well as limitations and conditions in the mea-

This work is financed by the National Centre of Research and Development of the Republic of Poland as part of the scientific research program for the defence and security named *Future Technologies for Defence – Young Scientist Contest* (Grant No.

has a significant impact on the performance of individual algorithms.

#### **Table 3.**

*Image pair I*ð Þ 0,*I*<sup>2</sup> *matching results.*


#### **Table 4.**

*Image pair I*0,*I*<sup>3</sup> *matching results.*


#### **Table 5.**

*Image pair I*2,*I*<sup>3</sup> *matching results.*

**Figure 9.** *Effect of contrast change on the number of the detected image features.*

On the basis of the results obtained, it can be concluded that the number of features detected by the examined methods decreases with image contrast reducing, which results in a smaller statistical sample processed in each subsequent step of these methods. This may be the cause of the lower matching efficiency of the methods considered for images that are significantly different from each other.

*Selected Issues and Constraints of Image Matching in Terrain-Aided Navigation: A Comparative… DOI: http://dx.doi.org/10.5772/intechopen.95039*

#### **5. Conclusions and final remarks**

The results of the algorithms presented in the literature are usually related to images that are fragments of source images, i.e. have their qualitatively identical counterparts in Ref. images. In the analysed cases, the differences between the reference image stored in the memory of the navigation system and that recorded by the sensor are significant. As a result, there are certain consequences that often prevent the image representing the same field object from being effectively matched. This is due to real environmental conditions and restrictions on obtaining information. The measurement system parameters and the quality of the images taken have a direct impact on the number of detected features. For example, the lack of complete information about the accuracy of field object's image mapping makes it impossible to properly select the size of the filters. This results in the detection of objects that are completely irrelevant to the issue considered, such as bushes, leaves or grass blades, which are highly variable over time. Consequently, it has a significant impact on the performance of individual algorithms.

The study concluded that the use of statistical algorithms such as RANSAC improves the effectiveness of the selected methods. However, the results obtained strongly depend on the size of the set taken into consideration and the match/ mismatch ratio. Therefore, in the terrain image processing, it is necessary to conduct an analysis of the informational characteristics of the examined objects and the conditions of acquisition. This allows for extracting characteristic points whose description does not significantly change due to atmospheric conditions.

The results of the simulation tests enable a general conclusion that the methods considered are often insufficient to determine the coordinates of a UAV/CM flying under unfavourable environmental conditions. The greatest development potential, in the context of the implementations examined in this work, is characterised by methods based on anisotropic diffusion, which in the course of simulation studies showed the highest effectiveness. Therefore, it seems justified to focus the research effort on further development of new image processing methods within the group of anisotropic diffusion methods. In particular, it is proposed to take the informative character of terrain images as determinants of the input parameters of the designed processing methods into account, to apply pre-processing methods aimed at decimation of the input data, their segmentation and determination of the main components, and to extend the definition of the designed methods with additional criteria increasing the effectiveness of detection and image feature matching. The newly developed methods should be aimed at the improvement of feature detection efficiency in terrain images and the selection of processing parameters taking into account environmental conditions as well as limitations and conditions in the measurement system.

#### **Acknowledgements**

This work is financed by the National Centre of Research and Development of the Republic of Poland as part of the scientific research program for the defence and security named *Future Technologies for Defence – Young Scientist Contest* (Grant No. DOB-2P/03/06/2018).

#### **Conflict of interest**

The authors declare no conflict of interest.

On the basis of the results obtained, it can be concluded that the number of features detected by the examined methods decreases with image contrast reducing, which results in a smaller statistical sample processed in each subsequent step of these methods. This may be the cause of the lower matching efficiency of the methods considered for images that are significantly different from each other.

*Effect of contrast change on the number of the detected image features.*

Correct matches 0 3 2 0 Mismatches 8 1 1 0 Percentage of correct matches 0% 75% 67% 0%

Correct matches 0 2 1 0 Mismatches 9 7 2 0 Percentage of correct matches 0% 29% 33% 0%

Correct matches 0 6 0 0 Mismatches 2 0 1 0 Percentage of correct matches 0% 100% 0% 0%

**Table 5.** *Image pair I*2,*I*<sup>3</sup>

**Figure 9.**

**144**

**Table 4.** *Image pair I*0,*I*<sup>3</sup>

**Table 3.**

*Image pair I*ð Þ 0,*I*<sup>2</sup> *matching results.*

*Self-Driving Vehicles and Enabling Technologies*

*matching results.*

*matching results.*

**SURF KAZE MSER ORB**

**SURF KAZE MSER ORB**

**SURF KAZE MSER ORB**

*Self-Driving Vehicles and Enabling Technologies*

**References**

[1] Boozer DD, Fellerhoff JR. Terrain-Aided Navigation Test Results in the AFTI/F-16 Aircraft. Navigation – Journal of The Institute of Navigation. 1988;35(2):161–175. DOI: 10.1002/ j.2161-4296.1988.tb00949.x

*DOI: http://dx.doi.org/10.5772/intechopen.95039*

Symposium, SPIE 0238, Image Processing for Missile Guidance; 23 December 1980; San Diego (USA). Bellingham: SPIE, 1980. DOI: 10.1117/12.959130

[9] Irani GB, Christ JP. Image processing for Tomahawk scene matching. Johns Hopkins APL Technical Digest. 1994;15

[10] Turek P, Bużantowicz W. Image matching constraints in unmanned aerial vehicle terrain-aided navigation. In: Proc. of the 2nd Aviation and Space Congress; 18–20 September 2019; Cedzyna (Poland). p. 206–208.

[11] Brown LG. A survey of image registration techniques. ACM

[12] Zitová B, Flusser J. Image

Computing Surveys. 1992;24(4):325– 376. DOI: 10.1145/146370.146374

registration methods: A survey. Image and Vision Computing. 2003;21(11): 977–1000. DOI: 10.1016/S0262-8856

[13] Bouchiha R, Besbes K. Automatic Remote-Sensing Image Registration Using SURF. International Journal of Computer Theory and Engineering. 2013;5(1):88–92. DOI: 10.7763/

[14] Kashif M, Deserno TM, Haak D, Jonas S. Feature description with SIFT, SURF, BRIEF, BRISK, or FREAK? A general question answered for bone age assessment. Computers in Biology and Medicine. 2016;68:67–75. DOI: 10.1016/

[15] Lindeberg T. Scale-space theory: A basic tool for analysing structures at different scales. Journal of Applied Statistics. 1994;21(2):224–270. DOI:

[16] Löwe DG. Distinctive Image Features from Scale-Invariant

j.compbiomed.2015.11.006

10.1080/757582976

(3):250–264.

*Selected Issues and Constraints of Image Matching in Terrain-Aided Navigation: A Comparative…*

(03)00137-9

IJCTE.2013.V5.653

[2] Enns R. Terrain-aided navigation using the Viterbi algorithm. Journal of Guidance, Control, and Dynamics. 1995; 18(6):1444–1449. DOI: 10.2514/3.21566

[3] Han Y, Wang B, Deng Z, Fu M. An improved TERCOM-based algorithm for gravity-aided navigation. IEEE Sensors Journal. 2016;16(8):2537–2544. DOI:

[4] Hua Z, Xiulin H. A height-measuring algorithm applied to TERCOM radar

International Conference on Advanced Computer Theory and Engineering (ICACTE); 20–22 August 2010; Chengdu (China). New York: IEEE, 2010. p. (V5–43)-(V5–46). DOI: 10.1109/ICACTE.2010.5579215

[5] Wei E, Dong C, Liu J, Tang S. An improved TERCOM algorithm for gravity-aided inertial navigation system. Journal of Geomatics. 2017;42(6):29–31. DOI: 10.14188/j.2095-6045.2016190

[6] Naimark L, Webb H, Wang T. Vision-Aided Navigation for Aerial Platforms. In: Proc. of the ION 2017 Pacific PNT Meeting; 1–4 Mai 2017; Honolulu (USA). Manassas: ION, 2017. p. 70–76. DOI: 10.33012/2017.15051

[7] Yang C, Vadlamani A, Soloviev A, Veth M, Taylor C. Feature matching error analysis and modeling for consistent estimation in vision-aided navigation. Navigation. 2018;65:609– 628. DOI: 10.1002/navi.265628

[8] Carr JR, Sobek JS. Digital Scene Matching Area Correlator (DSMAC). In:

Proc. of 24th Annual Technical

**147**

10.1109/JSEN.2016.2518686

altimeter. In: Proc. of the 3rd

#### **Author details**

Piotr Turek, Stanisław Grzywiński and Witold Bużantowicz\* Military University of Technology, Warsaw, Poland

\*Address all correspondence to: witold.buzantowicz@wat.edu.pl

© 2020 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/ by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

*Selected Issues and Constraints of Image Matching in Terrain-Aided Navigation: A Comparative… DOI: http://dx.doi.org/10.5772/intechopen.95039*

#### **References**

[1] Boozer DD, Fellerhoff JR. Terrain-Aided Navigation Test Results in the AFTI/F-16 Aircraft. Navigation – Journal of The Institute of Navigation. 1988;35(2):161–175. DOI: 10.1002/ j.2161-4296.1988.tb00949.x

[2] Enns R. Terrain-aided navigation using the Viterbi algorithm. Journal of Guidance, Control, and Dynamics. 1995; 18(6):1444–1449. DOI: 10.2514/3.21566

[3] Han Y, Wang B, Deng Z, Fu M. An improved TERCOM-based algorithm for gravity-aided navigation. IEEE Sensors Journal. 2016;16(8):2537–2544. DOI: 10.1109/JSEN.2016.2518686

[4] Hua Z, Xiulin H. A height-measuring algorithm applied to TERCOM radar altimeter. In: Proc. of the 3rd International Conference on Advanced Computer Theory and Engineering (ICACTE); 20–22 August 2010; Chengdu (China). New York: IEEE, 2010. p. (V5–43)-(V5–46). DOI: 10.1109/ICACTE.2010.5579215

[5] Wei E, Dong C, Liu J, Tang S. An improved TERCOM algorithm for gravity-aided inertial navigation system. Journal of Geomatics. 2017;42(6):29–31. DOI: 10.14188/j.2095-6045.2016190

[6] Naimark L, Webb H, Wang T. Vision-Aided Navigation for Aerial Platforms. In: Proc. of the ION 2017 Pacific PNT Meeting; 1–4 Mai 2017; Honolulu (USA). Manassas: ION, 2017. p. 70–76. DOI: 10.33012/2017.15051

[7] Yang C, Vadlamani A, Soloviev A, Veth M, Taylor C. Feature matching error analysis and modeling for consistent estimation in vision-aided navigation. Navigation. 2018;65:609– 628. DOI: 10.1002/navi.265628

[8] Carr JR, Sobek JS. Digital Scene Matching Area Correlator (DSMAC). In: Proc. of 24th Annual Technical

Symposium, SPIE 0238, Image Processing for Missile Guidance; 23 December 1980; San Diego (USA). Bellingham: SPIE, 1980. DOI: 10.1117/12.959130

[9] Irani GB, Christ JP. Image processing for Tomahawk scene matching. Johns Hopkins APL Technical Digest. 1994;15 (3):250–264.

[10] Turek P, Bużantowicz W. Image matching constraints in unmanned aerial vehicle terrain-aided navigation. In: Proc. of the 2nd Aviation and Space Congress; 18–20 September 2019; Cedzyna (Poland). p. 206–208.

[11] Brown LG. A survey of image registration techniques. ACM Computing Surveys. 1992;24(4):325– 376. DOI: 10.1145/146370.146374

[12] Zitová B, Flusser J. Image registration methods: A survey. Image and Vision Computing. 2003;21(11): 977–1000. DOI: 10.1016/S0262-8856 (03)00137-9

[13] Bouchiha R, Besbes K. Automatic Remote-Sensing Image Registration Using SURF. International Journal of Computer Theory and Engineering. 2013;5(1):88–92. DOI: 10.7763/ IJCTE.2013.V5.653

[14] Kashif M, Deserno TM, Haak D, Jonas S. Feature description with SIFT, SURF, BRIEF, BRISK, or FREAK? A general question answered for bone age assessment. Computers in Biology and Medicine. 2016;68:67–75. DOI: 10.1016/ j.compbiomed.2015.11.006

[15] Lindeberg T. Scale-space theory: A basic tool for analysing structures at different scales. Journal of Applied Statistics. 1994;21(2):224–270. DOI: 10.1080/757582976

[16] Löwe DG. Distinctive Image Features from Scale-Invariant

**Author details**

**146**

Piotr Turek, Stanisław Grzywiński and Witold Bużantowicz\*

\*Address all correspondence to: witold.buzantowicz@wat.edu.pl

© 2020 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/ by/3.0), which permits unrestricted use, distribution, and reproduction in any medium,

Military University of Technology, Warsaw, Poland

provided the original work is properly cited.

*Self-Driving Vehicles and Enabling Technologies*

Keypoints. International Journal of Computer Vision. 2004;60:91–110. DOI: 10.1023/B:VISI.0000029664.99615.94

[17] Bay H, Ess A, Tuytelaars T, Van Gool L. Speeded-Up Robust Features (SURF). Computer Vision and Image Understanding. 2008;110(3):346–359. DOI: 10.1016/j.cviu.2007.09.014

[18] Viola P, Jones M. Rapid object detection using a boosted cascade of simple features. In: Proc. of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition; 8–14 December 2001; Kauai (USA). p. (I-511)-(I-518). DOI: 10.1109/CVPR.2001.990517

[19] Rublee E, Rabaud V, Konolige K, Bradski G. ORB: An efficient alternative to SIFT or SURF. In: Proc. of the 13th International Conference on Computer Vision; 6–13 November 2011; Barcelona (Spain). p. 2564–2571. DOI: 10.1109/ ICCV.2011.6126544

[20] Rosten E, Drummond T. Machine Learning for High-Speed Corner Detection. In: Leonardis A, Bischof H, Pinz A, editors. Proc. of the 9th European Conference on Computer Vision 2006 – Lecture Notes in Computer Science, vol. 3951. Berlin-Heidelberg: Springer; 2006. p. 430–443. DOI: 10.1007/11744023\_34

[21] McIlroy P, Rosten E, Taylor S, Drummond T. Deterministic sample consensus with multiple match hypotheses. In: Proc. of the 21st British Machine Vision Conference; 31 August – 3 September 2010; Aberystwyth (UK). pp. 111.1–111.11. DOI: 10.5244/C.24.111

[22] Calonder M, Lepetit V, Strecha C, Fua P. BRIEF: Binary Robust Independent Elementary Features. In: Daniilidis K, Maragos P, Paragios N, editors. Proc. of the 11th European Conference on Computer Vision 2010 – Lecture Notes in Computer Science, vol. 6314. Berlin-Heidelberg: Springer; 2010. p. 778–792. DOI: 10.1007/978-3- 642-15561-1\_56

[23] Alcantarilla PF, Bartoli A, Davison AJ. KAZE Features. In: Fitzgibbon A, Lazebnik S, Perona P, Sato Y, Schmid C, editors. Proc. of the 13th European Conference on Computer Vision 2010 – Lecture Notes in Computer Science, vol. 7577. Berlin-Heidelberg: Springer; 2012. p. 214–227. DOI: 10.1007/978-3-642-33783-3\_16

[30] Wang Z, Bovik AC. Mean squared error: love it or leave it?. IEEE Signal Processing Magazine. 2009;26(1):98– 117. DOI: 10.1109/MSP.2008.930649

*DOI: http://dx.doi.org/10.5772/intechopen.95039*

*Selected Issues and Constraints of Image Matching in Terrain-Aided Navigation: A Comparative…*

[31] Wang Z, Bovik AC, Sheikh HR, Simoncelli EP. Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing. 2004;13(4):

[32] Horé A, Ziou D. Image quality metrics: PSNR vs. SSIM. In: Proc. of the 20th IAPR International Conference on Pattern Recognition; 23–26 August 2010; Istanbul (Turkey). p. 2366–2369.

DOI: 10.1109/ICPR.2010.579

600–612.

**149**

[24] Perona P, Malik J. Scale-space and edge detection using anisotropic diffusion. IEEE Transactions on Pattern Analysis and Machine Intelligence. 1990;12(7):629–639. DOI: 10.1109/ 34.56205

[25] Weickert J. Efficient image segmentation using partial differential equations and morphology. Pattern Recognition. 2001;34:1813–1824. DOI: 10.1016/S0031-3203(00)00109-6

[26] Charbonnier P, Blanc-Feraud L, Aubert G, Barlaud M. Deterministic edge-preserving regularization in computed imaging. IEEE Transactions on Image Processing. 1997;6(2): 298– 311. DOI: 10.1109/83.551699

[27] Matas J, Chum O, Urban M, Pajdla T. Robust wide baseline stereo from maximally stable extremal regions. In: Proc. of the 13th British Machine Vision Conference; 2–5 September 2002; Cardiff (UK). pp. 384–396.

[28] Chaumette F. Image moments: a general and useful set of features for visual servoing. IEEE Transactions on Robotics. 2004;20(4):713–723. DOI: 10.1109/TRO.2004.829463

[29] Fischler MA, Bolles RC. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM. 1981;24 (6):381–395.

*Selected Issues and Constraints of Image Matching in Terrain-Aided Navigation: A Comparative… DOI: http://dx.doi.org/10.5772/intechopen.95039*

[30] Wang Z, Bovik AC. Mean squared error: love it or leave it?. IEEE Signal Processing Magazine. 2009;26(1):98– 117. DOI: 10.1109/MSP.2008.930649

Keypoints. International Journal of Computer Vision. 2004;60:91–110. DOI: 10.1023/B:VISI.0000029664.99615.94

*Self-Driving Vehicles and Enabling Technologies*

p. 778–792. DOI: 10.1007/978-3-

[23] Alcantarilla PF, Bartoli A, Davison AJ. KAZE Features. In: Fitzgibbon A, Lazebnik S, Perona P, Sato Y, Schmid C, editors. Proc. of the 13th European Conference on Computer

Vision 2010 – Lecture Notes in Computer Science, vol. 7577. Berlin-Heidelberg: Springer; 2012. p. 214–227. DOI: 10.1007/978-3-642-33783-3\_16

[25] Weickert J. Efficient image segmentation using partial differential equations and morphology. Pattern Recognition. 2001;34:1813–1824. DOI: 10.1016/S0031-3203(00)00109-6

[26] Charbonnier P, Blanc-Feraud L, Aubert G, Barlaud M. Deterministic edge-preserving regularization in computed imaging. IEEE Transactions on Image Processing. 1997;6(2): 298–

311. DOI: 10.1109/83.551699

pp. 384–396.

(6):381–395.

[27] Matas J, Chum O, Urban M, Pajdla T. Robust wide baseline stereo from maximally stable extremal regions. In: Proc. of the 13th British Machine Vision Conference; 2–5 September 2002; Cardiff (UK).

[28] Chaumette F. Image moments: a general and useful set of features for visual servoing. IEEE Transactions on Robotics. 2004;20(4):713–723. DOI:

[29] Fischler MA, Bolles RC. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM. 1981;24

10.1109/TRO.2004.829463

[24] Perona P, Malik J. Scale-space and edge detection using anisotropic diffusion. IEEE Transactions on Pattern Analysis and Machine Intelligence. 1990;12(7):629–639. DOI: 10.1109/

642-15561-1\_56

34.56205

[17] Bay H, Ess A, Tuytelaars T, Van Gool L. Speeded-Up Robust Features (SURF). Computer Vision and Image Understanding. 2008;110(3):346–359. DOI: 10.1016/j.cviu.2007.09.014

[18] Viola P, Jones M. Rapid object detection using a boosted cascade of simple features. In: Proc. of the 2001 IEEE Computer Society Conference on

Computer Vision and Pattern Recognition; 8–14 December 2001; Kauai (USA). p. (I-511)-(I-518). DOI:

10.1109/CVPR.2001.990517

ICCV.2011.6126544

DOI: 10.1007/11744023\_34

[21] McIlroy P, Rosten E, Taylor S, Drummond T. Deterministic sample consensus with multiple match

hypotheses. In: Proc. of the 21st British Machine Vision Conference; 31 August – 3 September 2010; Aberystwyth (UK). pp. 111.1–111.11. DOI: 10.5244/C.24.111

[22] Calonder M, Lepetit V, Strecha C,

Independent Elementary Features. In: Daniilidis K, Maragos P, Paragios N, editors. Proc. of the 11th European Conference on Computer Vision 2010 – Lecture Notes in Computer Science, vol. 6314. Berlin-Heidelberg: Springer; 2010.

Fua P. BRIEF: Binary Robust

**148**

[19] Rublee E, Rabaud V, Konolige K, Bradski G. ORB: An efficient alternative to SIFT or SURF. In: Proc. of the 13th International Conference on Computer Vision; 6–13 November 2011; Barcelona (Spain). p. 2564–2571. DOI: 10.1109/

[20] Rosten E, Drummond T. Machine Learning for High-Speed Corner Detection. In: Leonardis A, Bischof H, Pinz A, editors. Proc. of the 9th European Conference on Computer Vision 2006 – Lecture Notes in Computer Science, vol. 3951. Berlin-Heidelberg: Springer; 2006. p. 430–443. [31] Wang Z, Bovik AC, Sheikh HR, Simoncelli EP. Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing. 2004;13(4): 600–612.

[32] Horé A, Ziou D. Image quality metrics: PSNR vs. SSIM. In: Proc. of the 20th IAPR International Conference on Pattern Recognition; 23–26 August 2010; Istanbul (Turkey). p. 2366–2369. DOI: 10.1109/ICPR.2010.579

**151**

Section 4

Future Mobility
