Section 3 Smart City Essentials

#### **Chapter 4**

## Traffic State Prediction and Traffic Control Strategy for Intelligent Transportation Systems

*Shangbo Wang*

### **Abstract**

The recent development of V2V (Vehicle-to-Vehicle), V2I (Vehicle-to-Infrastructure), V2X (Vehicle-to-Everything) and vehicle automation technologies have enabled the concept of Connected and Automated Vehicles (CAVs) to be tested and explored in practice. Traffic state prediction and control are two key modules for CAV systems. Traffic state prediction is important for CAVs because adaptive decisions, control strategies such as adjustment of traffic signals, turning left or right, stopping or accelerating and decision-making of vehicle motion rely on the completeness and accuracy of traffic data. For a given traffic state and input action, the future traffic states can be predicted via data-driven approaches such as deep learning models. RL (Reinforcement Learning) - based approaches gain the most popularity in developing optimum control and decision-making strategies because they can maximize the long-term award in a complex system via interaction with the environment. However, RL technique still has some drawbacks such as a slow convergence rate for high-dimensional states, etc., which need to be overcome in future research. This chapter aims to provide a comprehensive survey of the state-of-the-art solutions for traffic state prediction and traffic control strategies.

**Keywords:** traffic state prediction, traffic control, deep learning, V2V, V2I, V2X, CAV, RL

#### **1. Introduction**

Connected and Automated Vehicles (CAVs) are nowadays the area of extensive research and there are premises to suspect that the introduction CAVs may revolutionize the whole transportation area [1]. There is no lack of predictions stating that CAVs will solve many of the current problems experienced on roads today, such as congestion, traffic accidents and lost time [2].

Traffic state prediction and traffic control are two key modules in transportation systems with CAVs [3]. Traffic states such as flow, speed, congestion, etc., plays vital roles in traffic management, public service and traffic control [4]. By predicting the evolution of traffic state timely and accurately, decision-maker and traffic controller can make effective policy and control input to avoid traffic

congestion ahead of time and thus ITS (Intelligent Transportation Systems), advanced traffic management systems and traveler information systems rely on real-time traffic state prediction. Traffic control can be divided into a decisionmaking module and a vehicle control module. The former is used to optimize the mobility, safety and energy consumption by using the vehicle trajectory prediction results to calculate vehicle platoon sizes, speed, flow, density, traffic merging, diverging flow and traffic signals, while the latter is used for vehicle path control, vehicle fleet control and steering wheel, throttle, brake, and other actuator control by using onboard units based on the control commands [3]. How to timely and accurately predict the future traffic state and deliver an effective traffic control strategy are fundamental issues in ITS.

Traffic state prediction approaches can be broadly divided into two parts: parametric and non-parametric approaches [5]. Parametric approaches utilize parametric models that capture all the information about its predictions within a finite set of parameters. The popular techniques in parametric approaches include ARIMA (Autoregressive Integrated Moving Average) [6–9], linear regression [10] and Kalman Filter (KF) based method [11], which are linear models and able to have high accuracy with linear characteristics of traffic data. ARIMA model is based on the assumption that the future data will resemble the past and widely used in time series analysis, which can be made to be stationary by differencing. It can be specified three values that represent the order of autoregressive (*p*), the degree of differencing (*d*) and the order of moving average (*q*). The model order can be selected by Akaike's Information Criterion (AIC) combined with the likelihood of the historical data while the model parameters can be estimated by maximizing the log likelihood function. The extension of the ARIMA time series model into the spatial domain results in the STARIMA (Space–Time Autoregressive Integrated Moving Average) model, which can deliver more accurate prediction results in traffic prediction because of spatio-temporal correlation of traffic data. To capture the spatial correlation, the STARIMA model adds the spatial matrix comprising spatial adjacency and weight structure, and the number of spatial lags for STAR and STMA models. The drawback of the ARIMA model for traffic prediction is the strong assumption that traffic data can become stationary by differencing, which is difficult to be fulfilled because of the non-stationary characteristics of traffic data. By assuming the linear relationship between the input variables and the single output variable, the linear regression model aims to estimate the regression coefficients by using the historical traffic data. KF based methods allow a unified approach for the prediction of all processes that can be given a state space representation. Although, EKF (Extended Kalman Filtering) can be used to deal with the non-linearity of traffic data, it is difficult to have an accurate approximation of most non-linear functions and thus it can lead to relatively large error.

Non-parametric models such as DL (Deep Learning) outperform parametric models because of stochastic, indeterministic, non-linear and multidimensional characteristics of traffic data [5]. DL is a subset of machine learning (ML) which is based on the concept of deep neural network (DNN) and it has been widely used for data classification, natural language processing (NLP) and object recognition [5]. The most popular DL models used for traffic state prediction includes Convolution Neural Network (CNN) [12–14], Deep Belief Network (DBN) [15, 16], Recurrent Neural Network (RNN) [17–19] and Autoencoder (AE) [20] etc. CNN is useful for traffic prediction because of the two-dimensional characteristics of traffic data and its ability to extract the spatial feature. CNN is only connected to a smaller subset of input and thus decreases the computational complexity of the training process. DBN is a stacking of multiple RBMs (Restricted Boltzmann Machines), which

*Traffic State Prediction and Traffic Control Strategy for Intelligent Transportation… DOI: http://dx.doi.org/10.5772/intechopen.101675*

can be used to estimate the probability distribution of the input traffic data. LSTM is the special type of RNN, which can capture the temporal feature of traffic data, and LSTM can overcome the gradient vanishing problem caused by the standard RNN.

Traffic control strategies can be generally divided into classical methods and learning-based methods. Classical methods develop traffic controller based on control theory or optimization-based techniques, which include dynamic traffic assignment based nonlinear controller [21], standard proportional-integral (PI) controller [22, 23], robust PI controller [24], model-based predictive control (MPC) [25, 26], linear quadratic controller [27], mixed-integer non-linear programming (MINLP) [28], multi-objective optimization based decision-making model [29]. Learning-based methods refer to the utilization of artificial intelligence technologies to achieve decision-making and control for CAVs, which can be further divided into three categories: statistic learning-based method, deep learning-based (DL) method and reinforcement learning-based (RL) method. The RL-based method is currently one of the most commonly used learning-based techniques for traffic control and decision-making because RL can solve complex control problems by using the Markov decision process (MDP) to describe the interaction states of agent and environment [4]. The most popular RL-based methods include Q-learning for adaptive traffic signal control [30, 31], multi-agent RL approaches [32–35], Nash Q-learning strategy [36]. Many other RL-based approaches are also available in the literature. Q-learning based traffic signal control aims to minimize the average accumulated travel time by greedily selecting action at each iteration. Multi-agent RL approaches are more popularly used in network signal optimization and can be generally divided into centralized RL and decentralized RL, while the former considers the whole system as a single agent and the latter distributes the global control to each agent. Nash Q-learning strategy is a decentralized multi-agent RL strategy, which performs iterated updates based on assuming Nash equilibrium behavior over the current Q-values. It can be shown that traffic signal control using the Nash Q-learning strategy can converge to at least one Nash equilibrium for stationary control policies. However, Nash Qlearning is unable to achieve the Pareto Optimality without consideration of cooperation among different agents.

This chapter provides a comprehensive survey about state-of-the-art traffic state prediction and traffic control techniques. It is organized as follows: In Section 2, we firstly introduce the fundamental structure and main characteristics of two important DL models: CNN and LSTM (Long Short-Term Memory), as well as their advantages in traffic state prediction, then we introduce how to realize hybrid traffic state prediction by combining two models to achieve better accuracy. In Section 3, we detail RL fundamentals and introduce how it can be applied in traffic control and decision-making. We focus on multi-agent RL approaches. Pros and cons are discussed. Section 4 gives the summary of this chapter.

#### **2. DL-based traffic state prediction approaches**

In this section, we first briefly overview the machine learning and deep learning concept. Then, we focus on introducing the architectures of two DL models: CNN and LSTM, which show good performance in processing high-dimensional and temporal correlated data. Finally, a hybrid model of CNN and LSTM is described and the research potential is about how to improve the prediction accuracy by incorporating spatio-temporal correlation.

#### **2.1 Overview of deep learning**

ML approaches are broadly classified into two categories, i.e., Supervised Learning and Unsupervised Learning [5]. Supervised Learning requires input data to be clearly labeled. It involves a function *y* ¼ *f x*ð Þ that maps input *x* to output *y* [5]. It aims to perform two tasks: regression and classification. In contrast to supervised learning, unsupervised learning aims to perform data clustering by extracting the data pattern. Some popularly used supervised learning approaches mainly include Random Forest (RF), Support Vector Machine (SVM), Bayesian methods, Artificial Neural Network (ANN), etc., whereas unsupervised learning approaches mainly include Autoencoder, Principal Component Analysis (PCA), Deep Belief Network (DBN), etc. To perform prediction tasks by supervised learning approaches, a model needs to be trained firstly by a training dataset with a certain amount of samples. Then, new predictions can be obtained by inputting the feature vector of new samples to the trained model. Cross-validation is usually used to validate the prediction performance by performing the following procedures for each *k*-th fold: (i) the whole dataset is divided into *K* folds; (ii) a model is trained by using *K* � 1 folds as training data; (iii) the resulting model is validated by the *k*-th fold.

DL is a branch of ML which aims to construct a computational model with multiple processing layers to support high-level data abstraction. It can automatically extract the feature from data, without any human interference to explore hidden data relationships among different attributes of the dataset [37]. Concepts of DL are inspired by the thinking process of the human brain. Hence, the majority of DL architectures are using the framework of Artificial Neural Network (ANN), which consists of input, hidden and output layers with nonlinear computational elements (neurons and processing units). The network depth (the number of layers) can be adjusted according to the feature dimensions and complexity of the data. The number of neurons at the input layer is equal to the number of independent variables, while the number of neurons at the output layer is equal to the number of dependent variables, which can be single or multiple. Neurons of two successive layers are connected by weights which are updated while training the model. The neurons at each layer receive the output from the previous layer, which is generated by a weighted summation over inputs and then passed to an activation function (**Figure 1**).

#### **Figure 1.**

*(left) ANN with one input layer, two hidden layers and one output layer,* **W***,* **V***, θ: Weighting matrices between IL and HL1, HL1 and HL2, HL2 and OL; (right) an illustration of an output generation.*

*Traffic State Prediction and Traffic Control Strategy for Intelligent Transportation… DOI: http://dx.doi.org/10.5772/intechopen.101675*

Let us take the four-layer ANN in **Figure 2** for example. During the training process, the value of *k*-th neuron at the output layer can be calculated by

$$o\_k = f\left(\sum\_{j=1}^{J} \theta\_{k\_j} x\_j + b\_k\right) \tag{1}$$

$$z\_j = f\left(\sum\_{m=1}^{M} \nu\_{j,m} y\_m + b\_j\right) \tag{2}$$

$$\mathcal{Y}\_m = f\left(\sum\_{n=1}^N w\_{m,n}\mathbf{x}\_n + b\_m\right) \tag{3}$$

where *f* is the activation function, *xn* is the independent variable at the *n*-th neurons of the input layer, *bk*, *b <sup>j</sup>* and *bm* are bias values, *N*, *M* and *J* are the numbers of neurons at the input layer, hidden layer 1 and hidden layer 2, respectively. Then, the unknown parameters *W*, *V*, *θ* are adjusted to minimize loss function such as MSE (Mean Squared Error) by back propagation algorithms [38, 39].

#### **2.2 Fundamental structure of CNN and LSTM**

In this section, we examine two popular DL architectures: CNN and LSTM, which are used popularly for multidimensional and time sequential dataset. CNNs have been extensively applied in various fields, including traffic flow prediction [14, 40, 41], computer vision [42], Face Recognition [43], etc., while LSTMs are special kinds of RNNs, which are mainly applied in the area of temporal data processing, such as traffic state prediction [34, 44], speech processing [45] and NLP (Natural Language Processing) contexts [46], etc.

The significant difference between fully connected ANN and CNN is that CNN neurons are only connected to a smaller subset of input which decreases the total parameters in the network [47]. CNNs have the ability to extract important and distinctive features from multidimensional by making use of filtering operations. A commonly used type of CNNs, which is similar to multi-layer perception (MLP), consists of numerous convolution layers preceding pooling layers and fully connected layers. CNN structure is illustrated in **Figure 2**, where it consists of the input layer, convolution layer, pooling layer and fully connected layer. Convolution layer outputs higher abstraction of the feature. Each convolution layer uses several filters, which are designed to have a distinct set of weights. Filters used by the

convolution layer have the smaller dimensions compared to the data size. In the training phase, filter weights are automatically determined according to an assigned task. The filters of each convolution layer are applied through the input layer by computing the sum of the product of input and filter, leading to a feature map of each filter. Each feature map detects a distinct high-level feature which is then processed by a pooling layer and a fully connected layer. ReLU activation function is applied to remove all negative values in the feature map.

The benefits of CNNs over other statistical learning methods and DL methods are listed followings [48]:


CNN and other kinds of ANNs such has MLP are not designed for sequences and time series data because they do not have memory element. In such cases, RNN can deliver more accurate results. RNNs are widely used in traffic state prediction because traffic data has spatiotemporal characteristics, which cannot be captured by CNN or other kinds of ANNs. RNN structure is illustrated in **Figure 3**, where RNNs involve an internal memory element that memorizes the previous output. The current output *ht* is not only based upon present input *xt*, but also on previous output *ht*�1, which can be expressed by

$$h\_t = f\_h(\mathcal{W}\_i \mathfrak{x}\_t + \mathcal{W}\_r h\_{t-1} + b\_h) \tag{4}$$

$$\mathbf{y}\_t = f\_t\left(\mathbf{W}\_o \mathbf{h}\_t + \mathbf{b}\_\mathbf{y}\right) \tag{5}$$

where *Wi*, *Wr* and *Wo* are respectively the weighting matrices for the current input vector *xt*, previous output vector *ht*�**<sup>1</sup>** and current output vector *ht*, *bh* and *by* are the bias vectors, *f <sup>h</sup>* and *ft* are the activation functions.

**Figure 3.** *RNN structure.*

*Traffic State Prediction and Traffic Control Strategy for Intelligent Transportation… DOI: http://dx.doi.org/10.5772/intechopen.101675*

LSTM is firstly proposed in [49] to overcome the gradient vanishing problems generated by other RNNs. A typical LSTM network consists of an input layer, a recursive hidden layer and an output layer. In the recursive hidden layer, each neuron is made up of four structures: a forget gate, an input gate, an output gate and a memory block. The state of the memory cell reflects the features of the input, while the three gates can read, update and delete features stored in the cell. The LSTM structure is illustrated in **Figure 4**.

The past information carried by the cell state *ct*�**<sup>1</sup>** can be regulated by the current input state *xt*, the previous output *ht*�**<sup>1</sup>** and the gate *σ*, which is usually composed of a sigmoid neural network layer and a pointwise multiplication operation. From **Figure 4**, the forget gate *ft*, input gate *it*, cell state *ct*, output gate *ot*, and hidden state *ht* at *t*-th time instant can be expressed by

$$\begin{aligned} \mathbf{f}\_{t} &= \sigma(\mathbf{W}\_{f}[\mathbf{h}\_{t-1}, \mathbf{x}\_{t}] + \mathbf{b}\_{f}) \\ \mathbf{i}\_{t} &= \sigma(\mathbf{W}\_{i}[\mathbf{h}\_{t-1}, \mathbf{x}\_{t}] + \mathbf{b}\_{i}) \\ \mathbf{c}\_{t} &= \mathbf{f}\_{t} \odot \mathbf{c}\_{t-1} + \mathbf{i}\_{t} \odot \tanh\left(\mathbf{W}\_{c}[\mathbf{h}\_{t-1}, \mathbf{x}\_{t}] + \mathbf{b}\_{c}\right) \\ \mathbf{o}\_{t} &= \sigma(\mathbf{W}\_{o}[\mathbf{h}\_{t-1}, \mathbf{x}\_{t}] + \mathbf{b}\_{o}) \\ \mathbf{h}\_{t} &= \mathbf{o}\_{t} \odot \tanh\left(\mathbf{c}\_{t}\right) \end{aligned} \tag{6}$$

where *W:* and *b:* are respectively weighting matrix and bias vector and *:* denotes the subscripts including *f*, *i*, *c* and *o*, ⊙ is point-wise multiplication. The forget gate decides which information needs attention and which can be ignored. The information from the current state *xt* and hidden state *ht*�**<sup>1</sup>** are passed through the sigmoid function. Sigmoid generated values between 0 and 1. It concludes whether the part of the old output is necessary. The input gate updates the cell state by the following operations: first, values between 0 and 1 are generated by passing the current state *xt* and previous hidden state *ht*�**<sup>1</sup>** into the second sigmoid function. Then, the same information of the hidden state and current state will be passed through the tanh function to generate values regulated by the first operation. Finally, the current cell state *ct* is updated by weighted summation of the generated values and past cell state *ct*�**<sup>1</sup>**. The output gate determines the value of next hidden state. First, the values of the current state *xt* and previous hidden state *ht*�**<sup>1</sup>** are passed into the third sigmoid function. Then, the new cell state generated from the cell state is passed through the tanh function. Based on the final value, the network decides which information the hidden state should carry.

**Figure 4.** *LSTM structure.*

Generally, LSTM can address the vanishing gradient problem that makes network training difficult for a long-sequence temporal data. The long-term dependencies in the data can be learned to improve the prediction accuracy.

#### **2.3 Traffic state prediction with hybrid DL models**

Although, CNN and LSTM have advantages in dealing with traffic data with spatiotemporal dependencies, due to the complex and non-linear models of traffic data, it is hard to predict accurate results by using a single model [5]. Some literature proposed that prediction accuracy can be improved by hybrid modeling such as combining CNN and LSTM [50–53].

The spatial and temporal features can be fully extracted by hybrid models, where CNN in this model is used to capture spatial features of traffic data whereas LSTM is used to extract temporal features. Suppose that we have traffic state data of *K* locations *s K i*¼1 � � in ð Þ *<sup>t</sup>* � *<sup>N</sup>*, *<sup>t</sup>* � <sup>1</sup> are used as inputs to predict the traffic states at ð Þ *t*, *t* þ 1, ⋯, *t* þ *h* . The real-time measured data can be arranged into a matrix:

$$\mathbf{S} = \begin{bmatrix} \mathbf{s}\_1 \\ \mathbf{s}\_2 \\ \vdots \\ \mathbf{s}\_K \end{bmatrix} = \begin{bmatrix} \mathbf{s}\_{1,t-N} & \mathbf{s}\_{1,t-N+1} & \cdots & \mathbf{s}\_{1,t-1} \\ \mathbf{s}\_{2,t-N} & \mathbf{s}\_{2,t-N+1} & \cdots & \mathbf{s}\_{2,t-1} \\ \vdots & \vdots & \ddots & \vdots \\ \mathbf{s}\_{K,t-N} & \mathbf{s}\_{K,t-N+1} & \cdots & \mathbf{s}\_{K,t-1} \end{bmatrix} \tag{7}$$

Note that *sk*,*<sup>t</sup>* in the matrix can also be a vector which may include flow, speed, position, etc. Traffic state such as traffic flow depicts spatio-temporal characteristics, that is, the traffic state on each location at a certain time instant depends on that of neighboring locations at current or different time instant.

There are mainly two hybridization manners: the first one is to extract spatiotemporal features by concatenating CNN and LSTM, that is, each column of *S* is firstly input into a CNN model to obtain the high-level spatial feature map, which is then input into a LSTM.

model to capture the temporal features; the second one is to parallelize CNN and LSTM modeling process by considering the extracted spatial and temporal features are of the same importance, that is, the same traffic state data is input into two models, the final prediction is obtained by passing the output of two models through a FC (Fully Connected) layer. The structure of the two hybridizations is illustrated as follows (**Figure 5**).

For concatenated hybrid models, the real-time measured data matrix *S* is firstly parallelized in the time domain; then, a one-dimensional CNN is used to capture

**Figure 5.** *(left) concatenated hybrid model; (right) parallelized hybrid model.*

*Traffic State Prediction and Traffic Control Strategy for Intelligent Transportation… DOI: http://dx.doi.org/10.5772/intechopen.101675*

high-level spatial features for each channel; finally, the high-level spatial feature map is input into LSTM models to generate the final prediction.

The high-level spatial feature map output by the one-dimensional CNN can be expressed by

$$\mathbf{X} = \begin{bmatrix} \mathbf{x}\_1 \\ \mathbf{x}\_2 \\ \vdots \\ \mathbf{x}\_L \end{bmatrix} = \begin{bmatrix} \mathbf{x}\_{1,t-N} & \mathbf{x}\_{1,t-N+1} & \cdots & \mathbf{x}\_{1,t-1} \\ \mathbf{x}\_{2,t-N} & \mathbf{x}\_{2,t-N+1} & \cdots & \mathbf{x}\_{2,t-1} \\ \vdots & \vdots & \ddots & \vdots \\ \mathbf{x}\_{L,t-N} & \mathbf{x}\_{L,t-N+1} & \cdots & \mathbf{x}\_{L,t-1} \end{bmatrix} \tag{8}$$

where *xl*,*<sup>t</sup>* is the *l*-th high-level feature at the *t*-th time instant and can be obtained as follows

$$\propto\_{l,t} = f\_l(\mathbf{w}\_t \otimes \mathbf{S}\_t + \mathbf{b}\_t) \tag{9}$$

where *w<sup>t</sup>* is the one-dimensional filter with *K* � *L* þ 1 coefficients, ⊗ is the convolution operation, *S<sup>t</sup>* is the *t*-th column of *S*, *bt* is the bias, *fl* denotes the *l*-th activation function. For simplicity, only one convolution layer in Eq. (9) is displayed. In practice, multiple convolutions and pooling layers can be used to satisfy the demand according to the data size.

To extract the temporal features, the high-level spatial feature vector for single or multiple time instants will be selected for the input of each LSTM, which is denoted as

$$F = \begin{bmatrix} F\_0 & F\_1 & \cdots & F\_{N-1} \end{bmatrix} \tag{10}$$

where *F<sup>n</sup>* is the high-level spatial feature map for the *n*-th LSTM network denoted as

$$\mathbf{F}\_n = \begin{bmatrix} \mathcal{X}\_{1,t+n-N} & \mathcal{X}\_{1,t+n-N+1} & \cdots & \mathcal{X}\_{1,t+n-N+M-1} \\\\ \mathcal{X}\_{2,t+n-N} & \mathcal{X}\_{2,t+n-N+1} & \cdots & \mathcal{X}\_{2,t+n-N+M-1} \\ \vdots & \vdots & \ddots & \vdots \\\\ \mathcal{X}\_{L,t+n-N} & \mathcal{X}\_{1,t+n-N+1} & \cdots & \mathcal{X}\_{L,t+n-N+M-1} \end{bmatrix} \tag{11}$$

where *M* is the adjustable input window size. The spatio-temporal features output by LSTM are denoted as *H* ¼ ½ � *H*<sup>0</sup> *H*<sup>1</sup> ⋯ *H<sup>N</sup>*�<sup>1</sup> , *H<sup>n</sup>* is a *K* � *T* matrix, where *T* is the adjustable output window size with *M* þ *T* ≤ *N*. Combining with **Figure 4** and Eq. (6), the generated spatio-temporal features are iteratively determined by

$$\begin{aligned} \mathbf{f\_n} &= \sigma(\mathbf{W\_{\cdot f, \mathbf{F}} \text{vec}(\mathbf{F\_n}) + \mathbf{W\_{\cdot f, \mathbf{H}} \text{vec}(\mathbf{H\_{n-1}}) + \mathbf{b\_f})}\\ \mathbf{i\_n} &= \sigma(\mathbf{W\_{\cdot f, \mathbf{F}} \text{vec}(\mathbf{F\_n}) + \mathbf{W\_{\cdot f, \mathbf{H}} \text{vec}(\mathbf{H\_{n-1}}) + \mathbf{b\_i})}\\ \mathbf{c\_n} &= \mathbf{f\_n} \odot \mathbf{c\_{n-1}} + \mathbf{i\_n} \odot \tanh\left(\mathbf{W\_{\cdot f, \mathbf{F}} \text{vec}(\mathbf{F\_n}) + \mathbf{W\_{\cdot f, \mathbf{H}} \text{vec}(\mathbf{H\_{n-1}}) + \mathbf{b\_c})}\right)\\ \mathbf{o\_n} &= \sigma(\mathbf{W\_{o, \mathbf{F}} \text{vec}(\mathbf{F\_n}) + \mathbf{W\_{o, \mathbf{H}} \text{vec}(\mathbf{H\_{n-1}}) + \mathbf{b\_o})}\\ \mathbf{H\_n} &= \mathbf{o\_n} \odot \tanh\left(\mathbf{c\_n}\right) \end{aligned} \tag{12}$$

where *W*<sup>∗</sup> ,*<sup>F</sup>* and *W*<sup>∗</sup> ,*<sup>H</sup>* are respectively the weighting matrices for the current input high-level spatial feature matrix and previous spatio-temporal feature matrix, vecðÞ is used for vectorization due to different size of *F<sup>n</sup>* and *H<sup>n</sup>*�1, *σ* and *tanh* are respectively the sigmoid function and hyperbolic function with vector input.

Concatenated hybrid models utilize a one-dimensional CNN to obtain a smaller range of spatial features, in addition, they do not contain a fully connected layer at the output of LSTM models, and thus concatenated hybrid models are with low learning complexity. However, the temporal features delivered by LSTM have a strong correlation with the spatial features output by CNN, which needs some special assumptions about the raw data.

For parallelized hybrid models, the historical data matrix *S* need not be parallelized in the time domain, rather it is input into a CNN and LSTM simultaneously to extract the high-level spatial and temporal feature map independently. Then, the final prediction is generated by merging the high-level spatial and temporal features via a fully-connected layer. The high-level spatial feature map can be obtained by filtering *S* via two-dimensional CNNs. Suppose that we utilize a CNN with *L* convolution layers, the spatial filter for the *l*-th layer is denoted as *wl i*,*j* , *i* ¼ 0, 1, ⋯,*I* � 1; *j* ¼ 0, 1, ⋯, *J* � 1, where *I* and *J* are the size of the spatial filter. Given the historical data matrix *S* in (7), the output of the *l*-th layer of the *n*-th CNN is obtained by

$$\begin{aligned} \boldsymbol{o}\_{i;j,n}^{0,\boldsymbol{\varepsilon}} &= \boldsymbol{s}\_{ij+n} \\ \boldsymbol{o}\_{i;j,n}^{l,\boldsymbol{\varepsilon}} &= \sigma \left( \sum\_{i'=0}^{l-1} \sum\_{j'=0}^{l-1} w\_{i',j'}^{l} \boldsymbol{o}\_{i+i',j+j',n}^{l-1,\boldsymbol{\varepsilon}} + \boldsymbol{b}\_{i,j}^{l} \right) \\ \boldsymbol{i} &= \boldsymbol{0}, \boldsymbol{1}, \dots, \boldsymbol{I} - \mathbf{1}; j = \boldsymbol{0}, \boldsymbol{1}, \dots, \boldsymbol{J} - \mathbf{1}, n = \boldsymbol{0}, \boldsymbol{1}, \dots, \boldsymbol{N} - \mathbf{1} \end{aligned} \tag{13}$$

where *o<sup>l</sup>*,*<sup>c</sup> <sup>i</sup>*,*j*,*<sup>n</sup>* represents the ð Þ *i*, *j* -th output of the *l*-th layer of the *n*-th CNN, *σ* is the activation function and *bl <sup>i</sup>*,*<sup>j</sup>* is the ð Þ *i*, *j* -th bias of the *l*-th layer of CNN.

A LSTM is utilized to obtain the high-level temporal feature map. The output of the *n*-th LSTM can be obtained by Eq. (12) with the input of *Sn*, which can be denoted as

$$\mathbf{S}\_{n} = \begin{bmatrix} \mathbf{s}\_{1,t-N+n} & \mathbf{s}\_{1,t-N+n+1} & \cdots & \mathbf{s}\_{1,t-N+n+M-1} \\\\ \mathbf{s}\_{2,t-N+n} & \mathbf{s}\_{2,t-N+n+1} & \cdots & \mathbf{s}\_{2,t-N+n+M-1} \\\\ \vdots & \vdots & \ddots & \vdots \\\\ \mathbf{s}\_{K,t-N+n} & \mathbf{s}\_{K,t-N+n+1} & \cdots & \mathbf{s}\_{K,t-N+n+M-1} \end{bmatrix}\_{\mathbf{n}} \tag{14}$$
 
$$\begin{bmatrix} \mathbf{o}\_{n}^{L} = \mathbf{v} \mathbf{c}(H\_{n}) = \begin{bmatrix} o\_{0,n}^{L} & o\_{1,n}^{L} & \cdots & o\_{KT-1,n}^{L} \end{bmatrix}^{T} \\\ n = 0, 1, \cdots, N-1$$

By posing a fully connected layer to the output of the *L*-th CNN layer *o<sup>L</sup>*,*<sup>c</sup>* and the output of the *n*-th LSTM *o<sup>L</sup> <sup>n</sup>*, the *n*-th final prediction can be obtained by

$$\boldsymbol{\sigma}\_{\boldsymbol{n}}^{F} = \sigma \left( \boldsymbol{\mathcal{W}}^{F} \begin{bmatrix} \boldsymbol{\sigma}\_{\boldsymbol{n}}^{L,\text{c},\text{T}} , \boldsymbol{\sigma}\_{\boldsymbol{n}}^{L,\text{T}} \end{bmatrix}^{T} + \boldsymbol{b}^{F} \right) = \begin{bmatrix} \boldsymbol{\sigma}\_{0,\text{n}}^{F} & \boldsymbol{\sigma}\_{1,\text{n}}^{F} & \cdots & \boldsymbol{\sigma}\_{KT-1,\text{n}}^{F} \end{bmatrix}^{T} \tag{15}$$

where *W<sup>F</sup>* and *b<sup>F</sup>* are the weighting matrix and bias vector for the fully connected layer, respectively.

In parallelized hybrid models, the spatial and temporal feature maps are considered to be of the same importance, and thus are extracted independently. The fully connected layer merges the output of CNN and LSTM without any special assumptions about the high-level spatial and temporal features.

Traffic state has strong periodic features because people get used to repeating some similar or same behaviors on the same time period of different days or the same day of different weeks, e.g., most people routinely go to work in the morning and go home in the evening during the peak hour [53]; most people routinely go for *Traffic State Prediction and Traffic Control Strategy for Intelligent Transportation… DOI: http://dx.doi.org/10.5772/intechopen.101675*

shopping on weekends rather than weekdays, etc. The periodic features can be used as supplementary information to predict the future traffic state. For the short-term traffic state prediction, the real-time data only contains the data before the prediction time instant, but the historical data on previous days or weeks contain the full data of that period, that means, traffic state information after the inspected time instant on previous days or weeks can be utilized to get the prediction about that on the inspected time instant. Suppose we use parallelized hybrid models, the complete prediction structure should contain CNN and LSTM for the real-time data, CNN and bidirectional LSTM for the historical data, which are connected by using a fully connected layer.

The bidirectional LSTM is composed of two independent forward and backward LSTMs, whose inputs are the time series before and after the inspected time instant. The final prediction of bidirectional LSTM is obtained by concatenating the forward and backward LSTMs. The structure of bidirectional LSTM is depicted in **Figure 6**.

Suppose that additionally, we have historical traffic state data *Sd* on the *d*-th day, which is denoted as

$$\begin{array}{ccccc} \mathbf{S}\_{d} = \begin{bmatrix} s\_{t-N,d} & \cdots & s\_{t-1,d} & \cdots & s\_{t+N,d} \end{bmatrix} \\\\ \begin{bmatrix} \mathbf{s}\_{1,t-N,d} & \mathbf{s}\_{1,t-N+1,d} & \cdots & \mathbf{s}\_{1,t+N,d} \\\\ \mathbf{s}\_{2,t-N,d} & \mathbf{s}\_{2,t-N,d} & \cdots & \mathbf{s}\_{2,t+N,d} \\\\ \vdots & \vdots & \ddots & \vdots \\\\ \mathbf{s}\_{K,t-N,d} & \mathbf{s}\_{K,t-N,d} & \cdots & \mathbf{s}\_{K,t+N,d} \end{bmatrix} \\\\ d = \mathbf{1}, \mathbf{2}, \cdots, D \end{array} \tag{16}$$

where *D* is the number of previous days. We assume each previous day has data at 2*N* þ 1 time instants available for prediction. The input and output of the forward LSTM are given by Eqs. (14) and (15), and the input of the backward LSTM is given by

$$\mathbf{S}\_{n,d,BL} = \begin{bmatrix} \mathbf{s}\_{1,t+N-n-M+1,d} & \mathbf{s}\_{1,t+N-n-M+2,d} & \cdots & \mathbf{s}\_{1,t+N-n,d} \\ \mathbf{s}\_{2,t+N-n-M+1,d} & \mathbf{s}\_{2,t+N-n-M+2,d} & \cdots & \mathbf{s}\_{2,t+N-n,d} \\ \vdots & \vdots & \ddots & \vdots \\ \mathbf{s}\_{K,t+N-n-M+1,d} & \mathbf{s}\_{K,t+N-n-M+2,d} & \cdots & \mathbf{s}\_{K,t+N-n,d} \end{bmatrix} \tag{17}$$

**Figure 6.** *Bidirectional LSTM structure.*

Using Eq. (12), the output of the *n*-th backward LSTM can be obtained by

$$\boldsymbol{\sigma}\_{\boldsymbol{n}}^{\rm BL} = \text{vec}\left(\boldsymbol{H}\_{\boldsymbol{n}}^{\rm BL}\right) = \begin{bmatrix} \boldsymbol{o}\_{0,n}^{\rm BL} & \boldsymbol{o}\_{1,n}^{\rm BL} & \cdots & \boldsymbol{o}\_{KT-1,n}^{\rm BL} \end{bmatrix}^{T} \tag{18}$$

Then, the *n*-th temporal feature can be obtained by concatenating *o<sup>L</sup> <sup>n</sup>* and *o*BL *n* .

#### **3. Reinforcement learning based traffic signal control**

An accurate and efficient traffic state prediction can provide continuous and precise traffic status and vehicle states based on past information. How to utilize the current and predicted traffic states to make a real-time optimum decision is the main task of the traffic signal control module in ITS. The objectives of traffic signal control include minimizing the average waiting time at multiple intersections, reducing traffic congestion and maximizing network capacity. There exist real-time linear feedback control approaches and MPC (Model-based Predictive Control) that are specifically designed for traffic signal control systems to achieve the targets. The drawback of linear feedback control techniques that have been tried is that the system should always remain in the linear region at all times for the controller. Although, MPC has some advantages such as imposing constraints, the main shortcoming is it needs an accurate dynamic model, which is difficult to be obtained for traffic control systems. Data-driven approaches such as DRL (Deep Reinforcement Learning) based traffic control techniques are widely presented for ITS in recent years because RL can solve complex control problems and deep learning helps to approximate highly nonlinear functions from the complex datasets. In this section, we firstly briefly review the fundamental principles of RL. Then, we focus on multi-agent DRL based traffic signal control techniques such as decentralized multi-agent advantage actor-critic, which can converge to the local optimum and overcome the scalability issue by considering the non-stationarity of MDP transition caused by policy update of the neighborhood; and Nash Q-learning strategy, which can converge to Nash equilibrium by only considering the competition among agents.

#### **3.1 Overview of reinforcement learning**

Reinforcement Learning (RL) is a promising data-driven approach for decision-making and control in complex dynamic systems. RL methodology formally comes from a Markov Decision Process (MDP), which is a general mathematical framework sequential decision-making algorithms, and consists of five elements [54]:


*Traffic State Prediction and Traffic Control Strategy for Intelligent Transportation… DOI: http://dx.doi.org/10.5772/intechopen.101675*

RL aims to maximize a numerically defined reward by interacting with the environment to learn how to behave in an environment without any prior knowledge by learning. In traffic signal control systems, RL is used to find the best control policy *<sup>π</sup>* <sup>∗</sup> that maximizes the expected cumulative reward *E Rt* ½ � <sup>j</sup>*s*, *<sup>π</sup>* for each state *<sup>s</sup>* and cumulative discounted reward

$$R\_t = \sum\_{i=0}^{T-1} \gamma^i r\_{t+i} \tag{19}$$

where *γ* is the discount factor that reflects the importance of future rewards, *rt* is the *t*-th instantaneous reward. Choosing a larger*γ* means that the agent's actions have a higher dependency on future reward, whereas a smaller *γ* results in actions that mostly care about *rt*.

RL generally can be classified into model-based RL which knows or learns the transition model from state *st* to *st*þ1, and model-free RL which explores the environment without knowing or learning a transition model. Model-based RL emphasizes planning, that is, agents can keep track of all the routes in future time instants by predicting the next state and reward. Model-free RL can estimate the optimal policy without using or estimating the dynamics of the environments. In practice, model-free RL either estimates a value function or the policy by interacting with the environment and observing the responses. Model-free RL can be classified into value-based RL and policy-based RL. Value-based RL is mostly used for the cases that control problems have discrete state-action space. Q-learning and SARSA are two main value-based RL techniques, where the values of state-action pairs (*Q*value) are stored in a *Q*-table, and are learned via the recursive nature of Bellman equations utilizing the Markov property [54]:

$$Q^\pi(\mathfrak{s}\_t, \mathfrak{a}\_t) = E\_\pi[r\_t + \chi Q^\pi(\mathfrak{s}\_{t+1}, \mathfrak{a}(\mathfrak{s}\_{t+1}))] \tag{20}$$

where *π*ð Þ *a*j*s* is the probability of the action *a* is selected by given *s*, *Q* is the expected accumulative reward given by an action *a* and a state *s*

$$Q^\pi(s, a) = E[R\_t|s, a, \pi] \tag{21}$$

The stochasticity in Eq. (21) comes from the control policy *π* and the transition probability from *st* to *st*þ<sup>1</sup> . Value-based RL updates the *<sup>Q</sup><sup>π</sup>* with a learning rate 0<*α*< 1 by

$$Q^{\pi,\mu}(\mathfrak{s}\_t, \mathfrak{a}\_t) = Q^{\pi}(\mathfrak{s}\_t, \mathfrak{a}\_t) + a \left(\mathcal{Y}\_t - Q^{\pi}(\mathfrak{s}\_t, \mathfrak{a}\_t)\right) \tag{22}$$

where *yt* is the Temporal Difference (TD) target for *<sup>Q</sup><sup>π</sup> st* ð Þ , *at* and can be determined by

$$\begin{aligned} \mathcal{Y}\_t^{q-learning} &= \mathcal{R}\_t^{(n)} + \gamma^n \max\_{a\_{t+n}} \mathcal{Q}^\pi(s\_{t+n}, a\_{t+n}) \\ \mathcal{Y}\_t^{\text{arna}} &= \mathcal{R}\_t^{(n)} + \gamma^n \mathcal{Q}^\pi(s\_{t+n}, a\_{t+n}) \\ \mathcal{R}\_t^{(n)} &= \sum\_{i=0}^{n-1} \gamma^i r\_{t+i} \end{aligned} \tag{23}$$

The learning rate *<sup>α</sup>* controls the speed at which *<sup>Q</sup><sup>π</sup> st* ð Þ , *at* updates, a lager *<sup>α</sup>* allows a fast update but may oscillate over training epochs while a smaller *α* tends to reserve the old *<sup>Q</sup><sup>π</sup> st* ð Þ , *at* and thus may take longer time to train.

#### *Intelligent Electronics and Circuits - Terahertz, ITS, and Beyond*

Value-based RL does not work well for continuous control problems with infinite-dimensional action space or high-dimensional problems because it is difficult to explore all the states in a large and continuous space and store them in a table. In such a case, policy-based RL can provide better solutions than value-based RL. By treating the policy *πθ* as a probability distribution over state-action pairs parameterized by *θ*, policy parameters *θ* are updated to maximize the objective function *J*ð Þ*θ* , which can be

$$J(\theta) = E\_{\pi\_{\theta}}[\mathbf{Q}^{\pi\_{\theta}}(s, a) | \theta] = \iint\_{s \in \mathcal{S}, a \in A} \pi\_{\theta}(a|s) \mathbf{Q}^{\pi\_{\theta}}(s, a) ds da \tag{24}$$

The optimum policy parameters *θ* are selected to maximize *J*ð Þ*θ* , with

$$\theta\_{\text{opt}} = \underset{\theta}{\text{argmax}} \mathcal{J}(\theta) \tag{25}$$

Policy-based RL tries to select the optimum actions by using the gradient of the objective function with respect to *θ*, which can be written as

$$\begin{aligned} \nabla\_{\theta} \mathcal{J}(\theta) &= \iint\_{\mathcal{A} \in \mathcal{S}, a \in A} Q^{\pi\_{\theta}}(s, a) \nabla\_{\theta} \pi\_{\theta}(a|s) ds da \\\\ &= \iint\_{\mathcal{A} \in \mathcal{S}, a \in A} \pi\_{\theta}(a|s) Q^{\pi\_{\theta}}(s, a) \nabla\_{\theta} \log \pi\_{\theta}(a|s) ds da \\\\ &= E\_{\pi\_{\theta}}[Q^{\pi\_{\theta}}(s, a) \nabla\_{\theta} \log \pi\_{\theta}(a|s)] \end{aligned} \tag{26}$$

where *<sup>Q</sup>πθ* ð Þ *<sup>s</sup>*, *<sup>a</sup>* cannot be determined directly and thus Monto-Carlo method is used to sample *<sup>Q</sup>πθ* ð Þ *<sup>s</sup>*, *<sup>a</sup>* from *<sup>M</sup>* trajectories and take the empirical average

$$\nabla\_{\theta} I(\theta) \approx \frac{1}{M} \sum\_{m=1}^{M} R\_{t\_m} \nabla\_{\theta} \log \pi\_{\theta}(a|s) = \nabla\_{\theta} \left[ \frac{1}{M} \sum\_{m=1}^{M} R\_{t\_m} \log \pi\_{\theta}(a|s) \right] \tag{27}$$

where *Rtm* is given by Eq. (19). No analytical solution can be provided for Eq. (27), thus, the optimum solution of Eq. (25) can be obtained by stochastic gradient descent algorithms with

$$
\theta\_{t+1} = \theta\_t + aR\_t \nabla\_\theta \log \pi\_\theta(a|s) \tag{28}
$$

where *Rt* is an estimator of the mean of the reward function. Eq. (28) shows that policy gradient can learn a stochastic policy by the update of the parameters *θ* at each iteration. Thus, policy-based RL does not need to implement an exploration and exploitation trade off. A stochastic policy allows the agent to explore the state space without always taking the same action. However, policy-based RL typically converges to a local optimum rather than a global optimum.

Actor-critic RL combines the characteristics of policy-based methods and valuebased methods, in which an actor is used to control the agent's behaviors based on policy, critic evaluates the taken action based on value function. From Eq. (27), the objective function can be rewritten as

$$J(\theta) \approx \frac{1}{M} \sum\_{m=1}^{M} R\_{t\_m} \log \pi\_{\theta}(a\_{t\_m}|s\_{t\_m}) \tag{29}$$

*Traffic State Prediction and Traffic Control Strategy for Intelligent Transportation… DOI: http://dx.doi.org/10.5772/intechopen.101675*

The loss function for policy and value updating can be respectively defined as

$$\begin{split} L(\theta) &= -\frac{1}{M} \sum\_{m=1}^{M} R\_{t\_m} \log \pi\_{\theta}(a\_{t\_m}|s\_{t\_m}) \\ L(w) &= \frac{1}{M} \sum\_{m=1}^{M} \left(R\_{t\_m} - V\_w(s\_{t\_m})\right)^2 \end{split} \tag{30}$$

where *Vw stm* ð Þ is the approximation function for *<sup>Q</sup>πθ stm* , *atm* ð Þ and *<sup>w</sup>* is its parameter. Actor-critic RL aims to iteratively find the optimum *θ* and *w* to minimize the objective functions in (30).

Recall that *Rtm* is obtained by taking *T* samples from the stored minibatch with *Rtm* <sup>¼</sup> <sup>P</sup>*T*�<sup>1</sup> *<sup>i</sup>*¼<sup>0</sup> *<sup>γ</sup><sup>i</sup> rtm*þ*i*, thus it may have a relatively large bias and variance. Advantage Actor-Critic (A2C) aims to solve the problem by learning the Advantage values *Atm* instead of *Rtm* , which is defined by

$$A\_{t\_m} = R\_{t\_m} + \chi^T V\_{w^-}(\mathfrak{s}\_{t\_n+T}) - V\_{w^-}(\mathfrak{s}\_{t\_m}) \tag{31}$$

where *w*� is the parameter of the approximation function for the last iteration. Then, the loss function for policy updating can be rewritten by

$$L(\theta) = -\frac{1}{M} \sum\_{m=1}^{M} A\_{t\_m} \log \pi\_{\theta}(a\_{t\_m}|s\_{t\_m}) \tag{32}$$

#### **3.2 Multi-agent deep reinforcement learning based traffic signal control**

A real traffic network consists of multiple signalized intersections, each of which can be considered as an agent. The states for the *i*–th agent (intersection) such as the total number of approaching vehicles, position and speed of each vehicle, the vehicle flow and density of the links, etc. are not only determined by the *i*-th action (adjustment of green-time proportion) but also influenced by other agents' actions. Hence, traffic signal control can be modeled as a cooperative and competitive game, where the learning process requires considering other agents' actions to reach a globally optimum solution. When multiple agents are presented, the standard MDP is no longer suitable for describing the environment because actions from other agents can influence the state dynamics. In such a case, a Markov game can be defined by the tuple *<sup>N</sup>*, f g *Si <sup>i</sup>*<sup>∈</sup> *<sup>N</sup>*, f g *Ai <sup>i</sup>*<sup>∈</sup> *<sup>N</sup>*, *<sup>T</sup>*, *πθ<sup>i</sup>* f g*<sup>i</sup>* <sup>∈</sup> *<sup>N</sup>*, f g *ri <sup>i</sup>*<sup>∈</sup> *<sup>N</sup>*, *<sup>γ</sup>* � �, where


The centralized multi-agent RL considers the multi-agent systems as a single-agent system with joint state space *S* and action space *A* and thus it can achieve the global optimum. However, it is infeasible for large-scale traffic control systems because of the extremely high dimension of joint state-action space [32]. Decentralized Multi-Agent deep RL (MARL) can overcome the scalability issue by distributing the global control to each local agent, which is controlled based on the local observed states and communicated states and actions from other agents.

Suppose we have a multi-intersection traffic network, which can be modeled as *G V*ð Þ , *E* . The size of the graph is the number of intersections denoted by *N*, and *eij* ¼ 1 if vertices *vi* and *v <sup>j</sup>* are neighbors. The neighborhood of *vi* can be manually selected within a certain coverage limit, denoted by *Ω<sup>i</sup>* and thus the local region is defined by *Li* ¼ *Ω<sup>i</sup>* ∪ *vi*. Thanks to the advancement in communication technologies for ITS, it is possible to share the instantaneous rewards f g *ri <sup>i</sup>* <sup>∈</sup> *<sup>N</sup>*, states f g *st*,*<sup>i</sup> <sup>i</sup>* <sup>∈</sup> *<sup>N</sup>*, actions f g *at*,*<sup>i</sup> <sup>i</sup>* <sup>∈</sup> *<sup>N</sup>* and policies f g *πθ<sup>i</sup> <sup>i</sup>*<sup>∈</sup> *<sup>N</sup>* among agents. The objective of MARL is to maximize the total reward function *Qπθ* ≈P*<sup>N</sup> <sup>i</sup>*¼<sup>1</sup>*Qπθ <sup>i</sup>* , where *<sup>Q</sup>πθ <sup>i</sup>* is the local reward of the *i*-th agent by given global control policy and can be expressed by

$$\begin{aligned} \mathbf{Q}\_{i}\left(s\_{t,i},a\_{t,i},\pi\_{\theta\_{i},t},s\_{t,j=1,\cdots,Nj\neq i},\pi\_{\theta\_{j},t,j=1,\cdots,Nj\neq i}\right) &= E\left[\sum\_{t=0}^{T-1} \gamma^{t}r\_{t,i}\Big|s\_{t},\pi\_{\theta,t}\right] \\ &= E\left[\sum\_{t=0}^{T-1} \gamma^{t}r\_{t,i}\Big|\underbrace{s\_{t,i},a\_{t,i},\pi\_{\theta,t}}\_{\text{localstateandpolicy}},\underbrace{s\_{t,j=1,\cdots,Nj\neq i,\cdots,\mathcal{R}\_{j},t,j=1,\cdots,Nj\neq i}}\_{\text{commutativity}}\right] \\ &= E\left[r\_{t,i} + \gamma Q\_{i}\Big(s\_{t+1,i},\pi\_{\theta\_{i}}(s\_{t+1,i}),s\_{t+1,j=1,\cdots,Nj\neq i},\pi\_{\theta\_{j},t+1,j=1,\cdots,Nj\neq i}\right)\right] \end{aligned} \tag{33}$$

where the global states and policies can be communicated from all other agents in the system as well as the neighborhood *Li*. In multi-agent traffic signal control problems, the summation of each local reward can only be used to approximate *Qπθ* because of competitiveness among different intersections, i.e., a decrease of the average vehicle waiting time at the *i*-th intersection may cause waiting time increase at the neighboring intersections. Hence, the objective of decentralized MARL can be approximated to be the maximization of each local reward function with consideration of states and policies from the neighborhood, which can be expressed by

$$\boldsymbol{\Theta}\_{i}^{\*} = \underset{\theta\_{i}}{\operatorname{argmax}} \left[ \mathbf{Q}\_{i} \Big( \mathbf{s}\_{t,i}, \mathbf{a}\_{t,i}, \boldsymbol{\pi}\_{\theta\_{i},t}, \mathbf{s}\_{t,j} = 1, \dots, N\_{j} \neq i, \,\boldsymbol{\pi}\_{\theta\_{j},t}, \mathbf{t}\_{j} = 1, \dots, N\_{j} \neq i \right) \Big] \tag{34}$$

We assume Eq. (33) has continuous state-action space and thus multi-agent A2C can be applied to search the optimum policy parameter. From Eq. (31), the Advantage value for the *i*-th intersection can be expressed by

$$\begin{split} \mathcal{A}\_{t\_m, i} &= \mathcal{R}\_{t\_m, i} + \boldsymbol{\chi}^T \boldsymbol{V}\_{w\_i^-} \left( \boldsymbol{s}\_{t\_m + T} | \boldsymbol{\pi}\_{\theta\_i^-}, \boldsymbol{\pi}\_{\theta\_j^-, j = 1, \dots, N, j \neq i} \right) - \boldsymbol{V}\_{w\_i^-} \left( \boldsymbol{s}\_{t\_m} | \boldsymbol{\pi}\_{\theta\_i^-}, \boldsymbol{\pi}\_{\theta\_j^-, j = 1, \dots, N, j \neq i} \right) \\ &= \mathcal{R}\_{t\_m, i} + \boldsymbol{\chi}^T \boldsymbol{V}\_{w\_i^-} (\boldsymbol{s}\_{t\_m + T, L\_i}) - \boldsymbol{V}\_{w\_i^-} (\boldsymbol{s}\_{t\_m, L\_i}) \end{split} \tag{35}$$

*Traffic State Prediction and Traffic Control Strategy for Intelligent Transportation… DOI: http://dx.doi.org/10.5772/intechopen.101675*

where *Rtm*,*<sup>i</sup>* is the weighted sum of *T* samples from the minibatch, *Vw*� *<sup>i</sup> stm* j*πθ*� *<sup>i</sup>* , *πθ*� *<sup>j</sup>* ,*j*¼1,⋯,*N*,*j*6¼*i* � � is the approximation of the value function for the input sample *stm* with the function parameter of last time iteration, *T* þ 1 samples are obtained from the same stationary policy *πθ*� *<sup>i</sup>* , *πθ*� *<sup>j</sup>* ,*j*¼1,⋯,*N*,*j*6¼*i* h i which respectively represents the control policies for the *i*-th intersection and neighboring intersections at last time iteration, *stm*,*Li*≔ *stm*,*<sup>j</sup>* � � *j* ∈*Li* represents the states for all neighbors of the *i*-th intersection at *tm*. We assume that agents are synchronized, that is, the policy and value updating for all agents happen simultaneously at end of each episode, and the delay for information exchange among agents is ignored. Therefore, within each episode, the dynamic system can be considered to be stationary although the trajectory for each agent is influenced by multiple policies

*πθ*� *<sup>i</sup>* , *πθ*� *<sup>j</sup>* ,*j*¼1,⋯,*N*,*j*6¼*i* h i. Then, the loss function for policy and value updating can be obtained as

$$L(\theta\_i) = -\frac{1}{M} \sum\_{m=1}^{M} A\_{t\_m, i} \log \pi\_{\theta\_i}(a\_{t\_m, i}|s\_{t\_m, L\_i}) \\ L(w\_i) = \frac{1}{M} \sum\_{m=1}^{M} \left(R\_{t\_m, i} - V\_{w\_i}(s\_{t\_m, L\_i})\right)^2 \tag{36}$$

If each agent follows Eqs. (35) and (36) in a decentralized manner, a local optimum policy *πθ* <sup>∗</sup> *<sup>i</sup>* can be achieved if other agents can achieve the optimum policy *πθ* <sup>∗</sup> *<sup>j</sup>* ,*j*¼1,⋯,*N*,*j*6¼*<sup>i</sup>* within the same episode. However, if *<sup>θ</sup>* <sup>∗</sup> *<sup>j</sup>* , *j* ¼ 1, ⋯, *N*, *j* 6¼ *i* cannot be achieved or *θ <sup>j</sup>*, *j* ¼ 1, ⋯, *N*, *j* 6¼ *i* are updated within the same episode, the policy gradient may be inconsistent across minibatch and thus the convergence to a local optimum cannot be guaranteed, since *Atm*,*<sup>i</sup>* is conditioned on the changing *πθi* , *πθ <sup>j</sup>*,*j*¼1,⋯,*N*,*j*6¼*<sup>i</sup>* h i.

In practice, the information exchange among multiple intersections may not be synchronized and communication delay should be considered, which causes policy changing within the same episode and thus leads to non-stationarity. There is some research that try to stabilize convergence and relieve non-stationarity. Tesauro proposes a "Hyper-Q" learning, in which values of mixed strategies rather than base actions are learned and other agents'strategies are estimated from observed actions via Bayesian inference [55]. Foerster et al. include low-dimensional fingerprints, such as *ε* of *ε*-greedy exploration and the number of updates.

To relieve non-stationarity, the key is to keep policies from neighboring agents fixed within one episode. We can apply a DNN network to approximate the local policy *πθi*,*tm* �j*stm*,*Li* ð Þ when size of *Li* is relatively large. If we consider the sampled latest policies from neighbors to be additional input of the DNN network, besides the current state from *Li*, the local policy can be rewritten by

$$\begin{aligned} \pi\_{\theta\_i, t\_m} &= \pi\_{\theta\_i} \left( \cdot | s\_{t\_m, L\_i}, \pi\_{\theta\_j^-, t\_m - 1, j = 1, \dots, |L\_i|, j \neq i} \right) \\ \pi i &= 1, 2, \dots, N \end{aligned} \tag{37}$$

Then, the loss function for policy updating can be rewritten by

$$L(\theta\_i) = -\frac{1}{M} \sum\_{m=1}^{M} A\_{t\_m, i} \log \pi\_{\theta i} \left( \mathfrak{a}\_{t\_m, i} | s\_{t\_m L\_i}, \mathfrak{a}\_{\theta\_j^-, t\_m - 1, j = 1, \dots, |L\_i|, j \neq i} \right) \tag{38}$$

Even if the policies from the neighbors are fixed and are considered to be additional input, it is still difficult to approximate *Atm*,*<sup>i</sup>* given by Eq. (35) and thus convergence to a local optimum cannot be guaranteed. Recall that the total reward function can be approximated as the summation of local reward functions. Thus, a decomposable global reward with a spatial discount factor can be proposed to solve the problem.

$$\tilde{r}\_{t\_m,i} = \sum\_{j \in \mathcal{L}\_l \mid d(i,j) = d} a^d r\_{t\_m,j} \tag{39}$$

where *α* is the spatial discount factor with 0 ≤*α* ≤1, *d* is the distance between the *i*-th agent and *j*-th agent. The spatial discount factor scales down the reward in spatial order to emphasize the role played by the policy of the local agent. Compared to sharing the same weights across agents, the spatial discounted factor is more flexible for the trade-off between greedy control (*α* ¼ 0) and cooperative control (*α* ¼ 1), and is more relevant for estimating the advantage of local policy. By applying the spatial discount factor to neighboring states, we have

$$
\tilde{s}\_{l\_m, L\_i} = s\_{l\_m, i} \cup a \left[ s\_{l\_m, j} \right]\_{j \in L\_i} \tag{40}
$$

Then, the cumulative discounted reward can be obtained by

$$\hat{R}\_{t\_m,i} = \sum\_{t\_m=0}^{T-1} \gamma^t \tilde{r}\_{t\_m,i} \tag{41}$$

and the local return and Advantage value *Atm*,*<sup>i</sup>* for the *i*-th agent can be expressed by

$$\begin{aligned} \tilde{R}\_{t\_m,i} &= \hat{R}\_{t\_m,i} + \chi^T V\_{w\_i^-} \left( \tilde{s}\_{T,L\_i}, \pi\_{\theta\_j^-,T-1;j=1,\cdots,|L\_i|,j\neq i} \right) \\ A\_{t\_m,i} &= \tilde{R}\_{t\_m,i} - V\_{w\_i^-} \left( \tilde{s}\_{t\_m,L\_i}, \pi\_{\theta\_j^-,t\_m-1;j=1,\cdots,|L\_i|,j\neq i} \right) \end{aligned} \tag{42}$$

and Eq. (38) can be rewritten as

$$L(\theta\_i) = -\frac{1}{M} \sum\_{m=1}^{M} \begin{bmatrix} \tilde{R}\_{l\_m, i} - V\_{w\_i^-} \left( \tilde{s}\_{l\_m, L\_i}, \pi\_{\theta\_j^-, t\_m - 1, j = 1, \dots, |L\_i|, j \neq i} \right) \\ \times \log \pi\_{\theta\_i} \left( a\_{t\_m, i} |s\_{t\_m, L\_i}, \pi\_{\theta\_j^-, t\_m - 1, j = 1, \dots, |L\_i|, j \neq i} \right) \end{bmatrix} \tag{43}$$

The loss function for value updating can be expressed as

$$L(w\_i) = \frac{1}{M} \sum\_{m=1}^{M} \left(\tilde{R}\_{l\_m, i} - V\_{w\_i} \left(\tilde{s}\_{l\_m, L\_i\*} \pi\_{\theta\_j^-, t\_m - 1, j = 1, \dots, |L\_i|, j \neq i} \right) \right)^2 \tag{44}$$

The decreolized MA2C can overcome the scalability issue and achieve local optimum (Pareto Optimality). How to achieve the global optimum using a decentralized approach when the global reward function is non-convex in the future research direction.

Compared to decentralized MA2C, Nash Q-learning does not consider cooperation among agents and thus it has lower computational complexity but can only achieve the Nash equilibrium. Nash Q-learning aims to find the optimal global control policy *πθ* by iteratively updating agents' actions to maximize their Q functions based on assuming Nash equilibrium behavior, that is:

$$\begin{aligned} Q\_{t+1,i}(s\_{t,i}, a\_{t,1}, \dots, a\_{t,N}) &= (1-a)Q\_{t,i}(s\_{t,i}, a\_{t,1}, \dots, a\_{t,N}) + \\ a \left[ r\_{t,i} + \gamma Q\_{t,i} \left( s\_{t+1,i}, a\_{t+1,1}^\*, \dots, a\_{t+1,N}^\* \middle| \pi^\*\_{\theta,t} \right) \right] \\ a\_{t,n} &\leftarrow a\_{t+1,n}^\*, t \leftarrow t+1 \end{aligned} \tag{45}$$

*Traffic State Prediction and Traffic Control Strategy for Intelligent Transportation… DOI: http://dx.doi.org/10.5772/intechopen.101675*

**Figure 7.** *Comparison of different multi-agent RL methods for traffic signal control.*

where *Qt*,*<sup>i</sup> st*,*<sup>i</sup>* ð Þ , *at*,1, ⋯, *at*,*<sup>N</sup>* is the Nash Q function at the *t*-th iteration for the *i*-th agent, *π*<sup>∗</sup> *<sup>θ</sup>*,*<sup>t</sup>* is the joint Nash equilibrium strategy at the *t*-th iteration. Under the joint Nash equilibrium strategy, the following relation should be fulfilled:

$$\begin{aligned} Q\_{t,i} \left( \varsigma\_{t+1,i}, a^\*\_{t+1,1}, \dots, a^\*\_{t+1,N} \middle| \pi^\*\_{\theta,t+1} \right) &\geq \\ Q\_{t,i} \left( \varsigma\_{t+1,i}, a^\*\_{t+1,1}, \dots, a^\*\_{t+1,N} \middle| \pi^\*\_{\theta\_1,t+1}, \dots, \pi^\*\_{\theta\_{i-1},t+1}, \pi\_{\theta,t+1}, \pi^\*\_{\theta\_{i+1},t+1}, \dots, \pi^\*\_{\theta\_N,t+1} \right) \\ \text{for } i = 1,2,\dots,N \end{aligned} \tag{46}$$

Eqs. (45) and (46) show that at each iteration *t*, agent *i* observes its current state *st*,*<sup>i</sup>* and takes action to maximize its Q function based on *st*,*<sup>i</sup>* and other agents' actions. The update of *i*-th action will cause the update of actions of agents �*i*, which represents all agents excluding the agent *i*. The *t*-th joint Nash equilibrium strategy will not be obtained until the convergence of the joint Nash equilibrium action is reached.

In traffic signal control application, the state space *S* can be the number of vehicles, positions, speeds and lane order of vehicles, phase duration, etc., for the specific intersections, the action space *A* can be phase switch, phase duration or phase percentage, etc., the reward *r* can be queue length, average waiting time, cumulative time delay, network capacity etc. The objective of the multi-agent RL strategy is to minimize the accumulated average waiting for time or queue length etc.

By conducting a simulation on SUMO for a two-intersection case, we can observe in **Figure 7** that the centralized DQN outperform the centralized Q-learning in terms of reward value (Average Waiting Time/s) and convergence rate (the Number of Iterations). When the number of agents is small (two, in this case), by using the centralized methods, the average waiting time can converge to the local optimum, which is more optimal than the Nash equilibrium delivered by Nash Q learning. However, the convergence rate of Nash Q learning is higher than that of centralized methods.

#### **4. Summary**

In this chapter, we introduced deep learning-based traffic state prediction technique, which can provide accurate future information for traffic control and

decision making. The traffic state data depicts a strong correlation in the spatial and temporal domain, which can be utilized by applying CNN and LSTM techniques to improve the prediction accuracy. CNN technique is used to capture high-level spatial features while LSTM can provide excellent performance in dealing with time-sequential data by extracting high-level temporal features. We firstly reviewed the fundamentals of deep learning and presented the architecture of CNN and LSTM. Then, we introduced how to combine these two models to form concatenated hybrid models and parallelized hybrid models. Finally, we proposed bidirectional LSTM models to enhance prediction performance by learning additional high-level temporal features from the historical data in previous days.

Furthermore, we introduced the decentralized multi-agent advantage Actor-Critic technique and Nash Q learning for traffic signal control applications. We firstly briefly review the fundamental principles of RL. Then, we focus on multiagent DRL-based traffic signal control techniques such as decentralized multi-agent advantage actor-critic, which can converge to the local optimum and overcome the scalability issue by considering the non-stationarity of MDP transition caused by policy update of the neighborhood.

The main contribution of this chapter can be summarized as followings:


### **Author details**

Shangbo Wang Xi'an Jiaotong Liverpool University, Suzhou, China

\*Address all correspondence to: shangbo.wang@xjtlu.edu.cn

© 2022 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/ by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

*Traffic State Prediction and Traffic Control Strategy for Intelligent Transportation… DOI: http://dx.doi.org/10.5772/intechopen.101675*

#### **References**

[1] Gora P, Rüb I. Traffic models for selfdriving connected cars. Transportation Research Procedia. 2016;**14**:2207-2216

[2] Calvert S, Schakel WJ, Lint JWC. Will automated vehicles negatively impact traffic flow? Journal of Advanced Transportation. 2017;**2017**:8

[3] Yang Cheng YZ. Connected Automated Vehicle Highway (CAVH): A Vision and Development Report for Large Scale Automated Driving System (ADS) Deployment

[4] Liu Q, Li X, Yuan S and Li Z, Decision-Making Technology for Autonomous Vehicles Learning-Based Methods, Applications and Future Outlook, 2021

[5] Miglani A, Kumar N. Deep learning models for traffic flow prediction in autonomous vehicles: A review, solutions, and challenges. Vehicular Communications. 2019;**20**:100184

[6] Duan P, Mao G, Yue W, Wang S. A unified STARIMA based model for short-term traffic flow prediction. 21st International Conference on Intelligent Transportation Systems (ITSC). Maui, HI. 2018. pp. 1652-1657

[7] Napiah M, Kamaruddin I. ARIMA models for bus travel time prediction. Journal of the Institution of Engineers Malaysia. 2010;**71**

[8] Kumar SV, Vanajakshi L. Short-term traffic flow prediction using seasonal ARIMA model with limited input data. European Transport Research Review. 2015;**7**:21

[9] Williams BM, Durvasula PK, Brown DE. Urban freeway traffic flow prediction: Application of seasonal autoregressive integrated moving average and exponential smoothing models. Transportation Research Record. 1998;**1644**:132-141

[10] Ahn J, Ko E, Kim EY. Highway traffic flow prediction using support vector regression and Bayesian classifier. 2016 International Conference on Big Data and Smart Computing (BigComp). 2016. pp. 239-244. DOI: 10.1109/BIGCOMP.2016.7425919

[11] Kumar SV. Traffic flow prediction using Kalman filtering technique. Procedia Engineering. 2017;**187**:582-587

[12] Ranjan N, Bhandari S, Zhao HP, Kim H, Khan P. City-wide traffic congestion prediction based on CNN, LSTM and transpose CNN. IEEE Access. 2020;**8**:81606-81620

[13] Ma X, Dai Z, He Z, Ma J, Wang Y, Wang Y. Learning traffic as images: A Deep convolutional neural network for large-scale transportation network speed prediction. Sensors. 2017; **17**:818

[14] Di YANG, Songjiang LI, Zhou PENG, Peng WANG, Junhui WANG, Huamin YANG. MF-CNN: Traffic flow prediction using convolutional neural network and multi-features fusion. IEICE Transactions on Information and Systems. 2019;**E102.D**:1526-1536

[15] Bao X, Jiang D, Yang X, Wang H. An improved deep belief network for traffic prediction considering weather factors. Alexandria Engineering Journal. 2021; **60**:413-420

[16] Huang W, Song G, Hong H, Xie K. Deep architecture for traffic flow prediction: Deep belief networks with multitask learning. IEEE Transactions on Intelligent Transportation Systems. 2014;**15**:2191-2201

[17] Lu S, Zhang Q, Chen G, Seng D. A combined method for short-term traffic flow prediction based on recurrent

neural network. Alexandria Engineering Journal. 2021;**60**:87-94

[18] Sadeghi-Niaraki A, Mirshafiei P, Shakeri M, Choi S-M. Short-term traffic flow prediction using the modified elman recurrent neural network optimized through a genetic algorithm. IEEE Access. 2020;**8**:217526-217540

[19] Tian Y, Pan L. Predicting short-term traffic flow by long short-term memory recurrent neural network. 2015 IEEE International Conference on Smart City/ SocialCom/SustainCom (SmartCity). 2015. pp. 153-158. DOI: 10.1109/Smart City.2015.63

[20] Wang W, Bai Y, Yu C, Gu Y, Feng P, Wang X, et al. A network traffic flow prediction with deep learning approach for large-scale metropolitan area network. 2018 IEEE/IFIP Network Operations and Management Symposium. 2018. pp. 1-9

[21] Wang S, Li C, Yue W, Mao G. Network capacity maximization using route choice and signal control with multiple OD Pairs. In: IEEE Transactions on Intelligent Transportation Systems. Vol. 21. 2020. pp. 1595-1611

[22] Keyvan-Ekbatani M, Kouvelas A, Papamichail I, Papageorgiou M. Exploiting the fundamental diagram of urban networks for feedback-based gating. Transportation Research Part B: Methodological. 2012;**46**:1393-1403

[23] Elouni M, Rakha HA. Weathertuned network perimeter control - A network fundamental diagram feedback controller approach. 2018 International Conference on Vehicle Technology and Intelligent Transport Systems. 2018. pp. 82-90

[24] Haddad J, Shraiber A. Robust perimeter control design for an urban region. Transportation Research Part B: Methodological. 2014;**68**:315-332

[25] Sirmatel II, Geroliminis N. Economic model predictive control of large-scale urban road networks via perimeter control and regional route guidance. IEEE Transactions on Intelligent Transportation Systems. 2018;**19**:1112-1121

[26] Kouvelas A, Saeedmanesh M, Geroliminis N. A linear formulation for model predictive perimeter traffic control in cities\*\*This research has been supported by the ERC (European Research Council) Starting Grant "METAFERW: Modelling and controlling traffic congestion and propagation in large-scale urban multimodal networks" (Grant #338205). IFAC-PapersOnLine. 2017;**50**:8543-8548

[27] Aboudolas K, Geroliminis N. Perimeter and boundary flow control in multi-reservoir heterogeneous networks. Transportation Research Part B: Methodological. 2013;**55**:265-281

[28] Mohebifard R, Hajbabaie A. Optimal network-level traffic signal control: A benders decomposition-based solution algorithm. Transportation Research Part B: Methodological. 2019; **121**:252-274

[29] Wu W, Wang Z-J, Chen X-M, Wang P, Li M-X, Ou Y-J-X, et al. A decision-making model for autonomous vehicles at urban intersections based on conflict resolution. Journal of Advanced Transportation. 2021;**2021**:8894563

[30] Lu S, Liu X, Dai S. Incremental multistep Q-learning for adaptive traffic signal control based on delay minimization strategy. 2008 7th World Congress on Intelligent Control and Automation. 2008. pp. 2854-2858. DOI: 10.1109/WCICA.2008.4593378

[31] Shoufeng L, Ximin L, Shiqiang D. Q-learning for adaptive traffic signal control based on delay minimization strategy. 2008 IEEE International

*Traffic State Prediction and Traffic Control Strategy for Intelligent Transportation… DOI: http://dx.doi.org/10.5772/intechopen.101675*

Conference on Networking, Sensing and Control. 2008. pp. 687-691

[32] T. Chu, J. Wang, L. Codecà and Z. Li, "Multi-agent deep reinforcement learning for large-scale traffic signal control," IEEE Transactions on Intelligent Transportation Systems, PP. 2019

[33] Wang T, Cao J, Hussain A. Adaptive traffic signal control for large-scale scenario with cooperative group-based multi-agent reinforcement learning. Transportation Research Part C: Emerging Technologies. 2021;**125**:103046

[34] Wang X, Ke L, Qiao Z, Chai X. Large-scale traffic signal control using a novel multi-agent reinforcement learning. IEEE Transactions on Cybernetics. 2021;**51**(1):174-187

[35] Li Z, Xu C, Zhang G, A Deep Reinforcement Learning Approach for Traffic Signal Control Optimization, 2021

[36] Guo J, Harmati I. Evaluating semicooperative Nash/Stackelberg Qlearning for traffic routes plan in a single intersection. Control Engineering Practice. 2020;**102**:104525

[37] Shickel B, Tighe PJ, Bihorac A, Rashidi P. Deep EHR: A survey of recent advances on deep learning techniques for electronic health record (EHR) analysis. IEEE Journal of Biomedical and Health Informatics. 2018;**22**(5):1589- 1604. DOI: 10.1109/JBHI.2017.2767063

[38] Lin C-T, Lee CSG. Neural Fuzzy Systems: A Neuro-Fuzzy Synergism to Intelligent Systems. USA: Prentice-Hall, Inc.; 1996

[39] Chon T-S, Park Y-S, Kim J-M, Lee B-Y, Chung Y-J, Kim Y. Use of an artificial neural network to predict population dynamics of the forest–pest pine needle gall midge (Diptera:

Cecidomyiida). Environmental Entomology. 2000;**29**:1208-1215

[40] Fouladgar M, Parchami M, Elmasri R, Ghaderi A. Scalable deep traffic flow neural networks for urban traffic congestion prediction. 2017 International Joint Conference on Neural Networks (IJCNN). 2017. pp. 2251-2258

[41] Yu H, Wu Z, Wang S, Wang Y, Ma X. Spatiotemporal recurrent convolutional networks for traffic prediction in transportation networks. 2018 Proceedings of the 2nd International Conference on Computer and Data Analysis (ICCDA). 2018. pp. 28-35

[42] Fang W, Love PED, Luo H, Ding L. Computer vision for behaviour-based safety in construction: A review and future directions. Advanced Engineering Informatics. 2020;**43**:100980

[43] Li H-C, Deng Z-Y, Chiang H-H. Lightweight and resource-constrained learning network for face recognition with performance optimization. Sensors. 2020;**20**

[44] Hassannayebi E, Ren C, Chai C, Yin C, Ji H, Cheng X, et al. Short-term traffic flow prediction: A method of combined deep learnings. Journal of Advanced Transportation. 2021;**2021**: 9928073

[45] Dinler ÖB, Aydin N. An optimal feature parameter set based on gated recurrent unit recurrent neural networks for speech segment detection. Applied Sciences. 2020;**10**

[46] Jagannatha A, Yu H. Structured prediction models for RNN based sequence labeling in clinical text. Proceedings of the Conference on Empirical Methods in Natural Language Processing. 2016;856-865

[47] Mohammadi M, Al-Fuqaha A, Sorour S, Guizani M. Deep learning for IoT big data and streaming analytics: A survey. IEEE Communication Surveys and Tutorials. 2018;**20**:2923-2960

[48] Alzubaidi L, Zhang J, Humaidi AJ, Al-Dujaili A, Duan Y, Al-Shamma O, et al. Review of deep learning: Concepts, CNN architectures, challenges, applications, future directions. Journal of Big Data. 2021;**8**:53

[49] Hochreiter S, Schmidhuber J. Long short-term memory. Neural Computation. 1997;**9**:1735-1780

[50] Duan Z, Yang Y, Zhang K, Ni Y, Bajgain S. Improved deep hybrid networks for urban traffic flow prediction using trajectory data. IEEE Access. 2018;**6**:31820-31827

[51] Liu Y, Zheng H, Feng X and Chen Z. Short-Term Traffic Flow Prediction with Conv-LSTM. 2017

[52] Sainath TN, Vinyals O, Senior A, Sak H. Convolutional, Long Short-Term Memory, fully connected Deep Neural Networks. 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2015. pp. 4580-4584

[53] Wu Y, Tan H. Short-term traffic flow forecasting with spatial-temporal correlation in a hybrid deep learning framework. ArXiv abs/1612.01022, 2016

[54] Haydari A, Yilmaz Y. Deep reinforcement learning for intelligent transportation systems: A survey. CoRR. 2020;abs/2005.00935

[55] Tesauro G. Extending Q-learning to general adaptive multi-agent systems. 2003 Proceedings of the 16th International Conference on Neural Information Processing Systems (NIPS). 2003. pp. 871-878

#### **Chapter 5**

## Vehicle-To-Anything: The Trend of Internet of Vehicles in Future Smart Cities

*Mingbo Niu, Xiaoqiong Huang and Hucheng Wang*

#### **Abstract**

This chapter includes five parts—the concept of vehicle-to-anything (V2X), introduction of visible light communication (VLC), free-space optical communication (FSO), and terahertz (THz). The first part will present the concept of V2X. V2X is the basis and fundamental technology of future smart cars, autonomous driving, and smart transportation systems. Vehicle-to-network (V2N), vehicle-tovehicle (V2V), vehicle-to-infrastructure (V2I), and vehicle-to-people (V2P) are included in V2X. V2X will lead to a high degree of interconnection of vehicles. The concept of VLC is presented in the second part. Intelligent reflecting surface (IRS) for nano-optics and FSO communication is introduced in the third part. At the same time, IRS keeps pace with the phase in communication links. Prospects of THz in glamorous cities are introduced in the fourth part. These new technologies will lead to trends in the future. A comparison of optical communication technology and applications in V2X is described in the fifth part.

**Keywords:** Vehicle-to-anything, visible light communication, free-space optical communication, smart city, terahertz technology

#### **1. Part I: V2X Introduction**

#### **1.1 What is V2X?**

V2X refers to the realization of a full range of network connections among V2V, V2P, V2I, and V2N with the help of new advances in information and communication to improve the level of intelligence and autonomous driving capabilities of the vehicle. **Figure 1** shows the components of V2X. On one hand, V2X will improve traffic efficiency; on the other hand, it will provide users with intelligent, comfortable, safe, energy-saving, and efficient integrated services. V2X will establish a new direction for the development of automotive technology by integrating global positioning systems, wireless communication, and remote sensing technologies [1]. At the same time, V2X will realize the compatibility of manual driving and automatic driving. In the automatic driving mode, it is possible for the automatic vehicle to randomly select the driving route with the best road conditions through the analysis of real-time traffic information. This driving mode can alleviate traffic jams. In addition, through the use of onboard sensors and cameras, vehicle can perceive the surrounding environment and make rapid adjustments to achieve "zero traffic

**Figure 1.** *The components of V2X.*

accidents." For example, if a pedestrian suddenly appears, the car will automatically slow down to a safe speed or stop [2].

The earliest application of V2X was shown on a Cadillac by General Motors in 2006. Since then, other auto-product suppliers have begun to study this technology. However, the application of V2X was put on the agenda because of two traffic accidents that originated in the United States.V2X is the key technology of the future intelligent transportation system (ITS). V2X makes communication between vehicles and base stations easier. A series of messages, such as real-time road conditions, traffic signals, and pedestrian information, can be obtained. These messages can improve driving safety, reduce congestion, and improve traffic efficiency. The purpose of V2X is to reduce accidents, alleviate traffic congestion, reduce environmental pollution, and provide additional information services.

#### **1.2 Motivation**

If a vehicle can be illustrated as the driver's second pair of "eyes," it can theoretically reduce the occurrence of traffic accidents caused by driver's distraction or low visibility. V2X is a clever technology that turns the vehicle into the driver's eyes. V2X can see animals that suddenly run on the road before the driver notices. Generally, V2X uses neighbor cars to see traffic signal indicators and remind the driver while the driver can be hard to notice. Compared with cameras or Lidar commonly used in autonomous driving [3], V2X has the ability to break through visual blind spots and cross obstructions to obtain traffic information. At the same time, V2X shares real-time driving status with other vehicles or facilities and decides the driving state of the vehicle immediately through study and judgment algorithms information. In addition, V2X is free subject to extreme weather conditions, such as rain, fog, and strong light exposure. Therefore, V2X is being developed in transportation, especially in the field of autonomous driving.

#### **1.3 V2X development status**

In 2015, US launched the ITS five-year plan with the theme "change the way in which society moves forward." The main technical goals of planning are to "To realize application of connected vehicles" and "To accelerate autonomous driving."

#### *Vehicle-To-Anything: The Trend of Internet of Vehicles in Future Smart Cities DOI: http://dx.doi.org/10.5772/intechopen.105043*

Six categories of projects are defined in the plan—accelerated deployment, connected vehicles, autonomous driving, emerging capabilities, interoperability, and enterprise data. Connected vehicles, autonomous driving, and emerging capabilities are the three paths of technological development, while interoperability and enterprise data are the cornerstones of ITS development. To promote the further development of V2V and to reverse the subsequent legislative decisions of US, US Department of Transportation has led the "Safety Pilot Demonstration Deployment" project based on V2V and V2I. On the basis of the test and verification of the "Safety Pilot Demonstration Deployment" project, in 2014, the US National Highway Traffic Safety Administration announced the draft of the Vehicle-to-Vehicle Communication Advance Law, and launched the NPRM process in 2016 to enforce the light-duty vehicle V2V Communication, the main content includes:


The period from 2022 to 2025 is the deployment and development period of C-V2X industrialization. After 2025, the rapid development of the C-V2X industry will gradually achieve national coverage of C-V2X, and China will build a nationwide multi-level data platform, achieve cross-industry data interconnection, and provide diversified travel services.

#### **1.4 V2N**

V2N refers to the connection of vehicle devices with the network. Network exchanges data with the vehicle, stores and processes the acquired data, and provides various application services required by the vehicle. V2N communication is mainly used in vehicle navigation, remote vehicle monitoring, emergency rescue, and infotainment services.

#### **1.5 V2V**

V2V refers to communication between vehicles through onboard terminals. The vehicle-mounted terminal can obtain information, such as the speed, location, and driving conditions of surrounding vehicles in real time. Vehicles can also form an interactive platform to exchange information, such as pictures and videos in real time. V2V communication is mainly used to avoid or reduce traffic accidents, vehicle supervision, and management [4].

V2V enables sensors to communicate with neighboring vehicles, and it is more accurate and energy-efficient than any onboard surround sensing system. If we study further, we will come up with autonomous driving not a solution for the transition of automated vehicles from A to B, but a network protocol that optimizes traffic parameters and allows all commuters to reach their destination quickly and safely. Wireless upgrade is another basic autonomous driving function enabled by V2N. Since autonomous driving is a life-critical application, it must be kept up to the latest status.

#### **1.6 V2I**

V2I refers to the communication between vehicle-mounted equipment and roadside infrastructures, such as traffic lights, traffic cameras, and roadside units. The roadside infrastructure can also obtain information about vehicles in nearby areas and release various real-time information. V2I communication is mainly used in real-time information services, vehicle monitoring and management, and nonstop toll collection. V2I sensors collect information about traffic, traffic light status, radar equipment, cameras, and other road signals work as shared nodes to maximize infrastructure throughput. Even object-list lane markings or road barriers will one day become "smart" and become V2I communicators. For autonomous driving, information is critical because the vehicle may rely on stationary object data specifically for certain road events. Vehicles approaching the work area can notify and slow down. The parking lot can announce the availability of the previous passenger the moment they leave the scene [4].

In addition, vehicle can collect traffic data flow and help the driver choose the best route. Due to real-time traffic updates, V2I can reduce fuel consumption. Pre-filtering through V2I and autonomous driving can increase density, which will quadruple the current infrastructure capacity, keep road accidents at zero and increase traffic speed.

#### **1.7 V2P**

V2P means vulnerable traffic groups, including pedestrians and cyclists, use user equipment to communicate with vehicle-mounted devices. V2P communication is mainly used to avoid or reduce traffic accidents. By organically linking "people, vehicles, infrastructure, network" and other elements, V2X can not only support vehicles to obtain more information than bicycles perceive, promote the innovation and application of autonomous driving, but also help build a more intelligent environment. The transportation system promotes the development of new models and business of automobiles and transportation services. V2P is of great significance for improving traffic efficiency, saving resources, reducing pollution, reducing accident rates, and improving traffic management [5]. **Figure 2** shows the typical V2X scenario as follows.

**Figure 2.** *Typical V2X scenario.*

#### **2. Part II: Visible light communication**

#### **2.1 Introduction of VLC**

VLC refers to a type of communication that transmits data by modulating light waves in the visible spectrum (wavelength range from 380 nm to 750 nm). VLC is an emerging technology that realizes data communication by modulating the light intensity information emitted by light-emitting diode. Generally speaking, a system using visible light can be called VLC. VLC transmits data in a subtle way without affecting the normal lighting environment [6]. VLC is an optical communication technology, which uses optical transmission for information transmission. The spectrum is the physical threshold of light transmission, as shown in **Figure 3** is the spectrum of VLC [7].

Visible light usually uses LED as a communication medium. LED equipped electroluminescence and semiconductor to generate light, which is made by conducting materials. Due to high energy efficiency, durability, and low cost, LED sales have doubled. LED has been widely used in various devices, such as smartphones, vehicles, video screens, and signs. The universe of VLC has brought many benefits to the industry. LED bulbs have become the main medium for visible light communication.

#### **2.2 Development of VLC in transportation system**

So far, ITS relies on RF. However, the last decade has seen a major shift in lighting technology. With the major breakthrough of optical communication technology and the wide application of LED in indoor or outdoor lighting stimulation, VLC has become a feasible communication technology, making Vehicular VLC (V-VLC) possible in ITS [8].

A major advantage of VLC is the use of existing infrastructure to provide communication services. Data and energy can be transmitted simultaneously through LED. That is, energy transmitted does not increase the cost [9].

One advantage of visible light over radio frequencies is the size of the frequency spectrum. The allocation of frequencies in the radio frequency band of the electromagnetic spectrum is greatly limited, regulated by each country, and coordinated through international telecommunications agencies. Light, however, is a different material. The spectrum of visible light is completely free, and it will lead to different commercials and academic possibilities [9].

#### **Figure 3.**

*The frequency band of visible light in the electro-magnetic spectrum [7].*

Because of its propagation properties, light has a security advantage over radio waves. RF for vehicle-mounted networks has nondirectional propagation, relatively long communication distances, and it can penetrate objects. RF has been well studied over the past few decades and the technology is quite mature. But due to the potential security attacks, such as jamming eavesdropping, and man-in-the-middle attacks, it will raise concerns about their use in security-critical on-board networking applications. Light, on the other hand, does not follow this behavior. Light has high directivity, and its physical properties provide a more secure environment for communication systems [10].

Finally, one of the main advantages of light is the wave's high frequency, which allows for very high data rate communication. A large amount of bandwidth available in the visible spectrum allows for huge potential data rates. Currently, in terms of Wi-Fi, the highest data rates achieved in standard Wireless Gigabit are close to 1 Gbps [11]. Due to the high frequency of light waves, VLC searches have yielded impressive results, which speed up to 100 Gbps [12].

#### **2.3 VLC modulation**

VLC contains an irradiance modulation with direct detection (IM/DD) to communicate data faster than the persistence of human eyes by modulating LED intensity. Compared with traditional RF, VLC has superior speed and efficiency, security, and low cost. VLC fulfills its dual purpose of lighting and high-speed data communication. According to the characteristics of different modulation schemes, VLC modulation in visible light communication is divided into single carrier modulation multi-carrier modulation and Color Gamut-based Modulation.

#### *2.3.1 Single carrier modulation*

Single carrier modulation is the transmission of all data signals using a single signal carrier. Single carrier avoids the problem that the ratio of maximum instantaneous electric power to average electric power of a multi-carrier system at the same time of each phase is very large. This technology is more mature and the system has higher stability. For the best point-to-multipoint communication system, the single carrier modulation can make the frequency and time synchronization design easier, and improve the stability of the system. A single carrier modulation system provides a point-to-multipoint wireless communication solution with high efficiency, high flexibility, and high stability.

Common single-carrier modulation schemes include on–off Keying (OOK) and Pulse Position Modulation (PPM). OOK is a simple amplitude shift keying modulation. Because its modulation is simple and easy to implement, it is widely used in low- and medium-speed data rate demand application scenarios. Although the latest research has realized data transmission at 1250 Mbits over a distance of 1 m [13], transmission cannot be promoted due to the limitation of transmission distance.

PPM has been developed as an alternative communication technology to improve the anti-interference capability of information transmission. PPM is a good way of modulation [14]. Considering the bit error rate performance, bandwidth requirements, optical power, and optical implementation complexity, PPM modulation is a viable candidate for VLC communication. Compared to OOK, pulse position modulation has low noise interference because the amplitude and width of the pulse are constant during modulation. In pulse position modulation, noise removal and separation are very easy. Due to the constant pulse amplitude and

*Vehicle-To-Anything: The Trend of Internet of Vehicles in Future Smart Cities DOI: http://dx.doi.org/10.5772/intechopen.105043*

width, the power consumption is also very low compared with other modulation methods.

In the past few decades, PPM modulation technology has received extensive attention, and research has been extended to various forms, such as differential pulse position modulation (DPPM), digital pulse interval modulation, multi-pulse position modulation, overlapping pulse position modulation (OPPM), and pulse rate modulation. DPPM is a simple improvement of PPM modulation. As long as deleting all the "0" time slots behind the "1" time of the PPM modulation, we can get the corresponding DPPM signal. Compared to PPM, DPPM symbols do not have strict symbol synchronization requirements, and more importantly, they can provide higher power utilization and bandwidth utilization. However, the bit error rate in DPPM is higher than that in PPM [15]. The main disadvantage of this scheme is that the pulse width is very short, and the high order M-element PPM modulation VLC can improve the power and bandwidth efficiency [16]. PPM modulation index can improve the power of the system from 1 dB to 2.5 dB by reducing the average bit error rate (BER) [16]. OPPM signal modulation schemes offer key advantages over other existing PPM schemes, such as greater sensitivity and smaller bandwidth expansion [17]. A priority decoding OPPM error correction scheme is proposed, which can significantly improve the system's BER without affecting the system bandwidth [14].

But PPM still has some disadvantages, for example, synchronization between transmitter and receiver, which is not always possible; we need dedicated channels and like pulse-amplitude modulation, transmission requires high bandwidth and this modulation requires special equipment. In addition, single carrier modulation is commonly subject to inter-symbol interference (ISI) during highspeed data transmission, which means that new modulation techniques are required.

#### *2.3.2 Multi-carrier modulation*

With the increase in the VLC network data rate, MCM was developed to solve ISI during high-speed data transmission. MCM divides the transmitted data stream into several different components through different sub-channels. Under ideal propagation conditions, subchannels are usually orthogonal, and the number of substrates is chosen so that the bandwidth of each subchannel is lower than the coherent bandwidth of the channel, thus making it relatively immune to flat fading. Compared with incoherent modulation, MCM has lower energy efficiency and higher bandwidth efficiency. Common multi-carrier modulation includes subcarrier intensity modulation (SIM) and orthogonal frequency division multiplexing (OFDM), which have the advantages of high spectral efficiency and recovery ability against channel damage. However, SIM modulation is mostly used in the study of FSO, and here we discuss more OFDM [18].

OFDM modulation solves the multi-user problem by dividing the parallel data stream into different narrowband channels at different frequencies. However, most VLC systems use IM/DD, which requires that the electrical signal must be a real positive signal, so baseband OFDM cannot be directly applied. The improved schemes for OFDM, VLC include direct current (DC) bias optical OFDM (DCO-OFDM) and asymmetric limiting optical OFDM (ACO-OFDM). In the DCO-OFDM system, a DC offset is added to the normal OFDM symbol to reduce signal distortion and noise caused by negative limiting. In ACO-OFDM, only oddindexed subcarriers are modulated, and the negative signal is clipped to zero during transmission. Compared with ACO-OFDM, DCO-OFDM has a lower power efficiency, but higher spectral efficiency. With the increase of the modulation

order, BER performance of ACO-OFDM is about 4.5 dB better than that of DCO-OFDM, reaching 10<sup>3</sup> [19]. In the case of small bias, BER of ACO-OFDM with 16 quadrature amplitude modulation (QAM) format is lower than that of ADO-OFDM with 4 QAM format.

#### *2.3.3 Color-shift keying based on color gamut*

Color-shift keying (CSK) is a visible light communication intensity modulation scheme proposed in IEEE 802.15.7, which sends signals through the color intensity emitted by red, green, and blue (RGB) light-emitting diodes. CSK signal points can be represented by an intensive combination of RGB colors corresponding to the transmitted data. The flicker of the light source is reduced by keeping the total emission intensity constant. Due to its unique advantages in preventing scintillation effect and light intensity fluctuation, the research of CSK in vehicle-mounted VLC has attracted more and more attention in recent years.

RGB LED consists of three LEDs in a package and produces white light through a combination of red, green, and blue outputs. Although more costly, RGB LED light can produce any perceived lighting color and can increase VLC data throughput by acting as a separate communication band. The perceived lighting color can be modified while achieving higher spectral efficiency.

#### **3. Part III: Free-space optical communication**

#### **3.1 FSO introduction**

FSO is known as an optical wireless system, which uses the atmosphere between the transmitter and receiver as the propagation medium, and FSO communication link is a line of sight (LOS). It can be used on various platforms, such as satellites, ships, airplanes, and other stationary or moving space and atmosphere. Due to its unregulated spectrum, inherent security, high data rate, and wider bandwidth, FSO is considered to be a supplement to radio frequency communication [20]. Although optical communication has advantages that traditional communication links cannot match, because its propagation medium cannot be controlled or adjusted, optical communication systems will be affected by some atmospheric phenomena. The main challenges of FSO are narrow beamwidth, transmission signal scattering, and scintillation. Due to the influence of atmospheric turbulence, the received signal intensity will fluctuate, that is, flicker [21]. FSO works according to the principle of sight. For continuous data transmission, LOS generated by the light beam should be straight. FSO is a combination of wireless technology and optical technology. The main factor that needs to be considered is the optimal bandwidth of the light beam used for communication and transmission of information signal data, such as audio and video. Free-space technology is an older technology used for lower data communications. Due to the limited bandwidth, RF is limited. Using lighting to transmit data, Li-Fi technology was proposed in optical communication [22]. The wireless optical communication technology was developed by the National Aeronautics and Space Administration and used for military purposes with high-speed communication links [23].

In recent years, FSO have received extensive attention in terms of ground-tosatellite transmission links and last-mile applications due to their high capacity and easy implementation. However, atmospheric turbulence can cause random fluctuations in the amplitude and phase of the received signal, which limits the application of FSO links. Multiple-input multiple-output (MIMO), adaptive optics, and fiber laser phased array (FLPA) are important ways to suppress atmospheric turbulence [24].

### **3.2 Performance of FSO**

The performance analysis of FSO should be considered from external and internal parameters. The specifications and ratings of the components used, including operating frequency, divergence, power consumption, and transmission angle are all internal parameters. The ability of the lens and the error rate are all at the receiver end. External parameters include environmental factors, such as alignment, atmospheric attenuation, weather conditions, and scintillation.

FSO communication depends on weather conditions. If weather conditions are cloudy or visibility is lower, the formed communication link would not be sufficient for effective communication, whereas the performance of FSO relies on the weather conditions. In the FSO system, the transmitter would produce a narrow beam of light, and the narrow beam of light is straight. At the same time, the receiver should receive the narrow beam of light from the strong communication link on a straight line [25]. In the optical communication system, a straight beam with a diameter of 5–8 cm passes through and spreads to 1–5 m within 1 km.

#### **3.3 FSO efficiency**

FSO technology is changing rapidly day by day. This technology would increase and maximize signal bandwidth, at the same time, this technology transmitting data would be at high speeds. FSO technology is similar to fiber optical communication. The only conversion is signal path flow. That is to say, wireless communication between the transmitter and receiver, without cables, so it reduces costs and can be more efficient [26]. The efficiency of FSO mainly depends on the external aspects or the medium aspects between the transmitter and the receiver. Data transmission is lossless and high-speed if the transmission medium has strong visibility. The data transmission speed of the LED can reach 100 Mbps, and various experiments have been carried out to increase the data rate.

A fiber laser phased array transmitter into a FSO communication system and compared BER and optical transmit power of the two systems, which used singleaperture transmitter and FLPA transmitter. Experiments show that the power budget gap is about 8–10 dBm [24]. This shows that the FLPA transmitter provides a higher power budget. A new type of FSO switch capable of multicasting, the cost analysis of this switch shows that even if the cost of T-SE is 1.2 to 3.5 of microelectro-mechanical system mirroring, its cost is lower than that of AD-based switches [27]. In Ref. [16], this paper uses avalanche photodiode (APD) and the positive intrinsic negative (PIN) receivers, respectively, and considers a single input multi output system with strong gas turbulence defined by *M*-ary PPM modulation and gamma-gamma distribution. Then a comprehensive comparative evaluation of the two situations is carried out. The experimental results show that the performance of the system can be improved by increasing the strength. In addition, we compared the main parameters of FSO and RF in order to distinguish the differences between them more intuitively, as shown in **Table 1**.

#### **3.4 What is IRS?**

As a new invention, IRS can be called smart wall, smart reflective light, passive smart mirror, smart reflective surface, and large smart surface. IRS is composed of a


#### **Table 1.**

*The performance comparison between FSO and RF.*

large number of passive, low-cost components. It is a low-carbon and environmentally friendly smart component that can effectively control the phase, frequency, amplitude, and even polarization of the collision signal, IRS will build a real-time and reconfigurable propagation environment. The signal coverage of IRS is small, easy to deploy, and will not interfere with each other. By increasing the number of reflective elements, the quality of the received signal can be significantly improved. IRS does not require a power supply, complex algorithms, and hardware. IRS is easy to integrate into current wireless communication systems. These advantages make IRS a promising candidate for future wireless communication systems. IRS can greatly adjust the signal reflection to change the wireless channel to enhance communication performance. IRS is used to realize the intelligent and reconfigurable wireless channel propagation environment of the B5G/6G wireless communication system. Generally speaking, IRS is a plane composed of a large number of passive reflection units, and each passive reflection unit can independently produce a controllable amplitude and/or phase change of the incident signal. By densely deploying IRS units in the wireless network, the reflection of the IRS array is cleverly coordinated. The signal propagation between the transmitter and the receiver can be flexibly reconfigured to achieve the required realization and distribution, which provides a new means to fundamentally solve the problem of wireless channel fading damage and interference. It is possible to achieve a leap in wireless communication capacity and reliability.

#### *3.4.1 Features of IRS*

#### 1.**Passive**

IRS is composed of a large number of low-cost passive reflective components, which are only used to reflect signals and do not need to transmit signals. Therefore, IRS is almost passive and ideally does not require any dedicated energy.

#### 2.**Programmable control**

IRS can control the scattering, reflection, and refraction characteristics of radio waves through the program, thereby overcoming the negative effects of natural wireless propagation. Therefore, IRS-assisted wireless communication can intelligently control the wave-front, such as phase, amplitude, frequency, and even polarization, which can hit the signal without complicated decoding, encoding, and radio frequency processing operations.

#### 3.**Good compatibility**

IRS can be integrated into the existing communication network protocol only by changing the network, without changing the hardware facilities and software of their equipment. At the same time, the IRS has a full-band response, and it can ideally work at any operating frequency.

#### 4.**Easy to deploy**

IRS is characterized by small size, lightweight, conformal geometry, and thinner than the wavelength, so it is easier to install and disassemble. Therefore, IRS can be easily deployed on exterior walls of buildings, billboards, ceilings of factories and indoor spaces, and people's clothes.

#### *3.4.2 Application of IRS in V2X*

In the future, IRS will be everywhere. IRS can be deployed on outdoor walls, drones, and transportation equipment of smart buildings in smart cities. Selfdriving vehicles can use IRS as an intermediate medium to realize free wireless optical transmission, quickly, accurately, and accurately convey various information to the vehicle, and realize V2X. Centralizing vehicles into the Internet of Things makes the vehicle and the Internet of Things closely connected. The lightweight, convenient, and flexible deployment characteristics of IRS enable IRS to play a big role in V2X.

Another promising direction is IRS-assisted RF sensing and positioning. The large aperture size of the IRS and its ability to shape the propagation environment can significantly enhance RF sensing capabilities. The channel can be changed to provide favorable conditions for RF induction, and then it can be monitored with high precision. Encouraging results were reported in Ref. [28], and these results may have applications in energy-saving monitoring, assisted living, and remote health monitoring. However, the issue of optimizing the configuration of the IRS to enhance RF sensing remains to be studied. The effective combination of radio frequency technology and IRS can also be applied in the future V2X. Vehicles can use sensors to sense the signals of surrounding vehicles and traffic signs and provide evidence of effective traffic information for real-time V2X decision analysis. The combination of RF and IRS makes the monitoring data correct.

#### **4. Part IV: Terahertz technology**

#### **4.1 Introduction to THz**

In the past few years, wireless data traffic has seen unprecedented growth. On one hand, from 2016 to 2021, mobile data traffic is expected to increase seven times. On the other hand, video traffic is expected to triple in the same period [29]. In fact, by 2022, wireless and mobile device traffic is expected to account for 71% of total traffic. In fact, by 2030, wireless data rates will be sufficient to match wired broadband Competition [30]. The growth of use of wireless communication has led the researcher to explore appropriate radio spectrum ranges to satisfy the growing needs of individuals. For this reason, THz frequency band (0.1–10 THz) has begun to attract attention. Seamless data transmission, unlimited bandwidth, microsecond delay, and ultra-high-speed downloading of THz will completely lead the innovation of communication and change the way of communication and access information.

The term terahertz was first used in the field of microwave science in the 1970s to describe the spectral frequency of interferometers, the coverage of diode detectors, and water laser resonance [31, 32]. In 2000, terahertz was called a submillimeter wave, and the frequency range was between 100 GHz and 10 THz. However, the dividing line between sub-millimeter wave and far infrared was not clearly identified [33, 34]. The concept of ultra-wideband communication using THz for no line of sight signal components was first proposed as a powerful solution for extremely high data rates [35]. Since then, THz technology, especially communication technology, has captured the enthusiasm of the research community.

In fact, the rise of terahertz wireless communication started as early as 2000 when the 120 GHz wireless link produced by photonic technology started [36]. The 120 GHz signal is the first commercial terahertz communication system, and its allocated bandwidth is 18 GHz. Data rates of 10 gbps and 20 gbps are achieved through OOK or QPSK modulation, respectively [37, 38]. The terahertz frequency band guarantees a wide range of throughput, and theoretically can be extended to several terahertz to reach terabits per second (Tbps) [39]. This potential associated with terahertz technology has attracted a wider research community. In fact, the joint efforts of active research teams are producing new designs, materials, and manufacturing methods, providing unlimited opportunities for the development of terahertz. The potential benefits of the THz band are discussed [33]. THz can be applied to terahertz imaging and tomography [34]. THz wave is an electromagnetic wave between microwave and infrared, with a wavelength of 0.03–3 mm and a frequency of 0.1–10 THz. THz waves not only have the same straightness as light waves but also have similar penetrating and absorptive properties to radio waves.

#### **4.2 Characteristics of THz radiation**


Most polar molecules, such as water molecules and ammonia molecules, have strong absorption of THz radiation. The spectral characteristics of THz can be analyzed to study material composition or perform product quality control.

#### **4.3 Problems with THz**

Because most biological tissues are rich in water, water absorbs THz radiation very strongly, it greatly reduces the sensitivity of imaging of biological samples, and THz cannot make a clear image for samples with a lot of water, especially thick samples. This severely limits the application of THz imaging in biomedicine.

At present, the average energy of THz waves generated by most femtosecond lasers is only on the order of Nanowatts and can reach a signal-to-noise ratio of 100,000 or higher for single-point detection, but the signal-to-noise ratio of realtime two-dimensional imaging is very low. To obtain a high signal-to-noise ratio for imaging, a higher energy source is required.

#### **4.4 Future research direction of THz**

#### *4.4.1 Terahertz ultra-massive MIMO*

THz ultra-large MIMO frequency band can meet the needs of high data rates, but under the premise of providing a huge bandwidth, this band suffers a huge atmospheric loss. Therefore, high-gain directional antennas should be used for communication over a distance of more than a few meters. In the terahertz band, antennas are installed in the same space in a small and dense manner. Ultra Massive MIMO (UM-MIMO) channel was proposed [40, 41], the concept of UM-MIMO relies on the use of ultra-dense frequency-tunable plasma nano-antenna arrays, UM-MIMO was used for both transmitting and receiving, thereby increasing the communication distance and ultimately increasing the achievable data rate at terahertz frequencies [42]. In fact, when ensuring a two-dimensional or planar antenna array instead of a one-dimensional or linear array, the radiated signal can be adjusted in elevation and azimuth directions. This results in 3D or fulldimensional MIMO. The performance of UM-MIMO technology depends on two indicators, namely the prospect of plasmonic nano-antennas and the characteristics of the terahertz channel. Another important aspect is dynamic resource allocation, which can make full use of the UM-MIMO system and obtain maximum benefits through adaptive design schemes [43].

#### *4.4.2 Terahertz virtual reality perception through cellular networks*

Facing the technical barriers of 5G communication, THz is expected to have breakthroughs in reliability and low latency. Currently, video requires extremely high bandwidth. Therefore, the terahertz frequency band is sought as a technological supplement, and THz will provide high capacity and dense coverage to meet user needs. The terahertz cellular network will enable interactive, high dynamic range video with higher resolution and higher frame rate, which actually requires 10 times the bit rate required for 4 K video. Terahertz transmission will help solve any interference problems and provide additional data to support various instructions in video transmission. In addition, the terahertz band will become an enabler of 6-degree-of-freedom (6DoF) video, providing users with the ability to move inside and interact with the environment. The results of the literature [44] absorbing the impact on the terahertz link greatly limits the communication range of small base stations. This impact can be mitigated by the densification of the network.

Therefore, the terahertz can provide a rate of up to 16.4 Gbps with a delay threshold of 30 ms.

#### *4.4.3 The application of THz technology in unmanned driving*

At present, 5G has been put into use worldwide, and the B5G system will be a supplement to the current 5G. Due to the low latency and high reliability of THz technology, THz can be applied to driverless vehicles. The main goals of the current B5G system are as follows:


#### **5. Part V: Prospect of V2X**

#### **5.1 Comparison of optical communication technology**

The application of visible light, free optical communication and THz technology, and related technologies in V2X was described above. The following focuses on comparing related technologies.

#### *5.1.1 Comparison of VLC and THz*

Communication through visible light is a promising energy-sensing technology, attracting people from industry and academia to study its potential applications in different fields. VLC carries information by modulating light in the visible spectrum (390–750 nm) [45]. Recent advances in LED lighting have enabled unprecedented energy efficiency and lamp life because LEDs can be pulsed at very high speeds without significant impact on lighting output and the human eye. LED also has several attractive features, including low power consumption, small size, long life, low cost, and low heat radiation. Therefore, VLC can support many important services and applications, such as indoor positioning, human-computer interaction, device-to-device communication, vehicle networks, traffic lights, and advertising display [46]. Despite the advantages associated with deploying VLC communications, several challenges exist that may hinder the effectiveness of wireless communication links. To achieve high data rates in a VLC link, LoS channel should first be assumed, in which both the transmitter and receiver should be aligned with the field of view (FOV) to maximize channel gain. However, due to the continuous change of the movement and direction of the receiver, the field of view of the receiver may not always be aligned with the transmitter. This misalignment leads to a significant drop in received optical power [7]. When an object or man obstructs the line of sight, the optical power will drop significantly, resulting in a severe drop in the data rate. Similar to infrared waves, ambient light interference will significantly reduce the signal-to-noise ratio (SNR) of the received signal and reduce the communication quality [45]. The current research on visible light networks also reveals downstream traffic but does not consider how the uplink runs. Since the

*Vehicle-To-Anything: The Trend of Internet of Vehicles in Future Smart Cities DOI: http://dx.doi.org/10.5772/intechopen.105043*

directional beam to the receiver should be maintained in the VLC uplink communication, when the mobile device is constantly moving/rotating, a significant throughput drop may occur. Therefore, other wireless technologies should be used to transmit uplink data [46]. Contrary to the VLC system, the THz band allows NLOS to propagate when LoS is not available as a supplement [47]. In this case, NLOS propagation can reflect the beam to the receiver by strategically installing dielectric mirrors. Due to the low reflection loss of the dielectric mirror, the resulting path loss is sufficient. In fact, for a distance of up to 1 meter and a transmission power of 1 watt, only the NLOS component in the terahertz link has a capacity of about 100 Gbps [48]. In addition, the terahertz frequency band is considered a candidate frequency band for uplink communication, which is a capability lacking in VLC communication. Another specific application where terahertz has become a valuable solution is the need to turn off the lights when looking for network services. Due to the limitation of the positive signal and the real signal, the VLC system will suffer a loss of spectrum efficiency. In fact, compared with the traditional bipolar system, using the unipolar OFDM system to impose Hermitian symmetry will cause a performance loss of 3 dB [49]. Both THz and VLC can be used as communication technologies to realize V2X in the future. Realize a technological breakthrough in V2X.

#### *5.1.2 Comparison of VLC and FSO*

VLC has become an attractive alternative to indoor RF communication to meet the growing demand for massive data services. In addition to providing a huge and unlicensed bandwidth to cope with the crowded radio spectrum, VLC has various other advantages, such as ease of use, no radiation, and no electromagnetic interference. On one hand, FSO is a line-of-sight, which has attracted great attention as a high-bandwidth last-mile transmission technology. On the other hand, FSO is a reasonable alternative to optical fiber because it requires less initial deployment [50] and it can be installed in locations where wired connection deployment is challenging.

The indoor VLC must be connected to the base station to achieve the purpose of communication. The most economical solution for connecting an indoor VLC to an outdoor base station is to use a power cord. In this case, various studies have been proposed [51, 52], involving the integration of VLC and power line communication (PLC) as a backbone network. However, PLC channels suffer from multiple damages—deep notches, high attenuation, and colored background noise that limits the data rate. To provide better data rates and improve system performance, VLC should be supplemented by high-bandwidth FSO links to achieve high-data-rate indoor multimedia services [53]. The direct FSO/VLC heterogeneous interconnection with data aggregation and distribution has been proved through experiments.

The combination of visible light and free optical communication technology can be used on V2X. Visible lights can be used on traffic signs, such as traffic lights. The combination of VLC and FSO can achieve V2I. The combination of optical communication and smart vehicles will provide better services. The combination of optical communication and smart vehicles will provide better services in the future smart vehicles.

#### *5.1.3 Comparison of FSO and THz*

FSO technology is an excellent candidate for high-performance secure communication due to its safety, anti-interference, high beam directivity, flexibility, and energy efficiency. However, to date, the large-scale deployment of FSO communication systems has been affected by availability and reliability issues due to flicker

on sunny days, low visibility on foggy days, Mie scattering effects, and high sensitivity to beam drift effects [54]. Due to the high directivity of the beam, FSO links are more difficult to intercept than RF systems. Nevertheless, Eve can still apply beam splitting attacks on the transmitting end, and blocking attacks or beam divergence attacks on the receiving end. Judging from the number of recent papers related to physical layer security (PLS). PLS research on FSO communication systems seems to be gaining momentum, such as [55, 56]. Unfortunately, almost all PLS papers related to FSO links use the eavesdropping channel method and direct detection introduced by Wyner [57]. Fog is the most unfavorable factor affecting FSO link reliability. In contrast, terahertz signals are less affected by these problems but are affected by other weather conditions, such as rain and snow. This shows that the two transmission media (FSO and THz) can operate in a complementary manner, depending on the prevailing weather and atmospheric conditions.

The above is the comparison of related technologies. In the future research direction, these several technologies can support the related research of V2X. The combination of pairwise or the combination of several technologies can play a very important role in future research.

#### **5.2 Application scenarios of VLC**

V-VLC is a fairly novel technology, although experimental studies in real driving scenarios have shown the feasibility of this technology in the application of vehicle networking. However, the current research on V-VLC mainly focuses on the understanding and characterization of the V-VLC channel, as well as the development of the V-VLC prototype. Although in the aspect of the physical layer, V-VLC still has many unsolved problems, especially regarding the performance and channel model of V-VLC in transportation system channels. But now the research will also focus on higher layer protocols V-VLC and IEEE 802.11p C-V2X and other different communication technologies that can make up for each other's shortcomings and improve the overall performance of applications [8].

Vehicular networking applications V-VLC can be used alone or implemented as part of a heterogeneous vehicular networking system V-VLC can benefit these specific applications as follows:

**Cooperative sensing:** The V-VLC can share perceptual data with nearby vehicles via onboard cameras, or collect sensor data to sense larger driving situations. Using headlights and taillights, high-throughput data can be transmitted to both front and rear vehicles, respectively, to facilitate cooperative awareness and cooperative awareness applications.

**Information query:** V-VLC can be used for information query and publishing within the range. These applications achieve this by utilizing VLC communication based on V2I and I2V to query and publish highly scalable and propagated information without strong latency and reliability requirements. In this way, V-VLC can be used to transmit information in part of the network without LED traffic lights, traffic signs, or road lighting coverage.

**Intersection assistance:** Intersection assistance applications, such as intersection collision avoidance, improve intersection safety by providing coordination and warning means between vehicles rather than traditional methods, such as traffic lights. When vehicles face each other at an intersection, head-to-head V-VLC links can be used to communicate with vehicles on the opposite side of the intersection. In addition, LED-based traffic lights or other infrastructure elements can facilitate communication.

**Collision avoidance:** To avoid rear-end collision in intelligent transportation systems, microwave radar and short-range radio communication are proposed.

*Vehicle-To-Anything: The Trend of Internet of Vehicles in Future Smart Cities DOI: http://dx.doi.org/10.5772/intechopen.105043*

However, these technologies are affected by radio frequency competition and changing weather conditions, and cannot achieve fully autonomous collision avoidance and safety queuing. VLC has the advantages of personal safety, unreasonable frequency allocation, large transmission capacity, and mature white LED light source. It can supplement the existing automatic driving system to achieve higher safety and driving efficiency, especially in the automotive lighting system and traffic light scenes [58].

**Visible light localization system:** Visible light system benefits from the ideal characteristics of visible light and its spectrum. Compared with traditional RF communication systems, visible spectrum has huge free bandwidth, facilitating high-speed data transmission and reducing the cost of operators. LED-based VLP systems can be easily integrated into existing lighting infrastructure (street light parking lights and traffic lights) for localized purposes, often without the need for rewiring beyond their basic lighting functions. In general, VLP systems can be used appropriately in any application that uses LEDs [59].

#### **5.3 Applications of FSO in V2X**

FSO can be used in future V2V systems by using highly collimated beams for enhanced vehicle-to-everything applications. A low-rate control link with multiple Gbps-assisted FSO links running in parallel is proposed [60]. Previous a control link is used to exchange sensor data about vehicle attitude dynamics to perform FSO beam tracking and provide an ultra-reliable high data rate connection on the latter FSO link. The joint contribution of local and distributed processing guarantees continuous and precise pointing to fully support autonomous driving applications.

To counteract the adverse effects of the limited sampling frequency of onboard sensors and control link delays, the evolution of vehicle kinematics can be estimated by simultaneously predicting and fusing multiple inertial measurement unit (IMU) data, augmented by real-time information on vehicle position.

#### **5.4 Application of THz in V2X**

When it comes to vehicle networks, there are several additional reasons to explore higher frequency bands that can support multiple Gbps and Tbps links. Firstly, when transmitting at such a high data rate, even if a user is mobile, from a data point of view, the link actually seems to be static because the transmission is almost instantaneous. In short, although the systems change over time, they do so much slower than the actual data rate. Therefore, during the transmission of a given frame, the system appears to be static. In addition, even if the user's connection is intermittent, the amount of information that can be transmitted per connection may be huge (1 Tb/ s). In addition, by moving to a higher carrier frequency, the influence of the Doppler effect can be reduced. Although this may not be a problem for automotive networks, it is very important for wireless data transmission between or between aircraft flying at high speeds. Therefore, there are inherent characteristics that prompt the exploration of vehicle networks in the terahertz frequency band [61]. This is undoubtedly the future trend for the realization of V2X. In the future, THz technology can shine in V2X.

Based on the results left over from the millimeter-wave band, the main attributes of terahertz communication are expected to include the following:

1.High frequency provides very large available bandwidth, therefore, potentially high data rate.


THz will shine in unmanned driving. THz in V2X can be a direction and trend in the future, connecting everything. THz will deliver information quickly and efficiently in terms of high speed and high accuracy.

#### **6. Part VI: Conclusion**

V2X is the key technology of the Internet of Vehicles. The Internet of Vehicles in the true sense consists of the network platform, the vehicle, and the driving environment. This chapter focuses on the investigation and review of the important research results and forward-looking technologies of optical wireless communication in the application of V2X. VLC communication technology maximizes the use of existing traffic infrastructure to build a multi-user communication network structure for people-vehicle-traffic lights; the unique high-speed communication speed of THz and FSO communication technology provides strong communication speed support for V2X; IRS equipment research provides the possibility for longdistance NLOS communication. In addition, these aforementioned technologies and their key features are summarized, and their emerging future research and engineering directions are given. It is anticipated that, in building a smart city, optic-/THz-based technology will play an important role in a future highly developed V2X networking era.

#### **Author details**

Mingbo Niu, Xiaoqiong Huang\* and Hucheng Wang Vehicle-Road Collaborative Laboratory, Chang'an University, China

\*Address all correspondence to: 2020232078@chd.edu.cn

© 2022 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/ by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

*Vehicle-To-Anything: The Trend of Internet of Vehicles in Future Smart Cities DOI: http://dx.doi.org/10.5772/intechopen.105043*

#### **References**

[1] Jurgen R. V2V, GPS integration could improve safety. In: V2V / V2I Communications for Improved Road Safety and Efficiency.

[2] Martín-Sacristán D, Roger S, Garcia-Roger D. Low-latency V2X communication through localized MBMS with local V2X servers coordination. In: 2018 IEEE International Symposium on Broadband Multimedia Systems and Broadcastin. 2018. pp. 1-8

[3] Huang J, Fei Z, Wang T, Wang X, Liu F, Haijun Z, et al. V2X-communication assisted interference minimization for automotive radars. China Communications. 2019;**16**(10):100-111

[4] Jurgen R. V2V and V2I Technical Papers. In: V2V/V2I Communications for Improved Road Safety and Efficiency.

[5] Malik RQ, Ramli KN, Kareem ZH, Habelalmatee MI, Abbas AH, Alamoody A. An overview on V2P communication system: Architecture and application. In: 2020 3rd International Conference on Engineering Technology and its Applications. 2020. pp. 174-178

[6] Matheus LEM, Vieira AB, Vieira LFM. Vieira MAM, Gnawali O. Visible light communication: Concepts, applications and challenges. IEEE Communications Surveys and Tutorials; 2019;**21**(4):3204-3237. DOI: 10.1109/ COMST.2019.2913348

[7] Pathak PH, Feng X, Hu P, Mohapatra P. Visible light communication, networking, and sensing: A survey, potential and challenges. IEEE Communications Surveys and Tutorials; 2015;**17**(4):2047-2077. DOI: 10.1109/ COMST.2015.2476474

[8] Memedi A, Dressler F. Vehicular visible light communications: A survey. IEEE Communications Surveys and Tutorials. 2021;**23**(1):161-181

[9] H. Burchardt, N. Serafimovski, D. Tsonev, S. Videv, and H. Haas, "VLC: Beyond point-to-point communication," IEEE Communication Magazine, vol. 52, no. 7, pp. 98–105, Jul. 2014.

[10] Rohner C, Raza S, Puccinelli D, Voigt T. Security in visible light communication: Novel challenges and opportunities. IEEE Sensors Transducers. 2015;**192**(9):9-15

[11] Hansen CJ. WiGiG: Multi-gigabit wireless communications in the 60 GHz band. IEEE Wireless Communication. 2011;**18**(6):6-7

[12] Gomez A, Shi K, Quintana C. Beyond 100-Gb/s indoor wide field-ofview optical wireless communications. IEEE Photonics Technology Letters. 2015;**27**(4):367-370

[13] Yeh CH, Chow CW, Wei LY. 1250 Mbit/s OOK wireless white-light VLC transmission based on phosphor laser diode. IEEE Photonics Journal. 2019; **11**(3):1-5

[14] Zohaib A, Ahfay MH, Mather PJ, Sibley MJN. Improved BER for offset pulse position modulation using priority decoding over VLC system. In: IEEE Wireless Days, Manchester, UK. 2019

[15] Sui M, Zhou Z. The modified PPM modulation for underwater wireless optical communication. In: 2009 International Conference on Communication Software and Networks, Chengdu, China. 2009

[16] Islam MA, Chowdhury AB, Barua B. Free space optical communication with m-ary pulse position modulation under strong turbulence with different type of

receivers. In: 2015 2nd International Conference on Electrical Information and Communication Technologies, Khulna, Bangladesh. 2015

[17] Chizari A, Jamali MV, AbdollahRamezani S, Salehi JA, Dargahi A. Designing a dimmable OPPM-based VLC system under channel constraints. In: 2016 10th International Symposium on Communication Systems, Networks and Digital Signal Processing, Prague, Czech Republic. 2016

[18] Armstrong J. OFDM for optical communications. IEEE Journal of Lightwave Technology. 2009;**27**(3): 189-204

[19] Li F, Zhang C. Performance Analysis on Hybrid OOK-and-OFDM Modulation in VLC System. In: 2019 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting, Jeju, Korea. 2019

[20] Das S, Henniger H, Epple B. Requirements and challenges for tactical free-space lasercomm[C]. In: MILCOM 2008-2008 IEEE Military Communications Conference. 2008. pp. 1-10

[21] Andrews LC, Phillips RL, Hopen CY, AI-Habash MA. Theory of optical scintillation. Optical Society of America. 1999;**16**:1417-1429

[22] Haas H. LiFi: Conceptions, misconceptions and opportunities. In: 2016 IEEE Photonics Conference. IEEE. 2016

[23] Ravishankar MB, Rakshith RA, Aishwarya A, Thejaswini KN, Basavaraja G. Free space optics and radio frequency signals in spacial communication. In: 2019 International Conference on Intelligent Computing and Control Systems. 2019. pp. 1473-1477

[24] Chen S, Zhang Z, Cai S. High power budget coherent free space optical communication system based on fiber laser phased array. In: 2018 Asia Communications and Photonics Conference. 2018. pp. 1-3

[25] Kumar N, Rana AK. Impact of various parameters on the performance pf free space optics communication system. Optik-International Journal for Light and Electron Optics. 2013;**124**: 5774-5776

[26] Zocchi FE. A simple analytical model of adaptive optics for direct detection free space optical communication. Optics Communications. 2005;**248**:359-374

[27] Hamza AS, Deogun JS, Alexander DR. Free space optical multicast crossbar. IEEE/OSA Journal of Optical Communications and Networking. 2016;**8**(1):1-10

[28] Jingzhi Hu, Hongliang Zhang, Boya Di, Lianlin Li, Kaigui Bian, Lingyang Song, and Yonghui Li, "Reconfigurable intelligent surface based RF sensing: Design, optimization, and implementation," in IEEE Journal on Selected Areas in Communications*,* vol. 38, no. 11, pp. 2700-2716, Nov. 2020.

[29] Index CVN. Cisco visual networking index: Forecast and methodology 2015– 2020. In: White paper, CISCO. 2015

[30] Li R. Towards a new Internet for the year 2030 and beyond. In: 3rd Annual ITU IMT-2020/5G Workshop Demo Day, Geneva, Switzerland. 2018. pp. 1-21

[31] Kerecman AJ. The tungsten-p type silicon point contact diode. In: MTT-S IEEE Int. Microw. Symp. Dig. 1973. pp. 30-34

[32] Fleming JW. High resolution submillimeter-wave Fouriertransform spectrometry of gases. IEEE

*Vehicle-To-Anything: The Trend of Internet of Vehicles in Future Smart Cities DOI: http://dx.doi.org/10.5772/intechopen.105043*

Transactions on Microwave Theory and Techniques. 1974;**MTT-22**(12): 1023-1025

[33] Siegel PH. Terahertz technology. IEEE Transactions on Microwave Theory and Techniques. 2002;**50**(3): 910-928

[34] Ferguson B, Zhang X-C. Materials for terahertz science and technology. Nature Materials. 2002;**1**(1):26

[35] Piesiewicz R, Kleine-Ostmann T, Krumbholz N, Mittleman D, Koch M. Short-range ultra-broadband terahertz communications: Concepts and perspectives. IEEE Antennas Propagation Magazine. 2007;**49**(6): 24-39

[36] Nagatsuma T. A 120-GHz integrated photonic transmitter. In: Proc. IEEE Int. Topical Meeting Microw. Photon. 2000. pp. 225-228

[37] Hirata A, Kosugi T, Takahashi H. 120-GHz-band wireless link technologies for outdoor 10-Gbit/s data transmission. IEEE Transactions on Microwave Theory and Techniques. 2012;**60**(3):881-895

[38] Takahashi H, Hirata A, Takeuchi J, Kukutsu N, Kosugi T, Murata K. 120-GHz-band 20-Gbit/s transmitter and receiver MMICs using quadrature phase shift keying. In: Proc. IEEE 7th Eur. Microw. Integr. Circuits Conf. 2012. pp. 313-316

[39] Akyildiz IF, Jornet JM, Han C. TeraNets: Ultra-broadband communication networks in the terahertz band. IEEE Communication Magazine. 2014;**21**(4):130-135

[40] Larsson EG, Edfors O, Tufvesson F, Marzetta TL. Massive MIMO for next generation wireless systems. IEEE Communication Magazine. 2014;**52**(2): 186-195

[41] Akyildiz IF, Jornet JM. Realizing ultra-massive MIMO communication in the terahertz band. Nano Communication Network. 2016;**8**:46-54

[42] Zakrajsek LM, Pados DA, Jornet JM. Design and performance analysis of ultra-massive multi-carrier multiple input multiple output communications in the terahertz band. In: Proc. SPIE Image Sens. Technol. Mater. Devices Syst. Appl. IV. Vol. 10209. 2017. pp. 102090A1-102090A11

[43] Muñoz SR. Multi-user ultra-massive MIMO for very high frequency bands (mmWave and THz): A resource allocation problem [M.S. thesis]. Barcelona, Spain: Dept. Comput. Architect, Universitat Politècnica de Catalunya; 2018

[44] Chaccour C, Amer R, Zhou B, Saad W. "On the reliability of wireless virtual reality at terahertz (THz) frequencies". arXiv preprint arXiv: 1905.07656, 2019

[45] Arnon S. Visible Light Communication. Cambridge, UK: Cambridge Univ. Press; 2015

[46] Khalighi MA, Uysal M. Survey on free space optical communication: A communication theory perspective. IEEE Communication Surveys Tuts. 2014;**16**(4):2231-2258

[47] Akyildiz IF, Jornet JM, Han C. Terahertz band: Next frontier for wireless communications. Physics Communication. 2014;**12**:16-32

[48] Moldovan A, Ruder MA, Akyildiz IF, Gerstacker WH. LOS and NLOS channel modeling for terahertz wireless communication with scattered rays. In: Proc. IEEE GC Wkshps. 2014. pp. 388-392

[49] Wang Z, Mao T, Wang Q. Optical OFDM for visible light communications. In: Proc. IEEE 13th Int. Wireless

Commun. Mobile Comput. Conf. 2017. pp. 1190-1194

[50] Douik A, Dahrouj H, Al-Naffouri TY, Alouini M-S. Hybrid radio/freespace optical design for next generation backhaul systems. IEEE Transactions on Communications. 2016;**64**(6):2563-2577

[51] Komine T, Nakagawa M. Integrated system of white LED visiblelight communication and power-line communication. IEEE Transactions on Consume Electronics. 2003;**49**(1):71-79

[52] Ma X, Gao J, Yang F, Ding W, Yang H, Song J. Integrated power line and visible light communication system compatible with multi-service transmission. IET Communication. 2017;**11**(1):104-111

[53] Huang Z, Wang Z, Huang M, Li W, Lin T, He P, et al. Hybrid optical wireless network for future SAGOintegrated communication based on FSO/VLC heterogeneous interconnection. IEEE Photonics Journal. 2017;**9**(2):1-10

[54] Andrews LC, Philips RL. Laser Beam Propagation through Random Media. Bellingham, WA: SPIE Press; 2005

[55] Sun X, Djordjevic IB. Physical-layer security in orbital angular momentum multiplexing free-space optical communications. IEEE Photonics Journal. 2016;**8**(1):1-10

[56] Lopez-Martinez FJ, Gomez G, Garrido-Balsells JM. Physical layer security in free-space optical communications. IEEE Photonics Journal. 2015;**7**(2):7901014

[57] Wyner AD. The wire-tap channel. Bell System Technology Journal. 1975; **54**(8):1355-1387

[58] Soner B, Coleri S. Visible light communication based vehicle

localization for collision avoidance and platooning. IEEE Transactions on Vehicular Technology. 2021;**70**(3): 2167-2180

[59] Keskin MF, Sezer AD, Gezici S. Localization via Visible Light Systems. Proceedings of the IEEE. 2018;**106**(6): 1063-1088

[60] Brambilla M, Tagliaferri D, Nicoli M, Spagnolini U. Sensor and Map-Aided Cooperative Beam Tracking for Optical V2V Communications. In: 2020 IEEE 91st Vehicular Technology Conference. 2020. pp. 1-7

[61] Shahid M, Miquel JJ, Jocelyn A, Gerstacker Wolfgang H, Xiaodai D, Bo A. Terahertz communication for vehicular networks. In: IEEE Transactions on Vehicular Technology. 2017

#### **Chapter 6**

## Prediction of Large Scale Spatio-temporal Traffic Flow Data with New Graph Convolution Model

*Ping Wang,Tongtong Shi, Rui He and Wubei Yuan*

### **Abstract**

Prompt and accurate prediction of traffic flow is quite useful. It will help traffic administrator to analyze the road occupancy status and formulate dynamic and flexible traffic control in advance to improve the road capacity. It can also provide more precise navigation guidance for the road users in future. However, it is hard to predict spatiotemporal traffic flow data in large scale promptly with high accuracy caused by complex interrelation and nonlinear dynamic nature. With development of deep learning and other technologies, many prediction networks could predict traffic flow with accumulated historical data in time series. In consideration of the regional characteristics of traffic flow, the emerging Graph Convolutional Network (GCN) model is systematically introduced with representative applications. Those successful applications provide a possible way to contribute fast and proper traffic control strategies that could relieve traffic pressure, reduce potential conflict, fasten emergency response, etc.

**Keywords:** traffic flow, GCN, traffic data, deep learning, ITS

#### **1. Introduction**

#### **1.1 Background and current status**

Traffic problems such as frequent traffic congestion, serious traffic accidents, and long commuting times have seriously reduced the travel experience of passengers and the efficiency of traffic operations [1]. To cope with these problems, researchers work on improving the traffic control strategies based on prediction of future traffic stratus [2]. Traffic flow is one of important road conditions to access [3]. Based on prompt and accuracy perdition, better and fast-adjusted traffic control and guidance could be applied. Therefore, reliable traffic flow prediction is also one of the key factors to upgrade the traffic system from "passive adjust" to "active control in advance"; even prediction of future short-term traffic status of road sections is quite useful to prevent congestion deteriorate. For traffic management departments, early detection of traffic instability and abnormal potential risks based on reliable prediction data can improve a large number of existing traffic management control applications, such as traffic calming, signal control, etc.; for road users, real-time route updates and adjustments based on dynamic traffic prediction results can adjust travel time and

routes before congestion develops, thus providing vehicles to plan a driving path to avoid congested road sections and congested intersections or to plan a path with the shortest driving time for vehicles to improve traffic efficiency.

The current traffic prediction also faces the following challenges, as shown in **Figure 1**: (1) analyzing the spatial correlation of the road network: some roads are adjacent to each other and have different degrees of influence on upstream and downstream traffic volumes, so the traffic flow in this part is spatially correlated, and it is a challenge to consider the spatial location relationship to correlate the neighboring traffic flow characteristics [4]; (2) unlike the regular network layout, the traffic map structure is irregular; (3) the nonlinear retention of medium and long time prediction models: the traffic flow changes drastically at the peak time, which is difficult to predict, especially as the prediction time increases, the nonlinear retention ability of the model decreases and the time series signal gradually decays, so how to better correlate the time series relationship of traffic flow to maintain the steady-state time series prediction is also a long-term challenging task [5].

#### **1.2 Related works**

Traffic flow forecasting is based on historical traffic flow data to predict future traffic flow, which is a typical regression problem of traffic network time series [6]. In order to solve the traffic flow forecasting problem, factors such as traffic patterns, data types, spatial locations, and time periods need to be considered. Nowadays, many computational forecasting methods have been widely used in traffic flow prediction and have achieved good research results. As shown in **Figure 2**, common traffic flow prediction methods can be divided into three major categories [7]. ① early traffic flow prediction methods; ② machine-learning-based traffic flow prediction methods; ③ deep-learning-based traffic flow prediction methods.

#### *1.2.1 Early traffic flow prediction methods*

Early traffic flow prediction methods mainly model the relationship between traffic flow, speed, and density and regress the traffic flow data as well as optimize the parameters to achieve the fitting prediction of traffic data, mainly including statistical models and traffic simulation.

#### *1.2.1.1 Typical methods*

• Miska et al. [8] proposed cellular automata (CA) to simulate each participant of different flows and their interaction phenomena.

**Figure 1.**

*Challenges in traffic flow prediction; (a) road relevance from the web (https://www.ivsky.com/tupian/daolu\_ t948/); (b) the complex road network map of Shaanxi Province; (c) periodicity of traffic data.*

*Prediction of Large Scale Spatio-temporal Traffic Flow Data with New Graph Convolution… DOI: http://dx.doi.org/10.5772/intechopen.101756*

**Figure 2.**

*Classification of time-line based traffic flow prediction methods.*


#### *1.2.1.2 Advantages and disadvantages*

Models such as statistical mathematical models and traffic simulations can describe this traffic flow prediction as a time series problem approximately. However, simulation systems and simulation tools still need to consume a lot of computational power and skilled parameter settings to reach a steady state, and it is more difficult to get accurate prediction results from this prediction model due to the complexity of traffic scenarios. Besides, these methods based on statistics are only applicable to linear data, while traffic flow data are nonlinear and complex; thus, such methods are not capable of handling complex nonlinear traffic data.

#### *1.2.2 Traffic flow prediction methods based on machine learning*

With the demand for high accuracy in intelligent traffic scenarios, the shortcomings of traditional prediction methods that cannot model the complex state of traffic flow become more and more prominent, and machine learning methods gradually take an important place in traffic flow prediction tasks.

#### *1.2.2.1 Typical methods*


• Qi et al. [14] proposed a Hidden Markov Model (HMM) for short-term highway traffic prediction.

#### *1.2.2.2 Advantages and disadvantages*

Machine learning methods can better model the stochastic processes and nonlinear properties of traffic flows and have mostly better performance compared with traditional models. However, such methods do not consider the spatial and temporal correlation of traffic flow data and require extensive feature engineering. Therefore, it is difficult to solve complex traffic flow prediction problems.

#### *1.2.3 Traffic flow prediction methods based on deep learning*

Deep learning has been very successful in the fields of computer vision, speech recognition, and natural language processing, and more and more scholars are applying deep neural networks (DNNs) to various real-world scenario tasks. In traffic flow prediction, the models can be classified into road section prediction and area prediction according to their prediction range characteristics.

#### *1.2.3.1 Typical methods*


ASTGCN [24] considers only low-order neighborhood relationships between nodes and ignores correlations between different historical time periods.

#### *1.2.3.2 Advantages and disadvantages*

The core of the traffic prediction problem lies in how to effectively capture the spatiotemporal dimensional features and correlations of the data. Traditional convolutional neural networks can effectively extract local features of data, but can only work on standard grid data. The graph convolution can directly extract features from graph structured data and automatically mine the spatial patterns of traffic data. The convolution operation along the time axis can extract the temporal patterns of traffic data. Therefore, this paper focuses on the deep learning model based on graph convolutional network to capture the spatial and temporal characteristics of traffic data and effectively solve the traffic flow prediction problem.

#### *1.2.4 Future directions for exploring traffic flow prediction models*

GCN has become a mainstream method in the field of traffic flow prediction, but it started late, and its theoretical foundation and research depth are far from enough. At present, it still faces many problems that need to be solved. There are three main directions as follows.


#### **2. Traffic prediction based on graph convolution**

This section introduces the principles and techniques related to traffic flow prediction based on graph neural networks. First, an overview of graph neural

networks is given, and the graph convolutional networks (GCNs) [25] used to capture the spatial dependence of traffic flows in the road network are introduced separately in this paper. Secondly, the transformation of graph structure into actual road traffic graph structure modeling method is introduced; finally, this paper models the GCN on urban road networks and uses the topology of the GCN capture graph to handle the spatiotemporal traffic prediction task, and the application scenarios of traffic flow prediction are added at the end of the paper.

#### **2.1 Basic graph theory and convolutional networks**

#### *2.1.1 Graph theory*

Graph is a common data structure that is an important object of study in the field of computer and data science [26]. A graph usually consists of two elements, Vertex and Edge, where the vertices correspond to an abstract representation of the object of study and the edges represent the interconnection between two of the objects. Graphs are often used to represent things and specific relationships between things; in fact, graphs can represent any system with binary relationships. Graphs have a very wide range of applications in real life; social networks of human life, citation systems, urban transportation networks, and biochemical molecules can be effectively represented by graph structures.

#### *2.1.1.1 Basic concept*

In graph theory, a graph is usually represented as a set of vertices and edges [27], denoted as *G* ¼ ð Þ *V*, *E* , where *V* ¼ f g *v*1, *v*2, … , *vn* is the set of vertices and the elements in this non-empty set are the vertices of the graph. The set of edges can be denoted as *E*. A graph can be classified into directed and undirected graphs depending on whether the edges in the graph have directionality or not. If the edges of a graph have directionality, such edges are called directed edges as in **Figure 3(a)**, and if the edges of a graph have no directionality, then the corresponding graph is an undirected graph as in **Figure 3 (b)**. The graphs can be classified into weighted and unweighted graphs according to the presence or absence of specific weights of the edges in the graph [28]. Each edge in a weighted graph has a real weight as in **Figure 3(c)**, which represents the degree of connection between two vertices or the "distance" between two vertices. For example, in a traffic network, the weight of an edge can characterizes the physical distance between two vertices. In contrast, the other category is the unweighted graph, which can also be understood as the unweighted graph in which all the edge weights are equal.

#### *2.1.1.2 Algebraic representation of graphs*

As a common data structure, graphs have many kinds of algebraic representations, and common storage representations include adjacency matrices [29],

**Figure 3.** *Basic types of common diagrams.*

*Prediction of Large Scale Spatio-temporal Traffic Flow Data with New Graph Convolution… DOI: http://dx.doi.org/10.5772/intechopen.101756*

adjacency tables, and association matrices. Among them, adjacency matrices are widely used in graph representation learning because they can represent the constructional properties of graphs well and are easy to combine with matrix operations to understand the structural features of graphs.

If two vertices of an edge in a graph are *vi* and *v <sup>j</sup>*, then *vi* and *v <sup>j</sup>* are said to be their respective neighbors. We define the set of neighbors of *vi* as *N v*ð Þ*<sup>i</sup>* :

$$N(v\_i) = \left\{ v\_j | \exists e\_{\vec{\eta}} \in E \text{ or } e\_{\vec{\mu}} \in E \right\} \tag{1}$$

The degree of *vi* is defined as: the number of edges with *vi* as the endpoint, denoted as *deg v*ð Þ*<sup>i</sup>* , and therefore, *deg v*ð Þ¼*<sup>i</sup> N v*ð Þ*<sup>i</sup>* . In a directed graph, the degree of a vertex can be divided into out-degree and in-degree. The number of directed edges starting at vertex *vi* is called the out-degree of *vi*, and the number of directed edges ending at vertex *vi* is called the in-degree of *vi*. The sum of the entry and exit degrees of a vertex is equal to the degree of the vertex, and the sum of the degrees of all nodes is equal to twice the number of all edges. *eij* and *eji* represent edges in different directions between two vertices, e.g., *eij* represents an edge in the direction from *i* to *j*, while *eji* is the opposite.

The degree matrix is a matrix of the degrees of the vertices, so that the elements at the main diagonal positions are the vertex degrees and the remaining elements are 0. Accordingly, the directed graph has an entry degree matrix and an exit degree matrix. The adjacency matrix is a matrix used to represent the relationship between vertices. For graph *G* ¼ ð Þ *V*, *E* , the adjacency matrix can be expressed as:

$$A\_{ij} = \begin{cases} 1 & \text{if } < v\_i, v\_j > \subseteq E \\ 0 & \text{else} \end{cases} \tag{2}$$

The core idea of the adjacency table of a graph is to have a neighbor table for each vertex of the vertex set. The association matrix is used to represent the direct association of nodes and edges and is defined as:

$$B\_{\vec{ij}} = \begin{cases} 1 & \text{if } v\_i \text{and } e\_j \text{ are connected} \\ 0 & \text{else} \end{cases} \tag{3}$$

The Laplace matrix [30] is a special matrix that is often used in graph theory to study the structural properties of graphs. The Laplace matrix is defined as *L* ¼ *D* � *A*, where *D* is the degree matrix of the graph and *A* is the adjacency matrix of the graph. **Figure 4** shows the Laplace matrix representation of a simple graph.

#### *2.1.2 Graph convolutional networks*

Previous classical convolutional networks based on deep learning mostly consider regular data in Euclidean space in processing data. When inputting ordered data with fixed dimensions (e.g., images, speech, video, etc.), the convolutional operation and the capture and compression of the pooling layer make the network fitting effect remarkable. However, when faced with sequentially disordered road network traffic data with variable dimensions, the suitability of the traditional convolution operation decreases. However, graph neural networks (GNNs) can handle the abovementioned irregular graphs by passing node features into the neural network during iteration and outputting the node states. The original GNNs converge the hidden state to a fixed point based on the "immobile point" theory, which is ineffective for extracting edge information, and in the specific scenario

**Figure 4.** *Matrix representation of the graph; (a) graph structure; (b) degree matrix; (c) adjacency matrix; (d) Laplace matrix.*

represented by the graph, some feature information is shared among nodes due to the fixed convergence, making the actual information obtained scarce. Therefore, two types of Graph Convolutional Network (GCN) based on frequency domain and null domain are generated. Two types of GCN models: null domain convolution is the same as the traditional convolution method, which can convolve directly at the pixel point of the picture; frequency domain convolution needs to start from the graph signal processing, treating the kernel in the convolution as a filter and the learned features as signals for weighted summation.

As shown in **Figure 5**, the common network framework for graph convolution is illustrated. First, the neighboring nodes of the input graph structure are updated with a layer of convolution operation, and then a layer of ReLU activation function is added to obtain the basic convolution layer plus activation function structure. The above structure is stacked sequentially until the number of stacked layers reaches the prediction of the model, and the output part transforms the node features into labels for the relevant tasks. Unlike GNN circular iterative parameter sharing, GCN is a multilayer stack and the parameters are different for each layer.

Further, it mainly includes graph convolution based on the spectral domain (frequency domain) and the null domain. The spectral domain approach is to construct CNN simulations into the spectral domain by considering the localization of graph convolution through spectral analysis, such as Spectral Graph Convolution (SGC), which mainly focuses on the continuous derivation and improvement of the core formulations of spectral graph theory to reduce the computational power of the model from the perspective of optimization parameters. Empty domain

*Prediction of Large Scale Spatio-temporal Traffic Flow Data with New Graph Convolution… DOI: http://dx.doi.org/10.5772/intechopen.101756*

**Figure 5.** *General framework of graph convolution.*

methods perform convolution filters directly on the nodes of the graph and their neighborhoods, such as Diffusion Graph Convolution (DGC).

#### *2.1.2.1 Spectral Domain-based Graph Convolution Network (SGC)*

Spectral domain approach [31]: The absence of graph translation invariance poses difficulties in defining convolutional neural networks in the nodal domain. The spectral domain approach uses the convolution theorem to define the graph convolution from the spectral domain. The spectral domain graph convolution network is proposed based on graph signal processing, where the convolution layer of the graph neural network is defined as a filter, i.e., the filter removes the noise signal to obtain the result of the input signal. In practical applications, it can only be used to process graph structures that are undirected and have no information on the edges. The Fourier transform of the signal f(x) and its inverse transform are:

$$F(w) = \rho(w) = \int\_{-\infty}^{+\infty} f(\mathbf{x}) \exp\left(-iw\mathbf{x}\right) d\mathbf{x} \tag{4}$$

$$f(\mathbf{x}) = \boldsymbol{\varrho}^{-1}(\mathbf{F}(\mathbf{w})) \tag{5}$$

where *φ* varphi denotes the Fourier transform. It can be found that the Fourier transform that changes the time domain to the spectral domain is essentially the integral of the summation *f x*ð Þ with exp ð Þ �*iwx* as the basis vector. Defining the graph *G* of the input signal as a characteristic decomposable Laplace matrix *L* ¼ *D* � *A*, the normalized Laplace matrix *L* is defined as:

$$L = I\_n - D^{-\frac{1}{2}} A D^{-\frac{1}{2}} \tag{6}$$

where *D* denotes the degree matrix of graph *G*, *A* denotes the adjacency matrix, and *In* is the unit matrix of order *n*. After performing the eigendecomposition, it can be expressed as the universal structure *<sup>L</sup>* <sup>¼</sup> *<sup>U</sup>ΛU<sup>T</sup>*. where *<sup>Λ</sup>* is a matrix with each eigenvalue as a diagonal element, and *U* is a vector matrix composed of eigenvectors corresponding to each eigenvalue. Since *U* is an orthogonal matrix, the basis of the conventional Fourier transform exp ð Þ �*iwx* is then replaced by *<sup>U</sup><sup>T</sup>* and expressed in matrix form to obtain the Fourier transform of the signal x on the graph as:

$$
\hat{\mathfrak{x}} = U^T \mathfrak{x} \tag{7}
$$

where *x* refers to the original representation of the signal, *x*^ refers to the signal *x* after transforming it to the spectral domain, and *U<sup>T</sup>* denotes the transpose of the eigenvector matrix for doing the Fourier transform. The inverse Fourier transform of the signal *x* is:

$$\mathfrak{x} = U\hat{\mathfrak{x}}\tag{8}$$

Using the Fourier transform and the inverse transform on the graph, the graph convolution operation can be implemented as follows.

$$\mathbf{x}\_{\mathbf{G}}^{\*}\mathbf{g} = U\big(\left(U^{T}\boldsymbol{x}\right)\bigcirc\left(U^{T}\boldsymbol{y}\big)\big)\tag{9}$$

where <sup>∗</sup> *<sup>G</sup>* as denotes the graph convolution operator, *<sup>x</sup>* denotes the signal in the node domain on the graph, *g* is the graph convolution kernel, and ⨀ refers to the Hadamard product, which denotes the multiplication of the corresponding elements of two vectors. By replacing the vector *UTy* with the diagonal array *g<sup>θ</sup>* theta, the Hadamard product is transformed into a matrix multiplication. The graph convolution operation is denoted as *UgθUTx*.

To solve the excessive computation of Laplace eigenvalues and eigenvectors, Defferrar et al. [32] proposed ChebNet based on Chebyshev polynomials. The eigenvalue matrix is approximated by Chebyshev polynomials, and the Chebyshev polynomials are as follows.

$$\mathbf{g}\_{\boldsymbol{\theta}}(\boldsymbol{\Lambda}) = \sum\_{k=0}^{k-1} \theta\_k T\_k(\boldsymbol{\tilde{\Lambda}}) \tag{10}$$

where *θ* is the Chebyshev coefficient; *Tk Λ*~ � � is the *k-*th order Chebyshev polynomial of *<sup>Λ</sup>*~; *<sup>Λ</sup>*<sup>~</sup> <sup>¼</sup> <sup>2</sup><sup>∧</sup> *<sup>λ</sup>max* � *In*, *<sup>Λ</sup>*<sup>~</sup> are the normalized eigenvalue diagonal matrices. Thus, the convolution operation can be expressed as follows:

$$\mathbf{x}\_{\mathbf{G}}^{\*}\mathbf{g}\_{\boldsymbol{\theta}} = \mathbf{U} \left(\sum\_{k=0}^{k-1} \theta\_{k} T\_{k} \left(\tilde{\boldsymbol{\Lambda}}\right)\right) \mathbf{U}^{T}\mathbf{x} = \sum\_{k=0}^{k-1} \theta\_{k} T\_{k} \left(\tilde{\boldsymbol{\Lambda}}\right) \mathbf{x} \tag{11}$$

where <sup>2</sup>*<sup>L</sup> <sup>λ</sup>max* � *In*; the computational complexity of the graph convolution calculation is reduced from *O N*<sup>2</sup> � � to *O LE* ð Þ by replacing the Chebyshev expansion *Tk <sup>Λ</sup>*<sup>~</sup> � � with the eigen-decomposition part of the frequency domain convolution *g<sup>θ</sup>* in the original GCN, effectively avoiding the computational part of the eigen-decomposition, where *E* is the number of edges in the input graph and *L* is the order of the Laplace operator polynomial. ChebNet results in a significant reduction in computational complexity and a significant improvement in computational efficiency.

After that, Kipf et al. used first-order Chebyshev polynomials and simplified the spectral graph convolution by restricting the parameters in order to make ChebNet have better local connectivity properties. Let *<sup>K</sup>* <sup>¼</sup> 2, *<sup>T</sup>*<sup>0</sup> *<sup>L</sup>*<sup>~</sup> � � <sup>¼</sup> 1, *<sup>T</sup>*<sup>1</sup> *<sup>L</sup>*<sup>~</sup> � � <sup>¼</sup> *<sup>L</sup>*, *<sup>λ</sup>max* <sup>¼</sup> 2. Then the graph convolution calculation is simplified as:

$$
\varkappa\_{\rm G}^{\*} \mathfrak{g}\_{\vartheta} \approx \theta\_0 \mathfrak{x} + \theta\_1 (L - I\_n) \mathfrak{x} = \theta\_0 \mathfrak{x} - \theta\_1 D^{-\frac{1}{2}} A D^{-\frac{1}{2}} \mathfrak{x} \tag{12}
$$

Where: *θ*<sup>0</sup> and *θ*<sup>1</sup> are free parameters, shared by the whole graph. Let *θ* ¼ *θ*<sup>0</sup> ¼ �*θ*1, i.e., the two parameters are transformed into a one-parameter model, then the graph convolution is calculated as:

*Prediction of Large Scale Spatio-temporal Traffic Flow Data with New Graph Convolution… DOI: http://dx.doi.org/10.5772/intechopen.101756*

$$\propto\_{\rm G}^{\*} \mathbf{g}\_{\theta} \approx \theta \left( I\_n + D^{-\frac{1}{2}} A D^{-\frac{1}{2}} \right) \mathbf{x} \tag{13}$$

However, since *In* <sup>þ</sup> *<sup>D</sup>*�<sup>1</sup> 2*AD*�<sup>1</sup> <sup>2</sup> is the eigenvalue of 0, 2 ½ �, which may lead to the problem of disappearing, exploding or unstable values of neural network gradients, *<sup>D</sup>*<sup>~</sup> �<sup>1</sup> 2 *<sup>A</sup>*~*D*<sup>~</sup> �<sup>1</sup> <sup>2</sup> is used instead of *In* <sup>þ</sup> *<sup>D</sup>*�<sup>1</sup> 2*AD*�<sup>1</sup> <sup>2</sup> for normalization.

#### *2.1.2.2 Graph Convolutional Network (DGC) based on spatial domain*

The spatial domain approach: spatial-based graph convolutional networks were first proposed in Neural Network for Graphs (NN4G), which is different from the spectral domain graph convolutional neural network from signal processing theory, the spatial domain graph convolutional neural network starts from the nodes in the graph, designs the aggregation function to gather the features of neighboring nodes, adopts the message propagation mechanism, and thinks about how to accurately and efficiently use the features of neighboring nodes of the central node to update the features of the central node. The essence of CNN is weighted summation, and the spatial domain graph convolutional neural network is based on the basic construction process of CNN to accomplish the purpose of GNN aggregation of neighboring nodes from the perspective of summation. Since the nodes in the graph are unordered and the number of neighboring nodes is uncertain, one idea of the spatial domain graph convolutional neural network is (1) to fix the number of neighboring nodes and (2) to sort the neighboring nodes. If the above two tasks are completed, the non-Euclidean structured data becomes ordinary Euclidean structured data, and naturally the traditional algorithm can be completely migrated to the graph. Among them, step (1) also facilitates the application of GNN to graphs with many nodes.

Currently, GCN has become a fundamental model for traffic flow prediction research and a benchmark method for experiments. Although neither the airdomain graph convolution network nor the frequency-domain graph convolution network is proposed for the traffic flow prediction problem, the natural graph structure property of traffic data makes GCN show high efficiency and accuracy in the field of traffic flow prediction than the traditional methods.

#### **2.2 Modeling experiments for traffic prediction**

This section will first give a specific definition of the traffic flow prediction problem and then give the flow of the traffic flow prediction model based on spatiotemporal characteristics.

#### *2.2.1 Construction of traffic road network graph structure*

Traffic prediction is a typical time series prediction problem [33], and its road network traffic flow data exhibits a high degree of periodicity, which provides a great deal of potential for traffic prediction. **Figure 6** shows the traffic data for the first week of December for individual toll stations on the Shaanxi Provincial Freeway, demonstrating a high degree of periodicity.

Given the first *M* flow observations, the flow data measured at the *n* sensor stations at time step *H* can be viewed as a matrix of size *M* � *N*. The most likely flow measurements predicted at the next *H* time steps are:

$$F\_{t+1}, \dots, F\_{t+H} = \arg\max \log P(F\_{t+1}, \dots F\_{t+H} | F\_{t-M+1}, \dots, F\_t) \tag{14}$$

**Figure 7.** *Spatiotemporal correlations. (a) Stations in a road network. (b) Dynamic spatial correlations.*

where *Ft* ∈*R<sup>n</sup>* is a vector of observations for *n* road segments at time step *t*, where each element records the historical observations for a single road segment. For unordered road network traffic data, the observations *Ft* are not independent and can be viewed as graph signals defined on an undirected graph *G* with weight as shown in **Figure 7**, the graph is expressed in terms of an adjacency matrix *Gt* ¼ *Vt* ð Þ , *E*, *W* . *Ft* is a finite set of vertices corresponding to the observations of the *n* toll stations in the traffic.*E* is a set of edges representing the connections between stations, and *W* ∈*R<sup>n</sup>*�*<sup>n</sup>* represents the weighted adjacency matrix of *Gt*.

#### *2.2.2 Traffic data acquisition and preprocessing*

#### *2.2.2.1 Traffic datasets*

Traffic flow prediction by deep learning requires a large amount of data support, that is, real-world road traffic speed data. With the continuous improvement of

*Prediction of Large Scale Spatio-temporal Traffic Flow Data with New Graph Convolution… DOI: http://dx.doi.org/10.5772/intechopen.101756*

traffic facilities, the amount of traffic data has also produced an explosive growth. Traffic flow prediction is precisely based on huge traffic data, so understanding the current common traffic data is the basis for achieving traffic flow prediction. The sources of traffic data mainly include road fixed-point detectors, vehicle GPS records, bus IC cards, license plate recognition, cell phone data, etc. We have made the common traffic data used for traffic flow prediction as **Table 1**.

#### *2.2.2.2 Data preprocessing*

Traffic flow data mainly detects parameters such as speed, flow rate, time, etc. The data collection process may result in detection equipment failure, instrument error, software failure, communication interference, environmental noise, etc., and even sudden road failure may have a great impact on the data, resulting in real-time data may be missing or abnormal, so the overall process of validity processing of this type of traffic data according to its type is shown in **Figure 8**.

#### • Abnormal data processing

The preprocessing methods of abnormal data can be divided into two categories: Data rejection. Data rejection can be used when there is less erroneous data in the traffic data. The rejection of individual erroneous data will not affect the integrity and trend of the data, but if the proportion of erroneous data is large, the rejection method cannot be adopted because too much rejection of erroneous data will


#### **Table 1.**

*Common data sets for traffic flow prediction models.*

**Figure 8.** *Overall flow of data preprocessing.*

destroy the integrity of the data and its trend. Peak denoising. Since traffic data is highly nonlinear and the traffic data at peak hours can be very significant, i.e., the noise oscillation region during peak hours, peak denoising is needed. Commonly used methods such as empirical mode decomposition (EMD), i.e., fluctuation decomposition in the local oscillation part of the trend change.

• Missing data processing [34]

Missing data is caused by hardware and software factors that do not detect data at the detection end or packet loss during data communication. In road traffic, this can be due to excessive vehicle density and inaccurate data collection by traffic flow detection instruments, data failures in transmission, and many other reasons for gaps in the collected data, such as missing data at a point in time, a certain period, or several periods of time. Typically, there are two classical missing patterns in time series data as shown in **Figure 9** below. **Figure 9(a)** indicates that the exported toll records have randomly lost observations at a single toll station, and the white circles indicate the missing values. **Figure 9(b)** indicates that there are several consecutive time points in the records of multiple toll stations with no observed values, which is

**Figure 9.** *Example of missing pattern of spatiotemporal data (traffic data as an example).*

#### *Prediction of Large Scale Spatio-temporal Traffic Flow Data with New Graph Convolution… DOI: http://dx.doi.org/10.5772/intechopen.101756*

a more common pattern of missing spatiotemporal traffic data. The green curve in the green panel represents the observed values and the gray curve represents the missing values. This situation requires correlating and processing the missing data, and then repairing the data using interpolation and smoothing algorithms, prior to dimensionlessizing the data using initialization operators to consider the fact that the units and orders of magnitude of the characteristic series of influencing factors are not uniform.

#### • Data normalization/normalization

Generally, the obtained traffic data are scattered, and the distribution characteristic curve presented by the data is fuzzy, and the distribution cannot be determined. Therefore, the data do not satisfy the normal distribution and need to be normalized to regularize the data and improve the comparability between the data to facilitate the subsequent model prediction. The data are z-core normalized to approximately satisfy the normal distribution, so that the weights are more evenly distributed in the subsequent model training, i.e.,

$$\mathbf{x}^\* = \frac{\mathbf{x} - \mu}{\sigma} \tag{15}$$

where *μ*, *σ* are the mean and standard deviation.

#### *2.2.3 Classical graph convolution framework*

To solve the problem of non-Euclidean structure of traffic network data, graph neural networks are often used to model spatial dependencies in traffic networks, and then convolution is used to fundamentally improve the efficiency of graph analysis and network construction from frequency and spatial domains, i.e., Graph Convolutional Network (GCN). Graph Convolution extends traditional convolution to graph-structured data, and powerful methods such as graph convolutional networks and their variants are widely used for these spatiotemporal network data prediction tasks with good performance. Most existing graph convolutional traffic flow forecasts are spatiotemporal in nature, since most traffic data sets have both spatial and temporal attributes. The development of traffic flow prediction models based on graph convolutional networks is presented as in **Figure 10**. In this paper, five of the most typical and most referenced models will be selected for illustration.

**Figure 10.** *Traffic flow prediction model based on graph convolution.*

**Figure 11.** *Architecture of spatiotemporal graph convolutional networks.*

#### *2.2.3.1 STGCN predicts traffic flow*

The STGCN model proposed by Yu et al. [19] (**Figure 11**) (2018AAAI) for the first time uses graph structures to model traffic networks while using graph convolution to model spatiotemporal sequences and uses pure convolutional structures to extract spatiotemporal features from the graph structures simultaneously.

STGCN is composed of several spatiotemporal convolutional blocks, each of which is formed as a "sandwich" structure with two gated sequential convolution layers and one spatial graph convolution layer in between. The framework STGCN consists of two spatiotemporal convolutional blocks (ST-Conv blocks) and a fullyconnected output layer in the end. Each ST-Conv block contains two temporal gated convolution layers and one spatial graph convolution layer in the middle. The residual connection and bottleneck strategy are applied inside each block.

The model, although using convolution instead of LSTM-like patterns, does speed up training, but it also leads to missing historical data information and can only achieve short-term prediction, not long-term prediction, and graph convolution captures information between different nodes to model spatial models, which does not seem to make good use of the potential relationships between different regions.

#### *2.2.3.2 DCRNN for predicting traffic flow*

The DCRNN model proposed by Li et al. [22] (**Figure 12**) (2018ICLR) models spatial correlation as a diffusion process on directed graphs, thus modeling the transformation of traffic flow, and proposes diffusion convolution recurrent neural networks that can capture the spatial and temporal dependence between time series using a framework of seq2seq. To address these challenges, we propose to model the traffic flow as a diffusion process on a directed graph and introduce Diffusion Convolutional Recurrent Neural Network (DCRNN), a deep learning framework for traffic prediction that incorporates both spatial and temporal dependency in the traffic flow. Specifically, DCRNN captures the spatial dependency using bidirectional random walks on the graph and the temporal dependency using the encoder-decoder architecture with scheduled sampling.

*Prediction of Large Scale Spatio-temporal Traffic Flow Data with New Graph Convolution… DOI: http://dx.doi.org/10.5772/intechopen.101756*

#### *2.2.3.3 GMAN prediction of traffic flow*

The GMAN model (2020AAAI) proposed by Zheng et al. [23] (**Figure 13**) uses a spatiotemporal attention mechanism to model dynamic spatial relationships and nonlinear temporal relationships separately, while using a gating mechanism to

**Figure 12.**

*System architecture for the diffusion convolutional recurrent neural network designed for spatiotemporal traffic prediction.*

#### **Figure 13.**

*The overall framework of GMAN model. (a) the framework of GMAN. (b) Spatiotemporal embedding. (c) the ST-attention block.*

**Figure 14.** *The framework of graph WaveNet.*

adaptively fuse the information extracted by the spatiotemporal attention mechanism.

Because the whole traffic is a network, the error of one node is amplified by other nodes, which affects the final prediction results. To solve the above problem, GMAN adopts an encoder-decoder architecture, where encoder is used to extract features and decoder to predict. A transformed attention layer is applied in between these two to transform the encoded traffic features to generate a sequential representation of future time steps as the input to the decoder. Here Encoder and Docoder are composed of ST-attention block. Then the authors use an STE block to combine the spatial and temporal information and then input into the ST-ATTENTION block to solve the problem of complex time–space correlation. Finally, the experimental results of the article on two real-world traffic prediction tasks (i.e., traffic volume prediction and traffic speed prediction) demonstrate the superiority of GMAN.

#### *2.2.3.4 Graph WaveNet prediction of traffic flow*

Wu et al. [35] (**Figure 14**) propose in this paper a novel graph neural network architecture, Graph WaveNet, for spatial–temporal graph modeling. The model uses the idea of diffusion convolution in extracting spatial features of road networks and adds a novel adaptive connection matrix to make up for the deficiency of fixed *Prediction of Large Scale Spatio-temporal Traffic Flow Data with New Graph Convolution… DOI: http://dx.doi.org/10.5772/intechopen.101756*

**Figure 15.** *The framework of ASTGCN. SAtt: Spatial attention.*

topology in extracting spatial features and employs dilated causal convolution and gate mechanism on time series without the traditional RNNs cycle, which is validated by METR-LA and PEMS-BAY data sets, GWN in terms of training effect and time good results were achieved.

#### *2.2.3.5 ASTGCN prediction of traffic flow*

Guo et al. [24] (**Figure 15**) propose a novel attention-based spatial–temporal graph convolutional network (ASTGCN) model to solve traffic flow prediction problem. ASTGCN mainly consists of three independent components to respectively model three temporal properties of traffic flows, i.e., recent, daily-periodic, and weekly-periodic dependencies. More specifically, each component contains two major parts: 1) the spatial–temporal attention mechanism to effectively capture the dynamic spatial–temporal correlations in traffic data; 2) the spatial–temporal convolution, which simultaneously employs graph convolutions to capture the spatial patterns and common standard convolutions to describe the temporal features. The output of the three components is weighted fused to generate the final prediction results.

These five typical traffic flow prediction models above are compared in **Table 2**.

#### **2.3 Application scenarios for traffic flow prediction**

Many researchers have already applied the proposed traffic flow prediction models to various traffic scenarios and achieved excellent results. For example, by predicting the traffic flow of a roadway in advance, it can provide drivers with more advanced travel routes. In addition, it can also provide prerequisites for traffic light optimization, etc. In this paper, three scenarios are chosen to illustrate the


#### **Table 2.**

*Characteristics of typical models.*

application of traffic flow prediction in the context of highways. These three scenarios are the work that has been done by our team so far, and the reason for choosing the highway is that the work in this area is more mature.

#### *2.3.1 Scenario I. quantitative assessment on truck-related road risk for the safety control*

Traffic conditions of truck flow is one of the critical factors influencing transportation safety and efficiency, which is directly related to traffic accidents, maintenance scheduling, traffic flow interruption, risk control, and management. The estimation of the truck flow of various types could be better to identify the irregular flow variation introduced by various trucks and quantitatively assessed the corresponding road risks.

Jin et al. [36] first improved on the gated recursive unit (GRU) based on a deep learning approach to estimate various types of truck traffic. Then a multiple logistic regression method was proposed to classify the road risk into three classes: safe, risky, and dangerous. According to the CSV trend, road risks are classified into

*Prediction of Large Scale Spatio-temporal Traffic Flow Data with New Graph Convolution… DOI: http://dx.doi.org/10.5772/intechopen.101756*

**Figure 16.** *Coefficient of speed variation (CSV) of passenger cars.*

**Figure 17.** *Road risk assessed by predicted truck flow in April 12,018.*

three categories as shown in **Figure 16**. Different risk classes can guide traffic control and management and broadcast traffic information to drivers to help them choose their travel routes.

Finally, the road risk calculated by the predicted truck traffic is shown in **Figure 17**, from which the road risk status can be obtained at every moment.

#### *2.3.2 Scenario II. Improved manpower planning for highway toll gate*

In China, the relatively heavy queues at freeway toll booths and service areas during peak hours, coupled with the saturation of manpower scheduled during offpeak hours, are undoubtedly a huge obstacle to efficient and cost-effective freeway operations. Therefore, it needs an intelligent manpower planning strategy to simultaneously ensure the efficiency of highway transportation management and road user satisfaction.

Jin et al. [37] addressed a high-precision prediction of vehicle flow based on historical multisource traffic data. Based on the prediction results, an improved manpower planning strategy is proposed to schedule the work accordingly. And the method was tested on a randomly selected toll station as an example, as **Figure 18** shows the daily traffic pattern of the highway Hechizhai toll station.

**Figure 18.** *Daily traffic pattern of the Hechizhai toll gate on the freeway.*

#### **Figure 19.**

*Reversible toll lane configuration suggestion. The red lines refer to the number of toll lanes designed in each direction. The blue and brown gradient lines indicate the change in the number of exit and entrance lanes per hour, respectively. The black dash lines depict the reversible toll lane change period.*

The results show from **Figure 19** that the upper part and the lower part show the lane opening at the entrance and exit of toll gates in one week, respectively. Two narrow black dotted lines indicate the morning and evening peak hours. During the morning peak, it is obvious that two entry lanes are not used while the number of toll lanes of the exit has reached its upper threshold. The opposite phenomenon is

*Prediction of Large Scale Spatio-temporal Traffic Flow Data with New Graph Convolution… DOI: http://dx.doi.org/10.5772/intechopen.101756*

**Figure 20.** *Comparison of duplex lane simulation process.*

seen in the evening peak. Therefore, we suggest that one or two of the entrance lanes can be set as a reversible lane so that the traffic pressure can be released in peak hours. Note that the usage condition of this suggestion is that the entrance and exit of the toll gate must be adjacent such as Hechizhai toll gate.

#### *2.3.3 Scenario III. Capacity analysis of toll stations*

Yuan et al. [38] used the results of traffic prediction to analyze the capacity of a toll station and used different queuing models to describe the capacity of typical lanes and compared the delay time and queue length of each model and obtained that the single-way model is more efficient in a typical system. The traffic index of the multiplex lane is also simulated, and the specific simulation process is shown in **Figure 20**, and the capacity of the multiplex lane is obtained to be larger than that of the typical MTC lane, which can relieve the traffic pressure during the peak hours.

#### **3. Conclusions**

Traffic is the main driving force of urban development, and real-time and accurate traffic flow prediction is the key to the application of intelligent transportation system. Graph convolutional neural network is an efficient model for processing graph data and has received a lot of attention from researchers in the past few years. This section attempts to summarize the recent graph convolutional neural network models and their applications to traffic flow prediction.

1.This section summarizes the GCN-based traffic flow prediction model. Starting from the basic definition of graph convolution, the basic principles of GCN are introduced with the focus on frequency-domain graph convolution

and space-domain graph convolution. Then, the representative models are clarified, and the structure and characteristics of different prediction models are further categorized and reviewed.


### **Acknowledgements**

Thanks to the following researchers for their great support in the writing of this book, especially Wenbang Hao, Erlong Tan, Wanrong Xu, Zhen Jia, Yiwen Gao, Yajie Zhang, etc., who have put a lot of energy into many formulas and illustrations.

This work was supported in part by the National Key R&D Program of China (2020YFB1600400), Key Research and Development Program of Shaanxi Province (No.2020GY-020), National Natural Science Foundation of China (Grant No. 51505037) and Supported by the Fundamental Research Funds for the Central Universities, CHD (Grant No. 300102320305).

#### **Author details**

Ping Wang\*, Tongtong Shi, Rui He and Wubei Yuan Chang'an University, Xian, China

\*Address all correspondence to: pingwang@chd.edu.cn

© 2022 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/ by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

*Prediction of Large Scale Spatio-temporal Traffic Flow Data with New Graph Convolution… DOI: http://dx.doi.org/10.5772/intechopen.101756*

#### **References**

[1] Wei Chen H, Klaiber A. Does road expansion induce traffic? An evaluation of vehicle-kilometers traveled in China. Journal of Environmental Economics and Management. 2020;**104**:95-696. DOI: 10.1016/j.jeem.2020.102387

[2] Ye J, Zhao J, Ye K, Chengzhong X. How to build a graph-based deep learning architecture in traffic domain: A survey. IEEE Transactions on Intelligent Transportation Systems. 2020;**2**(6):1-20. DOI: 10.1109/ TITS.2020.3043250

[3] Wang P, Hao W, Jin Y. Fine-grained traffic flow prediction of various vehicle types via fusison of multisource data and deep learning approaches. IEEE Transactions on Intelligent Transportation Systems. 2020;**5**: 8-10. DOI: 10.1109/TITS.2020. 2997412

[4] Tian Z, Jia L, Dong H, Zhang Z. Determination of key nodes in urban road traffic network. Shenyang, China: Proceeding of the 11th World Congress on Intelligent Control and Automation. 29 June-4 July 2014, IEEE; 2014. pp. 3396-3400. DOI: 10.1109/ WCICA.2014.7053279

[5] Wang P, Xu W, Jin Y, Wang J, Li L. Forecasting traffic volume at a designated cross-section location on a freeway from large-regional toll collection data. IEEE Access. 2019;**7**: 9057-9070. DOI: 10.1109/ ACCESS.2018.2890725

[6] Jin Y, Xu W, Wang P. SAE Network: A Deep Learning Method for Traffic Flow Prediction. 2018 5th International Conference on Information, Cybernetics, and Computational Social Systems (ICCSS). 2018:241-246. DOI: 10.1109/ICCSS.2018.8572451

[7] Yu X, Sun L, Yang Y, Liu G. A shortterm traffic flow prediction method

based on spatial–temporal correlation using edge computing. Computers & Electrical Engineering. 2021;**93**:107219. DOI: 10.1016/j.compeleceng.2021. 107219

[8] Miska MP. Microscopic Online Simulation for Real-Time Traffic Management. TRAIL Research School. 2007

[9] Ngoduy D, Wilson RE. Multianticipative nonlocal macroscopic traffic model. Computer-Aided Civil and Infrastructure Engineering. 2014; **29**(4):248-263

[10] Okutani I, Stephanedes YJ. Dynamic prediction of traffic volume through Kalman filtering theory. Transportation Research Part B: Methodological. 1984; **18**(1):1-11

[11] Kumar SV, Vanajakshi L. Short-term traffic flow prediction using seasonal ARIM A model with limited input data. European Transport Research Review. 2015;**7**(3):21

[12] Jeong YS, Byon YJ, Castro-Neto MM, Easa SM. Supervised weightingonline learning algorithm for short-term traffic flow prediction. IEEE Trans. on Intelligent Transportation Systems. 2013;**14**(4):1700-1707. DOI: 10.1109/ TITS.2013.2267735

[13] Zhu S, Lin C, Chu Z. Bayesian network model for traffic flow estimation using prior link flows. Journal of Southeast University (English Edition). 2013;**29**(3):322-327. DOI: 10.3969/j.issn.1003-7985.2013. 03.017

[14] Qi Y, Ishak S. A Hidden Markov Model for short term prediction of traffic conditions on freeways. Transportation Research Part C: Emerging Technologies. 2014;**43**:95-111. DOI: 10.1016/j.trc.2014.02.007

[15] Chen M, Yu X, Yang L. PCNN: Deep convolutional networks for short-term traffic congestion prediction. IEEE Transactions on Intelligent Transportation Systems. 2018;**19**(11): 3550-3559. DOI: 10.1109/TITS.2018. 2835523

[16] Lv Y, Duan Y, Kang W, Li Z. Traffic flow prediction with big data: A deep learning approach. IEEE Transactions on Intelligent Transportation Systems. 2015;**16**(2):865-873. DOI: 10.1109/ TITS.2014.2345663

[17] Hochreiter S, Schmidhuber J. Long short-term memory [J]. Neural Computation. 1997;**9**(8):1735-1780. DOI: 10.1162/neco.1997.9.8.1735

[18] Cho K, van Merrienboer B, Gulcehre C, Bahdanau D. Learning phrase representations using RNN encoder-decoder for statistical machine translation. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2014

[19] Yu B, Yin H, Zhu Z. Spatio-temporal graph convolutional networks: A deep learning framework for traffic forecasting. Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence. 2018. pp. 3634-3640

[20] Zhang J, Yu Z, Qi D. Deep spatiotemporal residual networks for citywide crowd flows prediction. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence(AAAI'17). AAAI Press; 2017. pp. 1655–1661

[21] Pei H, Wei B, Chang KC-C, Lei Y, Yang B. Geom-GCN: Geometric Graph Convolutional Networks[J]. arXiv preprint arXiv:2002.05287, 2020

[22] Li Y, Yu R, Shahabi C, Liu Y. Diffusion convolutional recurrent neural network: data-driven traffic forecasting. arXiv preprint arXiv: 1707.01926, 2017

[23] Zheng C, Fan X, Cheng W, Qi J. GMAN: A graph multi-attention network for traffic prediction. Proceedings of the AAAI Conference on Artificial Intelligence. 2020;**34**(01): 1234-1241. DOI: 10.1609/aaai. v34i01.5477

[24] Guo S, Lin Y, Feng N, Song C, Wan H. Attention based spatialtemporal graph convolutional networks for traffic flow forecasting. Proceedings of the AAAI Conference on Artificial Intelligence. 2019;**33**:922-929. DOI: 10.1609/aaai.v33i01.3301922

[25] Kipf T, Welling M. Semi-supervised classification with graph convolutional networks. Published as a conference paper at ICLR. 2017. arXiv:1609.02907

[26] Liu Z, Zhou J. Introduction to Graph Neural Networks. Synthesis Lectures on Artificial Intelligence and Machine Learning. 2020;**14**(2):1-127

[27] Cheng Y, Liu Z, Cunchao T, Shi C, Sun M. Network Embedding: Theories, Methods, and Applications[J]. Synthesis Lectures on Artificial Intelligence and Machine Learning. 2021;**15**(2):1-242. DOI: 10.2200/S01063ED1V01Y20 2012AIM048

[28] Vassilis NI, Marques AG, Giannakis GB. Tensor Graph Convolutional Networks for Multi-Relational and Robust Learning. IEEE Transactions on Signal Processing. 2020; **68**:6535-6546. DOI: 10.1109/ TSP.2020.3028495

[29] Mestre Â. An Algebraic Representation of Graphs and Applications to Graph Enumeration. International Journal of Combinatorics. 2013;**2013**:347613, 14 pages. DOI: 10.1155/2013/347613

*Prediction of Large Scale Spatio-temporal Traffic Flow Data with New Graph Convolution… DOI: http://dx.doi.org/10.5772/intechopen.101756*

[30] Liao T, Wang W-Q, Huang B, Xu J. Learning laplacian matrix for smooth signals on graph. IEEE International Conference on Signal, Information and Data Processing (ICSIDP). 2019;**2019**: 1-5. DOI: 10.1109/ICSIDP47821.2019. 9173468

[31] Shi J, Cheung M, Du J, Moura JMF. Classification with vertex-based graph convolutional neural networks. 52nd Asilomar Conference on Signals, Systems, and Computers. 2018;**2018**: 752-756. DOI: 10.1109/ACSSC. 2018.8645378

[32] Tang S, Li B, Yu H. ChebNet: Efficient and Stable Constructions of Deep Neural Networks with Rectified Power Units using Chebyshev Approximations. arXiv. 2019

[33] Xiao J, Xie Y, Wen Y. The shorttime traffic flow prediction at ramp junction based on wavelet neural network. IEEE 5th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC). 2021;**5**:664-667. DOI: 10.1109/IAEAC50856.2021.9390960

[34] Yang H, Pan Z, Tao Q. Online learning for time series prediction of AR model with missing data. Neural Processing Letters. 2019;**50**(3): 2247-2263. DOI: 10.1007/s11063-019- 10007-x

[35] Wu Z, Pan S, Long G. Graph wave net for deep spatial-temporal graph modeling. Twenty-eighth International Joint Conference on Artificial Intelligence IJCAI. arXiv preprint arXiv: 1906.00121, 2019

[36] Jin Y, Jia Z, Wang P, Sun Z, Wen K, Wang J. Quantitative Assessment on Truck-Related Road Risk for the Safety Control via Truck Flow Estimation of Various Types. IEEE Access. 2019;**7**: 88799-88810. DOI: 10.1109/ ACCESS.2019.2924699

[37] Jin Y, Gao Y, Wang P, Wang J, Wang L. Improved Manpower Planning Based on Traffic Flow Forecast Using a Historical Queuing Model. IEEE Access. 2019;**7**:125101-125112. DOI: 10.1109/ ACCESS.2019.2933319

[38] Yuan W, Wang P, Yang J, Meng Y. An alternative reliability method to evaluate the regional traffic congestion from GPS data obtained from floating cars. IET Smart Cities. 2021;**3**(2):79-90. DOI: 10.1049/smc2.12001

Section 4
