**3. Methodology**

In this study, the bridge condition ratings data were provided by the Bridge Unit, Roads Branch, PWD Malaysia. The ratings based on the Annual Bridge Inspection Manual [4] by the PWD classified the state of a bridge and bridge component conditions into five numerical systems on a rating scale of 1–5, from "no damage" (rating 1) to "heavily and critically damaged" (rating 5). The bridge components are classified as either primary or secondary elements. The primary components include the surface, deck slab, beam/girder, piers, and abutment, whereas the secondary components include the parapets, expansion joints, bearings, slope protection, and drainpipes. The bridge condition rating can be evaluated by processing the ratings and important sets of the bridge components [15]. Suksuwan [16] evaluated an overall bridge condition rating based on the condition of the superstructure and substructure. The superstructure consists of two components, namely the bridge deck and accessories. Meanwhile, the substructure is divided into three components: pier, abutment, and foundation. The relationship between the bridge condition rating and its components can be drawn as *Y* = *f* {*X*<sup>1</sup> , *X*<sup>2</sup> , …, *Xp* }, where *X*<sup>1</sup> , *X*<sup>2</sup> , …, *Xp* are *p* bridge component condition rating variables and *Y* represents the bridge condition rating.

Since the availability of data sets with complete condition data of bridge component were very limited, utilizing incomplete sets by handling missing value with the appropriate treatment is expected to improve the performance of the model. Furthermore, upon observation of the available data sets, it was found that there is a big gap in the bridge condition rating distribution, where there were no complete data sets available for a bridge condition with a rating of 4 and only four pairs of complete sets of bridge conditions with a rating of 5 available. Consequently, finding the best method to handle this problem is an important step prior to constructing the model of the bridge condition ratings.

#### **3.1. Data preparation**

Li et al. [13] utilized ANN to evaluate a bridge conditions based on substructure, superstructure, deck, and channel conditions. In their proposed model, the training cases converged very well, but for the test cases, the prediction from the network is consistent with the target in about 60%. They concluded that the low prediction accuracy affected by the data used in training the network is not sufficient for the network to generate proper weights to precisely model the input– output relationship. Another reason was inconsistency in the evaluation results due to subjective factors observed in the inspection data, which are used to train and test the neural network.

In most countries, there exists a large time gap between the dates of the construction of the bridge and the adoption and implementation of the relevant BMS [14]. There is a general leaking in such BMSs' database such as inconsistency of data sets and much bridge component condition rating is unrecorded. Another example of how bridge condition rating data can go missing is the difficulty in obtaining information and expensive testing of some bridge components. The missing data constitute the largest fraction of the difficulties in analyzing the data, making constructing predictive ratings, and other decision-making processes that depend on these data. Furthermore, it is impossible to build a convictive classification model with missing data because the missing data affect the integrity of the dataset [6]. Zhimin et al. [6] used five methods to handle missing data in their classification problem. The methods are as follows: deleting missing data, replacing missing data with zero, replacing missing data with the mean value of all the data of the training set, replacing missing data with the mean value from the same label data of the training set and predicting the missing value using a feed-forward backpropagation ANN. Markey et al. [5] compared three methods for estimating missing data in the evaluation of ANN models for their approximation problems. These methods are as follows: simply replacing the missing data with zero or the mean value from

the training set or using a multiple imputation procedure to handle the missing value.

In this study, the bridge condition ratings data were provided by the Bridge Unit, Roads Branch, PWD Malaysia. The ratings based on the Annual Bridge Inspection Manual [4] by the PWD classified the state of a bridge and bridge component conditions into five numerical systems on a rating scale of 1–5, from "no damage" (rating 1) to "heavily and critically damaged" (rating 5). The bridge components are classified as either primary or secondary elements. The primary components include the surface, deck slab, beam/girder, piers, and abutment, whereas the secondary components include the parapets, expansion joints, bearings, slope protection, and drainpipes. The bridge condition rating can be evaluated by processing the ratings and important sets of the bridge components [15]. Suksuwan [16] evaluated an overall bridge condition rating based on the condition of the superstructure and substructure. The superstructure consists of two components, namely the bridge deck and accessories. Meanwhile, the substructure is divided into three components: pier, abutment, and foundation. The relationship between the bridge condition rating and its components can be drawn

are *p* bridge component condition rating variables

**3. Methodology**

30 Bridge Engineering

as *Y* = *f* {*X*<sup>1</sup>

, *X*<sup>2</sup>

, …, *Xp*

}, where *X*<sup>1</sup>

and *Y* represents the bridge condition rating.

, *X*<sup>2</sup>

, …, *Xp*

The original data sets include 1244 data sets from the last 4 years of inspection records of 311 single-span concrete bridges. Among these data sets, only 579 sets have complete rating data for components and the whole bridge condition. From these 579 data sets, a large number of the bridge condition rating data sets have repeated data. When constructing the model of bridge condition ratings through an ANN and MRA, they provide nothing new as the information is redundant. To avoid this redundancy, only one of the data sets with the same data is retained and the others are deleted. Furthermore, these available data sets are also adjusted to remove the outlier data sets where the bridge condition rating value cannot be larger than the maximum component rating data. After deleting the redundant and outlier data sets, there are 157 data sets left with almost all of them having a rating of 1, 2, or 3. However, bridges with a condition rating of 4 do not have complete component rating data, and only three pairs with a rating of 5 are available, as shown in **Figure 1**.

As explained in the previous section, all data sets for bridges with a rating of 4 are incomplete. In this study, the data sets that have 1–3 components with missing value are considered and handled with different methods to provide more data and fill the gap in the bridge rating scale distribution. The number of data sets for M0, M1, M2, and M3 are 157, 226, 252, and 267, respectively.

**Figure 1.** Bridge rating distribution of completed data sets (M0).


**Table 2.** The format of data distribution used in this study.

**Table 2** illustrates data sets that contain complete data and missing data of bridge component condition ratings. Five methods are proposed to handle these missing data of bridge component ratings. These five methods are as follows: substituting with the local mean (SM), substituting with the local minimum (SMN), substituting with the local mode (SMD), substituting with the local mean value of the same component class (SMC), and substituting with the available bridge condition rating value (SBR) from the same label of data set.

In SM method, the mean value of the data label n (xn) is calculated using Eq. (1). The missing value is then substituted with the xn value. Furthermore, the local minimum value (xmin) is defined as the smallest value appearing in the same label of data set. Meanwhile, the local mode value (xmode) is defined as the value that appears most often in the same label of data set.

$$X\_{\mu} = \frac{\sum\_{n=1}^{p} X\_{n1}}{p} \tag{1}$$

*y* = *β*<sup>0</sup> + *β*<sup>1</sup> *x*<sup>1</sup> + *β*<sup>2</sup> *x*<sup>2</sup> + *β*<sup>3</sup> *x*<sup>3</sup> + …+*β<sup>n</sup> xn* + *ε* (2)

Developing a Bridge Condition Rating Model Based on Limited Number of Data Sets

http://dx.doi.org/10.5772/intechopen.71556

33

Since the data sets used in this modeling problem are more than one, then Eq. (2) can be writ-

*yi* = *β*<sup>0</sup> + *β*<sup>1</sup> *x*1*<sup>i</sup>* + *β*<sup>2</sup> *x*2*<sup>i</sup>* + *β*<sup>3</sup> *x*3*<sup>i</sup>* + …+*β<sup>n</sup> x* + *ε*. (3)

where *β* is the coefficient of bridge components condition rating, *n* the number of bridge component considered, *i* = 1, 2, 3, …, *N* is the number of data sets, and *ε* represents the error of the model.

Five conditions of data sets are then trained with a choice of the network architecture, namely complete data sets (M0), which number 157 data sets in total; data sets with one component missing (M1), which number 226 data sets in total; data sets with one or two components missing (M2), which number 252 data sets in total; and data sets with one, two, or three com-

A feed-forward neural network with a single hidden layer that varies the number of hidden neurons in the range from 1 to 28 neurons and one output layer, as shown in **Figure 3**, was selected to train the data sets. The networks are trained with a variation of the back-propagation training algorithm, namely the Levenberg-Marquardt algorithm (trainlm). The Trainlm algorithm is utilized as it has the fastest convergence in function approximation problems such as the bridge condition rating problem [17]. Bridge component condition rating data are used as input variables, which consist of nine inputs (surface, expansion joint, parapet, drainage, slopes protection, abutment, bearing, deck/slab, and beam/girder), and the whole bridge condition rating

ten as follow:

**3.3. Artificial neural network model**

is used as the output variable *Y*.

ponents missing (M3), which number 267 data sets in total.

**Figure 2.** The distribution of data sets for data sets M0, M1, M2, and M3.

Meanwhile, in the substitution with SMC method, the missing value is substituted based on the criteria of the bridge components. If the primary component rating data are missing, the data will be substituted by the average value of other available primary component ratings. The same method is also applied to unavailable data of secondary components. By substitution with the SBR method, the missing component rating value is simply substituted with the available bridge condition rating value from the same label of data set.

After the missing data are handled with the above methods, the data are then checked to remove the redundant data that appear in the list. Furthermore, the remaining data sets show that there is no significance different in the number of data sets (M2 and M3) in comparison to M1, hence only data sets M0 and M1 are chosen in study. The distribution of data sets after removing the redundant data is shown in **Figure 2**.

#### **3.2. Multiple regression analysis model**

In this case, MRA deals with one output parameter (dependent variable), which is the bridge condition rating value and nine input parameters (independent variables) which are the bridge components condition rating. If the bridge condition rating is y and bridge components condition rating is x<sup>1</sup> , x<sup>2</sup> , x<sup>3</sup> , …, xn, then the model is given by:

$$y = \beta\_0 + \beta\_1 \mathbf{x}\_1 + \beta\_2 \mathbf{x}\_2 + \beta\_3 \mathbf{x}\_3 + \dots + \beta\_n \mathbf{x}\_n + \varepsilon \tag{2}$$

Since the data sets used in this modeling problem are more than one, then Eq. (2) can be written as follow:

$$y\_i = \beta\_0 + \beta\_1 \mathbf{x}\_{1i} + \beta\_2 \mathbf{x}\_{2i} + \beta\_3 \mathbf{x}\_{3i} + \dots + \beta\_n \mathbf{x} + \varepsilon. \tag{3}$$

where *β* is the coefficient of bridge components condition rating, *n* the number of bridge component considered, *i* = 1, 2, 3, …, *N* is the number of data sets, and *ε* represents the error of the model.

#### **3.3. Artificial neural network model**

**Table 2** illustrates data sets that contain complete data and missing data of bridge component condition ratings. Five methods are proposed to handle these missing data of bridge component ratings. These five methods are as follows: substituting with the local mean (SM), substituting with the local minimum (SMN), substituting with the local mode (SMD), substituting with the local mean value of the same component class (SMC), and substituting with

**No. of data Bridge component (X) Bridge rating (***Y***)**

*X<sup>1</sup> X<sup>2</sup> X<sup>3</sup> … X<sup>p</sup>* 1 *x11 x12 x13 … x1p Y1* 2 *x21 x22 xmis … x2p Y2* 3 *x31 x32 x33 … xmis Y3* … *… … … … … …* N *xn1 xn2 xn3 … xnp Yn*

In SM method, the mean value of the data label n (xn) is calculated using Eq. (1). The missing value is then substituted with the xn value. Furthermore, the local minimum value (xmin) is defined as the smallest value appearing in the same label of data set. Meanwhile, the local mode value (xmode) is defined as the value that appears most often in the same label of data set.

> *p* \_\_\_\_\_\_\_ *Xn*<sup>1</sup>

Meanwhile, in the substitution with SMC method, the missing value is substituted based on the criteria of the bridge components. If the primary component rating data are missing, the data will be substituted by the average value of other available primary component ratings. The same method is also applied to unavailable data of secondary components. By substitution with the SBR method, the missing component rating value is simply substituted with the

After the missing data are handled with the above methods, the data are then checked to remove the redundant data that appear in the list. Furthermore, the remaining data sets show that there is no significance different in the number of data sets (M2 and M3) in comparison to M1, hence only data sets M0 and M1 are chosen in study. The distribution of data sets after

In this case, MRA deals with one output parameter (dependent variable), which is the bridge condition rating value and nine input parameters (independent variables) which are the bridge components condition rating. If the bridge condition rating is y and bridge compo-

, …, xn, then the model is given by:

*<sup>p</sup>* (1)

the available bridge condition rating value (SBR) from the same label of data set.

available bridge condition rating value from the same label of data set.

*Xn* <sup>=</sup> <sup>∑</sup>*<sup>n</sup>*=<sup>1</sup>

**Table 2.** The format of data distribution used in this study.

32 Bridge Engineering

removing the redundant data is shown in **Figure 2**.

, x<sup>2</sup> , x<sup>3</sup>

**3.2. Multiple regression analysis model**

nents condition rating is x<sup>1</sup>

Five conditions of data sets are then trained with a choice of the network architecture, namely complete data sets (M0), which number 157 data sets in total; data sets with one component missing (M1), which number 226 data sets in total; data sets with one or two components missing (M2), which number 252 data sets in total; and data sets with one, two, or three components missing (M3), which number 267 data sets in total.

A feed-forward neural network with a single hidden layer that varies the number of hidden neurons in the range from 1 to 28 neurons and one output layer, as shown in **Figure 3**, was selected to train the data sets. The networks are trained with a variation of the back-propagation training algorithm, namely the Levenberg-Marquardt algorithm (trainlm). The Trainlm algorithm is utilized as it has the fastest convergence in function approximation problems such as the bridge condition rating problem [17]. Bridge component condition rating data are used as input variables, which consist of nine inputs (surface, expansion joint, parapet, drainage, slopes protection, abutment, bearing, deck/slab, and beam/girder), and the whole bridge condition rating is used as the output variable *Y*.

**Figure 2.** The distribution of data sets for data sets M0, M1, M2, and M3.

on the previous functions and then converts the network output back into the original units

Developing a Bridge Condition Rating Model Based on Limited Number of Data Sets

http://dx.doi.org/10.5772/intechopen.71556

A transfer function is used to produce the neuron output and limit the amplitude of the output of the neuron. It determines the relationship between the inputs and outputs of a neuron and a network [17]. In this study, the tangent sigmoid transfer function (tansig) and linear transfer function (purelin) are used in the hidden and output layer, respectively. The tansig function, as given in Eq. (4), produces outputs in the range of −1 to +1, and the purelin func-

*tansig*(*x*) <sup>=</sup> \_\_\_\_\_\_\_\_\_\_\_\_ 2 (<sup>1</sup> <sup>+</sup> *exp*(−<sup>2</sup> <sup>∗</sup> *<sup>x</sup>*)) <sup>−</sup> <sup>1</sup> (4)

*purelin*(*x*) = *x* (5)

The root mean squared error (RMSE) of the training set is used to measure the performance of the network, where the typical performance function that is used for training ANNs is the

linear regression line between the ANN outputs and the bridge condition rating targets is also

*N*

The number of epochs for all the training algorithms is fixed at 1000. At the same time, the initial values of the weights and biases are always initiated from random values; therefore, each run might produce different output values [20]. Therefore, each ANN is made to run

The process of training using the incomplete data is virtually the same as that for the training data with none of the features missing. Once the substitution has been made, the remaining steps of the training algorithm ensure the use of the handled features in the process of updating the weight and bias of the network, while the missing features are automatically handled based on the proposed solution. The purpose of the training process is to map the relationship

between the input and output parameters using the ANN, as given in Eq. (8).

<sup>2</sup>[*v* . *t*

Here, *x* is the input vector, *y* is the output vector, w is the weight matrix for the connections between the input and hidden layer, *v* is the weight matrix for the connections between the

\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_ <sup>∑</sup>*<sup>i</sup>*=<sup>1</sup>

(*ytarget* − *ypredicted*)

\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_

2

are reported.

is the network output of data label k, and Q is the num-

) of the

35

) and RMSE

*<sup>k</sup>* − *ak* (6)

<sup>1</sup>(*w* . *x* + *b*1) + *b*2] (8)

*<sup>N</sup>* (7)

mean sum of the squares of the network errors. The coefficient of determination (*R*<sup>2</sup>

used to measure the response of the trained network. The error of data label *k* (*ek*

an = sim(net, pn) and a = mapstd("reverse," an, ts) [19].

tion, as given in Eq. (5), produces outputs in the range of −∞ to +∞.

of all the training sets are calculated using Eqs. (6) and (7).

*ek* = *t*

*RMSE* = √

is the target of data label k, ak

ber of data sets used in the network training process.

30 times, and the average values for RMSE and *R*<sup>2</sup>

*ypredicted* = *t*

Here, tk

**Figure 3.** Structure of typical ANN model.

During the training process, there is a possible risk of overfitting or overtraining the network. In this situation, the error on the training set is driven to a very small value, but when new data are presented to the network, the error becomes large. The network has memorized the training examples, but it has not learned to generalize to new situations [17, 18]. Therefore, in this work, the early stopping technique was used to monitor the training process to handle the over-training problem. In the early stopping technique, there is a need to divide the data set into three subsets: training, validation, and testing data sets. The training set is used to train the network, and the validation data set is required to validate the network according to the early stopping technique. The testing data sets are used to test the performance of the trained network. In this study, 60, 20, and 20% were used as the training, validation, and testing data sets, respectively.

Prior to training with the data sets, the network inputs and targets are normalized using the functions [pn, ps] = mapstd(p) and [tn, ts] = mapstd(t) in MATLAB software so that they had mean of 0 and standard deviation of 1. The original network inputs and targets are given in the matrices p and t. The normalized inputs target pn and tn that are returned to have mean of 0 and standard deviation of 1. The settings structures ps and ts contain the means and standard deviations, respectively, of the original inputs and original targets. After the network has been trained, these settings are then used to transform any future inputs that are applied to the network. To convert these outputs back into the same units that were used for the original targets requires ts. The following functions simulate the network that was trained on the previous functions and then converts the network output back into the original units an = sim(net, pn) and a = mapstd("reverse," an, ts) [19].

A transfer function is used to produce the neuron output and limit the amplitude of the output of the neuron. It determines the relationship between the inputs and outputs of a neuron and a network [17]. In this study, the tangent sigmoid transfer function (tansig) and linear transfer function (purelin) are used in the hidden and output layer, respectively. The tansig function, as given in Eq. (4), produces outputs in the range of −1 to +1, and the purelin function, as given in Eq. (5), produces outputs in the range of −∞ to +∞.

from, as given in Eq. ( $\cup$ ), produces couplings in the range on  $\simeq$  to  $\sim$ . 
$$\text{transig}(\mathbf{x}) = \frac{2}{\{1 + \exp(-2 \ast x)\}} - 1\tag{4}$$

$$
overline{\text{lim}(\mathbf{x})} = \mathbf{x} \tag{5}$$

The root mean squared error (RMSE) of the training set is used to measure the performance of the network, where the typical performance function that is used for training ANNs is the mean sum of the squares of the network errors. The coefficient of determination (*R*<sup>2</sup> ) of the linear regression line between the ANN outputs and the bridge condition rating targets is also used to measure the response of the trained network. The error of data label *k* (*ek* ) and RMSE of all the training sets are calculated using Eqs. (6) and (7).

**Figure 3.** Structure of typical ANN model.

34 Bridge Engineering

ing data sets, respectively.

During the training process, there is a possible risk of overfitting or overtraining the network. In this situation, the error on the training set is driven to a very small value, but when new data are presented to the network, the error becomes large. The network has memorized the training examples, but it has not learned to generalize to new situations [17, 18]. Therefore, in this work, the early stopping technique was used to monitor the training process to handle the over-training problem. In the early stopping technique, there is a need to divide the data set into three subsets: training, validation, and testing data sets. The training set is used to train the network, and the validation data set is required to validate the network according to the early stopping technique. The testing data sets are used to test the performance of the trained network. In this study, 60, 20, and 20% were used as the training, validation, and test-

Prior to training with the data sets, the network inputs and targets are normalized using the functions [pn, ps] = mapstd(p) and [tn, ts] = mapstd(t) in MATLAB software so that they had mean of 0 and standard deviation of 1. The original network inputs and targets are given in the matrices p and t. The normalized inputs target pn and tn that are returned to have mean of 0 and standard deviation of 1. The settings structures ps and ts contain the means and standard deviations, respectively, of the original inputs and original targets. After the network has been trained, these settings are then used to transform any future inputs that are applied to the network. To convert these outputs back into the same units that were used for the original targets requires ts. The following functions simulate the network that was trained

$$e\_k = t\_k - a\_k \tag{6}$$

$$\begin{aligned} \text{RMSE} &= \sqrt{\frac{\sum\_{i=1}^{N} (\mathcal{Y}\_{\text{target}} - \mathcal{Y}\_{\text{predicted}})^2}{N}} \\ &= \sqrt{\frac{\sum\_{i=1}^{N} (\mathcal{Y}\_{\text{target}} - \mathcal{Y}\_{\text{predicted}})^2}{N}} \end{aligned} \tag{7}$$

Here, tk is the target of data label k, ak is the network output of data label k, and Q is the number of data sets used in the network training process.

The number of epochs for all the training algorithms is fixed at 1000. At the same time, the initial values of the weights and biases are always initiated from random values; therefore, each run might produce different output values [20]. Therefore, each ANN is made to run 30 times, and the average values for RMSE and *R*<sup>2</sup> are reported.

The process of training using the incomplete data is virtually the same as that for the training data with none of the features missing. Once the substitution has been made, the remaining steps of the training algorithm ensure the use of the handled features in the process of updating the weight and bias of the network, while the missing features are automatically handled based on the proposed solution. The purpose of the training process is to map the relationship between the input and output parameters using the ANN, as given in Eq. (8).

$$y\_{predlat} = \ t\_2 \left[ \upsilon \cdot t\_1(\pi r \dots \pi + b\_1) + b\_2 \right] \tag{8}$$

Here, *x* is the input vector, *y* is the output vector, w is the weight matrix for the connections between the input and hidden layer, *v* is the weight matrix for the connections between the hidden and output layer, *b*<sup>1</sup> is the bias in the hidden layer, *b*<sup>2</sup> is the bias in the output layer, *t*1 is the transfer function for the neurons in the hidden layer, and *t*<sup>2</sup> is the transfer function for the neurons in the output layer [21].
