2.1 System resilience evaluation indicator framework

In general existing urban resilience assessment frameworks consist of different dimensions of urban system and various indicators for evaluation. In this chapter, the indicator framework directly applies the indicator matrix proposed by [24], who collected attributes of urban resilience from 36 assessment frameworks and categorized them into 5 dimensions with 122 indicators. Subsequently, indicators from these dimensions are classified into the "three pillars" of system resilience by evaluators. This will guide decision-makers to identify and extract key indicators for resilience assessment of the complex lifeline systems.

• SR�R: the initial super-matrix generated by the ANP model with R indicators

, i ∈ {1, …, R}: entropy values

j � � �

from low to high, that is, C(1) ≤ C(2) ≤ … C(<sup>k</sup>). Identifying the system resilience level of points in clustering group is based on the norm of the clustering centroid. The

In statistics, bootstrapping methods refer to the tests or metrics that rely on random sampling with replacement. Bootstrapping method allows assigning measures of accuracy (bias, variance, confidence intervals, prediction error, etc.) to sample estimates [28, 29]. In this chapter, Algorithm 2 shows the bootstrapping method that outputs the confidence interval for the weight of each indicator depending on super-matrix S in ANP and thus provides the feasible interval for the

The feasible interval of each indicator i is computed by resampling n samples from the initial sample space {ai1, ai2, …, aiR}, and then repeat such procedure for B times. Then compute the average value of the samples in the resampled space for each b = 1, 2, …, B. Step. 2 and 3 in Algorithm 2 rank the average value of resample batches and then compute the estimated value for the 1–α confidence interval of the average value. The feasible interval of each indicator equals the estimated confi-

> ^<sup>∗</sup>,ð Þ <sup>k</sup><sup>2</sup> i

(1)–(4) with entropy objective are interpreted. Entropy theory has been applied to wide range of system resilience assessments from engineering and economics to anthropology and social ones [30–33]. Entropy indicates the degree of disorder, uncertainty, or lack of information about the configuration of system modules [34]. The lower the entropy value, the higher is the information utility it has. Here, the entropy value H computed by Algorithm 3 follows the definition in [35–37] and evaluates the information utility and reliability for the weights of indicators gener-

h i. Further, the optimized models

� � � 2

, j ¼ 1, 2, …, k and is then ranked

• aij, i, j ∈ {1, …, R}: each element in the matrix S for a given super-matrix S

Machine Learning-Based Method for Urban Lifeline System Resilience Assessment in GIS

The contents of hybrid K-means algorithm are summarized in Algorithm 1. In Algorithm 1, Step 1a computes a confidence interval for each indicator based on bootstrapping method. Step 1b computes the entropy value from the supermatrix S. In Step 2, the optimal weight W o ^ equals to the optimal solution of the optimization problems (1)–(4), where the weight confidence interval constraint is incorporated as (3). In (4), to ensure that each weight is greater than zero, we set ϵ ¼ 0:001. In Step 3, Ws represents the subjective weight, and Wo represents the objective weight. Step 4 outputs the weight-adjusted indicator data. Finally, Step 5 implements the classical K-means algorithm. The structure of K-means algorithm follows what have been proposed in [27]. Set the number of clustering groups as k. After implementing Algorithm 1 and getting the clustering results, the term mj, j = 1, 2, …, k is used to denote the centroid of each clustering group. The norm of

• H ∈IR1�<sup>R</sup> and Li, Ui ∈IRR�<sup>1</sup>

DOI: http://dx.doi.org/10.5772/intechopen.82748

• [Li, Ui], i ∈ {1, …, R}: closed intervals

each cluster centroid is computed as Cj <sup>¼</sup> <sup>m</sup>ð Þ <sup>T</sup>

2.2.1 Entropy and bootstrapping

dence interval, i.e., Li; Ui ½ �¼ θ

Algorithm 1. Hybrid K-means.

ated by ANP.

47

resilience level of the group with the centroid D(j) equals j.

weight of each indicator for further adjustment and optimization.

^<sup>∗</sup>,ð Þ <sup>k</sup><sup>1</sup> <sup>i</sup> ; θ

After construction of framework, the experts can select the detailed indicators from the indicator pool proposed by [24], classify the indicators into "three pillars" of capacities, and then derive the indicators' interdependence. A two-layer ANP model is constructed based on these indicators, in which the top layer is the "three pillars" of capacities and the bottom layer is constructed by the chosen indicators.

The experts choose an integer from 10 to 10 to represent the relative importance of any two indicators according to their perspective (for comparing "protection of wetlands and watersheds" and "availability and accessibility of resources," negative value means that "protection of wetlands and watersheds" is more important than the latter; and positive value means that "availability and accessibility of resources" are more important than the previous). The absolute value reveals the relative importance of an indicator inside a pair, i.e., selecting 10 and 10 means that one indicator strongly dominates and is dominated by the other, respectively, while 0 means they are almost equally important.

The ANP algorithm strictly refers to decision-making process introduced by [26]. The ANP will then generate weighted and unweighted super-matrices from the expert scoring. The row and column number of unweighted super-matrices equals to the number of indicators. The weighted super-matrices store the weights of indicators under the same capacity category and the weights between capacities. Explicitly, the weighted super-matrices are computed by the column normalization of unweighted super-matrices. The weight for all the indicators equals to the limited matrix iteratively computed and convergent from weighted super-matrices.

#### 2.2 Hybrid K-means algorithm

The hybrid K-means algorithm is developed in this section: the overview of the algorithm is presented first. The interpretations at each step of this algorithm are presented later. The contents for the algorithm are shown as follows:


2.1 System resilience evaluation indicator framework

Geographic Information Systems and Science

for resilience assessment of the complex lifeline systems.

while 0 means they are almost equally important.

2.2 Hybrid K-means algorithm

• N: the total number of all regions

• R: the total number of indicators

physical units.

• Ws ∈IR<sup>R</sup><sup>1</sup>

46

In general existing urban resilience assessment frameworks consist of different dimensions of urban system and various indicators for evaluation. In this chapter, the indicator framework directly applies the indicator matrix proposed by [24], who collected attributes of urban resilience from 36 assessment frameworks and categorized them into 5 dimensions with 122 indicators. Subsequently, indicators from these dimensions are classified into the "three pillars" of system resilience by evaluators. This will guide decision-makers to identify and extract key indicators

After construction of framework, the experts can select the detailed indicators from the indicator pool proposed by [24], classify the indicators into "three pillars" of capacities, and then derive the indicators' interdependence. A two-layer ANP model is constructed based on these indicators, in which the top layer is the "three pillars" of capacities and the bottom layer is constructed by the chosen indicators. The experts choose an integer from 10 to 10 to represent the relative importance of any two indicators according to their perspective (for comparing "protection of wetlands and watersheds" and "availability and accessibility of resources," negative value means that "protection of wetlands and watersheds" is more important than the latter; and positive value means that "availability and accessibility of resources" are more important than the previous). The absolute value reveals the relative importance of an indicator inside a pair, i.e., selecting 10 and 10 means that one indicator strongly dominates and is dominated by the other, respectively,

The ANP algorithm strictly refers to decision-making process introduced by [26]. The ANP will then generate weighted and unweighted super-matrices from the expert scoring. The row and column number of unweighted super-matrices equals to the number of indicators. The weighted super-matrices store the weights of indicators under the same capacity category and the weights between capacities. Explicitly, the weighted super-matrices are computed by the column normalization of unweighted super-matrices. The weight for all the indicators equals to the limited

matrix iteratively computed and convergent from weighted super-matrices.

• D ∈IR<sup>N</sup><sup>R</sup>: the column-normalized indicator data with N regions and R

: the weight of each indicator by the ANP model

indicators. Column normalized means that each original element in the column is divided by the maximum value in that column. The calculation enforces each element in the matrix D taking value inside [0, 1] and thus normalizes all the data into the same scale. Normalization is a standard procedure before performing K-means algorithm, when the indicators are of incomparable

presented later. The contents for the algorithm are shown as follows:

The hybrid K-means algorithm is developed in this section: the overview of the algorithm is presented first. The interpretations at each step of this algorithm are

The contents of hybrid K-means algorithm are summarized in Algorithm 1.

In Algorithm 1, Step 1a computes a confidence interval for each indicator based on bootstrapping method. Step 1b computes the entropy value from the supermatrix S. In Step 2, the optimal weight W o ^ equals to the optimal solution of the optimization problems (1)–(4), where the weight confidence interval constraint is incorporated as (3). In (4), to ensure that each weight is greater than zero, we set ϵ ¼ 0:001. In Step 3, Ws represents the subjective weight, and Wo represents the objective weight. Step 4 outputs the weight-adjusted indicator data. Finally, Step 5 implements the classical K-means algorithm. The structure of K-means algorithm follows what have been proposed in [27]. Set the number of clustering groups as k. After implementing Algorithm 1 and getting the clustering results, the term mj, j = 1, 2, …, k is used to denote the centroid of each clustering group. The norm of each cluster centroid is computed as Cj <sup>¼</sup> <sup>m</sup>ð Þ <sup>T</sup> j � � � � � � 2 , j ¼ 1, 2, …, k and is then ranked from low to high, that is, C(1) ≤ C(2) ≤ … C(<sup>k</sup>). Identifying the system resilience level of points in clustering group is based on the norm of the clustering centroid. The resilience level of the group with the centroid D(j) equals j.

#### 2.2.1 Entropy and bootstrapping

In statistics, bootstrapping methods refer to the tests or metrics that rely on random sampling with replacement. Bootstrapping method allows assigning measures of accuracy (bias, variance, confidence intervals, prediction error, etc.) to sample estimates [28, 29]. In this chapter, Algorithm 2 shows the bootstrapping method that outputs the confidence interval for the weight of each indicator depending on super-matrix S in ANP and thus provides the feasible interval for the weight of each indicator for further adjustment and optimization.

The feasible interval of each indicator i is computed by resampling n samples from the initial sample space {ai1, ai2, …, aiR}, and then repeat such procedure for B times. Then compute the average value of the samples in the resampled space for each b = 1, 2, …, B. Step. 2 and 3 in Algorithm 2 rank the average value of resample batches and then compute the estimated value for the 1–α confidence interval of the average value. The feasible interval of each indicator equals the estimated confidence interval, i.e., Li; Ui ½ �¼ θ ^<sup>∗</sup>,ð Þ <sup>k</sup><sup>1</sup> <sup>i</sup> ; θ ^<sup>∗</sup>,ð Þ <sup>k</sup><sup>2</sup> i h i. Further, the optimized models (1)–(4) with entropy objective are interpreted. Entropy theory has been applied to wide range of system resilience assessments from engineering and economics to anthropology and social ones [30–33]. Entropy indicates the degree of disorder, uncertainty, or lack of information about the configuration of system modules [34]. The lower the entropy value, the higher is the information utility it has. Here, the entropy value H computed by Algorithm 3 follows the definition in [35–37] and evaluates the information utility and reliability for the weights of indicators generated by ANP.

Algorithm 1. Hybrid K-means.

Input: Original indicators data D ∈IRN�R; Weight matrix by ANP model Ws ∈IRR�<sup>1</sup> , and Super matrix S∈IRR�R;

Step 1: for i, j = 1,…,r do.

Step 1a: Compute the bounds [Li, Ui], ∈ {1,…,R} by row data {ai1, ai2, …, aiR} i = 1,…,R from S using bootstrapping methods, summarized in Algorithm 2;

Step 1b: Compute the Entropy value H, based on column data a1j, a2j, …, aRj from S, summarized in Algorithm 3;

Step 2: Define Wo∈ IRR�<sup>1</sup> with each element as wi, i ∈ {1,…,R}, given a positive threshold ϵ>0, and solve the linear program

$$\min\_{\mathbf{W}o} \quad H \cdot \mathbf{W}o \tag{1}$$

Step 2: Normalize by column to obtain fij <sup>¼</sup> <sup>a</sup><sup>0</sup>

For S, each column vector a1j; a2j; …; aRj <sup>⊤</sup>

<sup>i</sup>¼<sup>1</sup>fijln fij .

DOI: http://dx.doi.org/10.5772/intechopen.82748

Return Entropy value H.

Hj ¼ � <sup>1</sup>

ment strategy.

3. Case study

methodology.

crucial [38–40].

49

3.1 Background information

ln ð Þ<sup>r</sup> <sup>∑</sup><sup>r</sup>

Step 3: for j = 1,…R do, compute entropy by definition:

strategy under the jth indicator's criterion; therefore, the weighted aggregation of each column's output entropy represents the total uncertainty metric of the indicator system. Summarized by optimization models (1)–(4), the algorithm seeks to find the optimal weight strategy W o ^ , to minimize the weighted aggregation entropy value H � Wo, with respect to the feasible interval computed from bootstrapping confidence interval. Namely, the algorithm intends to find an optimal weight strategy to maximize the overall information utility. Such strategy would help to eliminate the side effects of subjective judgments from the experts. Superimposing the subjective and objective weights together realizes a comprehensive weight assign-

Machine Learning-Based Method for Urban Lifeline System Resilience Assessment in GIS

Case study of system resilience evaluation for a water supply system in the risks

This section evaluates the Chenhang reservoir water supply system resilience under the salt tide hazards in the estuary of Yangtze River. The formal definition of salt tide is the emergency situation that chloride concentrations in water body exceed the national standard level (250 mg per liter of water). Salt tide destroys the quality of water, results in soil salinization in coastal areas and cities, and has negative impacts on production and human daily life. Recently, salt tide has already become one of the most internationally concerned disasters of coastal cities. The invasion of saltwater during salt tide will limit the access of high-quality municipal and industrial water and will cause water shortage and scheduling problems in some megacities of China located in estuarine and coastal areas, such as Shanghai. Thus, water shortage problem has become one of the main obstacles that obstructs the construction of eco-city and sustainable development. By experience, the intensity of salt tide intrusion changes with the period of tides, showing its periodic properties. In general, September to April next year is the period of time influenced by salt tide of water intake in Yangtze River. Each intrusion of salt tide lasts for 6–7 days. Since there exist multiple factors that affect the duration and extent of hazards (e.g., Yangtze River hydrology, chase traffic, weather, and wind), it is usually difficult to make detailed and accurate prediction for each intrusion. In recent years, the hazards have become more severe in the following aspects: long intrusion duration, high frequency, short interval time between intrusions, and independence of Yangtze River runoffs. Given the fact that the "Three Gorges Project" increased the ability to implement different water strategies at the upstream of Yangtze River, the extreme hydrological hazards occurred more frequently, which make the research on how salt tide influences the water supply system more practical and

of salt tide is presented in this section. The case study validates the proposed

ij ∑<sup>R</sup> <sup>i</sup>¼1a<sup>0</sup> ij .

, j ¼ 1, 2, …, R denotes the weighting

$$\text{s.t.} \quad \sum\_{j=1}^{R} w\_i = \mathbf{1}, \tag{2}$$

$$L\_i \le w\_i \le U\_i,\qquad i = 1, \ldots, R,\tag{3}$$

$$w\_i \succeq \mathfrak{e}, \qquad \qquad i = \mathbf{1}, \ldots, R,\tag{4}$$

which returns the optimal W o ^ <sup>¼</sup> f g <sup>w</sup>^ <sup>i</sup> , <sup>i</sup> = 1,2,…,R. Step 3: Denote Ws(<sup>r</sup> � 1) = {w<sup>0</sup> <sup>i</sup>}, i = 1,2,…,R. Compute the adjusted weights by

$$\mathcal{W} = \left\{ \frac{\hat{w}\_i \times w\_i'}{\sum\_{i=1}^{R} (\hat{w}\_i \times w\_i')} \right\}, i = 1, \dots, R,\tag{5}$$

Step4 :Construct weight matrixW<sup>0</sup> ∈ IR<sup>N</sup>�<sup>R</sup>byNcopies of arrayW⊤: Compute the weight

adjusted indicator data by Hadamard product of weight matrix W<sup>0</sup> and data matrix D, i.e., D<sup>0</sup> = W<sup>0</sup> ° D

Step 5: Implement K-means algorithm on D<sup>0</sup>

Return clustering result on N regions;

Algorithm 2. Bootstrapping

Input: Define an initial sample space {ai1, ai2, …, aiR} for each i ∈ {1, 2, …, R} from the Weighed supermatrix S; Confidence level α;

Step 1: Randomly select B batches of samples with size n from initial sample space, for each, i ∈ {1, 2, …R},

$$A\_i^{\*,b} = \left(a\_{i1}^{\*,b}, a\_{i2}^{\*,b}, \dots, a\_{in}^{\*,b}\right), b = 1, 2, \dots, B\_i$$

for b = 1,2,…,B do, compute θ ^∗,b <sup>i</sup> <sup>¼</sup> <sup>1</sup> <sup>n</sup> <sup>∑</sup><sup>n</sup> <sup>j</sup>¼<sup>1</sup>a<sup>∗</sup>,b ij , for i ∈ {1, 2, …R}, and b = 1,2,…,B. Step 2: Sort the value of θ ^∗, <sup>1</sup> <sup>i</sup> , θ ^∗, <sup>2</sup> <sup>i</sup> , …, θ ^∗,B <sup>i</sup> from low to high, and obtain θ ^<sup>∗</sup>,ð Þ<sup>1</sup> <sup>i</sup> ≤ θ ^<sup>∗</sup>,ð Þ<sup>2</sup> <sup>i</sup> ≤ … ≤ θ ^<sup>∗</sup>,ð Þ <sup>B</sup> <sup>i</sup> , for each i ∈ {1, 2, …R}.

Step 3: Set k<sup>1</sup> = α � B/2, and k<sup>2</sup> = B � α � B/2, let θ ^<sup>∗</sup>,ð Þ <sup>k</sup><sup>1</sup> <sup>i</sup> , θ ^<sup>∗</sup>,ð Þ <sup>k</sup><sup>2</sup> <sup>i</sup> be the estimation of θ ^<sup>∗</sup>,α=<sup>2</sup> <sup>i</sup> , θ ^<sup>∗</sup>, <sup>1</sup>�α=<sup>2</sup> <sup>i</sup> , that is, the estimation of 1 – α confidence interval of statistics θ. Algorithm 3. Computation of Entropy.

Input: Super matrix S = {aij}i = 1,…,R, j = 1,…,R. Step 1: Normalize the matrix element by a<sup>0</sup> ij <sup>¼</sup> aij ∑<sup>R</sup> <sup>j</sup>¼<sup>1</sup>aij , i, j∈f g 1; …; R , Step 2: Normalize by column to obtain fij <sup>¼</sup> <sup>a</sup><sup>0</sup> ij ∑<sup>R</sup> <sup>i</sup>¼1a<sup>0</sup> ij .

Step 3: for j = 1,…R do, compute entropy by definition:

Hj ¼ � <sup>1</sup> ln ð Þ<sup>r</sup> <sup>∑</sup><sup>r</sup> <sup>i</sup>¼<sup>1</sup>fijln fij .

Input: Original indicators data D ∈IRN�R; Weight matrix by ANP model

i = 1,…,R from S using bootstrapping methods, summarized in Algorithm 2;

min Wo

s:t: ∑ R j¼1

which returns the optimal W o ^ <sup>¼</sup> f g <sup>w</sup>^ <sup>i</sup> , <sup>i</sup> = 1,2,…,R.

<sup>W</sup> <sup>¼</sup> <sup>w</sup>^ <sup>i</sup> � <sup>w</sup><sup>0</sup>

∑<sup>R</sup>

i

adjusted indicator data by Hadamard product of weight matrix W<sup>0</sup> and data

Input: Define an initial sample space {ai1, ai2, …, aiR} for each i ∈ {1, 2, …, R}

Step 1: Randomly select B batches of samples with size n from initial sample

<sup>i</sup><sup>2</sup> ; …; <sup>a</sup><sup>∗</sup>,b in

<sup>i</sup> , that is, the estimation of 1 – α confidence interval of statistics θ.

ij <sup>¼</sup> aij ∑<sup>R</sup> <sup>j</sup>¼<sup>1</sup>aij

� �

i � �

<sup>i</sup>¼<sup>1</sup> <sup>w</sup>^ <sup>i</sup> � <sup>w</sup><sup>0</sup>

( )

Step 1a: Compute the bounds [Li, Ui], ∈ {1,…,R} by row data {ai1, ai2, …, aiR}

Step 1b: Compute the Entropy value H, based on column data a1j, a2j, …, aRj from

Step 2: Define Wo∈ IRR�<sup>1</sup> with each element as wi, i ∈ {1,…,R}, given a positive

H � Wo (1)

wi ¼ 1, (2)

, i ¼ 1, …, R, (5)

Li ≤ wi ≤ Ui, i ¼ 1, …, R, (3) wi≥ϵ, i ¼ 1, …, R, (4)

<sup>i</sup>}, i = 1,2,…,R. Compute the adjusted weights by

∈ IR<sup>N</sup>�<sup>R</sup>byNcopies of arrayW⊤: Compute the weight

, b ¼ 1, 2, …, B,

<sup>i</sup> from low to high, and obtain

^<sup>∗</sup>,ð Þ <sup>k</sup><sup>1</sup> <sup>i</sup> , θ

ij , for i ∈ {1, 2, …R}, and b = 1,2,…,B.

^<sup>∗</sup>,ð Þ <sup>k</sup><sup>2</sup>

, i, j∈f g 1; …; R ,

<sup>i</sup> be the estimation

, and Super matrix S∈IRR�R;

threshold ϵ>0, and solve the linear program

Step 3: Denote Ws(<sup>r</sup> � 1) = {w<sup>0</sup>

Step4 :Construct weight matrixW<sup>0</sup>

Step 5: Implement K-means algorithm on D<sup>0</sup>

from the Weighed supermatrix S; Confidence level α;

<sup>i</sup> <sup>¼</sup> <sup>a</sup><sup>∗</sup>,b

^∗, <sup>1</sup> <sup>i</sup> , θ ^∗, <sup>2</sup> <sup>i</sup> , …, θ ^∗,B

Step 3: Set k<sup>1</sup> = α � B/2, and k<sup>2</sup> = B � α � B/2, let θ

Input: Super matrix S = {aij}i = 1,…,R, j = 1,…,R. Step 1: Normalize the matrix element by a<sup>0</sup>

<sup>i</sup><sup>1</sup> ; <sup>a</sup><sup>∗</sup>,b

^∗,b <sup>i</sup> <sup>¼</sup> <sup>1</sup> <sup>n</sup> <sup>∑</sup><sup>n</sup> <sup>j</sup>¼<sup>1</sup>a<sup>∗</sup>,b

<sup>i</sup> , for each i ∈ {1, 2, …R}.

A<sup>∗</sup>,b

^<sup>∗</sup>,ð Þ <sup>B</sup>

Algorithm 3. Computation of Entropy.

Return clustering result on N regions;

Algorithm 2. Bootstrapping

space, for each, i ∈ {1, 2, …R},

for b = 1,2,…,B do, compute θ

Step 2: Sort the value of θ

<sup>i</sup> ≤ … ≤ θ

^<sup>∗</sup>, <sup>1</sup>�α=<sup>2</sup>

^<sup>∗</sup>,ð Þ<sup>2</sup>

θ ^<sup>∗</sup>,ð Þ<sup>1</sup> <sup>i</sup> ≤ θ

of θ ^<sup>∗</sup>,α=<sup>2</sup> <sup>i</sup> , θ

48

matrix D, i.e., D<sup>0</sup> = W<sup>0</sup> ° D

Step 1: for i, j = 1,…,r do.

Geographic Information Systems and Science

S, summarized in Algorithm 3;

Ws ∈IRR�<sup>1</sup>

Return Entropy value H.

For S, each column vector a1j; a2j; …; aRj <sup>⊤</sup> , j ¼ 1, 2, …, R denotes the weighting strategy under the jth indicator's criterion; therefore, the weighted aggregation of each column's output entropy represents the total uncertainty metric of the indicator system. Summarized by optimization models (1)–(4), the algorithm seeks to find the optimal weight strategy W o ^ , to minimize the weighted aggregation entropy value H � Wo, with respect to the feasible interval computed from bootstrapping confidence interval. Namely, the algorithm intends to find an optimal weight strategy to maximize the overall information utility. Such strategy would help to eliminate the side effects of subjective judgments from the experts. Superimposing the subjective and objective weights together realizes a comprehensive weight assignment strategy.
