Applications of Artificial Neural Networks

Chapter 5

Abstract

1. Introduction

caused [2–4].

81

Object Recognition Using

Rafael Marrocos Magalhaes and Helton Maia

Richardson Santiago Teles de Menezes,

Convolutional Neural Networks

This chapter intends to present the main techniques for detecting objects within

images. In recent years there have been remarkable advances in areas such as machine learning and pattern recognition, both using convolutional neural networks (CNNs). It is mainly due to the increased parallel processing power provided by graphics processing units (GPUs). In this chapter, the reader will understand the details of the state-of-the-art algorithms for object detection in images, namely, faster region convolutional neural network (Faster RCNN), you only look once (YOLO), and single shot multibox detector (SSD). We will present the advantages and disadvantages of each technique from a series of comparative tests. For this, we will use metrics such as accuracy, training difficulty, and characteristics to implement the algorithms. In this chapter, we intend to contribute to a better understanding of the state of the art in machine learning and convolutional networks for

solving problems involving computational vision and object detection.

Keywords: machine learning, convolutional neural network, object detection

There are fascinating problems with computer vision, such as image classification and object detection, both of which are part of an area called object recognition. For these types of issues, there has been a robust scientific development in the last years, mainly due to the advances of convolutional neural networks, deep learning techniques, and the increase of the parallelism processing power offered by the graphics processing units (GPUs). The image classification problem is the task of assigning to an input image one label from a fixed set of categories. This classification problem is central within computer vision because, despite its simplicity, there are a wide variety of practical applications and has multiple uses, such as labeling skin cancer images [1], use of high-resolution images to detect natural disasters such as floods, volcanoes, and severe droughts, noting the impacts and damage

The performance of image classification algorithms crucially relies on the features used to feed them [5]. It means that the progress of image classification techniques using machine learning relied heavily on the engineering of selecting the essential features of the images that make up the database. Thus, obtaining these resources has become a daunting task, resulting in increased complexity and computational cost. Commonly, two independent steps are required for image

#### Chapter 5

### Object Recognition Using Convolutional Neural Networks

Richardson Santiago Teles de Menezes, Rafael Marrocos Magalhaes and Helton Maia

#### Abstract

This chapter intends to present the main techniques for detecting objects within images. In recent years there have been remarkable advances in areas such as machine learning and pattern recognition, both using convolutional neural networks (CNNs). It is mainly due to the increased parallel processing power provided by graphics processing units (GPUs). In this chapter, the reader will understand the details of the state-of-the-art algorithms for object detection in images, namely, faster region convolutional neural network (Faster RCNN), you only look once (YOLO), and single shot multibox detector (SSD). We will present the advantages and disadvantages of each technique from a series of comparative tests. For this, we will use metrics such as accuracy, training difficulty, and characteristics to implement the algorithms. In this chapter, we intend to contribute to a better understanding of the state of the art in machine learning and convolutional networks for solving problems involving computational vision and object detection.

Keywords: machine learning, convolutional neural network, object detection

#### 1. Introduction

There are fascinating problems with computer vision, such as image classification and object detection, both of which are part of an area called object recognition. For these types of issues, there has been a robust scientific development in the last years, mainly due to the advances of convolutional neural networks, deep learning techniques, and the increase of the parallelism processing power offered by the graphics processing units (GPUs). The image classification problem is the task of assigning to an input image one label from a fixed set of categories. This classification problem is central within computer vision because, despite its simplicity, there are a wide variety of practical applications and has multiple uses, such as labeling skin cancer images [1], use of high-resolution images to detect natural disasters such as floods, volcanoes, and severe droughts, noting the impacts and damage caused [2–4].

The performance of image classification algorithms crucially relies on the features used to feed them [5]. It means that the progress of image classification techniques using machine learning relied heavily on the engineering of selecting the essential features of the images that make up the database. Thus, obtaining these resources has become a daunting task, resulting in increased complexity and computational cost. Commonly, two independent steps are required for image

classification, feature extraction, and learning algorithm choice, and this has been widely developed and enhanced using support vector machines (SVMs).

The SVM algorithm, when considered as part of the supervised learning approach, is often used for tasks as classification, regression, and outlier detection [6]. The most attractive feature of this algorithm is that its learning mechanism for multiple objects is simpler to be analyzed mathematically than traditional neural network architecture, thus allowing to complex alterations with known effects on the core features of the algorithm [7]. In essence, an SVM maps the training data to higher-dimensional feature space and constructs a separation hyperplane with maximum margin, producing a nonlinear separation boundary in the input space [8].

Today, the most robust object classification and detection algorithms use deep learning architectures, with many specialized layers for automating the filtering and feature extraction process. Machine learning algorithms such as linear regression, support vector machines, and decision trees all have its peculiarities in the learning process, but fundamentally they all apply similar steps: make a prediction, receive a correction, and adjust the prediction mechanism based on the correction, at a high level, making it quite similar to how a human learns. Deep learning has appeared bringing a new approach to the problem, which attempted to overcome previous drawbacks by learning abstraction in data following a stratified description paradigm based on a nonlinear transformation [9]. A key advantage of deep learning is its ability to perform semi-supervised or unsupervised feature extraction over massive datasets.

Instead of the original VGG fully connected layers, a set of auxiliary convolutional layers change the model, thus enabling to extract features at multiple scales and

The SSD network has several feature layers to the end of the base network, which predicts the offsets to default

The bounding box generation considers the application of matching precomputed, fixed-size bounding boxes called priors with the original distribution of ground truth boxes. These priors are selected to keep the intersection over union

The overall loss function defined in Eq. (1) is a linear combination of the confidence loss, which measures how confident the network is of the computed bounding box using categorical cross-entropy and location loss, which measures how far away the networks predicted bounding boxes are from the ground truth

where N is the number of matched default boxes and Lconf and Lloc are the confidence and location loss, respectively, as defined in [13]. Figure 1 depicts how to apply the convolutional kernels to an input image in the SSD architecture.

You only look once [14] is a state-of-the-art object detection algorithm which targets real-time applications, and unlike some of the competitors, it is not a tradi-

YOLO works by dividing the input image into a grid of S � S cells, where each of

these cells is responsible for five bounding boxes predictions that describe the rectangle around the object. It also outputs a confidence score, which is a measure of the certainty that an object was enclosed. Therefore the score does not have any relation with the kind of object present in the box, only with the box's shape. For each predicted bounding box, a class it's also predicted working just like a regular classifier giving resulting in a probability distribution over all the possible classes. The confidence score for the bounding box and the class prediction combines into one final score that specifies the probability for each box includes a specific type of object. Given these design choices, most of the boxes will have low confidence scores, so only the boxes whose final score is beyond a threshold are kept. Eq. (2) states the loss function minimized by the training step in the YOLO

<sup>N</sup> Lconfð Þþ <sup>x</sup>,<sup>c</sup> <sup>α</sup>Llocð Þ <sup>x</sup>, <sup>l</sup>, <sup>g</sup> (1)

progressively decrease the size of the input to each subsequent layer.

boxes of different scales, aspect ratios, and their associated confidences. Figure based on [13].

1

(IoU) ratio equal to or greater than 0:5.

Object Recognition Using Convolutional Neural Networks

DOI: http://dx.doi.org/10.5772/intechopen.89726

L xð Þ¼ ,c, l, g

tional classifier purposed as an object detector.

ones using L2 norm.

Figure 1.

2.2 You only look once

algorithm.

83

The ability to learn the feature extraction step present in deep learning-based algorithms comes from the extensive use of convolutional neural networks (ConvNet or CNN). In this context, convolution is a specialized type of linear operation and can be seen as the simple application of a filter to a determined input [10]. Repeated application of the same filter to an input results in a map of activations called a feature map, indicating the locations and strength of a detected feature in the input by tweaking the parameters of the convolution. The network can adjust itself to reduce the error and therefore learn the best parameters to extract relevant information on the database.

Many deep neural network (DNN)-based object detectors have been proposed in the last few years [11, 12]. This research investigates the performance of stateof-the-art DNN models of SSD and Faster RCNN applied to a classical detection problem where the algorithms were trained to identify several animals in images; furthermore to exemplify the application in scientific research, the YOLO network was trained to solve the mice tracking problem. The flowing sections describe the DNN models mentioned earlier in more details [13–15].

#### 2. Object detection techniques

#### 2.1 Single shot multibox detector

The single shot multibox detector [13] is one of the best detectors in terms of speed and accuracy comprising two main steps, feature map extraction and convolutional filter applications, to detect objects.

The SSD architecture builds on the VGG-16 network [16], and this choice was made based on the strong performance in high-quality image classification tasks and the popularity of the network in problems where transfer learning is involved. Object Recognition Using Convolutional Neural Networks DOI: http://dx.doi.org/10.5772/intechopen.89726

Figure 1.

classification, feature extraction, and learning algorithm choice, and this has been

Today, the most robust object classification and detection algorithms use deep learning architectures, with many specialized layers for automating the filtering and feature extraction process. Machine learning algorithms such as linear regression, support vector machines, and decision trees all have its peculiarities in the learning process, but fundamentally they all apply similar steps: make a prediction, receive a correction, and adjust the prediction mechanism based on the correction, at a high level, making it quite similar to how a human learns. Deep learning has appeared bringing a new approach to the problem, which attempted to overcome previous drawbacks by learning abstraction in data following a stratified description paradigm based on a nonlinear transformation [9]. A key advantage of deep learning is its ability to perform semi-supervised or unsupervised feature extraction over mas-

The ability to learn the feature extraction step present in deep learning-based

Many deep neural network (DNN)-based object detectors have been proposed in the last few years [11, 12]. This research investigates the performance of stateof-the-art DNN models of SSD and Faster RCNN applied to a classical detection problem where the algorithms were trained to identify several animals in images; furthermore to exemplify the application in scientific research, the YOLO network was trained to solve the mice tracking problem. The flowing sections describe the

The single shot multibox detector [13] is one of the best detectors in terms of

The SSD architecture builds on the VGG-16 network [16], and this choice was made based on the strong performance in high-quality image classification tasks and the popularity of the network in problems where transfer learning is involved.

speed and accuracy comprising two main steps, feature map extraction and

algorithms comes from the extensive use of convolutional neural networks (ConvNet or CNN). In this context, convolution is a specialized type of linear operation and can be seen as the simple application of a filter to a determined input [10]. Repeated application of the same filter to an input results in a map of activations called a feature map, indicating the locations and strength of a detected feature in the input by tweaking the parameters of the convolution. The network can adjust itself to reduce the error and therefore learn the best parameters to

extract relevant information on the database.

2. Object detection techniques

2.1 Single shot multibox detector

82

DNN models mentioned earlier in more details [13–15].

convolutional filter applications, to detect objects.

widely developed and enhanced using support vector machines (SVMs). The SVM algorithm, when considered as part of the supervised learning approach, is often used for tasks as classification, regression, and outlier detection [6]. The most attractive feature of this algorithm is that its learning mechanism for multiple objects is simpler to be analyzed mathematically than traditional neural network architecture, thus allowing to complex alterations with known effects on the core features of the algorithm [7]. In essence, an SVM maps the training data to higher-dimensional feature space and constructs a separation hyperplane with maximum margin, producing a nonlinear separation boundary in the input

Recent Trends in Artificial Neural Networks - From Training to Prediction

space [8].

sive datasets.

The SSD network has several feature layers to the end of the base network, which predicts the offsets to default boxes of different scales, aspect ratios, and their associated confidences. Figure based on [13].

Instead of the original VGG fully connected layers, a set of auxiliary convolutional layers change the model, thus enabling to extract features at multiple scales and progressively decrease the size of the input to each subsequent layer.

The bounding box generation considers the application of matching precomputed, fixed-size bounding boxes called priors with the original distribution of ground truth boxes. These priors are selected to keep the intersection over union (IoU) ratio equal to or greater than 0:5.

The overall loss function defined in Eq. (1) is a linear combination of the confidence loss, which measures how confident the network is of the computed bounding box using categorical cross-entropy and location loss, which measures how far away the networks predicted bounding boxes are from the ground truth ones using L2 norm.

$$L(\mathbf{x}, \mathbf{c}, l, \mathbf{g}) = \frac{1}{N} \left( L\_{\rm conf}(\mathbf{x}, \mathbf{c}) + a L\_{\rm loc}(\mathbf{x}, l, \mathbf{g}) \right) \tag{1}$$

where N is the number of matched default boxes and Lconf and Lloc are the confidence and location loss, respectively, as defined in [13]. Figure 1 depicts how to apply the convolutional kernels to an input image in the SSD architecture.

#### 2.2 You only look once

You only look once [14] is a state-of-the-art object detection algorithm which targets real-time applications, and unlike some of the competitors, it is not a traditional classifier purposed as an object detector.

YOLO works by dividing the input image into a grid of S � S cells, where each of these cells is responsible for five bounding boxes predictions that describe the rectangle around the object. It also outputs a confidence score, which is a measure of the certainty that an object was enclosed. Therefore the score does not have any relation with the kind of object present in the box, only with the box's shape.

For each predicted bounding box, a class it's also predicted working just like a regular classifier giving resulting in a probability distribution over all the possible classes. The confidence score for the bounding box and the class prediction combines into one final score that specifies the probability for each box includes a specific type of object. Given these design choices, most of the boxes will have low confidence scores, so only the boxes whose final score is beyond a threshold are kept.

Eq. (2) states the loss function minimized by the training step in the YOLO algorithm.

$$\begin{split} \lambda\_{\text{coord}} & \sum\_{i=0}^{j^2} \sum\_{j=0}^{B} \mathbf{1}\_{\hat{\mathbf{y}}}^{obj} \left[ \left( \mathbf{x}\_i - \hat{\mathbf{x}}\_i \right)^2 + \left( \boldsymbol{y}\_i - \hat{\boldsymbol{y}}\_i \right)^2 \right] \\ & + \lambda\_{\text{coord}} \sum\_{i=0}^{j^2} \sum\_{j=0}^{B} \mathbf{1}\_{\hat{\mathbf{y}}}^{obj} \left[ \left( \sqrt{\boldsymbol{w}\_i} - \sqrt{\boldsymbol{w}\_i} \right)^2 + \left( \sqrt{\boldsymbol{h}\_i} - \sqrt{\boldsymbol{h}\_i} \right)^2 \right] \\ & + \sum\_{i=0}^{j^2} \sum\_{j=0}^{B} \mathbf{1}\_{\hat{\mathbf{y}}}^{obj} \left( \mathbf{C}\_i - \hat{\mathbf{C}}\_i \right)^2 + \lambda\_{\text{coord}} \sum\_{i=0}^{j^2} \sum\_{j=0}^{B} \mathbf{1}\_{\hat{\mathbf{y}}}^{obj} \left( \mathbf{C}\_i - \hat{\mathbf{C}}\_i \right)^2 + \sum\_{i=0}^{j^2} \sum\_{c \in class} \left( \boldsymbol{p}\_i(c) - \boldsymbol{p}\_i(c) \right)^2. \end{split} \tag{2}$$

where 1obj <sup>i</sup> indicates if an object appears in cell <sup>i</sup> and 1obj ij denotes the j th bounding box predictor in cell i responsible for that prediction; x, y, w, h, and C denote the coordinates that represent the center of the box relative to the bounds of the grid cell. The width and height predictions are relative to the whole image. Finally, C denotes the confidence prediction, that is, the IoU between the predicted box and any ground truth box.

Figure 2 describes how the YOLO network process as image. Initially, the input gets passed through a CNN producing the bounding boxes with its perspectives confidences scores and generating the class probability map. Finally, the results of the previous steps are combined to form the final predictions.

classify the image within the proposed region and predict the offset values for the

Faster RCNN acts as a single, unified network for object detection [15]. The region proposal network module

The strategy behind the region proposal network (RPN) training is to use a binary label for each anchor, so the number one will represent the presence of an object and number zero the absence; with this strategy any IoU over 0:7 determines

Thus a multitask loss function shown in Eq. (3) is minimized during the training

, p<sup>∗</sup> i � � <sup>þ</sup> <sup>λ</sup> <sup>1</sup>

where i is the index of the anchor in the batch, pi is the predicted probability of

Figure 3 depicts the unified network for object detection implemented in the Faster RCNN architecture. Using the recently popular terminology of neural networks with "attention" mechanisms [20], the region proposal network module tells

A sample of the PASCAL VOC [21] dataset is used to exemplify the use of SSD and RCNN object detection algorithms. A sample of 6 classes of the 20 available

The images presented in the dataset were randomly divided as follows: 1911 for training corresponding to 50%, 1126 for validation corresponding to 25% and test

To further illustrate the applications of such algorithms in scientific research, the

were selected. Table 1 describes the sample size selected for each class.

dataset used for the YOLO network presented in [22] was also analyzed. As described in [22], the dataset is composed of images from three researches that

Nreg

<sup>i</sup> is the ground truth probability of the anchor, ti is the predicted

X i p∗

<sup>i</sup> is the ground truth bounding box coordinate, and Lcls

<sup>i</sup> Lreg ti, t <sup>∗</sup> i � � (3)

the object's presence and below 0:3 indicates the object's absence.

X i

Lcls pi

Ncls

serves as the "attention" of this unified network. Figure based on [15].

Object Recognition Using Convolutional Neural Networks

DOI: http://dx.doi.org/10.5772/intechopen.89726

and Lreg are the classification and regression loss, respectively.

bounding boxes.

L pi � �, f gti � � <sup>¼</sup> <sup>1</sup>

bounding box coordinate, t <sup>∗</sup>

also corresponding to 25%.

the Fast RCNN module where to look [15].

involve behavioral experiments with mice:

being an object, p<sup>∗</sup>

3. Datasets

85

phase.

Figure 3.

#### 2.3 Faster region convolutional neural network

The faster region convolutional neural network [15] is another state-of-the-art CNN-based deep learning object detection approach. In this architecture, the network takes the provided input image into a convolutional network which provides a convolutional feature map. Instead of using the selective search algorithm to identify the region proposals made in previous iterations [18, 19], a separate network is used to learn and predict these regions. The predicted region proposals are then reshaped using a region of interest (ROI) pooling layer, which is then used to

#### Figure 2.

YOLO model detection as a regression problem [17]. Thus the input image is divided into a S � S grid and for each grid cell, B bounding boxes, confidence for those boxes, and C class probabilities are predicted. These encoded predictions are as an S � S � ð Þ B ∗ 5 þ C tensor. Figure based on [17].

Figure 3.

λcoord

þ Xs 2

i¼0

Figure 2.

84

X B

j¼0 1 obj ij Ci � <sup>C</sup>^<sup>i</sup> � �<sup>2</sup>

where 1obj

any ground truth box.

Xs 2

X B

ij ð Þ xi � x^<sup>i</sup>

Recent Trends in Artificial Neural Networks - From Training to Prediction

Xs 2

i¼0

<sup>i</sup> indicates if an object appears in cell <sup>i</sup> and 1obj

the previous steps are combined to form the final predictions.

2.3 Faster region convolutional neural network

<sup>2</sup> <sup>þ</sup> yi � ^yi � �<sup>2</sup> h i

> <sup>þ</sup> ffiffiffiffi hi <sup>p</sup> �

> > þX<sup>s</sup> 2

> > > i¼0

� � q <sup>2</sup> " #

ffiffiffiffi ^ hi

X c∈classes

ij denotes the j

pi ð Þ� c p^<sup>i</sup> ð Þ<sup>c</sup> � �<sup>2</sup>

(2)

th bounding

ffiffiffiffiffi wi <sup>p</sup> � ffiffiffiffiffi w^i � � p <sup>2</sup>

X B

j¼0 1 obj ij Ci � <sup>C</sup>^<sup>i</sup> � �<sup>2</sup>

box predictor in cell i responsible for that prediction; x, y, w, h, and C denote the coordinates that represent the center of the box relative to the bounds of the grid cell. The width and height predictions are relative to the whole image. Finally, C denotes the confidence prediction, that is, the IoU between the predicted box and

Figure 2 describes how the YOLO network process as image. Initially, the input gets passed through a CNN producing the bounding boxes with its perspectives confidences scores and generating the class probability map. Finally, the results of

The faster region convolutional neural network [15] is another state-of-the-art CNN-based deep learning object detection approach. In this architecture, the network takes the provided input image into a convolutional network which provides a convolutional feature map. Instead of using the selective search algorithm to identify the region proposals made in previous iterations [18, 19], a separate network is used to learn and predict these regions. The predicted region proposals are then reshaped using a region of interest (ROI) pooling layer, which is then used to

YOLO model detection as a regression problem [17]. Thus the input image is divided into a S � S grid and for each grid cell, B bounding boxes, confidence for those boxes, and C class probabilities are predicted. These

encoded predictions are as an S � S � ð Þ B ∗ 5 þ C tensor. Figure based on [17].

j¼0 1 obj

Xs 2

X B

j¼0 1 obj ij

þ λcoord

i¼0

i¼0

þ λcoord

Faster RCNN acts as a single, unified network for object detection [15]. The region proposal network module serves as the "attention" of this unified network. Figure based on [15].

classify the image within the proposed region and predict the offset values for the bounding boxes.

The strategy behind the region proposal network (RPN) training is to use a binary label for each anchor, so the number one will represent the presence of an object and number zero the absence; with this strategy any IoU over 0:7 determines the object's presence and below 0:3 indicates the object's absence.

Thus a multitask loss function shown in Eq. (3) is minimized during the training phase.

$$L\left(\left\{p\_i\right\}, \left\{t\_i\right\}\right) = \frac{1}{N\_{cl}}\sum\_i L\_{cls}\left(p\_i, p\_i^\*\right) + \lambda \frac{1}{N\_{reg}}\sum\_i p\_i^\* L\_{reg}\left(t\_i, t\_i^\*\right) \tag{3}$$

where i is the index of the anchor in the batch, pi is the predicted probability of being an object, p<sup>∗</sup> <sup>i</sup> is the ground truth probability of the anchor, ti is the predicted bounding box coordinate, t <sup>∗</sup> <sup>i</sup> is the ground truth bounding box coordinate, and Lcls and Lreg are the classification and regression loss, respectively.

Figure 3 depicts the unified network for object detection implemented in the Faster RCNN architecture. Using the recently popular terminology of neural networks with "attention" mechanisms [20], the region proposal network module tells the Fast RCNN module where to look [15].

#### 3. Datasets

A sample of the PASCAL VOC [21] dataset is used to exemplify the use of SSD and RCNN object detection algorithms. A sample of 6 classes of the 20 available were selected. Table 1 describes the sample size selected for each class.

The images presented in the dataset were randomly divided as follows: 1911 for training corresponding to 50%, 1126 for validation corresponding to 25% and test also corresponding to 25%.

To further illustrate the applications of such algorithms in scientific research, the dataset used for the YOLO network presented in [22] was also analyzed. As described in [22], the dataset is composed of images from three researches that involve behavioral experiments with mice:

#### Recent Trends in Artificial Neural Networks - From Training to Prediction


behavioral neuroscience experiments. The task of mice detection consists of determining the location in the image where the animals are present, for each

used the convolutional networks described in Section 2.

Object Recognition Using Convolutional Neural Networks

DOI: http://dx.doi.org/10.5772/intechopen.89726

The computational development here presented was performed on a computer with CPU AMD Athlon II X2 B22 at 2:8GHz, 8GB of RAM, NVIDIA GeForce GTX 1070 8GB GPU, Ubuntu 18:04 LTS as OS, CUDA 9, and CuDNN 7. Our approach

The results obtained for the SSD and Faster RCNN networks in the experiments

Figure 4(a) depicts the increasing development of the mean average precision values in the epochs of training. Both architectures reached high mean average precision (mAP) while successfully minimizing the values of their respective loss functions. The Faster RCNN network presented higher and better stability in precision, which can be seen by the smoothness in its curve. Figure 4(b) is a box plot of the time spent by each network on the classification of a single image, whereas the SSD came ahead with 17 � 2 ms as the mean and standard deviation values, and the Faster RCNN translated its higher computational complexity in the execution time with 30 � 2ms as the mean and standard deviation values, respectively.

Table 3 presents more results related to object detection performance. First, it shows the mean average precision, which is the mean value of the average precisions for each class, where average precision is the average value of 11 points on the precision-recall curve for each possible threshold, that is, all the probability of detection for the same class (Precision-Recall evaluation according to the terms

Figure 5 shows some selected examples of object detection results on the dataset used. Each output box is associated with a category label and a softmax score in

(a) Comparison of the mAP models during the training phase. (b) Time spent to execute each architecture on a

Network Framework Mean average precision (%)

Fast RCNN GluonCV [27] 96:07 SSD GluonCV [27] 84:35

Mean average precision results after 100 epochs of training.

½ � 0, 1 . A score threshold of 0:5 is used to display these images.

were based on the analysis of 4163 images, organized according to the dataset

frame acquired.

5. Results and conclusion

described in the PASCAL VOC [21]).

Figure 4.

Table 3.

87

single image.

described in Section 3.

#### Table 1.

SSD and RCNN network dataset description.


Table 2.

Description of the dataset for use with the YOLO network as earlier used in [22].


Table 2 describes the sample size selected from each of the datasets used in this paper. For the ethological evaluation [23], 3707 frames were used, captured in a top view of the arena of social interaction experiments among mice. For the automated home-cage [24], a sample of 3073 frames was selected from a side view of behavioral experiments. For the CRIM13 [25], a sample of 6842 frames was selected, 3492 from a side view and 3350 from a top view.

The same dataset division used in [22] was also reproduced resulting in 6811 images for training, 3405 for validation, and 3406 for the test.

#### 4. Material and methods for object detection

In this work, the previously described SDD and Faster RCNN networks are compared in the task of localization and tracking of six species of animals in diversified environments. Having accurate, detailed, and up-to-date information about the location and behavior of animals in the wild would improve our ability to study and conserve ecosystems [26]. Additionally, results from the YOLO network, reproduced from [22], to detect and track mice in videos are recorded during

behavioral neuroscience experiments. The task of mice detection consists of determining the location in the image where the animals are present, for each frame acquired.

The computational development here presented was performed on a computer with CPU AMD Athlon II X2 B22 at 2:8GHz, 8GB of RAM, NVIDIA GeForce GTX 1070 8GB GPU, Ubuntu 18:04 LTS as OS, CUDA 9, and CuDNN 7. Our approach used the convolutional networks described in Section 2.

#### 5. Results and conclusion

The results obtained for the SSD and Faster RCNN networks in the experiments were based on the analysis of 4163 images, organized according to the dataset described in Section 3.

Figure 4(a) depicts the increasing development of the mean average precision values in the epochs of training. Both architectures reached high mean average precision (mAP) while successfully minimizing the values of their respective loss functions. The Faster RCNN network presented higher and better stability in precision, which can be seen by the smoothness in its curve. Figure 4(b) is a box plot of the time spent by each network on the classification of a single image, whereas the SSD came ahead with 17 � 2 ms as the mean and standard deviation values, and the Faster RCNN translated its higher computational complexity in the execution time with 30 � 2ms as the mean and standard deviation values, respectively.

Table 3 presents more results related to object detection performance. First, it shows the mean average precision, which is the mean value of the average precisions for each class, where average precision is the average value of 11 points on the precision-recall curve for each possible threshold, that is, all the probability of detection for the same class (Precision-Recall evaluation according to the terms described in the PASCAL VOC [21]).

Figure 5 shows some selected examples of object detection results on the dataset used. Each output box is associated with a category label and a softmax score in ½ � 0, 1 . A score threshold of 0:5 is used to display these images.

Figure 4.

• Ethological evaluation [23]: This research presents new metrics for chronic

Dataset Number of images Resolution Ethological evaluation [23] 3707 640 480 Automated home-cage [24] 3073 320 240 CRIM13 [25] 6842 656 490

Class Number of images

Bird 811 Cat 1128 Dog 1341 Horse 526 Sheep 357 Total 4163

Recent Trends in Artificial Neural Networks - From Training to Prediction

• Automated home-cage [24]: This study introduces a trainable computer vision system that allows the automated analysis of complex mouse behaviors; they are eat, drink, groom, hang, micromovement, rear, rest, and walk.

• Caltech Resident-Intruder Mouse dataset (CRIM13) [25]: It has videos recorded with superior and synchronized lateral visualization of pairs of mice

Table 2 describes the sample size selected from each of the datasets used in this paper. For the ethological evaluation [23], 3707 frames were used, captured in a top view of the arena of social interaction experiments among mice. For the automated home-cage [24], a sample of 3073 frames was selected from a side view of behavioral experiments. For the CRIM13 [25], a sample of 6842 frames was selected, 3492

The same dataset division used in [22] was also reproduced resulting in 6811

In this work, the previously described SDD and Faster RCNN networks are compared in the task of localization and tracking of six species of animals in diversified environments. Having accurate, detailed, and up-to-date information about the location and behavior of animals in the wild would improve our ability to study and conserve ecosystems [26]. Additionally, results from the YOLO network, reproduced from [22], to detect and track mice in videos are recorded during

stress models of social defeat in mice.

Total 13, 622

Description of the dataset for use with the YOLO network as earlier used in [22].

SSD and RCNN network dataset description.

Table 1.

Table 2.

86

from a side view and 3350 from a top view.

involved in social behavior in 13 different actions.

images for training, 3405 for validation, and 3406 for the test.

4. Material and methods for object detection

(a) Comparison of the mAP models during the training phase. (b) Time spent to execute each architecture on a single image.


#### Table 3.

Mean average precision results after 100 epochs of training.

Figure 5. Output examples of the networks. (a)–(d) refer to SSD and (e)–(i) to Faster RCNN.

Our approach, as in [22], also used two versions of the YOLO network to detect mice within three different experimental setups. The results obtained were based on the analysis of 13,622 images, organized according to the dataset described in Section 3.

The first version of YOLO being trained was the YOLO Full network which uses the Darknet-53 [14] convolutional architecture that comprises 53 convolutional layers. Such a model was trained as described in [17], starting from an ImageNet [28] pre-trained model. Each model requires 235 MB of storage size. We used a batch of eight images, a momentum of 0:9, and weight decay of 5 <sup>10</sup>4. The model took 140 hours to be trained.

Given the aforementioned small difference between the two versions of the YOLO object detector, the possibility of designing real-time systems for experiments involving animal tracking is closer to reality with the Tiny architecture. Derived from the smaller demand for computing power, systems where actions are taken while the experiment takes place can be designed without the need for human

(a) and (b) YOLO architecture evolution in mean average precision and minimization of the loss function during the training phase. (c) GPU time required to obtain the classification of an image in each of the

Figure 7 shows some examples, resulting from mice tracking performed on the three different datasets used. Thus, it is possible to verify the operation of mouse tracking in different scenarios. In (a)–(c), the black mouse appears over a white background, the video is recorded from a top view camera in a typical configuration in behavioral experiments. For Figures (d)–(f), the camera was positioned on the side of the experimental box; the algorithm performed the tracking correctly for different positions of the animal. Finally, in Figures (g)–(i), the images were recorded by a top-view camera, and it is possible to verify a large amount of information besides the tracked object. However, the algorithm worked very well,

This chapter presented an overview of the machine learning techniques using convolutional neural networks for image object detection. The main algorithms for solving this type of problem were presented: Faster RCNN, YOLO, and SSD. To exemplify the functioning of the algorithms, datasets recognized by the scientific literature and in the field of computer vision were selected, tests were performed, and the results were presented, showing the advantages and differences of each of the techniques. This content is expected to serve as a reference for researchers and

those interested in this broadly developing area of knowledge.

intervention.

89

Figure 6.

networks.

even for two animals in the same arena.

Object Recognition Using Convolutional Neural Networks

DOI: http://dx.doi.org/10.5772/intechopen.89726

A smaller and faster YOLO alternative was also trained and named as YOLO Tiny. To speed up the process, this "tiny" version comprises only a portion of the Darknet-53 [14] resources: 23 convolutional layers. Each model requires only 34 MB of storage size. The network training follows as described in [17], finetuning an ImageNet [28] pre-trained model. We used a batch of 64 images, a momentum of 0:9, and weight decay of 5 <sup>10</sup>4. The model took 18 hours to be trained.

Figure 6 shows the comparison of the two YOLO models used, YOLO Full and Tiny. Figure 6(a) shows high accuracy of the Full architecture with small oscillations of the accuracy curve during the training. In Figure 6(b), the high accuracy is maintained from the earliest times and remains practically unchanged up to the limit number of epochs. Both architectures reached high mean average precision values while successfully minimizing the values of their loss function. The Tiny version of the YOLO network presented better stability in precision, which can be seen by the smoothness in its curve. The results show that the mean average precision reached by this re-implementation was 90.79 and 90.75% for the Full and Tiny versions of YOLO, respectively. The use of the Tiny version is a good alternative for experimental designs that require real-time response.

Figure 6(c) is a bar graph showing the mean time spent on the classification of a single image in both architectures. The smaller size of the Tiny version gets a direct translation in execution time, having 0:08 0:06s as the mean and standard deviation values, whereas the Full version has 0:36 0:16s as the mean and standard deviation values, respectively.

Object Recognition Using Convolutional Neural Networks DOI: http://dx.doi.org/10.5772/intechopen.89726

Our approach, as in [22], also used two versions of the YOLO network to detect mice within three different experimental setups. The results obtained were based on the analysis of 13,622 images, organized according to the dataset described in

The first version of YOLO being trained was the YOLO Full network which uses

Figure 6 shows the comparison of the two YOLO models used, YOLO Full and Tiny. Figure 6(a) shows high accuracy of the Full architecture with small oscillations of the accuracy curve during the training. In Figure 6(b), the high accuracy is maintained from the earliest times and remains practically unchanged up to the limit number of epochs. Both architectures reached high mean average precision values while successfully minimizing the values of their loss function. The Tiny version of the YOLO network presented better stability in precision, which can be seen by the smoothness in its curve. The results show that the mean average precision reached by this re-implementation was 90.79 and 90.75% for the Full and Tiny versions of YOLO, respectively. The use of the Tiny version is a good alterna-

Figure 6(c) is a bar graph showing the mean time spent on the classification of a single image in both architectures. The smaller size of the Tiny version gets a direct translation in execution time, having 0:08 0:06s as the mean and standard deviation values, whereas the Full version has 0:36 0:16s as the mean and standard

the Darknet-53 [14] convolutional architecture that comprises 53 convolutional layers. Such a model was trained as described in [17], starting from an ImageNet [28] pre-trained model. Each model requires 235 MB of storage size. We used a batch of eight images, a momentum of 0:9, and weight decay of 5 <sup>10</sup>4. The

Output examples of the networks. (a)–(d) refer to SSD and (e)–(i) to Faster RCNN.

Recent Trends in Artificial Neural Networks - From Training to Prediction

A smaller and faster YOLO alternative was also trained and named as YOLO Tiny. To speed up the process, this "tiny" version comprises only a portion of the Darknet-53 [14] resources: 23 convolutional layers. Each model requires only 34 MB of storage size. The network training follows as described in [17], finetuning an ImageNet [28] pre-trained model. We used a batch of 64 images, a momentum of 0:9, and weight decay of 5 <sup>10</sup>4. The model took 18 hours to be

tive for experimental designs that require real-time response.

Section 3.

Figure 5.

trained.

88

model took 140 hours to be trained.

deviation values, respectively.

(a) and (b) YOLO architecture evolution in mean average precision and minimization of the loss function during the training phase. (c) GPU time required to obtain the classification of an image in each of the networks.

Given the aforementioned small difference between the two versions of the YOLO object detector, the possibility of designing real-time systems for experiments involving animal tracking is closer to reality with the Tiny architecture. Derived from the smaller demand for computing power, systems where actions are taken while the experiment takes place can be designed without the need for human intervention.

Figure 7 shows some examples, resulting from mice tracking performed on the three different datasets used. Thus, it is possible to verify the operation of mouse tracking in different scenarios. In (a)–(c), the black mouse appears over a white background, the video is recorded from a top view camera in a typical configuration in behavioral experiments. For Figures (d)–(f), the camera was positioned on the side of the experimental box; the algorithm performed the tracking correctly for different positions of the animal. Finally, in Figures (g)–(i), the images were recorded by a top-view camera, and it is possible to verify a large amount of information besides the tracked object. However, the algorithm worked very well, even for two animals in the same arena.

This chapter presented an overview of the machine learning techniques using convolutional neural networks for image object detection. The main algorithms for solving this type of problem were presented: Faster RCNN, YOLO, and SSD. To exemplify the functioning of the algorithms, datasets recognized by the scientific literature and in the field of computer vision were selected, tests were performed, and the results were presented, showing the advantages and differences of each of the techniques. This content is expected to serve as a reference for researchers and those interested in this broadly developing area of knowledge.

References

542(7639):115

113-128

621-636

pp. 1-6

pp. 2600-2603

91

[1] Esteva A et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017;

DOI: http://dx.doi.org/10.5772/intechopen.89726

Object Recognition Using Convolutional Neural Networks

[9] Pan WD, Dong Y, Wu D.

Applications. 2018. p. 159

2016

[10] Goodfellow I, Bengio Y,

IEEE; 2013. pp. 8599-8603

[12] Kriegeskorte N. Deep neural networks: A new framework for modeling biological vision and brain information processing. Annual Review of Vision Science. 2015;1:417-446

[13] Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, et al. SSD: Single shot multibox detector. In: European Conference on Computer Vision. Cham: Springer; 2016. pp. 21-37

[14] Redmon J, Farhadi A. Yolov3: An Incremental Improvement. arXiv; 2018

[15] Ren S, He K, Girshick R, Sun J. Faster

detection with region proposal networks. In: Advances in Neural Information Processing Systems. 2015. pp. 91-99

[16] Simonyan K, Zisserman A. Very deep convolutional networks for largescale image recognition. arXiv preprint

[17] Redmon J, Divvala S, Girshick R, Farhadi A. You only look once: Unified,

Proceedings of the IEEE Conference on

real-time object detection. In:

Computer Vision and Pattern Recognition. 2016. pp. 779-788

arXiv:1409.1556; 2014

r-cnn: Towards real-time object

Classification of malaria-infected cells using deep convolutional neural networks. In: Machine Learning: Advanced Techniques and Emerging

Courville A. Deep Learning. MIT Press;

[11] Deng L, Hinton G, Kingsbury B. New types of deep neural network learning for speech recognition and related applications: An overview. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[2] Jayaraman V, Chandrasekhar MG, Rao UR. Managing the natural disasters from space technology inputs. Acta Astronautica. 1997;40(2–8):291-325

[3] Leonard M et al. A compound event framework for understanding extreme impacts. Wiley Interdisciplinary Reviews: Climate Change. 2014;5(1):

[4] Kogan FN. Global drought watch from space. Bulletin of the American Meteorological Society. 1997;78(4):

[5] Srinivas S, Sarvadevabhatla RK, Mopuri RK, Prabhu N, Kruthiventi SSS, Venkatesh Babu R. An introduction to deep convolutional neural nets for computer vision. In: Deep Learning for Medical Image Analysis. Academic

[6] de Menezes RST, de Azevedo Lima L, Santana O, Henriques-Alves AM, Santa Cruz RM, Maia H. Classification of mice head orientation using support vector machine and histogram of oriented

[7] Oskoei MA, Gan JQ, Hu H. Adaptive schemes applied to online SVM for BCI data classification. In: 2009 Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE; 2009.

[8] Hearst MA, Dumais ST, Osuna E, Platt J, Scholkopf B. Support vector machines. IEEE Intelligent Systems and their Applications. 1998;13(4):1828

Press; 2017. pp. 25-52

gradients features. In: 2018 International Joint Conference on Neural Networks (IJCNN). IEEE; 2018.

#### Figure 7.

Output examples of the YOLO network. (a)–(c) refer to ethological evaluation [23], (d)–(f) refer to automated home-cage [24], and (g)–(i) refer to CRIM13 [25].

At the moment, we are experiencing the era of machine learning applications, and much should be developed in the coming years from the use and improvement of these techniques. Further improvements in the development of even more specific hardware and fundamental changes in related mathematical theory are expected shortly, making artificial intelligence increasingly present and important to the contemporary world.

### Author details

Richardson Santiago Teles de Menezes<sup>1</sup> , Rafael Marrocos Magalhaes<sup>2</sup> and Helton Maia1 \*


\*Address all correspondence to: helton.maia@gmail.com

© 2019 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/ by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Object Recognition Using Convolutional Neural Networks DOI: http://dx.doi.org/10.5772/intechopen.89726

#### References

[1] Esteva A et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017; 542(7639):115

[2] Jayaraman V, Chandrasekhar MG, Rao UR. Managing the natural disasters from space technology inputs. Acta Astronautica. 1997;40(2–8):291-325

[3] Leonard M et al. A compound event framework for understanding extreme impacts. Wiley Interdisciplinary Reviews: Climate Change. 2014;5(1): 113-128

[4] Kogan FN. Global drought watch from space. Bulletin of the American Meteorological Society. 1997;78(4): 621-636

[5] Srinivas S, Sarvadevabhatla RK, Mopuri RK, Prabhu N, Kruthiventi SSS, Venkatesh Babu R. An introduction to deep convolutional neural nets for computer vision. In: Deep Learning for Medical Image Analysis. Academic Press; 2017. pp. 25-52

[6] de Menezes RST, de Azevedo Lima L, Santana O, Henriques-Alves AM, Santa Cruz RM, Maia H. Classification of mice head orientation using support vector machine and histogram of oriented gradients features. In: 2018 International Joint Conference on Neural Networks (IJCNN). IEEE; 2018. pp. 1-6

[7] Oskoei MA, Gan JQ, Hu H. Adaptive schemes applied to online SVM for BCI data classification. In: 2009 Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE; 2009. pp. 2600-2603

[8] Hearst MA, Dumais ST, Osuna E, Platt J, Scholkopf B. Support vector machines. IEEE Intelligent Systems and their Applications. 1998;13(4):1828

[9] Pan WD, Dong Y, Wu D. Classification of malaria-infected cells using deep convolutional neural networks. In: Machine Learning: Advanced Techniques and Emerging Applications. 2018. p. 159

[10] Goodfellow I, Bengio Y, Courville A. Deep Learning. MIT Press; 2016

[11] Deng L, Hinton G, Kingsbury B. New types of deep neural network learning for speech recognition and related applications: An overview. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE; 2013. pp. 8599-8603

[12] Kriegeskorte N. Deep neural networks: A new framework for modeling biological vision and brain information processing. Annual Review of Vision Science. 2015;1:417-446

[13] Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, et al. SSD: Single shot multibox detector. In: European Conference on Computer Vision. Cham: Springer; 2016. pp. 21-37

[14] Redmon J, Farhadi A. Yolov3: An Incremental Improvement. arXiv; 2018

[15] Ren S, He K, Girshick R, Sun J. Faster r-cnn: Towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems. 2015. pp. 91-99

[16] Simonyan K, Zisserman A. Very deep convolutional networks for largescale image recognition. arXiv preprint arXiv:1409.1556; 2014

[17] Redmon J, Divvala S, Girshick R, Farhadi A. You only look once: Unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016. pp. 779-788

At the moment, we are experiencing the era of machine learning applications, and much should be developed in the coming years from the use and improvement of these techniques. Further improvements in the development of even more specific hardware and fundamental changes in related mathematical theory are expected shortly, making artificial intelligence increasingly present and important

Output examples of the YOLO network. (a)–(c) refer to ethological evaluation [23], (d)–(f) refer to

Recent Trends in Artificial Neural Networks - From Training to Prediction

© 2019 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/ by/3.0), which permits unrestricted use, distribution, and reproduction in any medium,

, Rafael Marrocos Magalhaes<sup>2</sup> and

to the contemporary world.

\*

Richardson Santiago Teles de Menezes<sup>1</sup>

2 Federal University of Paraiba, Brazil

provided the original work is properly cited.

1 Federal University of Rio Grande do Norte, Brazil

automated home-cage [24], and (g)–(i) refer to CRIM13 [25].

\*Address all correspondence to: helton.maia@gmail.com

Author details

Helton Maia1

90

Figure 7.

[18] Girshick R, Donahue J, Darrell T, Malik J. Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2014. p. 580587

[19] Girshick R. Fast r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision. 2015. pp. 1440-1448

[20] Chorowski JK, Bahdanau D, Serdyuk D, Cho K, Bengio Y. Attentionbased models for speech recognition. In: Advances in Neural Information Processing Systems. 2015. pp. 577-585

[21] Everingham M et al. The Pascal visual object classes (VOC) challenge. International Journal of Computer Vision. 2010;88(2):303-338

[22] Peixoto HM, Teles RS, Luiz JVA, Henriques-Alves AM, Santa Cruz RM. Mice Tracking Using the YOLO Algorithm. Vol. 7. PeerJ Preprints; 2019. p. e27880v1

[23] Henriques-Alves AM, Queiroz CM. Ethological evaluation of the effects of social defeat stress in mice: Beyond the social interaction ratio. Frontiers in Behavioral Neuroscience. 2016;9:364

[24] Jhuang H et al. Automated homecage behavioural phenotyping of mice. Nature Communications. 2010;1:68

[25] Burgos-Artizzu XP, Dollár P, Lin D, Anderson DJ, Perona P. Social behavior recognition in continuous video. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition. IEEE; 2012. pp. 1322-1329

[26] Norouzzadeh MS et al. Automatically identifying, counting, and describing wild animals in cameratrap images with deep learning. Proceedings of the National Academy of Sciences of the United States of America. 2018;115(25):E5716-E5725

Chapter 6

Abstract

performance.

1. Introduction

93

Approach

Prediction of Wave Energy

Soumya Ghosh and Mrinmoy Majumder

Potential in India: A Fuzzy-ANN

The conversion efficiency of wave energy converters is not only unsatisfactory but also expensive, which is why the popularity of wave energy as an alternative to conventional energy sources is subjacent. This means that besides wave height and period, there are many other factors which influence the amount of "utilizable" wave energy potential. The present study attempts to identify these important factors and predict power potential as a function of these factors. Accordingly, a polynomial neural network was utilized, and fuzzy logic was applied to identify the most important factors. According to the results, wave height was found to have the maximum importance followed by wave period, water depth, and salinity. In total, 12 different neural network models were developed to predict the same output, among which the model with all of the 4 inputs was found to have optimal

Keywords: wave energy, power potential, fuzzy logic, artificial neural network

Wave energy is considered as one of the most promising marine renewable resources, with global worldwide wave power estimated at around 2 TW [1]. Several renewable energy-generating sources such as wave power, tides, and current which are associated with marine have always been misunderstood though it has strong predictability and other physical properties [2]. Wave energy presents a number of advantages with respect to other CO2-free energy sources—high-power density, a relatively high utilization factor, and last, but not the least, low environmental and visual impact [3]. Wave energy resource assessments fall into two categories. Renewable energy is continually available, but due to the complexity of conversion and storage procedures and uncertainty in their availability, such sources of energy have till now been used with caution [4]. Most of the drawbacks were found to vary with location. Some of the advantages are high-energy density [5] and good predictability as well as reduced negative environmental impacts on beaches [6], the marine ecosystems [7], and the wave climate [8]. If we consider the energy consumption, then India ranks four just after the United States, China, and Russia. Electricity consumption in India is expected to rise to around 2280 BkWh by 2021–2022 and to around 4500 BkWh by 2031–2032 [9]. Various methods have been used to estimate wave power potential, but most of them are subjective and

[27] Guo J, He H, He T, Lausen L, Li M, Lin H, et al. GluonCV and GluonNLP: Deep Learning in Computer Vision and Natural Language Processing. arXiv preprint arXiv:1907; 2019. p. 04433

[28] Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L. ImageNet: A large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition. IEEE; 2009. pp. 248-255

[29] Chen X-L et al. Remote sensing image-based analysis of the relationship between urban heat island and land use/ cover changes. Remote Sensing of Environment. 2006;104(2):133-146

#### Chapter 6

[18] Girshick R, Donahue J, Darrell T, Malik J. Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2014.

Recent Trends in Artificial Neural Networks - From Training to Prediction

Sciences of the United States of America. 2018;115(25):E5716-E5725

[27] Guo J, He H, He T, Lausen L, Li M, Lin H, et al. GluonCV and GluonNLP: Deep Learning in Computer Vision and Natural Language Processing. arXiv preprint arXiv:1907; 2019. p. 04433

[28] Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L. ImageNet: A large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition. IEEE; 2009.

[29] Chen X-L et al. Remote sensing image-based analysis of the relationship between urban heat island and land use/ cover changes. Remote Sensing of Environment. 2006;104(2):133-146

pp. 248-255

[19] Girshick R. Fast r-cnn. In:

[20] Chorowski JK, Bahdanau D,

Advances in Neural Information Processing Systems. 2015. pp. 577-585

[21] Everingham M et al. The Pascal visual object classes (VOC) challenge. International Journal of Computer Vision. 2010;88(2):303-338

[22] Peixoto HM, Teles RS, Luiz JVA, Henriques-Alves AM, Santa Cruz RM. Mice Tracking Using the YOLO

Algorithm. Vol. 7. PeerJ Preprints; 2019.

[23] Henriques-Alves AM, Queiroz CM. Ethological evaluation of the effects of social defeat stress in mice: Beyond the social interaction ratio. Frontiers in Behavioral Neuroscience. 2016;9:364

[24] Jhuang H et al. Automated homecage behavioural phenotyping of mice. Nature Communications. 2010;1:68

[25] Burgos-Artizzu XP, Dollár P, Lin D, Anderson DJ, Perona P. Social behavior recognition in continuous video. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition. IEEE;

2012. pp. 1322-1329

92

[26] Norouzzadeh MS et al.

Automatically identifying, counting, and describing wild animals in camera-

Proceedings of the National Academy of

trap images with deep learning.

Proceedings of the IEEE International Conference on Computer Vision. 2015.

Serdyuk D, Cho K, Bengio Y. Attentionbased models for speech recognition. In:

p. 580587

pp. 1440-1448

p. e27880v1

## Prediction of Wave Energy Potential in India: A Fuzzy-ANN Approach

Soumya Ghosh and Mrinmoy Majumder

#### Abstract

The conversion efficiency of wave energy converters is not only unsatisfactory but also expensive, which is why the popularity of wave energy as an alternative to conventional energy sources is subjacent. This means that besides wave height and period, there are many other factors which influence the amount of "utilizable" wave energy potential. The present study attempts to identify these important factors and predict power potential as a function of these factors. Accordingly, a polynomial neural network was utilized, and fuzzy logic was applied to identify the most important factors. According to the results, wave height was found to have the maximum importance followed by wave period, water depth, and salinity. In total, 12 different neural network models were developed to predict the same output, among which the model with all of the 4 inputs was found to have optimal performance.

Keywords: wave energy, power potential, fuzzy logic, artificial neural network

#### 1. Introduction

Wave energy is considered as one of the most promising marine renewable resources, with global worldwide wave power estimated at around 2 TW [1]. Several renewable energy-generating sources such as wave power, tides, and current which are associated with marine have always been misunderstood though it has strong predictability and other physical properties [2]. Wave energy presents a number of advantages with respect to other CO2-free energy sources—high-power density, a relatively high utilization factor, and last, but not the least, low environmental and visual impact [3]. Wave energy resource assessments fall into two categories. Renewable energy is continually available, but due to the complexity of conversion and storage procedures and uncertainty in their availability, such sources of energy have till now been used with caution [4]. Most of the drawbacks were found to vary with location. Some of the advantages are high-energy density [5] and good predictability as well as reduced negative environmental impacts on beaches [6], the marine ecosystems [7], and the wave climate [8]. If we consider the energy consumption, then India ranks four just after the United States, China, and Russia. Electricity consumption in India is expected to rise to around 2280 BkWh by 2021–2022 and to around 4500 BkWh by 2031–2032 [9]. Various methods have been used to estimate wave power potential, but most of them are subjective and

linear and cannot be adapted to various situations. In the present study, a new method for estimating wave power potential is proposed; it is an objective, cognitive, and unbiased method which estimates the wave energy potential of a location considering the most important nonlinearity.

2.2 Group method of data handling (GMDH)

DOI: http://dx.doi.org/10.5772/intechopen.84676

Prediction of Wave Energy Potential in India: A Fuzzy-ANN Approach

complex problems in nonlinear systems [14].

3. Methodology

Figure 1.

95

Schematic diagram of present investigation.

input and output parameters.

coefficients are obtained using the regression method [15].

The self-adaptive heuristic ANN based method is one of the learning machine approaches based on the polynomial theory of complex systems, designed by Ivakhnenko (1971). Generally, the first-order (linear) Kolmogorov-Gabor polynomial including n nodes can be used as transfer function [13]:

where Y is the middle candidate solution, x is a given initial solutions, and a is

The main advantage of the GMDH model is in building analytical functions within feed-forward networks based on quadratic polynomials whose weighting

The methods were used to estimate the rank of importance of the parameters based on the study objective shown in Figure 1. The procedures to estimate the wave power potential involve the application of the MCDM method to estimate the priority value of the parameters and GMDH to reveal the relationship between the

The MCDM methodology deduces the importance of the parameters based on their citation frequency, expert inputs, and availability of data. All three methods were used to estimate the rank of importance of the parameters based on the study objective and on criteria like efficiency and cost. The detailed hierarchy of MCDM methodologies is shown in Figure 2. The model uses fuzzy logic to determine

the vector of coefficients or weights. New middle candidate solutions can be obtained according to the inputs of the current layer and the transfer function. Self-organizing models of optimal convolution is constructed by inductive algorithm which was supervised by original GMDH. It is totally based on the input-output relationships of a given dataset, without the need for user interference. The GMDH network is known as a self-organized approach that solves various

Y ¼ f xð Þ¼ <sup>1</sup>; x2; …; xn a<sup>0</sup> þ a1x<sup>1</sup> þ a2x<sup>2</sup> þ … þ anxn (1)

#### 1.1 Objective

The objective of my study was multi-criteria decision-making (MCDM) methods like fuzzy logic decision-making (FLDM), and cognitive methods like group method of data handling (GMDH) were utilized which incorporate both objectivity and adaptability in the predictive method. As far as the authors know, fuzzy-based MCDM cascaded with GMDH has not previously been used to estimate wave power potential.

#### 1.2 Future aspect

Cognitive study of site variety for wave energy power plant was infrequently attempted, and that is why the authors of the present study tried to propose a novel methodology in selection of most favorable sites for wave energy generation by MCDM and ANN technologies. Finally, the consideration of another multi-criteria decision-making method instead of fuzzy for evaluating the decision alternatives and the comparison of the results with the ones of the present study could represent a subject for future research.

#### 2. Methodology

The new method comprises two steps:


Sections 2.1 and 2.2 discuss the strengths, weaknesses, and applicability of the method in this study.

#### 2.1 Fuzzy logic

Fuzzy set theory was first introduced as the mathematical programming of the primary works [10]. Fuzzy logic resembles human analysis in its use of inaccurate information to create decisions. Many such problems can be formulated as the minimization of functionals defined over a class of admissible domains. Nondeterministic condition deceits both design variables and allowable limits. A stochastic problem can be transformed into its deterministic form by using expected value and the chance-constrained programming technique. Thus, fuzzy mathematical formulation could be a substitution of this [11]. The advantage of fuzzy logic lies in the depiction of importance for similarly important factors by fuzzy scale, and disadvantages are only found in the qualitative variables which can be used. Fuzzy logic could be applied to such problems as determining a suitable location for a biogas plant, geothermal potential, and the control design of power management [12].

#### 2.2 Group method of data handling (GMDH)

The self-adaptive heuristic ANN based method is one of the learning machine approaches based on the polynomial theory of complex systems, designed by Ivakhnenko (1971). Generally, the first-order (linear) Kolmogorov-Gabor polynomial including n nodes can be used as transfer function [13]:

$$Y = f(\mathbf{x}\_1, \mathbf{x}\_2, \dots, \mathbf{x}\_n) = a\_0 + a\_1\mathbf{x}\_1 + a\_2\mathbf{x}\_2 + \dots + a\_n\mathbf{x}\_n \tag{1}$$

where Y is the middle candidate solution, x is a given initial solutions, and a is the vector of coefficients or weights. New middle candidate solutions can be obtained according to the inputs of the current layer and the transfer function.

Self-organizing models of optimal convolution is constructed by inductive algorithm which was supervised by original GMDH. It is totally based on the input-output relationships of a given dataset, without the need for user interference. The GMDH network is known as a self-organized approach that solves various complex problems in nonlinear systems [14].

The main advantage of the GMDH model is in building analytical functions within feed-forward networks based on quadratic polynomials whose weighting coefficients are obtained using the regression method [15].

#### 3. Methodology

linear and cannot be adapted to various situations. In the present study, a new method for estimating wave power potential is proposed; it is an objective, cognitive, and unbiased method which estimates the wave energy potential of a location

Recent Trends in Artificial Neural Networks - From Training to Prediction

The objective of my study was multi-criteria decision-making (MCDM) methods like fuzzy logic decision-making (FLDM), and cognitive methods like group method of data handling (GMDH) were utilized which incorporate both objectivity and adaptability in the predictive method. As far as the authors know, fuzzy-based MCDM cascaded with GMDH has not previously been used to estimate

Cognitive study of site variety for wave energy power plant was infrequently attempted, and that is why the authors of the present study tried to propose a novel methodology in selection of most favorable sites for wave energy generation by MCDM and ANN technologies. Finally, the consideration of another multi-criteria decision-making method instead of fuzzy for evaluating the decision alternatives and the comparison of the results with the ones of the present study could represent

I. Application of MCDM, i.e., FLDM, to find the weight of importance

II. Application of GMDH to provide a predictive infrastructure for making the

Sections 2.1 and 2.2 discuss the strengths, weaknesses, and applicability of the

Fuzzy set theory was first introduced as the mathematical programming of the primary works [10]. Fuzzy logic resembles human analysis in its use of inaccurate information to create decisions. Many such problems can be formulated as the minimization of functionals defined over a class of admissible domains. Nondeterministic condition deceits both design variables and allowable limits. A stochastic problem can be transformed into its deterministic form by using expected value and the chance-constrained programming technique. Thus, fuzzy mathematical formulation could be a substitution of this [11]. The advantage of fuzzy logic lies in the depiction of importance for similarly important factors by fuzzy scale, and disadvantages are only found in the qualitative variables which can be used. Fuzzy logic could be applied to such problems as determining a suitable location for a biogas plant, geothermal potential, and the control design of power

considering the most important nonlinearity.

1.1 Objective

wave power potential.

a subject for future research.

The new method comprises two steps:

method resource independent

1.2 Future aspect

2. Methodology

method in this study.

2.1 Fuzzy logic

management [12].

94

The methods were used to estimate the rank of importance of the parameters based on the study objective shown in Figure 1. The procedures to estimate the wave power potential involve the application of the MCDM method to estimate the priority value of the parameters and GMDH to reveal the relationship between the input and output parameters.

The MCDM methodology deduces the importance of the parameters based on their citation frequency, expert inputs, and availability of data. All three methods were used to estimate the rank of importance of the parameters based on the study objective and on criteria like efficiency and cost. The detailed hierarchy of MCDM methodologies is shown in Figure 2. The model uses fuzzy logic to determine

Figure 1. Schematic diagram of present investigation.

Figure 2. Figure showing the hierarchy of the MCDM methodologies.

weights of importance as derived from the rank of importance and the aggregation method.

The model uses fuzzy logic to determine weights of importance as derived from the rank of importance and the aggregation method.

height, wind speed, and water depth for the five locations were collected from the National Data Buoy Center. The most recent reanalysis dataset was produced by the European Centre for Medium-Range Weather Forecasts (ECMWF) [16]. The five locations in the Bay of Bengal (BoB) are used to estimate the wave power potential in an Indian scenario. The wave height (Hs) and wave period (Te) are obtained from

Location 3: Puducherry (15.511064 N, 81.523419E)

3.5 2.6 3.1 2.4 1.6

6.2 7.4 6.5 8.2 8.4

3000 2500 700 1400 2200

34.7 33.5 35.6 32.8 33.2

Location 4: Bhubaneswar (17.870545 N, 84.384105E)

Location 5: Visakhapatnam (19.192926 N, 85.702425E)

Hs <sup>¼</sup> <sup>4</sup> ffiffiffiffiffiffi m<sup>0</sup>

Energy period <sup>ð</sup>TeÞ ¼ <sup>m</sup>�<sup>1</sup>

corresponding water depth and the geographical coordinates are indicated for each

In total, 12 GMDH-based models were developed with the same 4 inputs and 1 output as wave power potential. The numbers of inputs were varied from three to five where transformation of input and output data was conducted by the use of tangent and cube root functions. The top three parameters were identified with the help of the fuzzy logic MCDM method. According to the EPI, 3 models which were found to be better than the 12 models developed for the present study were selected

The performance of all 12 models was analyzed by aean absolute error (MAE)

proportional to model accuracy, whereas the other metrics are directly proportional

[17] and correlation (R) [18]. The former metrics are known to be inversely

to model performance. The performance of the model during the checking (c) or testing phase is a more important indicator of model reliability than the performance of the model in the training (t) phase [19]. Performance of the three selected models was tested for reliability with the help of root-mean-square error (RMSE), mean relative error (MRE), correlation (R), and percent bias (PBIAS) between the predicted and observed data. The equivalent performance index (EPI)

was prepared to represent the performance of the models (see Eq. (4)).

More details about locations 1–5 are provided in Table 1, where the

p (2)

(3)

m<sup>0</sup>

the spectral moment as shown in Eqs. (2) and (3):

Magnitude of the parameter with respect to the selected location.

Parameters Indian scenario

Location 2: Kikanda (13.787577, 80.642017E)

Prediction of Wave Energy Potential in India: A Fuzzy-ANN Approach

Location 1: Chennai (10.911854, 80.581172E)

DOI: http://dx.doi.org/10.5772/intechopen.84676

Wave height (m)

Wave period (s)

Water depth (m)

Salinity (psu)

Table 1.

Significant of wave height (Hs)

3.2 Development of the cognitive method

of the five selected locations.

for further validation.

97

#### 3.1 Case study

Figure 3 presents the geographical locations of five points (locations 1–5), which are used to define the wave energy potential of different locations. The data of wave

Figure 3. Locations of the study area.

Prediction of Wave Energy Potential in India: A Fuzzy-ANN Approach DOI: http://dx.doi.org/10.5772/intechopen.84676


Table 1.

weights of importance as derived from the rank of importance and the

Recent Trends in Artificial Neural Networks - From Training to Prediction

the rank of importance and the aggregation method.

Figure showing the hierarchy of the MCDM methodologies.

The model uses fuzzy logic to determine weights of importance as derived from

Figure 3 presents the geographical locations of five points (locations 1–5), which are used to define the wave energy potential of different locations. The data of wave

aggregation method.

3.1 Case study

Figure 3.

96

Locations of the study area.

Figure 2.

Magnitude of the parameter with respect to the selected location.

height, wind speed, and water depth for the five locations were collected from the National Data Buoy Center. The most recent reanalysis dataset was produced by the European Centre for Medium-Range Weather Forecasts (ECMWF) [16]. The five locations in the Bay of Bengal (BoB) are used to estimate the wave power potential in an Indian scenario. The wave height (Hs) and wave period (Te) are obtained from the spectral moment as shown in Eqs. (2) and (3):

Significant of wave height (Hs)

$$H\_s = \mathbf{4}\sqrt{m\_0} \tag{2}$$

$$\text{Energy period (Te)} = \frac{m\_{-1}}{m\_0} \tag{3}$$

More details about locations 1–5 are provided in Table 1, where the corresponding water depth and the geographical coordinates are indicated for each of the five selected locations.

#### 3.2 Development of the cognitive method

In total, 12 GMDH-based models were developed with the same 4 inputs and 1 output as wave power potential. The numbers of inputs were varied from three to five where transformation of input and output data was conducted by the use of tangent and cube root functions. The top three parameters were identified with the help of the fuzzy logic MCDM method. According to the EPI, 3 models which were found to be better than the 12 models developed for the present study were selected for further validation.

The performance of all 12 models was analyzed by aean absolute error (MAE) [17] and correlation (R) [18]. The former metrics are known to be inversely proportional to model accuracy, whereas the other metrics are directly proportional to model performance. The performance of the model during the checking (c) or testing phase is a more important indicator of model reliability than the performance of the model in the training (t) phase [19]. Performance of the three selected models was tested for reliability with the help of root-mean-square error (RMSE), mean relative error (MRE), correlation (R), and percent bias (PBIAS) between the predicted and observed data. The equivalent performance index (EPI) was prepared to represent the performance of the models (see Eq. (4)).

Figure 4. The 12 models developed for prediction of suitable site selection.

$$\text{EPI} = \frac{\text{R}}{\text{MAE} + \text{MRE} + \text{RMSE} + \text{PBIAS}} \tag{4}$$

The result of the sensitivity analysis is shown in Figure 6, and case study results are

Energy period

Location 1: Chennai 0.26515 0.16893 0.30612 0.20435 0.01177 1 Location 2: Kikanda 0.19696 0.20163 0.25510 0.19729 0.00724 3 Location 3: Puducherry 0.23484 0.17711 0.07142 0.20965 0.00975 2 Location 4: Bhubaneswar 0.18181 0.22343 0.14285 0.19316 0.00780 4

The performance analysis of the locations for wave energy power potential in the case study area.

The comparison of actual and predicted value of the index both with training and testing data.

Water depth

0.12121 0.22888 0.22448 0.19552 0.00261 5

Salinity Indicator Rank

Figure 7 shows the comparison of predicted and observed output during the training and testing. The performance analysis of the 12 models revealed that the developed model no. "5CIONF6" was the most consistent model among all the models in the study. The most important models were trained with GMDH, the input and output was transformed by the cube root function, and all five variables were used as input.

shown in Table 2.

Location 5: Visakhapatnam

Table 2.

Figure 7.

99

Figure showing the sensitivity analysis of input variable.

DOI: http://dx.doi.org/10.5772/intechopen.84676

Location Wave

height

Prediction of Wave Energy Potential in India: A Fuzzy-ANN Approach

Figure 6.

The names of the models considered in the study are given in Figure 4. The nomenclature was prepared by placing the number of inputs as the first letter followed by the initial letter of the training algorithm, the data transformation function, and lastly the model number.

#### 3.3 Sensitivity analysis

The sensitivity of the better model among the models considered in the study was also tested to verify whether the importance of the input parameters are imbibe into the model result.

#### 4. Results and discussion

Figure 5 shows the score and the rank of the criteria based on the fuzzy logic method. Literature surveys and wave heights were found to be the most important criterion and alternative, respectively, whereas data availability and salinity were identified as the least important criterion and alternative, respectively. According to the results from the MCDM, it can be observed that the wave height (0.4084) and salinity (0.3897) have the highest and lowest importance, respectively, with respect to location selection for wave power plants in Figure 5. The performance analysis of the 12 models prepared for prediction is depicted in Figure 4.

Figure 5. Fuzzy logic results.

Prediction of Wave Energy Potential in India: A Fuzzy-ANN Approach DOI: http://dx.doi.org/10.5772/intechopen.84676

Figure 6.

EPI <sup>¼</sup> <sup>R</sup>

Recent Trends in Artificial Neural Networks - From Training to Prediction

function, and lastly the model number.

The 12 models developed for prediction of suitable site selection.

3.3 Sensitivity analysis

Figure 4.

into the model result.

Figure 5. Fuzzy logic results.

98

4. Results and discussion

The names of the models considered in the study are given in Figure 4. The nomenclature was prepared by placing the number of inputs as the first letter followed by the initial letter of the training algorithm, the data transformation

The sensitivity of the better model among the models considered in the study was also tested to verify whether the importance of the input parameters are imbibe

Figure 5 shows the score and the rank of the criteria based on the fuzzy logic method. Literature surveys and wave heights were found to be the most important criterion and alternative, respectively, whereas data availability and salinity were identified as the least important criterion and alternative, respectively. According to the results from the MCDM, it can be observed that the wave height (0.4084) and salinity (0.3897) have the highest and lowest importance, respectively, with respect to location selection for wave power plants in Figure 5. The performance

analysis of the 12 models prepared for prediction is depicted in Figure 4.

MAE <sup>þ</sup> MRE <sup>þ</sup> RMSE <sup>þ</sup> PBIAS (4)

Figure showing the sensitivity analysis of input variable.

The result of the sensitivity analysis is shown in Figure 6, and case study results are shown in Table 2.

Figure 7 shows the comparison of predicted and observed output during the training and testing. The performance analysis of the 12 models revealed that the developed model no. "5CIONF6" was the most consistent model among all the models in the study. The most important models were trained with GMDH, the input and output was transformed by the cube root function, and all five variables were used as input.


#### Table 2.

The performance analysis of the locations for wave energy power potential in the case study area.

The comparison of actual and predicted value of the index both with training and testing data.

energy potential. In total, four factors were identified as the most important in regard to the calculation of wave energy potential, as found from the literature survey. In total, 12 different models were developed by varying the inputs within these 4 factors and power potential as output. The data representing various scenarios was generated and used to train the models. The arc tangent function was used in six cases to transfer the data of either input or output or both. Performance metrics like RMSE, MAE, PBIAS, and R were used to find the equivalent performance of the models. The model with all the factors as input was found to be most efficient among all the other 11 models. The accuracy of the model was found to be above 99.99%. The power potential of five different locations on the Indian coastal belt was used as a case study. The model output and the result from the power potential equation were compared and found to be coherent with each other,

Prediction of Wave Energy Potential in India: A Fuzzy-ANN Approach

DOI: http://dx.doi.org/10.5772/intechopen.84676

although magnitude of the results is well apart.

NSE Nash-Sutcliffe model efficiency coefficient

RSR RMSE-observation standard deviation ratio

ANN artificial neural network MCDM multi-criteria decision-making FLDM fuzzy logic decision-making method GMDH group method of data handling

Nomenclature

PBIAS percent bias

R correlation

Author details

101

Agartala, Tripura, India

Soumya Ghosh\* and Mrinmoy Majumder

provided the original work is properly cited.

\*Address all correspondence to: soumyaee@gmail.com

School of Hydro-Informatics Engineering, National Institute of Technology

© 2019 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/ by/3.0), which permits unrestricted use, distribution, and reproduction in any medium,

PI performance index

Figure 8. Location-based model output indicator vs. actual power potential.

Figure 6 depicts that the sensitivity analysis of the model in wave height is maximum and least important of salinity. Figure 8 depicts the prediction of power potential in form locations of the eastern coastline of India, as predicted in the selected model output with combined the actual power of wave power equation. This model was satisfactory in our objective.

#### 4.1 Study area

In the investigation, the quality of five locations for installation of wave stations was determined by the new methodology. Location 1 (Chennai) has greater practicality than four alternative locations for utilization of wave energy potential. The wave power potential per meter of wave crest of the five locations was also calculated as recommended by [20] in Eq. (5).

$$P = \frac{\rho \text{g}^2}{64\pi} H\_{mo}^2 T\_\epsilon \approx \left(0.5 \frac{kw}{m^3 s}\right) H\_{mo}^2 T\_\epsilon \tag{5}$$

where P is the wave power per unit crest length (kw/m), ρ is the sea water density (kg/m3 ), g is the gravitational acceleration (m/s<sup>2</sup> ), Hs is the significant wave height (m), and Te is the energy period (s).

For the Indian scenario, the power potentials of five locations were found to be 12385.224 (kw), 8157.452 (kw), 10186.215 (kw), 7702.158 (kw), and 3506.674 (kw). The model output values, locations 1–5, were found to be equal to 0.011777, 0.007245 0.009758, 0.007801, and 0.002619, respectively. The power potential and the model value were found to be consistent with each other. According to the graph, the model output power and locations are based on the normalized value of power potential shown in Figure 8. The values were 0.295324188, 0.194513469, 0.242889081, 0.183657038, and 0.083616223, by locations 1–5, respectively.

#### 5. Conclusion

The present study attempts to predict the wave energy potential of different coastal regions with the help of the four most relevant factors. The study utilized fuzzy MCDM and GMDH models to develop a framework to predict the wave

Prediction of Wave Energy Potential in India: A Fuzzy-ANN Approach DOI: http://dx.doi.org/10.5772/intechopen.84676

energy potential. In total, four factors were identified as the most important in regard to the calculation of wave energy potential, as found from the literature survey. In total, 12 different models were developed by varying the inputs within these 4 factors and power potential as output. The data representing various scenarios was generated and used to train the models. The arc tangent function was used in six cases to transfer the data of either input or output or both. Performance metrics like RMSE, MAE, PBIAS, and R were used to find the equivalent performance of the models. The model with all the factors as input was found to be most efficient among all the other 11 models. The accuracy of the model was found to be above 99.99%. The power potential of five different locations on the Indian coastal belt was used as a case study. The model output and the result from the power potential equation were compared and found to be coherent with each other, although magnitude of the results is well apart.

#### Nomenclature

Figure 6 depicts that the sensitivity analysis of the model in wave height is maximum and least important of salinity. Figure 8 depicts the prediction of power potential in form locations of the eastern coastline of India, as predicted in the selected model output with combined the actual power of wave power equation.

In the investigation, the quality of five locations for installation of wave stations was determined by the new methodology. Location 1 (Chennai) has greater practicality than four alternative locations for utilization of wave energy potential. The wave power potential per meter of wave crest of the five locations was also calcu-

moTe <sup>≈</sup> <sup>0</sup>:<sup>5</sup> kw

For the Indian scenario, the power potentials of five locations were found to be 12385.224 (kw), 8157.452 (kw), 10186.215 (kw), 7702.158 (kw), and 3506.674 (kw).

0.007245 0.009758, 0.007801, and 0.002619, respectively. The power potential and the model value were found to be consistent with each other. According to the graph, the model output power and locations are based on the normalized value of power potential shown in Figure 8. The values were 0.295324188, 0.194513469, 0.242889081, 0.183657038, and 0.083616223, by locations 1–5, respectively.

The present study attempts to predict the wave energy potential of different coastal regions with the help of the four most relevant factors. The study utilized fuzzy MCDM and GMDH models to develop a framework to predict the wave

where P is the wave power per unit crest length (kw/m), ρ is the sea water

), g is the gravitational acceleration (m/s<sup>2</sup>

The model output values, locations 1–5, were found to be equal to 0.011777,

m<sup>3</sup>s 

H2

moTe (5)

), Hs is the significant wave

This model was satisfactory in our objective.

Location-based model output indicator vs. actual power potential.

Recent Trends in Artificial Neural Networks - From Training to Prediction

lated as recommended by [20] in Eq. (5).

height (m), and Te is the energy period (s).

<sup>P</sup> <sup>¼</sup> <sup>ρ</sup>g<sup>2</sup> <sup>64</sup><sup>π</sup> <sup>H</sup><sup>2</sup>

4.1 Study area

Figure 8.

density (kg/m3

5. Conclusion

100


### Author details

Soumya Ghosh\* and Mrinmoy Majumder School of Hydro-Informatics Engineering, National Institute of Technology Agartala, Tripura, India

\*Address all correspondence to: soumyaee@gmail.com

© 2019 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/ by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

### References

[1] Gunn K, Stock-Williams C. Quantifying the global wave power resource. Renewable Energy. 2012;44: 296-304

[2] Henfridsson U, Neimane V, Strand K, Kapper R, Bernhoff H, Danielsson O, et al. Wave energy potential in the Baltic Sea and the Danish part of the North Sea, with reflections on the Skagerrak. Journal of Renewable Energy. 2007; 32(12):2069-2084

[3] Mackay EB, Bahaj AS, Challenor PG. Uncertainty in wave energy resource assessment. Part 2: Variability and predictability. Journal of Renewable Energy. 2010;35(8):1809-1819

[4] Xie WT, Dai YJ, Wang RZ, Sumathy K. Concentrated solar energy applications using Fresnel lenses: A review. Journal of Renewable and Sustainable Energy Reviews. 2011;15(6): 2588-2606

[5] Iglesias LM, Carballo R, Castro A, Fraguela JA, Frigaard P. Wave energy potential in Galicia (NW Spain). Journal of Renewable Energy. 2009;34:2323-2333

[6] Abanades J, Greaves D, Iglesias G. Wave farm impact on the beach profile: A case study. Journal of Coastal Engineering. 2014;86:36-44

[7] Azzellino A, Conley D, Vicinanza D, Kofoed JP. Marine renewable energies: Perspectives and implications for marine ecosystems. The Scientific World Journal. 2013;2013:1-3

[8] Veigas M, Ramos V, Iglesias GA. Wave farm for an island: Detailed effects on the nearshore wave climate. Journal of Energy. 2014;69:801-812

[9] Garg P. Energy scenario and vision 2020 in India. Journal of Sustainable Energy and Environment. Aug 2012;3 (1):7-17

[10] Zimmermann HJ. Fuzzy Set Theory and its Applications. 2nd ed. Boston, Dordrecht, London: Kluwer Academic Publishers; 1991

performance. Journal of Climate Research. 2005;30(1):79-82

DOI: http://dx.doi.org/10.5772/intechopen.84676

Prediction of Wave Energy Potential in India: A Fuzzy-ANN Approach

[18] Pascual-González J, Guillén-Gosálbez G, Mateo-Sanz JM, Jiménez-Esteller L. Statistical analysis of the EcoInvent database to uncover

Production. 2016;112:359-368

relationships between life cycle impact assessment metrics. Journal of Cleaner

[19] Noori N, Kali L. Coupling SWAT and ANN models for enhanced daily stream flow prediction. Journal of Hydrology. 2016;533:141-151

[20] Ghosh S, Chakraborty T, Saha S, Majumder M, Pal M. Development of the location suitability index for wave energy production by ANN and MCDM techniques. Renewable and Sustainable Energy Reviews. 2016;59:1017-1028

103

[11] Sevkli M. An application of the fuzzy ELECTRE method for supplier selection. Journal of International Journal of Production Research. 2010; 48(12):3393-3405

[12] Franco C, Bojesen M, Leth Hougaard J, Nielsen K. A fuzzy approach to a multiple criteria and geographical information system for decision support on suitable locations for biogas plants. Journal of Applied Energy. 2015;140:304-315

[13] Anastasakis L, Mort N. The development of self-organization techniques in modelling: A review of the group method of data handling (GMDH). Technical Report. University of Sheffield, Department of Automatic Control and Systems Engineering; 2001

[14] Hwang HS. Fuzzy GMDH-type neural network model and its application to forecasting of mobile communication. Journal of Computers and Industrial Engineering. 2006;50(4): 450-457

[15] Kalantary F, Ardalan H, Nariman-Zadeh N. An investigation on the Su–N SPT correlation using GMDH type neural networks and genetic algorithms. Journal of Engineering Geology. 2009; 109(1):144-155

[16] de Antonio FO. Wave energy utilization: A review of the technologies. Journal of Renewable and Sustainable Energy Reviews. 2010;14(3):899-918

[17] Willmott CJ, Matsuura K. Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model

Prediction of Wave Energy Potential in India: A Fuzzy-ANN Approach DOI: http://dx.doi.org/10.5772/intechopen.84676

performance. Journal of Climate Research. 2005;30(1):79-82

References

296-304

32(12):2069-2084

2588-2606

(1):7-17

102

[1] Gunn K, Stock-Williams C. Quantifying the global wave power resource. Renewable Energy. 2012;44:

[2] Henfridsson U, Neimane V, Strand K, Kapper R, Bernhoff H, Danielsson O, et al. Wave energy potential in the Baltic Sea and the Danish part of the North Sea, with reflections on the Skagerrak. Journal of Renewable Energy. 2007;

Recent Trends in Artificial Neural Networks - From Training to Prediction

[10] Zimmermann HJ. Fuzzy Set Theory and its Applications. 2nd ed. Boston, Dordrecht, London: Kluwer Academic

[11] Sevkli M. An application of the fuzzy ELECTRE method for supplier selection. Journal of International Journal of Production Research. 2010;

[12] Franco C, Bojesen M, Leth Hougaard J, Nielsen K. A fuzzy approach to a multiple criteria and geographical information system for decision support on suitable locations for biogas plants. Journal of Applied

Energy. 2015;140:304-315

[13] Anastasakis L, Mort N. The development of self-organization techniques in modelling: A review of the

group method of data handling

(GMDH). Technical Report. University of Sheffield, Department of Automatic Control and Systems Engineering; 2001

[14] Hwang HS. Fuzzy GMDH-type neural network model and its application to forecasting of mobile communication. Journal of Computers and Industrial Engineering. 2006;50(4):

[15] Kalantary F, Ardalan H, Nariman-Zadeh N. An investigation on the Su–N SPT correlation using GMDH type neural networks and genetic algorithms. Journal of Engineering Geology. 2009;

[16] de Antonio FO. Wave energy utilization: A review of the technologies. Journal of Renewable and Sustainable Energy Reviews. 2010;14(3):899-918

[17] Willmott CJ, Matsuura K.

Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model

450-457

109(1):144-155

Publishers; 1991

48(12):3393-3405

[3] Mackay EB, Bahaj AS, Challenor PG. Uncertainty in wave energy resource assessment. Part 2: Variability and predictability. Journal of Renewable Energy. 2010;35(8):1809-1819

[4] Xie WT, Dai YJ, Wang RZ, Sumathy

[5] Iglesias LM, Carballo R, Castro A, Fraguela JA, Frigaard P. Wave energy potential in Galicia (NW Spain). Journal of Renewable Energy. 2009;34:2323-2333

[6] Abanades J, Greaves D, Iglesias G. Wave farm impact on the beach profile:

[7] Azzellino A, Conley D, Vicinanza D, Kofoed JP. Marine renewable energies: Perspectives and implications for marine ecosystems. The Scientific World Journal. 2013;2013:1-3

[8] Veigas M, Ramos V, Iglesias GA. Wave farm for an island: Detailed effects on the nearshore wave climate. Journal of Energy. 2014;69:801-812

[9] Garg P. Energy scenario and vision 2020 in India. Journal of Sustainable Energy and Environment. Aug 2012;3

A case study. Journal of Coastal Engineering. 2014;86:36-44

K. Concentrated solar energy applications using Fresnel lenses: A review. Journal of Renewable and Sustainable Energy Reviews. 2011;15(6): [18] Pascual-González J, Guillén-Gosálbez G, Mateo-Sanz JM, Jiménez-Esteller L. Statistical analysis of the EcoInvent database to uncover relationships between life cycle impact assessment metrics. Journal of Cleaner Production. 2016;112:359-368

[19] Noori N, Kali L. Coupling SWAT and ANN models for enhanced daily stream flow prediction. Journal of Hydrology. 2016;533:141-151

[20] Ghosh S, Chakraborty T, Saha S, Majumder M, Pal M. Development of the location suitability index for wave energy production by ANN and MCDM techniques. Renewable and Sustainable Energy Reviews. 2016;59:1017-1028

**105**

**Chapter 7**

**Abstract**

Procedures

*Mihai Datcu, Gottfried Schwarz* 

*and Corneliu Octavian Dumitru*

ML and AI methods and signal processing.

examples of EO benchmarking data sets.

machine learning, deep learning

**1. Introduction**

**Keywords:** Earth observation, synthetic aperture radar, multispectral,

This chapter introduces the basic properties, features, and models for very specific Earth observation (EO) cases recorded by very high-resolution (VHR) multispectral, Synthetic Aperture Radar (SAR), and multi-temporal observations. Further, we describe and discuss procedures and machine learning-based tools to generate large semantic training and benchmarking data sets. The particularities of relative data set biases and cross-data set generalization are reviewed, and an algorithmic analysis frame is introduced. Finally, we review and analyze several

Deep Learning Training and

Benchmarks for Earth Observation

Deep learning methods are often used for image classification or local object segmentation. The corresponding test and validation data sets are an integral part of the learning process and also of the algorithm performance evaluation. High and particularly very high-resolution Earth observation (EO) applications based on satellite images primarily aim at the semantic labeling of land cover structures or objects as well as of temporal evolution classes. However, one of the main EO objectives is physical parameter retrievals such as temperatures, precipitation, and crop yield predictions. Therefore, we need reliably labeled data sets and tools to train the developed algorithms and to assess the performance of our deep learning paradigms. Generally, imaging sensors generate a visually understandable representation of the observed scene. However, this does not hold for many EO images, where the recorded images only depict a spectral subset of the scattered light field, thus generating an indirect signature of the imaged object. This spots the load of EO image understanding, as a new and particular challenge of Machine Learning (ML) and Artificial Intelligence (AI). This chapter reviews and analyses the new approaches of EO imaging leveraging the recent advances in physical process-based

Images: Data Sets, Features, and

#### **Chapter 7**

## Deep Learning Training and Benchmarks for Earth Observation Images: Data Sets, Features, and Procedures

*Mihai Datcu, Gottfried Schwarz and Corneliu Octavian Dumitru*

### **Abstract**

Deep learning methods are often used for image classification or local object segmentation. The corresponding test and validation data sets are an integral part of the learning process and also of the algorithm performance evaluation. High and particularly very high-resolution Earth observation (EO) applications based on satellite images primarily aim at the semantic labeling of land cover structures or objects as well as of temporal evolution classes. However, one of the main EO objectives is physical parameter retrievals such as temperatures, precipitation, and crop yield predictions. Therefore, we need reliably labeled data sets and tools to train the developed algorithms and to assess the performance of our deep learning paradigms. Generally, imaging sensors generate a visually understandable representation of the observed scene. However, this does not hold for many EO images, where the recorded images only depict a spectral subset of the scattered light field, thus generating an indirect signature of the imaged object. This spots the load of EO image understanding, as a new and particular challenge of Machine Learning (ML) and Artificial Intelligence (AI). This chapter reviews and analyses the new approaches of EO imaging leveraging the recent advances in physical process-based ML and AI methods and signal processing.

**Keywords:** Earth observation, synthetic aperture radar, multispectral, machine learning, deep learning

#### **1. Introduction**

This chapter introduces the basic properties, features, and models for very specific Earth observation (EO) cases recorded by very high-resolution (VHR) multispectral, Synthetic Aperture Radar (SAR), and multi-temporal observations. Further, we describe and discuss procedures and machine learning-based tools to generate large semantic training and benchmarking data sets. The particularities of relative data set biases and cross-data set generalization are reviewed, and an algorithmic analysis frame is introduced. Finally, we review and analyze several examples of EO benchmarking data sets.

In the following, we describe what has to be taken into account when we want to benchmark the classification results of satellite images, in particular the classification capabilities, throughputs, and accuracies offered by modern machine learning and artificial intelligence approaches.

Our underlying goal is the identification and understanding of the semantic content of satellite images and their application-oriented interpretation from a user perspective. In order to determine the actual performance of automated image classification routines, we need to find and select test data and to analyze the performance of our classification and interpretation routines in an automated environment.

A particular point to be understood is what type of data exists for remote sensing images that we want to classify. We are faced with long processing chains for the scientific analysis of image data, starting with uncalibrated *"*raw*"* sensor data, followed by dedicated calibration steps, subsequent feature extraction, object identification and annotation, and ending with quantitative scientific research and findings about the processes and effects being monitored in the geophysical environment of our planet with respect to climate change, disaster risks, crop yield predictions, etc.

In addition, we have to mention that free and open-access satellite products have revolutionized the role of remote sensing in Earth system studies. In our case, the data being used are based on multispectral (i.e., multi-color) sensors such as Landsat with 7 bands, Sentinel-2 [4] with 13 bands, Sentinel-3 with 21 bands, and MODIS with 36 bands but also SAR sensors such as Sentinel-1 [6], TerraSAR-X [26] or RADARSAT. For a better understanding of their imaging potential, we will describe the most important parameters of these images. For multispectral sensors, there exists several well-known and publicly available land cover benchmarking data sets comprising typical remote sensing image patches, while comparable SAR benchmarking data sets are very scarce and dedicated.

The main aspects being treated are:


In this chapter, we assume that we can rely on already processed data with sufficient calibration accuracy and accurate annotation allowing us to understand all imaging parameters and their accuracy. We also assume that we can profit from reliably documented image data and that we can continue with data analytics for image understanding and high-level interpretation without any further precautions.

The latter steps have to be organized systematically in order to guarantee reliable results. A common strategy is to split these tasks into three phases, namely initial basic software functionality testing; second, training and optimizing of the software parameters by means of selected reference data, and finally, benchmarking of the overall software functionality such as processing speed and attainable results. This systematic approach leads to quantifiable and comparable results as described in the following sections.

**107**

parameters).

*Deep Learning Training and Benchmarks for Earth Observation Images: Data Sets, Features…*

During the last years, the field of deep learning had an explosive expansion in many domains with predominance in computer vision, speech recognition, and text analysis. For example, during 2019, more than 500 articles per month have been published in the field of deep learning. Thus, any reports on the state of the art hardly can follow this development. In Ref. [1], published in January 2019, more than 330 references were analyzed reviewing the theoretical and architectural aspects for Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), including Long Short-Term Memories (LSTMs) and Gated Recurrent Units (GRUs), Auto-Encoders (AEs), Deep Belief Networks (DBNs), Generative Adversarial Networks (GANs), and Deep Reinforcement Learning (DRL). The review paper [1] also summarizes 20 deep learning frameworks, two standard development kits, 49 benchmark data sets in all domains, from which three are dedicated to hyperspectral remote sensing. In addition, Ball et al. [2] describe the landscape of deep learning from all perspectives, theory, tools, applications, and challenges as of 2017. This article analyzes 419 references. A more recent overview from April 2019 [3] summarizes more than 170 references reporting on applications

Typical remote sensing images acquired by aircraft or satellite platforms can be characterized based on the operational capabilities of these platforms (such as their flight path, their capabilities for instrument pointing, and the on-board data storage and data downlink capacities), the type of instruments and their sensors (such as optical images with distinctive spectral bands [4, 5] or radar images such as synthetic aperture radars [6]), and opportunities for the repetitive acquisition of geographically overlapping image time series (for instance, for vegetation monitor-

Current images can provide raw data with more than eight bits per sample, can perform initial data processing and annotation already on board, and can downlink compressed data with error correcting codes. After downlinking the image data to ground stations, the received data will be stored and processed by dedicated computing facilities. A common remote sensing strategy is to perform a systematic level-by-level processing (generating so-called products that comprise image data together with metadata documenting relevant image acquisition and processing

A common conventional approach is to follow a unified concept, where Level-0 products contain unprocessed but re-ordered detector data; Level-1 data represent radiometrically calibrated intensity images, while Level-2 data are geometrically corrected and map-projected data. Level-3 data are higher level products such as semantic maps or overlapping time-series data. In general, users have access to different product levels and can access and download selected products from databases

Some additional products have to be generated interactively by the users. Typical examples are image content classifications and trend analyses following mathematical approaches. Today, these interactive steps migrate from purely interactive and simple tools to commonly accepted machine learning tools. At the moment, the majority of machine learning tools use "deep" learning approaches; here, the problem is decomposed into several layers to find a good representation of image content

What we have to outline first are some important parameters of remote sensing images. One critical point of typical remote sensing images is their enormous size

via image catalogs and so-called quick-look (also called thumb-nail) images.

categories [7]. These aspects will be dealt with in more detail in Section 4.

*DOI: http://dx.doi.org/10.5772/intechopen.90910*

of deep learning in remote sensing.

ing to predict optimal crop harvesting dates).

**2. Remote sensing images**

*Deep Learning Training and Benchmarks for Earth Observation Images: Data Sets, Features… DOI: http://dx.doi.org/10.5772/intechopen.90910*

During the last years, the field of deep learning had an explosive expansion in many domains with predominance in computer vision, speech recognition, and text analysis. For example, during 2019, more than 500 articles per month have been published in the field of deep learning. Thus, any reports on the state of the art hardly can follow this development. In Ref. [1], published in January 2019, more than 330 references were analyzed reviewing the theoretical and architectural aspects for Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), including Long Short-Term Memories (LSTMs) and Gated Recurrent Units (GRUs), Auto-Encoders (AEs), Deep Belief Networks (DBNs), Generative Adversarial Networks (GANs), and Deep Reinforcement Learning (DRL). The review paper [1] also summarizes 20 deep learning frameworks, two standard development kits, 49 benchmark data sets in all domains, from which three are dedicated to hyperspectral remote sensing. In addition, Ball et al. [2] describe the landscape of deep learning from all perspectives, theory, tools, applications, and challenges as of 2017. This article analyzes 419 references. A more recent overview from April 2019 [3] summarizes more than 170 references reporting on applications of deep learning in remote sensing.

#### **2. Remote sensing images**

*Recent Trends in Artificial Neural Networks - From Training to Prediction*

and artificial intelligence approaches.

environment.

predictions, etc.

In the following, we describe what has to be taken into account when we want to benchmark the classification results of satellite images, in particular the classification capabilities, throughputs, and accuracies offered by modern machine learning

Our underlying goal is the identification and understanding of the semantic content of satellite images and their application-oriented interpretation from a user perspective. In order to determine the actual performance of automated image classification routines, we need to find and select test data and to analyze the performance of our classification and interpretation routines in an automated

A particular point to be understood is what type of data exists for remote sensing images that we want to classify. We are faced with long processing chains for the scientific analysis of image data, starting with uncalibrated *"*raw*"* sensor data, followed by dedicated calibration steps, subsequent feature extraction, object identification and annotation, and ending with quantitative scientific research and findings about the processes and effects being monitored in the geophysical environment of our planet with respect to climate change, disaster risks, crop yield

In addition, we have to mention that free and open-access satellite products have revolutionized the role of remote sensing in Earth system studies. In our case, the data being used are based on multispectral (i.e., multi-color) sensors such as Landsat with 7 bands, Sentinel-2 [4] with 13 bands, Sentinel-3 with 21 bands, and MODIS with 36 bands but also SAR sensors such as Sentinel-1 [6], TerraSAR-X [26] or RADARSAT. For a better understanding of their imaging potential, we will describe the most important parameters of these images. For multispectral sensors, there exists several well-known and publicly available land cover benchmarking data sets comprising typical remote sensing image patches, while comparable SAR

• ML paradigms to support the semantic annotation of very large data sets, that is, using hybrid methods integrating Support Vector Machines (SVMs), Bayesian, and Deep Neural Networks (DNNs) algorithms in active learning paradigms by using initially small and controllable training data sets, and progressively growing the volume of labeled data by transfer learning.

• Proposing solutions to the semantic aspects of the spatial annotations for dif-

In this chapter, we assume that we can rely on already processed data with sufficient calibration accuracy and accurate annotation allowing us to understand all imaging parameters and their accuracy. We also assume that we can profit from reliably documented image data and that we can continue with data analytics for image understanding and high-level interpretation without any further precautions.

The latter steps have to be organized systematically in order to guarantee reliable results. A common strategy is to split these tasks into three phases, namely initial basic software functionality testing; second, training and optimizing of the software parameters by means of selected reference data, and finally, benchmarking of the overall software functionality such as processing speed and attainable results. This systematic approach leads to quantifiable and comparable results as described

• Discussing the implications of the sensory and semantic gaps.

benchmarking data sets are very scarce and dedicated.

ferent sensor resolutions and spatial scales.

The main aspects being treated are:

**106**

in the following sections.

Typical remote sensing images acquired by aircraft or satellite platforms can be characterized based on the operational capabilities of these platforms (such as their flight path, their capabilities for instrument pointing, and the on-board data storage and data downlink capacities), the type of instruments and their sensors (such as optical images with distinctive spectral bands [4, 5] or radar images such as synthetic aperture radars [6]), and opportunities for the repetitive acquisition of geographically overlapping image time series (for instance, for vegetation monitoring to predict optimal crop harvesting dates).

Current images can provide raw data with more than eight bits per sample, can perform initial data processing and annotation already on board, and can downlink compressed data with error correcting codes. After downlinking the image data to ground stations, the received data will be stored and processed by dedicated computing facilities. A common remote sensing strategy is to perform a systematic level-by-level processing (generating so-called products that comprise image data together with metadata documenting relevant image acquisition and processing parameters).

A common conventional approach is to follow a unified concept, where Level-0 products contain unprocessed but re-ordered detector data; Level-1 data represent radiometrically calibrated intensity images, while Level-2 data are geometrically corrected and map-projected data. Level-3 data are higher level products such as semantic maps or overlapping time-series data. In general, users have access to different product levels and can access and download selected products from databases via image catalogs and so-called quick-look (also called thumb-nail) images.

Some additional products have to be generated interactively by the users. Typical examples are image content classifications and trend analyses following mathematical approaches. Today, these interactive steps migrate from purely interactive and simple tools to commonly accepted machine learning tools. At the moment, the majority of machine learning tools use "deep" learning approaches; here, the problem is decomposed into several layers to find a good representation of image content categories [7]. These aspects will be dealt with in more detail in Section 4.

What we have to outline first are some important parameters of remote sensing images. One critical point of typical remote sensing images is their enormous size

calling for big data environments with powerful processors and large data stores. A second important point is the geometrical and radiometrical resolution of the image pixels, resulting in different target types that can be identified and discriminated during classification. While the typical pixel-to-pixel spacing of air-borne cameras corresponds to centimeters on the ground, space-borne instruments with high resolution flown on low polar orbits mostly lie in the range of half a meter to a few meters. In contrast, imaging from more distant geostationary or geosynchronous orbits results in low-resolution images. As for the number of brightness levels of each pixel, modern cameras often provide more than eight bits of resolution. **Table 1** shows some typical parameters of current satellites with imaging instruments.

Further, the pixels of an image can be complemented by additional information obtained by feature extraction and automated object identification (used as image content descriptors) as well as publicly available information from auxiliary external databases (e.g., geographical references or geophysical parameters). These data allow the provision of accurate quantitative results in physical units; however, one has to be aware of the fact that while many phenomena become visible, some internal relationships may remain invisible without dedicated additional investigations. **Table 1** shows some typical parameters of current satellite images.

In addition to the standard image products as described above, any additional automated or interactive analysis and interpretation of remote sensing images calls for intelligent strategies how to quickly select distinct and representative images, how to generate image time series, to extract features, to identify objects, to recognize hitherto hidden relationships and correlations, to exploit statistical descriptive models describing additional relationships, and to apply techniques for the annotation and visualization of global/local image properties (that have to be stored and administered in databases).

While typical traditional image content analysis tools either use full images, sequences of small image patches, collections of mid-size image segments or countless individual pixels together with routines from already established toolboxes (e.g., Orfeo [9]), or advanced machine learning approaches exploiting innovative machine learning strategies, as for instance, transfer learning [8] or the use of adversarial networks [10]. However, any use of advanced approaches requires the


**109**

*Deep Learning Training and Benchmarks for Earth Observation Images: Data Sets, Features…*

preparation and conduction of tests that allow a benchmarking of the new software routines, notably methods and tools to generate and analyze data for testing, training, verification, and final benchmarking. These testing activities have to be

Feature extraction and classification (edges, corners, ridges, texture, color, interest points, shapes)

As can be seen from **Table 2**, there exist already quite a number of traditional image content analysis tools. Some of them generate pre-processed images for subsequent analysis by human image interpreters, while others allow the identification and extraction of objects. However, these tools do not yet exploit the most recent

Analysis of pixel statistics and use of computer vision algorithms (e.g., histograms of gradients, local binary

**3. Machine learning, artificial intelligence, and data science for remote** 

Currently, we see a lot of public interest in machine learning (ML), artificial intelligence (AI), and data science (DS). We have to make sure what we mean by

• ML is often used if we describe technical developments where a computer

system is trained and used to find and classify objects in data sets. A prominent example is the identification and interpretation of traffic signs for automated driving, typically use cases where a computer system is coupled with a camera and other sensors, and the traffic signs have to be recognized independent of different illumination and weather conditions, a vast range of potential driving speeds, varying distances and perspectives, other cars moving within the field of view of the camera, supplementary information provided by text panels or adjacent traffic signs, and constraints to be observed such as the maximum reasonable processing time. In essence, we can consider these applications as a reduction of many image pixels into single features (from a given list of cases and options) or a combination of features (e.g., max speed of 30 mph except on weekends). In most cases, the ML software is tested and trained by many

• AI combines the full functionality of ML with additional decision-making and reaction capabilities. This additional decision-making can be implemented by continuous understanding of the current overall situation, the extraction of reactions from given rule sets (supported by continuously updated

*DOI: http://dx.doi.org/10.5772/intechopen.90910*

Clipping of outliers and de-noising Color coding of brightness levels

Histogram manipulation (e.g., stretching) Normalization and contrast enhancement

Transformations and filtering of coefficients

patterns, speeded-up robust features)

Box-car filtering (e.g., high-pass filtering, smoothing)

Extraction of content-oriented regions and objects

*Typical capabilities of traditional image content analysis tools.*

supported by efficient visualization tools.

automated machine learning techniques.

typical examples as well as counterexamples.

**sensing**

**Table 2.**

these buzzwords:

#### **Table 1.**

*Typical imaging parameters of current satellites.*

*Deep Learning Training and Benchmarks for Earth Observation Images: Data Sets, Features… DOI: http://dx.doi.org/10.5772/intechopen.90910*


#### **Table 2.**

*Recent Trends in Artificial Neural Networks - From Training to Prediction*

calling for big data environments with powerful processors and large data stores. A second important point is the geometrical and radiometrical resolution of the image pixels, resulting in different target types that can be identified and discriminated during classification. While the typical pixel-to-pixel spacing of air-borne cameras corresponds to centimeters on the ground, space-borne instruments with high resolution flown on low polar orbits mostly lie in the range of half a meter to a few meters. In contrast, imaging from more distant geostationary or geosynchronous orbits results in low-resolution images. As for the number of brightness levels of each pixel, modern cameras often provide more than eight bits of resolution. **Table 1** shows some typical parameters of current satellites with imaging

Further, the pixels of an image can be complemented by additional information obtained by feature extraction and automated object identification (used as image content descriptors) as well as publicly available information from auxiliary external databases (e.g., geographical references or geophysical parameters). These data allow the provision of accurate quantitative results in physical units; however, one has to be aware of the fact that while many phenomena become visible, some internal relationships may remain invisible without dedicated additional investiga-

In addition to the standard image products as described above, any additional automated or interactive analysis and interpretation of remote sensing images calls for intelligent strategies how to quickly select distinct and representative images, how to generate image time series, to extract features, to identify objects, to recognize hitherto hidden relationships and correlations, to exploit statistical descriptive models describing additional relationships, and to apply techniques for the annotation and visualization of global/local image properties (that have to be stored and

While typical traditional image content analysis tools either use full images, sequences of small image patches, collections of mid-size image segments or countless individual pixels together with routines from already established toolboxes (e.g., Orfeo [9]), or advanced machine learning approaches exploiting innovative machine learning strategies, as for instance, transfer learning [8] or the use of adversarial networks [10]. However, any use of advanced approaches requires the

pixels 104

Sub-meter to tens of m Meters to several meters

Number of overlapping bands Viewing/incidence angle polar or geo. orbit

**SAR instruments**

 × 104 pixels

C-band, X-band, L-band, etc.

V and H polarization scan modes interferometry

(amplitudes or intensities)

tions. **Table 1** shows some typical parameters of current satellite images.

**Optical cameras and spectrometers**

bands

views fusion of bands

Target areas (typ.) Land, ocean, ice, atmosphere Land, ocean, ice

Pixel types (typ.) Detector counts reflectances Complex-valued "detected data"

104 × 104

Bands (typ.) 300 to 1000 nm and infrared

Special modes (typ.) Dynamical targeting stereo

**108**

instruments.

administered in databases).

**High-resolution imaging instruments**

Spatial resolution

Important parameters

*Typical imaging parameters of current satellites.*

columns)

(typ.)

(typ.)

**Table 1.**

Image size (typ. lines ×

*Typical capabilities of traditional image content analysis tools.*

preparation and conduction of tests that allow a benchmarking of the new software routines, notably methods and tools to generate and analyze data for testing, training, verification, and final benchmarking. These testing activities have to be supported by efficient visualization tools.

As can be seen from **Table 2**, there exist already quite a number of traditional image content analysis tools. Some of them generate pre-processed images for subsequent analysis by human image interpreters, while others allow the identification and extraction of objects. However, these tools do not yet exploit the most recent automated machine learning techniques.

#### **3. Machine learning, artificial intelligence, and data science for remote sensing**

Currently, we see a lot of public interest in machine learning (ML), artificial intelligence (AI), and data science (DS). We have to make sure what we mean by these buzzwords:


parameters), and the handling of unexpected emergency cases. In the case of autonomous driving, one can think of a lane change on a motorway after a reason for a lane change has been found, and from a number of alternative reactions, when a lane change appears as the best reaction. Then the current situation has to be checked when a lane change becomes possible, and a sequence of subsequent actions is executed.

• DS as a scientific and technical discipline of its own shall provide all guiding principles that are needed from end-to-end system design up to data analytics and image understanding—including the system layout and verification, the selection of components and tools, the implementation and installation of the components and their verification, and the benchmarking of the full functionality. In the case of remote sensing applications, we also have to include all aspects of sensor calibration, comparisons with the findings of other researchers via Internet, and traceable scientific data interpretation.

As our applications mostly use cases dealing with remote sensing images, we can limit ourselves to the main ML paradigms that support the semantic annotation of very large data sets. Based on the current state-of-the art developments, we consider that there are three currently important fundamental and internationally accepted image classification approaches for remote sensing applications and two additional learning principles useful for satellite images:


**111**

*Deep Learning Training and Benchmarks for Earth Observation Images: Data Sets, Features…*

accomplished by a visualization interface where a user can select or deselect image patches that do belong to or do not belong to a specific target class. For

• *Transfer learning*: the idea of transfer learning is to train a network for a given task and then to exploit or "translate" the resulting network parameters to another use case. A typical example cited in [8] is the use of knowledge gained, while learning to recognize cars in images is applied when trying to recognize

One of the most critical points for satellite image classification is the dependence of the classification results on the resolution (pixel spacing) of the images. Experiences gained by many authors demonstrate that the identified classes and their local assignment within image patches are strongly resolution-dependent as higher resolution will often lead to a higher number of visible and identified semantic categories. Thus, the performance of any semantic interpretation of images must be considered as a data-dependent metric: this potential difficulty should prevent

Another similar point to be mentioned is the risk of sensory and semantic gaps encountered during image classification. Sensory gaps result from cases where a sensing instrument cannot measure the full range of potential cases with all their physical effects and details that could exist in a real-world scene and that we cannot record and identify with uniform confidence. A similar potential pitfall for image understanding can result from semantic gaps. For instance, during interactive labeling by test persons, different people could assign different categories to image patches due to their educational background, professional experiences, etc. For

The number of available approaches, algorithms, and tools is growing continuously. Some examples have become very widespread in academia such as Caffe [16], TensorFlow [17], and PyTorch [18]. In contrast to these established solutions, a large number of fresh publications are submitted every day. As an example, the ArXiv preprint repository [19] collects in its "computer science" and "statistics"

Many experiments with image classification systems have shown that traditional single-level ("shallow") algorithms are less performant than multi-level ("deep") concepts where distinct filtering operations are applied on each level, and the results of the previous levels can be used on each deeper level; the final result will be obtained by combining the specific results of each separate level. The reason for the better performance of multi-level algorithms is that one can apply distinct filters specifically tailored to each level. Typical examples are multi-resolution filters that detect image characteristics on several scales: when we look at satellite images of urban settlements, then a business district normally has larger high-rise buildings and broader streets than a residential suburb with interspersed low-rise buildings

From a high-level perspective, we can say that learning works best with deep learning approaches exploiting dedicated "network" structures. Here, we understand networks as design structures of the data flows and the arrangement of pixel handling steps governing the processing of our images. This concept also supports

*DOI: http://dx.doi.org/10.5772/intechopen.90910*

us from blind-folded direct performance comparisons.

directories hundreds of new machine learning papers per day.

further details, see [14].

trucks.

further details, see [15].

and individual gardens.

**4. Networks for deep learning**

*Deep Learning Training and Benchmarks for Earth Observation Images: Data Sets, Features… DOI: http://dx.doi.org/10.5772/intechopen.90910*

accomplished by a visualization interface where a user can select or deselect image patches that do belong to or do not belong to a specific target class. For further details, see [14].

• *Transfer learning*: the idea of transfer learning is to train a network for a given task and then to exploit or "translate" the resulting network parameters to another use case. A typical example cited in [8] is the use of knowledge gained, while learning to recognize cars in images is applied when trying to recognize trucks.

One of the most critical points for satellite image classification is the dependence of the classification results on the resolution (pixel spacing) of the images. Experiences gained by many authors demonstrate that the identified classes and their local assignment within image patches are strongly resolution-dependent as higher resolution will often lead to a higher number of visible and identified semantic categories. Thus, the performance of any semantic interpretation of images must be considered as a data-dependent metric: this potential difficulty should prevent us from blind-folded direct performance comparisons.

Another similar point to be mentioned is the risk of sensory and semantic gaps encountered during image classification. Sensory gaps result from cases where a sensing instrument cannot measure the full range of potential cases with all their physical effects and details that could exist in a real-world scene and that we cannot record and identify with uniform confidence. A similar potential pitfall for image understanding can result from semantic gaps. For instance, during interactive labeling by test persons, different people could assign different categories to image patches due to their educational background, professional experiences, etc. For further details, see [15].

The number of available approaches, algorithms, and tools is growing continuously. Some examples have become very widespread in academia such as Caffe [16], TensorFlow [17], and PyTorch [18]. In contrast to these established solutions, a large number of fresh publications are submitted every day. As an example, the ArXiv preprint repository [19] collects in its "computer science" and "statistics" directories hundreds of new machine learning papers per day.

#### **4. Networks for deep learning**

Many experiments with image classification systems have shown that traditional single-level ("shallow") algorithms are less performant than multi-level ("deep") concepts where distinct filtering operations are applied on each level, and the results of the previous levels can be used on each deeper level; the final result will be obtained by combining the specific results of each separate level. The reason for the better performance of multi-level algorithms is that one can apply distinct filters specifically tailored to each level. Typical examples are multi-resolution filters that detect image characteristics on several scales: when we look at satellite images of urban settlements, then a business district normally has larger high-rise buildings and broader streets than a residential suburb with interspersed low-rise buildings and individual gardens.

From a high-level perspective, we can say that learning works best with deep learning approaches exploiting dedicated "network" structures. Here, we understand networks as design structures of the data flows and the arrangement of pixel handling steps governing the processing of our images. This concept also supports

*Recent Trends in Artificial Neural Networks - From Training to Prediction*

sequence of subsequent actions is executed.

learning principles useful for satellite images:

formulas derived by Bayes [11].

new SVM.

labeling by users [11].

parameters), and the handling of unexpected emergency cases. In the case of autonomous driving, one can think of a lane change on a motorway after a reason for a lane change has been found, and from a number of alternative reactions, when a lane change appears as the best reaction. Then the current situation has to be checked when a lane change becomes possible, and a

• DS as a scientific and technical discipline of its own shall provide all guiding principles that are needed from end-to-end system design up to data analytics and image understanding—including the system layout and verification, the selection of components and tools, the implementation and installation of the components and their verification, and the benchmarking of the full functionality. In the case of remote sensing applications, we also have to include all aspects of sensor calibration, comparisons with the findings of other research-

As our applications mostly use cases dealing with remote sensing images, we can limit ourselves to the main ML paradigms that support the semantic annotation of very large data sets. Based on the current state-of-the art developments, we consider that there are three currently important fundamental and internationally accepted image classification approaches for remote sensing applications and two additional

• *Bayesian networks*: a Bayesian network consists of a probabilistic graphical model representing a set of variables together with their conditional dependencies. It can be used for parameter learning and is based on traditional

• *Support Vector Machines* (SVMs): SVMs support classification and regression tasks by identifying basic support points that are used to define a robust separation plane between all sample points. In general, the resulting separation plane is a hyperplane with nonlinear characteristics. In order to obtain a separation plane with linear characteristics, the sample points are mapped into a higher-dimensional system with linear characteristics. This mapping exploits so-called kernel functions [12]. A well-known SVM software package is [13], which also explains how to train and verify a

• *Neural Networks*: neural networks follow the concept of biological neurons that trigger a positive response if the input signal corresponds to a known object. Thus, technical implementations mostly consist of three levels, namely a visible input layer followed by an internal processing layer that is not visible to the user (in principle, an artificial neural network), and a visible output layer. An extension of general neural networks are deep neural networks; here, the processing layer is split into several linked internal sublayers that allow a more detailed analysis of the input data (e.g., on selected scales). The internal network parameters (i.e., the filter coefficients) are derived ("learned") by means of typical (and atypical) image samples and manual

• *Active learning*: this learning strategy combines automated learning with interactive steps involving the user during important decisions. This can be

ers via Internet, and traceable scientific data interpretation.

**110**

more intricate label assignment concepts such as primary labels defining the main category of an image patch supplemented by secondary labels that provide additional information about "mixed classes" or supplementary spatial details of a given image patch.

In the meantime, some types of networks have emerged that have proven their robustness in the case of satellite images to be annotated semantically. In the following, we list four types of networks that have proven their usefulness for satellite image interpretation:


Besides the network types listed above, we also need an overall algorithmic architecture embedding the networks. For our applications, a "U" approach has proven to be a useful concept for satellite image content analysis. A "U" approach contains a descending branch followed by an ascending branch and is conceived for handling a progressively shrinking number of elements until a final core element (a main category) is found, followed by stepwise complementary semantic information. Further details can be found in [21].

In our experience, most general remote sensing applications can be solved efficiently by CNNs or similar approaches. However, quite a number of innovative alternatives have been proposed during the last years, for example, common auto-encoders, recursive approaches for time series, and adversarial networks for fast learning with only a few examples. In our case, we suggest to use CNNs for non-critical satellite image applications, while highly complicated or timecritical applications could call for innovative approaches as already described above.

**113**

database.

**6. Perspectives**

*Deep Learning Training and Benchmarks for Earth Observation Images: Data Sets, Features…*

When we train a classification network and verify its performance, the main goal is to train the system for correct category assignments resp. semantic annotations (labels), that is, to add supplementary information to each satellite image

The semantic annotations can either be learned in a preparatory phase or be taken from catalogs of already existing categories. If we aim at long-term analyses of satellite images, a good approach is to use the same catalogs during the entire lifetime of the analysis or to re-run the entire system with updated catalogs.

The easiest approach is to select typical examples for each category and to assign the given labels to all new image data. However, if we follow this straightforward approach, we will probably encounter some difficulties when image patches with unexpected content arrive. A first remedy is to add an additional "unknown" category and to assign this label to all image patches that do not fit well to one of the given categories. Further, experience with machine learning systems has shown that good classification results can also be reached when we systematically select positive as well as negative examples (i.e., counterexamples) for each category leading to a comprehensive coverage and understanding of each category. This process can be accomplished manually by knowledgeable operators (i.e., image interpretation experts) [22]. Another approach is data augmentation: If we do not have sufficient examples of a necessary category, one can create additional realistic data by simply

This simple example leads us to systematic methods for a database creation. One has to find a comprehensive and fairly balanced set of examples that covers the expected total variety of cases. Thus, we avoid so-called database biases [23]. In addition, one has to make sure that the inclusion of additional examples does not lead to overfitting or excessive runtimes. This can be accomplished by setting up a validation testbed where these potential pitfalls can be tested, trained, and where the final performance of the created database structure can be verified. One has to be aware of the fact that database access times may strongly depend on the available computer systems, their interconnections, and the selected type of

These approaches led to a number of publicly available databases with label annotations for civilian remote sensing data. There are several semantically annotated databases based on optical (most often multispectral) data, while there are only a few databases based on SAR data. Some advanced remote sensing database examples are [25–27]. Of course, their general applicability and transferability depend on the actual image resolution, the imaging geometry, and the noise content of the images. Current state-of-the-art systems are being assessed based on end-toend tests covering also inter alia practical aspects such as the runtime depending on the database design and the selected test images, the amount and organization of available labels, the correctness of the obtained annotations, and the overall

As for remote sensing images, there exist already several semantically annotated

collections of typical high-resolution satellite images—a number of collections of optical images and a few collections of SAR images. However, these collections often seem to be potpourris of interesting snapshots rather than systematically selected samples based on regionally typical target classes and their visibility as a

*DOI: http://dx.doi.org/10.5772/intechopen.90910*

flipping or rotating already available images.

implementation and validation effort.

**5. Training and benchmarking**

patch that we analyze.

*Deep Learning Training and Benchmarks for Earth Observation Images: Data Sets, Features… DOI: http://dx.doi.org/10.5772/intechopen.90910*

#### **5. Training and benchmarking**

*Recent Trends in Artificial Neural Networks - From Training to Prediction*

image patch.

image interpretation:

tion and annotation.

offending elements ("dropout method").

category labels. For further details, see [24].

mation. Further details can be found in [21].

more intricate label assignment concepts such as primary labels defining the main category of an image patch supplemented by secondary labels that provide additional information about "mixed classes" or supplementary spatial details of a given

In the meantime, some types of networks have emerged that have proven their robustness in the case of satellite images to be annotated semantically. In the following, we list four types of networks that have proven their usefulness for satellite

• *Deep Neural Networks* (DNNs): as described in [20], these networks consist of several layers and comprise an input layer, an output layer, and at least one hidden layer in between. Each layer performs dedicated pixel processing. The

• *Recursive Neural Networks (not to be confused with recurrent neural networks; both network types appear as* RNNs*)*: when we have structured input data, these data can be efficiently handled by recursive neural networks that are often being used for speech processing and understanding. Recursive neural networks can also be used for natural scenes such as images containing recursive structures [23]. RNN algorithms identify the units that an image contains and how the units interact. Thus, one can use RNNs for semantic scene segmenta-

• *Convolutional Neural Networks* (CNNs): these networks have been conceived for low-error classification of big images with a very large number of classes. As described by [21], one can classify more than a million images and assign more than 1000 different classes. This is accomplished internally by five convolutional layers, three fully connected layers, and a million internal parameters. To reduce overfitting, the method applies regularization by disregarding

• *Generative Adversarial Networks* (GANs): an adversarial network allows the mutual training of two competing multilayer perceptron models *G* and *D* following an adversarial process: *G* determines the data distribution, while *D* estimates the probability that a sample comes from training data rather than from *D*. In addition, D maps the high-dimensional input data to semantic

Besides the network types listed above, we also need an overall algorithmic architecture embedding the networks. For our applications, a "U" approach has proven to be a useful concept for satellite image content analysis. A "U" approach contains a descending branch followed by an ascending branch and is conceived for handling a progressively shrinking number of elements until a final core element (a main category) is found, followed by stepwise complementary semantic infor-

In our experience, most general remote sensing applications can be solved efficiently by CNNs or similar approaches. However, quite a number of innovative alternatives have been proposed during the last years, for example, common auto-encoders, recursive approaches for time series, and adversarial networks for fast learning with only a few examples. In our case, we suggest to use CNNs for non-critical satellite image applications, while highly complicated or timecritical applications could call for innovative approaches as already described

corresponding training phase can be understood as deep learning.

**112**

above.

When we train a classification network and verify its performance, the main goal is to train the system for correct category assignments resp. semantic annotations (labels), that is, to add supplementary information to each satellite image patch that we analyze.

The semantic annotations can either be learned in a preparatory phase or be taken from catalogs of already existing categories. If we aim at long-term analyses of satellite images, a good approach is to use the same catalogs during the entire lifetime of the analysis or to re-run the entire system with updated catalogs.

The easiest approach is to select typical examples for each category and to assign the given labels to all new image data. However, if we follow this straightforward approach, we will probably encounter some difficulties when image patches with unexpected content arrive. A first remedy is to add an additional "unknown" category and to assign this label to all image patches that do not fit well to one of the given categories. Further, experience with machine learning systems has shown that good classification results can also be reached when we systematically select positive as well as negative examples (i.e., counterexamples) for each category leading to a comprehensive coverage and understanding of each category. This process can be accomplished manually by knowledgeable operators (i.e., image interpretation experts) [22]. Another approach is data augmentation: If we do not have sufficient examples of a necessary category, one can create additional realistic data by simply flipping or rotating already available images.

This simple example leads us to systematic methods for a database creation. One has to find a comprehensive and fairly balanced set of examples that covers the expected total variety of cases. Thus, we avoid so-called database biases [23]. In addition, one has to make sure that the inclusion of additional examples does not lead to overfitting or excessive runtimes. This can be accomplished by setting up a validation testbed where these potential pitfalls can be tested, trained, and where the final performance of the created database structure can be verified. One has to be aware of the fact that database access times may strongly depend on the available computer systems, their interconnections, and the selected type of database.

These approaches led to a number of publicly available databases with label annotations for civilian remote sensing data. There are several semantically annotated databases based on optical (most often multispectral) data, while there are only a few databases based on SAR data. Some advanced remote sensing database examples are [25–27]. Of course, their general applicability and transferability depend on the actual image resolution, the imaging geometry, and the noise content of the images. Current state-of-the-art systems are being assessed based on end-toend tests covering also inter alia practical aspects such as the runtime depending on the database design and the selected test images, the amount and organization of available labels, the correctness of the obtained annotations, and the overall implementation and validation effort.

#### **6. Perspectives**

As for remote sensing images, there exist already several semantically annotated collections of typical high-resolution satellite images—a number of collections of optical images and a few collections of SAR images. However, these collections often seem to be potpourris of interesting snapshots rather than systematically selected samples based on regionally typical target classes and their visibility as a

function of different instrument types. The situation is aggravated by the current lack of systematically selected benchmarking data that could be used as well-known reference data for quality and performance assessments such as classification tasks or throughput testing.

These deficiencies have to be solved in the near future as more and more highresolution images become publicly available, while the end-users already expect reliable automated image classification and content understanding results for more and more high-level applications. We can expect that the progress in deep learning will also lead to much progress in many other fields of image processing, even beyond the field of remote sensing; thus, remote sensing should be aware of what is published by the image processing and environmental protection communities at large.

#### **7. Conclusions**

While high-resolution imaging has made much progress for many remote sensing applications, standardized image classification benchmarking still deserves more progress. On the one hand, several benchmarking concepts and tools could still be gleaned from other disciplines; on the other hand, an optimal solution of test cases for SAR image interpretation still needs more progress in basic approaches of how to verify actual image classification results and the identification of dubious cases.

#### **Acknowledgements**

We appreciate the cooperation with Politehnica University of Bucharest (UPB) in Romania and our project partners from the European H2020 projects CANDELA (under grant agreement No. 776193) and ExtremeEarth (under grant agreement No. 825258).

#### **Author details**

Mihai Datcu1,2\*, Gottfried Schwarz1 and Corneliu Octavian Dumitru1

1 German Aerospace Center (DLR), Remote Sensing Technology Institute, Wessling, Germany

2 Politehnica University of Bucharest, Bucharest, Romania

\*Address all correspondence to: mihai.datcu@dlr.de

© 2020 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/ by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

**115**

2019]

*Deep Learning Training and Benchmarks for Earth Observation Images: Data Sets, Features…*

[10] Adversarial machine learning. Available at: https://en.wikipedia.org/ wiki/Adversarial\_machine\_learning

[11] Bayesian network. Available at: https://en.wikipedia.org/wiki/Bayesian\_

[12] Support vector machine. Available at: https://en.wikipedia.org/wiki/

Support-vector\_machine [Accessed May

[13] LIBSVM -- A Library for Support Vector Machines. Available at: https:// www.csie.ntu.edu.tw/~cjlin/libsvm/

[14] Active learning. Available at: https:// en.wikipedia.org/wiki/Active\_learning

[15] Bahmanyar R, Murillo A. Evaluating the sensory gap for earth observation images using human perception and an LDA-based computational model. In Image Processing (ICIP), 2015 IEEE International Conference on pp. 566-570

[16] Caffe software. Available at: https:// caffe.berkeleyvision.org/ [Accessed May

[17] TensorFlow. Available at: https:// www.tensorflow.org [Accessed: March

[18] PyTorch. Available at: https:// pytorch.org/ [Accessed: March 2019]

[19] arXiv e-Print archive. Available at: https://arxiv.org/ [Accessed: March

[20] Krizhevsky A, Sutskever I, Hinton GC. ImageNet Classification with Deep Convolutional Neural Networks. Available at: https://papers. nips.cc/4824- imagenet-with-deepconvolutional-neural-networks.pdf

network [Accessed May 2019]

[Accessed May 2019]

[Accessed May 2019]

[Accessed May 2019]

2019]

2019]

2019]

2019]

*DOI: http://dx.doi.org/10.5772/intechopen.90910*

[1] Alom Z, Taha T, Yakopcic C, Westberg S, Sidike P, Nasrin S, et al. State-of-the-art survey on deep learning theory and architectures. Electronics. 2019;**8**:292. Available at: https://www.

mdpi.com/2079-9292/8/3/292

arxiv.org/abs/1709.00308

pii/S0924271619301108

2019]

2019]

2019]

April 2019]

[2] Ball J, Anderson D, Chan CS. A comprehensive survey of deep learning in remote sensing: Theories, tools and challenges for the community. Journal of Applied Remote Sensing. 2017;**11**(4):042609. Available at: https://

[3] Ma L, Liu Y, Zhang X, Ye Y, Yin G, Johnson B. Deep learning in remote sensing applications: A meta-

analysis and review. ISPRS Journal of Photogrammetry and Remote Sensing. 2019;**152**:166-177. Available at: https:// www.sciencedirect.com/science/article/

[4] ESA Sentinel-2. Available at: https:// sentinels.copernicus.eu/web/sentinel/ missions/sentinel-2 [Accessed: April

[5] ESA Sentinel-3. Available at: https:// sentinels.copernicus.eu/web/sentinel/ missions/sentinel-3 [Accessed: April

[6] ESA Sentinel-1. Available at: https:// sentinels.copernicus.eu/web/sentinel/ missions/sentinel-1 [Accessed: April

[7] Deep learning. Available at: https:// www.deeplearningbook.org [Accessed:

[8] Transfer learning. Available at: https://en.wikipedia.org/wiki/Transfer\_

[9] Orfeo Toolbox. Available at: https:// www.orfeo-toolbox.org/ [Accessed May

learning [Accessed May 2019]

**References**

*Deep Learning Training and Benchmarks for Earth Observation Images: Data Sets, Features… DOI: http://dx.doi.org/10.5772/intechopen.90910*

#### **References**

*Recent Trends in Artificial Neural Networks - From Training to Prediction*

function of different instrument types. The situation is aggravated by the current lack of systematically selected benchmarking data that could be used as well-known reference data for quality and performance assessments such as classification tasks

These deficiencies have to be solved in the near future as more and more highresolution images become publicly available, while the end-users already expect reliable automated image classification and content understanding results for more and more high-level applications. We can expect that the progress in deep learning will also lead to much progress in many other fields of image processing, even beyond the field of remote sensing; thus, remote sensing should be aware of what is published by the image processing and environmental protection communities at

While high-resolution imaging has made much progress for many remote sensing applications, standardized image classification benchmarking still deserves more progress. On the one hand, several benchmarking concepts and tools could still be gleaned from other disciplines; on the other hand, an optimal solution of test cases for SAR image interpretation still needs more progress in basic approaches of how to verify actual image classification results and the identification of dubious

We appreciate the cooperation with Politehnica University of Bucharest (UPB) in Romania and our project partners from the European H2020 projects CANDELA (under grant agreement No. 776193) and ExtremeEarth (under grant agreement

**114**

**Author details**

No. 825258).

**Acknowledgements**

or throughput testing.

large.

cases.

**7. Conclusions**

Wessling, Germany

Mihai Datcu1,2\*, Gottfried Schwarz1

and Corneliu Octavian Dumitru1

1 German Aerospace Center (DLR), Remote Sensing Technology Institute,

© 2020 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/ by/3.0), which permits unrestricted use, distribution, and reproduction in any medium,

2 Politehnica University of Bucharest, Bucharest, Romania

\*Address all correspondence to: mihai.datcu@dlr.de

provided the original work is properly cited.

[1] Alom Z, Taha T, Yakopcic C, Westberg S, Sidike P, Nasrin S, et al. State-of-the-art survey on deep learning theory and architectures. Electronics. 2019;**8**:292. Available at: https://www. mdpi.com/2079-9292/8/3/292

[2] Ball J, Anderson D, Chan CS. A comprehensive survey of deep learning in remote sensing: Theories, tools and challenges for the community. Journal of Applied Remote Sensing. 2017;**11**(4):042609. Available at: https:// arxiv.org/abs/1709.00308

[3] Ma L, Liu Y, Zhang X, Ye Y, Yin G, Johnson B. Deep learning in remote sensing applications: A metaanalysis and review. ISPRS Journal of Photogrammetry and Remote Sensing. 2019;**152**:166-177. Available at: https:// www.sciencedirect.com/science/article/ pii/S0924271619301108

[4] ESA Sentinel-2. Available at: https:// sentinels.copernicus.eu/web/sentinel/ missions/sentinel-2 [Accessed: April 2019]

[5] ESA Sentinel-3. Available at: https:// sentinels.copernicus.eu/web/sentinel/ missions/sentinel-3 [Accessed: April 2019]

[6] ESA Sentinel-1. Available at: https:// sentinels.copernicus.eu/web/sentinel/ missions/sentinel-1 [Accessed: April 2019]

[7] Deep learning. Available at: https:// www.deeplearningbook.org [Accessed: April 2019]

[8] Transfer learning. Available at: https://en.wikipedia.org/wiki/Transfer\_ learning [Accessed May 2019]

[9] Orfeo Toolbox. Available at: https:// www.orfeo-toolbox.org/ [Accessed May 2019]

[10] Adversarial machine learning. Available at: https://en.wikipedia.org/ wiki/Adversarial\_machine\_learning [Accessed May 2019]

[11] Bayesian network. Available at: https://en.wikipedia.org/wiki/Bayesian\_ network [Accessed May 2019]

[12] Support vector machine. Available at: https://en.wikipedia.org/wiki/ Support-vector\_machine [Accessed May 2019]

[13] LIBSVM -- A Library for Support Vector Machines. Available at: https:// www.csie.ntu.edu.tw/~cjlin/libsvm/ [Accessed May 2019]

[14] Active learning. Available at: https:// en.wikipedia.org/wiki/Active\_learning [Accessed May 2019]

[15] Bahmanyar R, Murillo A. Evaluating the sensory gap for earth observation images using human perception and an LDA-based computational model. In Image Processing (ICIP), 2015 IEEE International Conference on pp. 566-570

[16] Caffe software. Available at: https:// caffe.berkeleyvision.org/ [Accessed May 2019]

[17] TensorFlow. Available at: https:// www.tensorflow.org [Accessed: March 2019]

[18] PyTorch. Available at: https:// pytorch.org/ [Accessed: March 2019]

[19] arXiv e-Print archive. Available at: https://arxiv.org/ [Accessed: March 2019]

[20] Krizhevsky A, Sutskever I, Hinton GC. ImageNet Classification with Deep Convolutional Neural Networks. Available at: https://papers. nips.cc/4824- imagenet-with-deepconvolutional-neural-networks.pdf

[21] Ronneberger O, Fischer P, Brox T. U-Net: Convolutional Networks for Biomedical Image Segmentation, Medical Image Computing and Computer-Assisted Intervention (MICCAI). Vol. 9351. Basel, Switzerland: Springer International Publishing, LNCS; 2015. pp. 234-241, Available at: https://arXiv:150504597

[22] Murillo Montes de Oca A, Bahmanyar R, Nistor N, Datcu M. Earth observation image semantic bias: A collaborative user annotation approach. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (JSTARS). 2017;**10**(6):2462-2477. DOI: 10.1109/ JSTARS.2017.2697003

[23] Socher R, Lin C, Ng AY, Manning, CD. Parsing Natural Scenes and Natural Language with Recursive Neural Networks, 28th International Conference on Machine Learning (IVML 2011). Available at: https://nlp.stanford.edu/pubs/ SocherLinNgManning\_ICML2011.pdf

[24] Goodfellow IJ, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, et al. Generative Adversarial Nets. Available at: http://papers.nips.cc/ paper/5423-generative-adversarialnets.pdf

[25] Xia G-S, Bai X, Ding J, Zhu Z, Belongie S, Luo J, et al. DOTA: A largescale dataset for object detection in aerial images. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, Utah, USA. 2018

[26] Dumitru CO, Schwarz G, Datcu M. SAR image land cover datasets for classification benchmarking of temporal changes. IEEE Journal of Selected Topics in Applied Earth Observation and Remote Sensing (JSTARS). 2018;**11**(5):1571-1592

[27] Sumbul G, Charfuelan M, Demir B, Markl V. BIGEARTHNET: A largescale benchmark archive for remote sensing image understanding. In: IEEE International Conference on Geoscience and Remote Sensing Symposium (IGARRS), Yokohama, Japan. 2019

**117**

**1. Introduction**

**Chapter 8**

**Abstract**

Comparison

*and Huzaifa Hashim*

Data Mining Technology for

Structural Control Systems:

Concept, Development, and

*Meisam Gordan, Zubaidah Ismail, Zainah Ibrahim* 

**Keywords:** data mining, structural damped systems, vibration control,

In recent years, there has been a vast theoretical and experimental investigations in various problems encountered in different structures, from basic structural components (e.g., beams and plates) to complex structural systems (e.g., bridges and buildings). This is due to the fact that structures are built to support a load, namely, static or dynamic loads, incoming from different forces (e.g., tension, compression, torsion, bending, and shear). In this direction, many structures need to be designed to withstand dynamic loads even though they spend most of the time supporting static loads [1, 2]. Static loads are those that are gradually applied and remain in

machine learning, artificial intelligence, statistical analysis

Structural control systems are classified into four categories, that is, passive, active, semi-active, and hybrid systems. These systems must be designed in the best way to control harmonic motions imposed to structures. Therefore, a precise powerful computer-based technology is required to increase the damping characteristics of structures. In this direction, data mining has provided numerous solutions to structural damped system problems as an all-inclusive technology due to its computational ability. This chapter provides a broad, yet in-depth, overview in data mining including knowledge view (i.e., concept, functions, and techniques) as well as application view in damped systems, shock absorbers, and harmonic oscillators. To aid the aim, various data mining techniques are classified in three groups, that is, classification-, prediction-, and optimization-based data mining methods, in order to present the development of this technology. According to this categorization, the applications of statistical, machine learning, and artificial intelligence techniques with respect to vibration control system research area are compared. Then, some related examples are detailed in order to indicate the efficiency of data mining algorithms. Last but not least, capabilities and limitations of the most applicable data mining-based methods in structural control systems are presented. To the best of our knowledge, the current research is the first attempt to illustrate the data mining applications in this domain.

#### **Chapter 8**

*Recent Trends in Artificial Neural Networks - From Training to Prediction*

[27] Sumbul G, Charfuelan M, Demir B, Markl V. BIGEARTHNET: A largescale benchmark archive for remote sensing image understanding. In: IEEE International Conference on Geoscience and Remote Sensing Symposium (IGARRS), Yokohama, Japan. 2019

[21] Ronneberger O, Fischer P, Brox T. U-Net: Convolutional Networks for Biomedical Image Segmentation, Medical Image Computing and Computer-Assisted Intervention

(MICCAI). Vol. 9351. Basel, Switzerland: Springer International Publishing, LNCS; 2015. pp. 234-241, Available at:

Bahmanyar R, Nistor N, Datcu M. Earth observation image semantic bias: A collaborative user annotation approach. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (JSTARS). 2017;**10**(6):2462-2477. DOI: 10.1109/

[23] Socher R, Lin C, Ng AY, Manning, CD. Parsing Natural Scenes and Natural Language with Recursive Neural Networks, 28th International Conference on Machine Learning (IVML 2011). Available at: https://nlp.stanford.edu/pubs/

SocherLinNgManning\_ICML2011.pdf

[24] Goodfellow IJ, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, et al. Generative Adversarial Nets. Available at: http://papers.nips.cc/ paper/5423-generative-adversarial-

[25] Xia G-S, Bai X, Ding J, Zhu Z, Belongie S, Luo J, et al. DOTA: A largescale dataset for object detection in aerial images. In: IEEE/CVF Conference

on Computer Vision and Pattern Recognition, Salt Lake City, Utah, USA.

[26] Dumitru CO, Schwarz G, Datcu M. SAR image land cover datasets for classification benchmarking of temporal changes. IEEE Journal of Selected Topics in Applied Earth Observation and Remote Sensing (JSTARS).

https://arXiv:150504597

JSTARS.2017.2697003

[22] Murillo Montes de Oca A,

**116**

2018;**11**(5):1571-1592

nets.pdf

2018

## Data Mining Technology for Structural Control Systems: Concept, Development, and Comparison

*Meisam Gordan, Zubaidah Ismail, Zainah Ibrahim and Huzaifa Hashim*

#### **Abstract**

Structural control systems are classified into four categories, that is, passive, active, semi-active, and hybrid systems. These systems must be designed in the best way to control harmonic motions imposed to structures. Therefore, a precise powerful computer-based technology is required to increase the damping characteristics of structures. In this direction, data mining has provided numerous solutions to structural damped system problems as an all-inclusive technology due to its computational ability. This chapter provides a broad, yet in-depth, overview in data mining including knowledge view (i.e., concept, functions, and techniques) as well as application view in damped systems, shock absorbers, and harmonic oscillators. To aid the aim, various data mining techniques are classified in three groups, that is, classification-, prediction-, and optimization-based data mining methods, in order to present the development of this technology. According to this categorization, the applications of statistical, machine learning, and artificial intelligence techniques with respect to vibration control system research area are compared. Then, some related examples are detailed in order to indicate the efficiency of data mining algorithms. Last but not least, capabilities and limitations of the most applicable data mining-based methods in structural control systems are presented. To the best of our knowledge, the current research is the first attempt to illustrate the data mining applications in this domain.

**Keywords:** data mining, structural damped systems, vibration control, machine learning, artificial intelligence, statistical analysis

#### **1. Introduction**

In recent years, there has been a vast theoretical and experimental investigations in various problems encountered in different structures, from basic structural components (e.g., beams and plates) to complex structural systems (e.g., bridges and buildings). This is due to the fact that structures are built to support a load, namely, static or dynamic loads, incoming from different forces (e.g., tension, compression, torsion, bending, and shear). In this direction, many structures need to be designed to withstand dynamic loads even though they spend most of the time supporting static loads [1, 2]. Static loads are those that are gradually applied and remain in

place for longer duration of time. These loads are not time dependent. As an illustration, a live load on a structure is considered as a static load. Besides, most of the loadings applied to civil engineering structures, including seismic loadings, are usually considered as equivalent static loads [3, 4]. On the other hand, time-dependent dynamic loads such as machinery vibrations, earthquakes, wind storms, sea waves, and traffic can cause intensive and continuous vibrational motions which can cause changing of the structural properties (i.e., mass, stiffness, or damping) and loading to change in the dynamic responses, such as natural frequencies, mode shapes, and damping ratios [5–8]. Therefore, in-service structural systems in civil engineering such as tall buildings, long hydraulic structures, and long-span bridges are damageprone under these loads during their service life [9–14]. Moreover, these loads can cause intensive and stable vibrational motions, which can be damaging to human inhabitants. Based on these explanations, vibration is a serious concern in civil structures. It is due to the fact that existence of damage can disturb functionality and safety of the structure. However, the risk of occurrence of structural damage can be decreased by using a controlled vibration system to increase the damping characteristics of the structure. Accordingly, the advantage of using damping device is that damping system can improve the ability of the structure to dissipate a portion of the energy released during a dynamic loading event [15–18].

Over the last few decades, taller and wider structures have been built because of enormous developments in civil engineering area. As mentioned earlier, these structures will be subject to external loads which can cause vibrational problems. Consequently, it is essential to control the vibrational motions to reduce the response and to improve structure performance, safety, flexibility, serviceability, and structural reliability of these structures. Generally, structural control systems include four main groups which are passive, active, semi-active, and hybrid devices. Classification of these energy dissipation supplements is based on their operational mechanisms [19–21].

Data mining is the analysis of datasets to discover the relationships, new correlations, and trends and to extract the useful data in the form of patterns. Therefore, this process has been used to identify valid, valuable, and understandable forms of data [22, 23]. Accordingly, in recent years, this technology has provided various solutions to structural damped systems because of its powerful computational capacity. In this matter, many researchers have studied and examined various data mining techniques for passive, semi-active, active, and hybrid damped systems. In the same line, this chapter attempts to present the recent developments of well-known data mining techniques in vibration control devices. Before going into the details, it is important to point out the fundamental principles of data mining. Hence, data mining concepts including definition, background, functions, and techniques are discussed in the following section. Then, the concepts of applicable algorithms and their applications in damped systems are detailed in Section 3. Furthermore, applicable examples of data mining algorithms are presented for better understanding.

#### **2. Data mining concept**

Data can be defined as any fact, number, or text which can be proceeded by a computer. As the obtained pattern through data mining may be very difficult to find, it is sometimes compared to gold mining in rivers (**Figure 1**). The term "gold mining" refers to the search for gold in rocks or sand. Data mining is a search for information and knowledge. The origination of data mining traces back to the development of artificial intelligence in the 1950s. The development of data mining is shown in **Figure 2**.

In general, data mining has two classes which are descriptive mining and predictive mining using various techniques and functions (see **Figure 3** and **Table 1**).

**119**

**Figure 2.**

**Figure 1.**

*Gold mining and data mining.*

*History of data mining development.*

*Data Mining Technology for Structural Control Systems: Concept, Development, and Comparison*

The techniques play important roles to obtain effective models from observations. Besides, data mining techniques have also three main groups which are statistical techniques, machine learning techniques, and artificial intelligence techniques. It is noted that each of these techniques has particular algorithms for running the models to get the best solution. For instance, artificial neural network (ANN), Bayesian analysis, ant colony optimization, ICA, support vector machine, principal component analysis, particle swarm optimization (PSO), genetic algorithm, fuzzy logic, regression analysis, clustering, classification, and decision tree are classified under data mining techniques. Furthermore, the functions of data mining are categorized into clustering, prediction, classification, exploration, and association. The purpose of clustering is to divide the samples into groups with related behavior. The numerical prediction activity determines patterns, rules, or models to predict continuous or discrete target values which can also be used for other functions. Classification is used to recognize several rules which can be applied in future work to determine whether a previously unknown item belongs to a known class. Exploration is used to find out dimensionality of an input data, and, eventually, the association activity is used to frequently detect occurring related objects. Based on their particular utilizations in consequence of their assumptions and drawbacks, one or a combination of some of these tasks can be used to find the

*DOI: http://dx.doi.org/10.5772/intechopen.88651*

hidden information [24–27].

#### *Data Mining Technology for Structural Control Systems: Concept, Development, and Comparison DOI: http://dx.doi.org/10.5772/intechopen.88651*

The techniques play important roles to obtain effective models from observations. Besides, data mining techniques have also three main groups which are statistical techniques, machine learning techniques, and artificial intelligence techniques. It is noted that each of these techniques has particular algorithms for running the models to get the best solution. For instance, artificial neural network (ANN), Bayesian analysis, ant colony optimization, ICA, support vector machine, principal component analysis, particle swarm optimization (PSO), genetic algorithm, fuzzy logic, regression analysis, clustering, classification, and decision tree are classified under data mining techniques. Furthermore, the functions of data mining are categorized into clustering, prediction, classification, exploration, and association. The purpose of clustering is to divide the samples into groups with related behavior. The numerical prediction activity determines patterns, rules, or models to predict continuous or discrete target values which can also be used for other functions. Classification is used to recognize several rules which can be applied in future work to determine whether a previously unknown item belongs to a known class. Exploration is used to find out dimensionality of an input data, and, eventually, the association activity is used to frequently detect occurring related objects. Based on their particular utilizations in consequence of their assumptions and drawbacks, one or a combination of some of these tasks can be used to find the hidden information [24–27].

**Figure 1.** *Gold mining and data mining.*

**Figure 2.**

*Recent Trends in Artificial Neural Networks - From Training to Prediction*

place for longer duration of time. These loads are not time dependent. As an illustration, a live load on a structure is considered as a static load. Besides, most of the loadings applied to civil engineering structures, including seismic loadings, are usually considered as equivalent static loads [3, 4]. On the other hand, time-dependent dynamic loads such as machinery vibrations, earthquakes, wind storms, sea waves, and traffic can cause intensive and continuous vibrational motions which can cause changing of the structural properties (i.e., mass, stiffness, or damping) and loading to change in the dynamic responses, such as natural frequencies, mode shapes, and damping ratios [5–8]. Therefore, in-service structural systems in civil engineering such as tall buildings, long hydraulic structures, and long-span bridges are damageprone under these loads during their service life [9–14]. Moreover, these loads can cause intensive and stable vibrational motions, which can be damaging to human inhabitants. Based on these explanations, vibration is a serious concern in civil structures. It is due to the fact that existence of damage can disturb functionality and safety of the structure. However, the risk of occurrence of structural damage can be decreased by using a controlled vibration system to increase the damping characteristics of the structure. Accordingly, the advantage of using damping device is that damping system can improve the ability of the structure to dissipate a

portion of the energy released during a dynamic loading event [15–18].

Over the last few decades, taller and wider structures have been built because of enormous developments in civil engineering area. As mentioned earlier, these structures will be subject to external loads which can cause vibrational problems. Consequently, it is essential to control the vibrational motions to reduce the response and to improve structure performance, safety, flexibility, serviceability, and structural reliability of these structures. Generally, structural control systems include four main groups which are passive, active, semi-active, and hybrid devices. Classification of these energy dissipation supplements is based on their operational mechanisms [19–21].

Data mining is the analysis of datasets to discover the relationships, new correlations, and trends and to extract the useful data in the form of patterns. Therefore, this process has been used to identify valid, valuable, and understandable forms of data [22, 23]. Accordingly, in recent years, this technology has provided various solutions to structural damped systems because of its powerful computational capacity. In this matter, many researchers have studied and examined various data mining techniques for passive, semi-active, active, and hybrid damped systems. In the same line, this chapter attempts to present the recent developments of well-known data mining techniques in vibration control devices. Before going into the details, it is important to point out the fundamental principles of data mining. Hence, data mining concepts including definition, background, functions, and techniques are discussed in the following section. Then, the concepts of applicable algorithms and their applications in damped systems are detailed in Section 3. Furthermore, applicable examples of data mining algorithms are presented for better understanding.

Data can be defined as any fact, number, or text which can be proceeded by a computer. As the obtained pattern through data mining may be very difficult to find, it is sometimes compared to gold mining in rivers (**Figure 1**). The term "gold mining" refers to the search for gold in rocks or sand. Data mining is a search for information and knowledge. The origination of data mining traces back to the development of artificial intelligence in the 1950s. The development of data mining is shown in **Figure 2**. In general, data mining has two classes which are descriptive mining and predictive mining using various techniques and functions (see **Figure 3** and **Table 1**).

**118**

**2. Data mining concept**

#### *Recent Trends in Artificial Neural Networks - From Training to Prediction*

**Figure 3.** *Data mining functions.*


**Table 1.** *Data mining techniques.*

#### **3. Data mining algorithms**

#### **3.1 Support vector machine (SVM)**

SVM is one of the classification- and prediction-based techniques which was first introduced by Vapnik in 1963 [28]. It works based on learning theory and because of its high accuracy and good generalization capability; it has the potential to produce high-quality predictions in numerous tasks. Therefore, SVM has various

**121**

**Figure 4.**

*Data Mining Technology for Structural Control Systems: Concept, Development, and Comparison*

SVM has been used in structural control systems. For instance, a SVM-based semi-active control strategy was reported by [32] for the numerical model of a multi-storey structure. In this study, four seismic waves including the El Centro, Hachinohe, and Kobe waves, as well as the Shanghai artificial wave, whose peak ground accelerations were all scaled to 0.1 g, were taken into consideration. As shown in **Figure 4**, a three-storey shear-type frame structure with dampers was

The seismic responses of structural top storey with the structure-damper system, structure-SVM system, and no-control device are shown, respectively, in **Figure 5**. It is seen from this figure that the structure-SVM system model has perfectly learned the control effectiveness of the structure-damper system. This observation indicated that the structure-SVM system model was significantly

In order to further examine the seismic response reduction of the controlled structure using the present algorithm, the displacement response of every floor under these four seismic waves is shown in **Figure 6**. It is seen that under the Hachinohe wave, the peak displacement response of every floor, especially the top floor, with the structure-SVM system model, was remarkably smaller than that with the structure-damper system. The authors verified once again that the proposed structure-SVM system model will render better effectiveness than the

Comparative results of this study demonstrate that general semi-active dampers designed using the SVM-based semi-active control algorithm was capable of

applications which can be found in several areas such as machine learning, data classification, and pattern recognition [29, 30]. Basic models of SVM are linear SVM with linear functions and nonlinear SVM with kernel functions. Moreover, the aim of SVM classifier is to determine a separating hyperplane to divide the given data into two classes (i.e., positive class and negative class) in the optimal form. Therefore, the optimal separating hyperplane is determined by solving an optimi-

*DOI: http://dx.doi.org/10.5772/intechopen.88651*

considered as a case study in this work.

better than the structure-damper system.

providing the higher level of response reduction.

*Structure-SVM semi-active control system model and implementation flow chart.*

structure-damper system.

zation problem [31].

#### *Data Mining Technology for Structural Control Systems: Concept, Development, and Comparison DOI: http://dx.doi.org/10.5772/intechopen.88651*

applications which can be found in several areas such as machine learning, data classification, and pattern recognition [29, 30]. Basic models of SVM are linear SVM with linear functions and nonlinear SVM with kernel functions. Moreover, the aim of SVM classifier is to determine a separating hyperplane to divide the given data into two classes (i.e., positive class and negative class) in the optimal form. Therefore, the optimal separating hyperplane is determined by solving an optimization problem [31].

SVM has been used in structural control systems. For instance, a SVM-based semi-active control strategy was reported by [32] for the numerical model of a multi-storey structure. In this study, four seismic waves including the El Centro, Hachinohe, and Kobe waves, as well as the Shanghai artificial wave, whose peak ground accelerations were all scaled to 0.1 g, were taken into consideration. As shown in **Figure 4**, a three-storey shear-type frame structure with dampers was considered as a case study in this work.

The seismic responses of structural top storey with the structure-damper system, structure-SVM system, and no-control device are shown, respectively, in **Figure 5**. It is seen from this figure that the structure-SVM system model has perfectly learned the control effectiveness of the structure-damper system. This observation indicated that the structure-SVM system model was significantly better than the structure-damper system.

In order to further examine the seismic response reduction of the controlled structure using the present algorithm, the displacement response of every floor under these four seismic waves is shown in **Figure 6**. It is seen that under the Hachinohe wave, the peak displacement response of every floor, especially the top floor, with the structure-SVM system model, was remarkably smaller than that with the structure-damper system. The authors verified once again that the proposed structure-SVM system model will render better effectiveness than the structure-damper system.

Comparative results of this study demonstrate that general semi-active dampers designed using the SVM-based semi-active control algorithm was capable of providing the higher level of response reduction.

**Figure 4.** *Structure-SVM semi-active control system model and implementation flow chart.*

*Recent Trends in Artificial Neural Networks - From Training to Prediction*

**120**

**Table 1.**

*Data mining techniques.*

**Figure 3.**

*Data mining functions.*

**3. Data mining algorithms**

**3.1 Support vector machine (SVM)**

SVM is one of the classification- and prediction-based techniques which was first introduced by Vapnik in 1963 [28]. It works based on learning theory and because of its high accuracy and good generalization capability; it has the potential to produce high-quality predictions in numerous tasks. Therefore, SVM has various

**Data mining technique Category Learning type**

Support vector machine Machine learning Supervised Decision tree Statistical Supervised Clustering Statistical Unsupervised Principal component analysis Machine learning Unsupervised Regression Statistical Supervised

Meta-heuristics Artificial intelligence –

Classification Statistical Supervised Bayesian Machine learning Supervised

Artificial neural network Artificial intelligence Supervised/unsupervised

Fuzzy Artificial intelligence Supervised/unsupervised

#### **Figure 5.**

*Seismic responses of the structural top storey under the El Centro wave with PGA = 0.1 g using general semiactive dampers and SVM-based semi-active control algorithm.*

#### **Figure 6.**

*Displacement responses of every storey under the Hachinohe seismic wave using general semi-active dampers and SVM-based semi-active control algorithm.*

#### **3.2 Artificial neural network (ANN)**

Artificial neural network, which is a self-organizing prediction-based computational technique, was first proposed in the 1980s. This algorithm can solve many functions through pattern recognition [33]. It also can effectively be used to reconstruct nonlinear relationship learning from training [34]. A typical ANN model has two parts, that is, processing units (neurons) and connections between elements [34], in which neurons are located in layers of the network. A layered ANN structure, called multilayer perceptron (MLP), is one of the most widespread ANN methods. Generally, a conventional ANN has three layers which are input layer, hidden layer, and output layer. ANNs also can be categorized by their network topology such as feed forward and feedback or by their learning algorithms such as supervised learning and unsupervised learning [35].

**123**

**Figure 7.**

*Neural network model for a passive control system.*

*Data Mining Technology for Structural Control Systems: Concept, Development, and Comparison*

There are a variety of researches focusing on the application of ANN in structural control systems. For instance, according to reports by [36, 37], ANN has a great capacity to improve the functionality of active control systems due to its high pattern recognition capability. It also could be used for semi-active [38, 39] and passive damping systems [40]. The following are the review of some related examples

Suresh et al. [36] applied a nonlinearly parameterized neural network as a novel

controller scheme for the active control of earthquake-excited nonlinear baseisolated buildings. Numerical simulations were performed on a full-scale numerical test-bed base-isolated building with an isolation system comprising hysteretic lead-rubber bearings. They showed that the proposed approach could achieve good response reductions for a wide range of near-fault earthquakes, without a corre-

**Figure 7** demonstrates a neural network model that was developed by [40] which shows the application of ANN in passive damping devices. In this study, the ANN was employed in order to predict the inelastic demand of structural systems with viscoelastic dampers in terms of peak displacement, effective damping, and effective time period. The authors established that the ANN could be effectively used for new designs as well as for checking the response of any retrofitted structure for the chosen design spectrum. In addition, they concluded that artificial neural networks also were useful in quickly deciding the amount of damping and the number of dampers required to reduce the peak displacement and help in

A smart active control system, called NEURO-FBG combining fiber Bragg grating (FBG) sensors and neural networks, has been proposed by [41] in a steel building. In this study, an attempt has been made to illustrate the development procedure of the converter and controller by means of "NEURO-FBG converter" and "NEURO-FBG controller." In this regard, the NEURO-FBG smart control system was designed to be a robust and reliable active control system with "smart" performance. To achieve this goal, a specific methodology was proposed comprising three parts, that is, a structural surveillance system, three converters, and a controller (see **Figure 8**). The analytical results show that the NEURO-FBG system could effectively control the response of the structure and provide a more reliable system than ordinary active control. Later on, the authors verified their method using an experimentation [42]. According to their experimental results, the proposed active

**Figure 9** shows the architecture of an ANN-based real-time force tracking scheme for magnetorheological (MR) dampers, which was applied numerically and

*DOI: http://dx.doi.org/10.5772/intechopen.88651*

which indicate the applicability of ANN in damping systems.

sponding increase in the superstructure response.

control system can be successfully applied to buildings.

restricting further damage.

There are a variety of researches focusing on the application of ANN in structural control systems. For instance, according to reports by [36, 37], ANN has a great capacity to improve the functionality of active control systems due to its high pattern recognition capability. It also could be used for semi-active [38, 39] and passive damping systems [40]. The following are the review of some related examples which indicate the applicability of ANN in damping systems.

Suresh et al. [36] applied a nonlinearly parameterized neural network as a novel controller scheme for the active control of earthquake-excited nonlinear baseisolated buildings. Numerical simulations were performed on a full-scale numerical test-bed base-isolated building with an isolation system comprising hysteretic lead-rubber bearings. They showed that the proposed approach could achieve good response reductions for a wide range of near-fault earthquakes, without a corresponding increase in the superstructure response.

**Figure 7** demonstrates a neural network model that was developed by [40] which shows the application of ANN in passive damping devices. In this study, the ANN was employed in order to predict the inelastic demand of structural systems with viscoelastic dampers in terms of peak displacement, effective damping, and effective time period. The authors established that the ANN could be effectively used for new designs as well as for checking the response of any retrofitted structure for the chosen design spectrum. In addition, they concluded that artificial neural networks also were useful in quickly deciding the amount of damping and the number of dampers required to reduce the peak displacement and help in restricting further damage.

A smart active control system, called NEURO-FBG combining fiber Bragg grating (FBG) sensors and neural networks, has been proposed by [41] in a steel building. In this study, an attempt has been made to illustrate the development procedure of the converter and controller by means of "NEURO-FBG converter" and "NEURO-FBG controller." In this regard, the NEURO-FBG smart control system was designed to be a robust and reliable active control system with "smart" performance. To achieve this goal, a specific methodology was proposed comprising three parts, that is, a structural surveillance system, three converters, and a controller (see **Figure 8**). The analytical results show that the NEURO-FBG system could effectively control the response of the structure and provide a more reliable system than ordinary active control. Later on, the authors verified their method using an experimentation [42]. According to their experimental results, the proposed active control system can be successfully applied to buildings.

**Figure 9** shows the architecture of an ANN-based real-time force tracking scheme for magnetorheological (MR) dampers, which was applied numerically and

**Figure 7.** *Neural network model for a passive control system.*

*Recent Trends in Artificial Neural Networks - From Training to Prediction*

**122**

**Figure 5.**

**Figure 6.**

**3.2 Artificial neural network (ANN)**

*and SVM-based semi-active control algorithm.*

*active dampers and SVM-based semi-active control algorithm.*

supervised learning and unsupervised learning [35].

Artificial neural network, which is a self-organizing prediction-based computational technique, was first proposed in the 1980s. This algorithm can solve many functions through pattern recognition [33]. It also can effectively be used to reconstruct nonlinear relationship learning from training [34]. A typical ANN model has two parts, that is, processing units (neurons) and connections between elements [34], in which neurons are located in layers of the network. A layered ANN structure, called multilayer perceptron (MLP), is one of the most widespread ANN methods. Generally, a conventional ANN has three layers which are input layer, hidden layer, and output layer. ANNs also can be categorized by their network topology such as feed forward and feedback or by their learning algorithms such as

*Displacement responses of every storey under the Hachinohe seismic wave using general semi-active dampers* 

*Seismic responses of the structural top storey under the El Centro wave with PGA = 0.1 g using general semi-*

**Figure 8.** *Block diagram of NEURO-FBG smart control system.*

#### **Figure 9.**

*The architecture of the neural network modeling.*

experimentally on a five-storey shear frame by [43]. In this study, the forward and inverse MR damper dynamics were modeled by the neural network method using constant and half-sinusoidal current tests. As it can be seen in **Figure 9**, the ANN modeling was implemented for the forward and inverse MR damper model. It was concluded that the experimental validation of the neural network modeling both the forward and inverse MR damper dynamics showed the accuracy of these models.

A semi-active control strategy that combined a neuro-control system including a multilayer ANN with a back-propagation training algorithm with a smart damper was proposed to reduce seismic responses of structures [38]. A set of numerical simulations was performed to verify the effectiveness of the proposed method.

**125**

**Table 2.**

**Figure 10.**

*Data Mining Technology for Structural Control Systems: Concept, Development, and Comparison*

To aid the aim, two controllers were used, that is, (1) a primary control algorithm based on a cost function, and the sensitivity evaluation algorithm was employed in order to replace an emulator neural network as well as produce the desired active control force and (2) a secondary bang-bang-type controller caused the smart damper to generate the desired active control force, so long as this force was dissipative. It should be noted that cost function is defined as the squared sum of offset between the actual and the desired responses. Therefore, the main purpose was to minimize the cost function during training the network. **Figure 10** demonstrates the diagram of control for semi-active neuro-control using smart damper as well as a three-storey building with a single smart damper which was used as a numerical modeling. The authors showed that the proposed semi-active control system using ANN and smart dampers was a promising tool for control of real structures.

Fuzzy logic was proposed by Lotfi Zadeh for the first time in 1965. It has been employed in different applications such as pattern recognition, classification, decision-making, etc. [44]. The basic configuration of a fuzzy technique consists of four important components, which are fuzzification, fuzzy rule base, fuzzy inference, and defuzzification. Fuzzification is a mapping from a crisp input to fuzzy membership sets. The fuzzy rule base has set rules of fuzzy variables described by membership functions. Fuzzy inference is a decision-making mechanism of the fuzzy system. The defuzzifier changes the fuzzy consequences from different rules into crisp values [45]. Fuzzy is a model-free technique for structural system identification, where the most important advantages of fuzzy systems are their high parallel implementation, nonlinearity, and being capable of adapting [46].

Adaptive fuzzy control strategy [47], fuzzy gain scheduling [48], semi-active fuzzy logic control system [49], model-based fuzzy logic controller (MBFLC) [50],

**Case Wind speed (m/s) Maximum torsion (rad)**

Uncontrolled 55.52 0.02 Controlled with passive TMD 98 0.02 Controlled with STMD-FLC 110 0.0063

Applications of fuzzy logic in SHM are detailed in **Table 2**.

*Diagram of control for semi-active neuro-control using smart damper.*

*Comparison of the effectiveness of passive TMD and STMD-FLC.*

*DOI: http://dx.doi.org/10.5772/intechopen.88651*

**3.3 Fuzzy logic**

*Data Mining Technology for Structural Control Systems: Concept, Development, and Comparison DOI: http://dx.doi.org/10.5772/intechopen.88651*

To aid the aim, two controllers were used, that is, (1) a primary control algorithm based on a cost function, and the sensitivity evaluation algorithm was employed in order to replace an emulator neural network as well as produce the desired active control force and (2) a secondary bang-bang-type controller caused the smart damper to generate the desired active control force, so long as this force was dissipative. It should be noted that cost function is defined as the squared sum of offset between the actual and the desired responses. Therefore, the main purpose was to minimize the cost function during training the network. **Figure 10** demonstrates the diagram of control for semi-active neuro-control using smart damper as well as a three-storey building with a single smart damper which was used as a numerical modeling. The authors showed that the proposed semi-active control system using ANN and smart dampers was a promising tool for control of real structures.

#### **3.3 Fuzzy logic**

*Recent Trends in Artificial Neural Networks - From Training to Prediction*

experimentally on a five-storey shear frame by [43]. In this study, the forward and inverse MR damper dynamics were modeled by the neural network method using constant and half-sinusoidal current tests. As it can be seen in **Figure 9**, the ANN modeling was implemented for the forward and inverse MR damper model. It was concluded that the experimental validation of the neural network modeling both the forward and inverse MR damper dynamics showed the accuracy of these models. A semi-active control strategy that combined a neuro-control system including a multilayer ANN with a back-propagation training algorithm with a smart damper was proposed to reduce seismic responses of structures [38]. A set of numerical simulations was performed to verify the effectiveness of the proposed method.

**124**

**Figure 9.**

**Figure 8.**

*The architecture of the neural network modeling.*

*Block diagram of NEURO-FBG smart control system.*

Fuzzy logic was proposed by Lotfi Zadeh for the first time in 1965. It has been employed in different applications such as pattern recognition, classification, decision-making, etc. [44]. The basic configuration of a fuzzy technique consists of four important components, which are fuzzification, fuzzy rule base, fuzzy inference, and defuzzification. Fuzzification is a mapping from a crisp input to fuzzy membership sets. The fuzzy rule base has set rules of fuzzy variables described by membership functions. Fuzzy inference is a decision-making mechanism of the fuzzy system. The defuzzifier changes the fuzzy consequences from different rules into crisp values [45]. Fuzzy is a model-free technique for structural system identification, where the most important advantages of fuzzy systems are their high parallel implementation, nonlinearity, and being capable of adapting [46]. Applications of fuzzy logic in SHM are detailed in **Table 2**.

Adaptive fuzzy control strategy [47], fuzzy gain scheduling [48], semi-active fuzzy logic control system [49], model-based fuzzy logic controller (MBFLC) [50],

#### **Figure 10.**

*Diagram of control for semi-active neuro-control using smart damper.*


**Table 2.**

*Comparison of the effectiveness of passive TMD and STMD-FLC.*

optimal fuzzy logic controller [51], fuzzy controller [52–54], neuro-fuzzy [55–57], genetic fuzzy logic controller (GFLC) [58], fuzzy control strategy based on a neural network forecasting model [59], and wavelet-neuro-fuzzy control [37] are some of the important applications of fuzzy logic in structural control systems. The following examples illustrate the applicability of fuzzy in damping systems.

A semi-active fuzzy control system was introduced by [49] to reduce the seismic responses in variable orifice dampers. In this direction, a numerical study was conducted to investigate the effectiveness of the proposed approach. Results revealed that the fuzzy logic controller (FLC) was capable of improving the structural responses. Another semi-active fuzzy control system comprising a semi-active tuned mass damper (STMD) system with variable damping was proposed by [60] to control the flutter instability of long-span suspension

**Figure 11.**

*Membership functions of the fuzzy logic controller variables: (a) membership functions for displacement, (b) membership functions for velocity, and (c) membership functions for semi-active tuned mass damper (TMD) damping ratio [60].*

**127**

**Figure 12.**

*Data Mining Technology for Structural Control Systems: Concept, Development, and Comparison*

bridges. In this study, the variable damping of the system was chosen through a fuzzy logic controller. The STMD-FLC methodology was applied to increase the flutter wind speed of the test structure which was a suspension bridge. To do so, in order to select the level of semi-active damping ratio, a fuzzy logic feedback controller was incorporated into a closed-loop control system. The displacement and velocity quantities were used as the input to the fuzzy logic controller, and the level of STMD damping ratio was its output for each degree of freedom, namely, vertical and torsional (see **Figure 11**). The FLC system was designed based on the

In addition, a comparison of the effectiveness of passive TMD and STMD-FLC was carried out in this research, which is shown in **Table 2**. The table clearly shows

The description of the fuzzy input membership function abbreviations is as follows: NL = negative large, NM = negative medium, NS = negative small, ZR = zero, PS = positive small, PM = positive medium, and PL = positive large; and those of the output are as follows: ZR = zero, VS = very small, S = small, L = large, and

Adaptive network-based fuzzy inference system (ANFIS) is a hybrid learn-

ANFIS has proven to be an excellent function approximation tool. For example,

an ANFIS controller was developed by [61] for reduction of environmentally induced vibration in multiple-degree-of-freedom building structure with MR damper. The systems were excited using two different earthquake random vibration loadings. **Figure 12** illustrates the comparison of the displacement response at the top of the structure with and without control under El Centro and Hachinohe earthquakes. The figure shows that ANFIS clearly could reduce the displacement

*Displacement response of the structure under El Centro and Hachinohe earthquakes.*

ing algorithm which combines the back-propagation gradient descent and least squares techniques to generate a fuzzy inference system. The membership functions in ANFIS are adjusted according to a given set of input and output data. The main objective of ANFIS is to integrate the finest features of neural networks and fuzzy systems. Accordingly, the outputs of ANFIS can be seen in two steps, that is, (1) representation of prior knowledge into a set of constraints to reduce the optimization search space from fuzzy system and (2) adaptation of back-propagation to structured network to automate fuzzy control parametric tuning from neural network. Therefore, ANFIS is one of the best trade-offs between neural and fuzzy systems providing smoothness due to the fuzzy control interpolation and adaptability, due to the neural network back-

the superior performance of semi-active control over the passive control.

*DOI: http://dx.doi.org/10.5772/intechopen.88651*

Mamdani's fuzzy inference method.

VL = very large.

propagation [61].

amplitude in both vibration loadings.

#### *Data Mining Technology for Structural Control Systems: Concept, Development, and Comparison DOI: http://dx.doi.org/10.5772/intechopen.88651*

bridges. In this study, the variable damping of the system was chosen through a fuzzy logic controller. The STMD-FLC methodology was applied to increase the flutter wind speed of the test structure which was a suspension bridge. To do so, in order to select the level of semi-active damping ratio, a fuzzy logic feedback controller was incorporated into a closed-loop control system. The displacement and velocity quantities were used as the input to the fuzzy logic controller, and the level of STMD damping ratio was its output for each degree of freedom, namely, vertical and torsional (see **Figure 11**). The FLC system was designed based on the Mamdani's fuzzy inference method.

In addition, a comparison of the effectiveness of passive TMD and STMD-FLC was carried out in this research, which is shown in **Table 2**. The table clearly shows the superior performance of semi-active control over the passive control.

The description of the fuzzy input membership function abbreviations is as follows: NL = negative large, NM = negative medium, NS = negative small, ZR = zero, PS = positive small, PM = positive medium, and PL = positive large; and those of the output are as follows: ZR = zero, VS = very small, S = small, L = large, and VL = very large.

Adaptive network-based fuzzy inference system (ANFIS) is a hybrid learning algorithm which combines the back-propagation gradient descent and least squares techniques to generate a fuzzy inference system. The membership functions in ANFIS are adjusted according to a given set of input and output data. The main objective of ANFIS is to integrate the finest features of neural networks and fuzzy systems. Accordingly, the outputs of ANFIS can be seen in two steps, that is, (1) representation of prior knowledge into a set of constraints to reduce the optimization search space from fuzzy system and (2) adaptation of back-propagation to structured network to automate fuzzy control parametric tuning from neural network. Therefore, ANFIS is one of the best trade-offs between neural and fuzzy systems providing smoothness due to the fuzzy control interpolation and adaptability, due to the neural network backpropagation [61].

ANFIS has proven to be an excellent function approximation tool. For example, an ANFIS controller was developed by [61] for reduction of environmentally induced vibration in multiple-degree-of-freedom building structure with MR damper. The systems were excited using two different earthquake random vibration loadings. **Figure 12** illustrates the comparison of the displacement response at the top of the structure with and without control under El Centro and Hachinohe earthquakes. The figure shows that ANFIS clearly could reduce the displacement amplitude in both vibration loadings.

**Figure 12.** *Displacement response of the structure under El Centro and Hachinohe earthquakes.*

*Recent Trends in Artificial Neural Networks - From Training to Prediction*

ing examples illustrate the applicability of fuzzy in damping systems.

optimal fuzzy logic controller [51], fuzzy controller [52–54], neuro-fuzzy [55–57], genetic fuzzy logic controller (GFLC) [58], fuzzy control strategy based on a neural network forecasting model [59], and wavelet-neuro-fuzzy control [37] are some of the important applications of fuzzy logic in structural control systems. The follow-

A semi-active fuzzy control system was introduced by [49] to reduce the seismic responses in variable orifice dampers. In this direction, a numerical study was conducted to investigate the effectiveness of the proposed approach. Results revealed that the fuzzy logic controller (FLC) was capable of improving the structural responses. Another semi-active fuzzy control system comprising a semi-active tuned mass damper (STMD) system with variable damping was proposed by [60] to control the flutter instability of long-span suspension

*Membership functions of the fuzzy logic controller variables: (a) membership functions for displacement, (b) membership functions for velocity, and (c) membership functions for semi-active tuned mass damper* 

**126**

**Figure 11.**

*(TMD) damping ratio [60].*

#### **3.4 Clustering**

Clustering is an unsupervised statistical data analysis technique, which is used in pattern recognition, image analysis, and bioinformatics. This method is employed to divide datasets into separated similar subsets (clusters) according to typical patterns identified in the clustering analysis [62]. In order to have a successful clustering, maximum intra-cluster similarity as well as minimum inter-cluster similarity is required. The K-means is one of the most descriptive partitioning clustering algorithms with a quite reliable effectiveness at local optimum. However, it can be employed only to numerical datasets. Furthermore, K-means has poor handling for data prone to noise and outliers [63]. Clustering can also help to decrease the distance between datasets and improve the similarity of datasets in each cluster [64, 65].

A combination of fuzzy C-means clustering and subtractive clustering has been developed numerically for nonlinear system identification of a seismically excited building-MR damper system. It was demonstrated from the simulation that the proposed fuzzy model is effective in identifying nonlinear behavior of the building-MR damper system subjected to the 1940 El Centro earthquake. **Figure 13** compares the displacement and acceleration responses of the original simulation model with those of the identified model. Note that the original simulation model means an analysis model of the building equipped with an MR damper. As can be seen from the figure, overall good agreements between the original values and the identified model were found in the time histories of both displacement and acceleration responses [66].

#### **3.5 Genetic algorithm (GA)**

GA, which is one the most powerful optimization-based algorithms, was first proposed by John Holland in the 1970s. In GA, a chromosome is used to determine the solution. The chromosome includes a group of genes that optimize parameters. This algorithm employs a random solution from a current population. Then, the next generation will be created using crossover and mutation operators [67]. In general, GA is an attractive tool to optimize difficult problems due to its benefits such as parallelism, convergence to global optima, adaptation, and no need for the gradient of the objective function. Considering these benefits, GA has been successfully applied in optimal design of TMDs [68–70] and multiple tuned liquid column damper (MTLCD) [71], optimization of earthquake energy dissipation system [72], optimization of active control systems in high-rise buildings [73], optimal damper distribution [74], smart control systems [75], etc.

#### **Figure 13.**

*Comparison of original simulation and obtained results from the proposed model. (a) displacement response, (b) acceleration response.*

**129**

**Figure 14.**

*Centro earthquake.*

*Data Mining Technology for Structural Control Systems: Concept, Development, and Comparison*

An optimization strategy of hydraulic actuators, that is, an implicit redundant representation (IRR) genetic algorithm with a non-dominated sorting II (NS2) GA, namely, NS2-IRR GA, was implemented numerically by [73] in order to minimize the distribution of control devices in large-scale structures as well as optimize the dynamic responses of structures. It was shown that the proposed NS2-IRR GA-based control system was effective in finding not only optimal locations and numbers of actuators in structures but also minimum responses of the buildings. In the same line, **Figure 14**, which compares the dynamic behavior of the proposed approach with those of the benchmark control system in reducing displacements of the 20-storey building, clearly indicates the effectiveness of GA in minimizing the

PSO which was first proposed by Kennedy and Eberhart [76] is one of the population-based artificial intelligence optimization-based techniques. The

approach was simulated by the social behavior of organisms such as bird flocking to be used as a suitable tool for global optimization [77]. In PSO, a particle represents

*Comparison of displacement responses of benchmark and NS2-IRR genetic algorithm approaches under the El* 

*DOI: http://dx.doi.org/10.5772/intechopen.88651*

**3.6 Particle swarm optimization**

displacement/drift responses of the building structure.

*Data Mining Technology for Structural Control Systems: Concept, Development, and Comparison DOI: http://dx.doi.org/10.5772/intechopen.88651*

An optimization strategy of hydraulic actuators, that is, an implicit redundant representation (IRR) genetic algorithm with a non-dominated sorting II (NS2) GA, namely, NS2-IRR GA, was implemented numerically by [73] in order to minimize the distribution of control devices in large-scale structures as well as optimize the dynamic responses of structures. It was shown that the proposed NS2-IRR GA-based control system was effective in finding not only optimal locations and numbers of actuators in structures but also minimum responses of the buildings. In the same line, **Figure 14**, which compares the dynamic behavior of the proposed approach with those of the benchmark control system in reducing displacements of the 20-storey building, clearly indicates the effectiveness of GA in minimizing the displacement/drift responses of the building structure.

#### **3.6 Particle swarm optimization**

*Recent Trends in Artificial Neural Networks - From Training to Prediction*

Clustering is an unsupervised statistical data analysis technique, which is used in pattern recognition, image analysis, and bioinformatics. This method is employed to divide datasets into separated similar subsets (clusters) according to typical patterns identified in the clustering analysis [62]. In order to have a successful clustering, maximum intra-cluster similarity as well as minimum inter-cluster similarity is required. The K-means is one of the most descriptive partitioning clustering algorithms with a quite reliable effectiveness at local optimum. However, it can be employed only to numerical datasets. Furthermore, K-means has poor handling for data prone to noise and outliers [63]. Clustering can also help to decrease the distance between datasets and improve the similarity of datasets in

A combination of fuzzy C-means clustering and subtractive clustering has been developed numerically for nonlinear system identification of a seismically excited building-MR damper system. It was demonstrated from the simulation that the proposed fuzzy model is effective in identifying nonlinear behavior of the building-MR damper system subjected to the 1940 El Centro earthquake. **Figure 13** compares the displacement and acceleration responses of the original simulation model with those of the identified model. Note that the original simulation model means an analysis model of the building equipped with an MR damper. As can be seen from the figure, overall good agreements between the original values and the identified model were found in the time histories of both displacement and acceleration

GA, which is one the most powerful optimization-based algorithms, was first proposed by John Holland in the 1970s. In GA, a chromosome is used to determine the solution. The chromosome includes a group of genes that optimize parameters. This algorithm employs a random solution from a current population. Then, the next generation will be created using crossover and mutation operators [67]. In general, GA is an attractive tool to optimize difficult problems due to its benefits such as parallelism, convergence to global optima, adaptation, and no need for the gradient of the objective function. Considering these benefits, GA has been successfully applied in optimal design of TMDs [68–70] and multiple tuned liquid column damper (MTLCD) [71], optimization of earthquake energy dissipation system [72], optimization of active control systems in high-rise buildings [73], optimal damper

*Comparison of original simulation and obtained results from the proposed model. (a) displacement response,* 

**3.4 Clustering**

each cluster [64, 65].

responses [66].

**3.5 Genetic algorithm (GA)**

distribution [74], smart control systems [75], etc.

**128**

**Figure 13.**

*(b) acceleration response.*

PSO which was first proposed by Kennedy and Eberhart [76] is one of the population-based artificial intelligence optimization-based techniques. The approach was simulated by the social behavior of organisms such as bird flocking to be used as a suitable tool for global optimization [77]. In PSO, a particle represents

**Figure 14.**

*Comparison of displacement responses of benchmark and NS2-IRR genetic algorithm approaches under the El Centro earthquake.*

**Figure 15.** *Results for the 1981 Imperial Valley (El Centro-06): controlled displacement.*

a potential solution where each particle has two updatable features: position and velocity. PSO is easy to apply and has a great computational capacity. In comparison with other optimization approaches, PSO is more efficient and requires a fewer numbers of function evaluations while giving better or the same quality of results. However, it has some weaknesses such as trapping into local optimum in a complex search space. Besides, the disability to implement a precise local search around a local optimum is another drawback in PSO [78, 79].

PSO was applied for optimization of the parameters of a TMD-viscously damped system in [80] including the optimum mass ratio, damper damping, and tuning frequency. The results were calculated by means of three numerical examples for different nonstationary ground acceleration systems to demonstrate the efficiency of the proposed method. To this end, the system was subjected to ground accelerations with different PSO-based power spectra in order to minimize either the maximum displacement or acceleration mean square responses. The authors of this research reported that it was quite easy to program the applications of PSO in practical engineering.

Another method called wavelet PSO-based linear quadratic regulator (WPSOB-LQR) was presented numerically by [81] to find the optimal control force of active TMD via PSO-based linear quadratic regulator (LQR) and wavelet analysis. To aid the aim, PSO was used to determine the gain matrices through the online update of the weighting matrices used in the controller while eliminating the trial and error. **Figure 15** shows the time history of displacement response by using predeveloped LQR control method and the proposed WPSOB-LQR approach. As it can be observed from this figure, the displacement was significantly reduced using their proposed method. Moreover, the authors stated that the proposed method was practicable and worthwhile for vibration control of structures.

#### **4. Conclusion**

In this chapter, a brief description of data mining has been made, the most applicable techniques were reviewed, and the applications of machine learning, artificial intelligence, and statistical algorithms in structural control systems have been stated. Furthermore, for each technique, an attempt has been made to present several examples to familiarize readers more in the corresponding field as well as presenting an overall background of the researches done by several investigators worldwide. The following are some of the important conclusions.

Fuzzy, GA, and ANN are the most applicable methods in structural control systems. Among all, fuzzy controllers were the most powerful techniques to solve numerous problems. As a matter of fact, fuzzy algorithm could present uncomplicated and strong solutions for control systems in order to modify uncertainty, such as selecting damping ratios and reducing the responses of structures. ANN was another self-organizing technique which has been used to control and predict

**131**

**Author details**

Meisam Gordan\*, Zubaidah Ismail, Zainah Ibrahim and Huzaifa Hashim

\*Address all correspondence to: meisam.gordan@gmail.com

provided the original work is properly cited.

Department of Civil Engineering, University of Malaya, Kuala Lumpur, Malaysia

© 2019 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/ by/3.0), which permits unrestricted use, distribution, and reproduction in any medium,

*Data Mining Technology for Structural Control Systems: Concept, Development, and Comparison*

the seismic responses of structures with energy dissipation systems. As far as the ANN was concerned, the local minimum point, overlearning, and the excessive dependence on experience in the choice of structures and types were its inevitable limitations, while SVM could get rid of these limitations and has been successfully applied to time series forecasting. Likewise, SVM could provide some special advantages in the fields of small sample issues and nonlinear and high-dimensional pattern recognition. MATLAB was the main program language which has been used

This research was funded by the University of Malaya (UM) and the Ministry of Higher Education (MOHE), Malaysia (Grant numbers: IIRG007A, GPF015A-2018

The authors would like to express their sincere thanks to the University of Malaya and the Ministry of Education, Malaysia, for their support given through

*DOI: http://dx.doi.org/10.5772/intechopen.88651*

to develop data mining techniques in this area.

**Notes/thanks/other declarations**

**Acknowledgements**

and RG561-18HTM).

research grants.

*Data Mining Technology for Structural Control Systems: Concept, Development, and Comparison DOI: http://dx.doi.org/10.5772/intechopen.88651*

the seismic responses of structures with energy dissipation systems. As far as the ANN was concerned, the local minimum point, overlearning, and the excessive dependence on experience in the choice of structures and types were its inevitable limitations, while SVM could get rid of these limitations and has been successfully applied to time series forecasting. Likewise, SVM could provide some special advantages in the fields of small sample issues and nonlinear and high-dimensional pattern recognition. MATLAB was the main program language which has been used to develop data mining techniques in this area.

#### **Acknowledgements**

*Recent Trends in Artificial Neural Networks - From Training to Prediction*

local optimum is another drawback in PSO [78, 79].

*Results for the 1981 Imperial Valley (El Centro-06): controlled displacement.*

practicable and worthwhile for vibration control of structures.

worldwide. The following are some of the important conclusions.

practical engineering.

**Figure 15.**

**4. Conclusion**

a potential solution where each particle has two updatable features: position and velocity. PSO is easy to apply and has a great computational capacity. In comparison with other optimization approaches, PSO is more efficient and requires a fewer numbers of function evaluations while giving better or the same quality of results. However, it has some weaknesses such as trapping into local optimum in a complex search space. Besides, the disability to implement a precise local search around a

PSO was applied for optimization of the parameters of a TMD-viscously damped system in [80] including the optimum mass ratio, damper damping, and tuning frequency. The results were calculated by means of three numerical examples for different nonstationary ground acceleration systems to demonstrate the efficiency of the proposed method. To this end, the system was subjected to ground accelerations with different PSO-based power spectra in order to minimize either the maximum displacement or acceleration mean square responses. The authors of this research reported that it was quite easy to program the applications of PSO in

Another method called wavelet PSO-based linear quadratic regulator (WPSOB-LQR) was presented numerically by [81] to find the optimal control force of active TMD via PSO-based linear quadratic regulator (LQR) and wavelet analysis. To aid the aim, PSO was used to determine the gain matrices through the online update of the weighting matrices used in the controller while eliminating the trial and error. **Figure 15** shows the time history of displacement response by using predeveloped LQR control method and the proposed WPSOB-LQR approach. As it can be observed from this figure, the displacement was significantly reduced using their proposed method. Moreover, the authors stated that the proposed method was

In this chapter, a brief description of data mining has been made, the most applicable techniques were reviewed, and the applications of machine learning, artificial intelligence, and statistical algorithms in structural control systems have been stated. Furthermore, for each technique, an attempt has been made to present several examples to familiarize readers more in the corresponding field as well as presenting an overall background of the researches done by several investigators

Fuzzy, GA, and ANN are the most applicable methods in structural control systems. Among all, fuzzy controllers were the most powerful techniques to solve numerous problems. As a matter of fact, fuzzy algorithm could present uncomplicated and strong solutions for control systems in order to modify uncertainty, such as selecting damping ratios and reducing the responses of structures. ANN was another self-organizing technique which has been used to control and predict

**130**

This research was funded by the University of Malaya (UM) and the Ministry of Higher Education (MOHE), Malaysia (Grant numbers: IIRG007A, GPF015A-2018 and RG561-18HTM).

#### **Notes/thanks/other declarations**

The authors would like to express their sincere thanks to the University of Malaya and the Ministry of Education, Malaysia, for their support given through research grants.

#### **Author details**

Meisam Gordan\*, Zubaidah Ismail, Zainah Ibrahim and Huzaifa Hashim Department of Civil Engineering, University of Malaya, Kuala Lumpur, Malaysia

\*Address all correspondence to: meisam.gordan@gmail.com

© 2019 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/ by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

### **References**

[1] Hanif MU, Ibrahim Z, Jameel M, Ghaedi K, Aslam M. A new approach to estimate damage in concrete beams using non-linearity. Construction and Building Materials. 2016;**124**:1081-1089

[2] Hanif MU, Ibrahim Z, Jameel M, Ghaedi K, Hashim H. Simulation-based non-linear vibration model for damage detection in RC beams. European Journal of Environmental and Civil Engineering. March 2019:1-26. DOI: 10.1080/19648189.2019.1578270

[3] Filiatrault A. Principles of Passive Supplemental Damping and Seismic Isolation. Pavia: Iuss Press; 2006

[4] Ghayeb HH, Razak HA, Sulong NHR. Development and testing of hybrid precast concrete beamto-column connections under cyclic loading. Construction and Building Materials. 2017;**151**:258-278. DOI: 10.1016/j.conbuildmat.2017.06.073

[5] Gordan M, Ismail Z, Razak HA, Ibrahim Z, Vibration-based structural damage identification using data mining. In: 24th International Congress on Sound and Vibration. London; 2017

[6] Gordan M, Ghaedi K. Experimental study on the effectiveness of tuned mass damper on a steel frame under harmonic load. In: 4th International Congress on Civil Engineering, Architecture & Urban Development, Shahid Beheshti University, Tehran. Tehran, Iran: Shahid Beheshti University; 2016

[7] Hanif MU, Ibrahim Z, Ghaedi K, Javanmardi A, Rehman SK. Finite element simulation of damage In RC beams. Journal of Civil Engineering, Science and Technology. 2018;**9**:50-57

[8] Ghaedi K, Ibrahim Z, Javanmardi A. A new metallic bar damper device for seismic energy dissipation of civil structures. IOP Conference

Series: Materials Science and Engineering. 2018;**431**:1-7. DOI: 10.1088/1757-899X/431/12/122009

[9] Javanmardi A, Ibrahim Z, Ghaedi K, Jameel M, Khatibi H, Suhatril M. Seismic response characteristics of a base isolated cable-stayed bridge under moderate and strong ground motions. Archives of Civil and Mechanical Engineering. 2017;**17**:419-432. DOI: 10.1016/j. acme.2016.12.002

[10] Javanmardi A, Ibrahim Z, Ghaedi K, Khan NB, Ghadim HB. Seismic isolation retrofitting solution for an existing steel cable-stayed bridge. PLoS One. 2018;**13**:1-22. DOI: 10.1371/journal. pone.0200482

[11] Ghaedi K, Jameel M, Ibrahim Z, Khanzaei P. Seismic analysis of roller compacted concrete (RCC) dams considering effect of sizes and shapes of galleries. KSCE Journal of Civil Engineering. 2016;**20**:261:261-272. DOI: 10.1007/s12205-015-0538-2

[12] Ghaedi K, Hejazi F, Ibrahim Z, Khanzaei P. Flexible foundation effect on seismic analysis of roller compacted concrete (RCC) dams using finite element method. KSCE Journal of Civil Engineering. 2017;**22**:1-13

[13] Ghaedi K, Ibrahim Z, Adeli H. Invited review: Recent developments in vibration control of building and bridge structures. Journal of Vibroengineering. 2017;**19**:3564-3580

[14] Ghaedi K, Khanzaei P, Vaghei R, Fateh A, Javanmardi A, Gordan M, et al. Reservoir hydrostatic pressure effect on roller compacted concrete (RCC) dams. Malaysian Construction Research Journal. 2016;**19**:1-9

[15] Gordan M. Experimental Investigation of Passive Tuned Mass

**133**

*Data Mining Technology for Structural Control Systems: Concept, Development, and Comparison*

[22] Miranda T, Correia AG, Santos M, Ribeiro L, Cortez P. New models for strength and deformability parameter calculation in rock masses using data-mining techniques. International Journal of Geomechanics. 2011;**11**:44-58

[23] Buchheit RB, Garrett JH Jr, Lee SR, Brahme R. A knowledge discovery case study for the intelligent workplace. Computing in Civil and Building Engineering. 2000:914-921. DOI:

[24] Gordan M, Razak HA, Ismail Z, Ghaedi K. Recent developments in damage identification of structures using data mining. Latin American Journal of Solids and Structures.

[25] Liao S-H, Chu P-H, Hsiao P-Y. Data mining techniques and applications – A decade review from 2000 to 2011. Expert Systems with Applications.

[26] Alves V, Cremona C, Cury A. On the use of symbolic vibration data for robust structural health monitoring. Proceedings of the Institution of Civil

[27] Gordan M, Razak HA, Ismail Z, Ghaedi K. Data mining based damage identification using imperialist competitive algorithm and artificial neural network. Latin American Journal of Solids and Structures. 2018;**15**:1-14

[28] Vapnik V. The Nature of Statistical Learning Theory. New York: Springer-

[29] He H-X, Yan W. Structural damage detection with wavelet support vector machine: Introduction and applications.

Structural Control and Health Monitoring. 2007;**14**:162-176

[30] Tinoco J, Gomes Correia A, Cortez P. Support vector machines

Engineers. 2015;**169**:715-723

10.1061/40513(279)119

2017;**14**:2373-2401. DOI: 10.1590/1679-78254378

2012;**39**:11303-11311

Verlag; 1995

*DOI: http://dx.doi.org/10.5772/intechopen.88651*

Damper and Fluid Viscous Damper on a Slender Two Dimension Steel Frame. Johor, Malaysia: University Technology

Marsono AK, Md Tap M. Investigation

[18] Ghaedi K, Javanmardi A, Gordan M, Hamed K, Abdollah M. Application of 2D and 3D finite element modelling of gravity dams under seismic loading. In: 3rd National Graduate. Conference, Universiti Tenaga Nasional, Kuala Lumpur, Putrajaya Campus; 2015.

[19] Ghaedi K, Ibrahim Z, Jameel M, Javanmardi A, Khatibi H. Seismic response analysis of fully base-isolated adjacent buildings with segregated foundations. Advances in Civil Engineering. 2018;**2018**:1-21. DOI:

[20] Ghaedi K, Ibrahim Z, Javanmardi A, Rupakhety R. Experimental study of a new Bar damper device for vibration control of structures subjected to earthquake loads. Journal of Earthquake

[21] Javanmardi A, Ibrahim Z, Ghaedi K, Benisi Ghadim H, Hanif MU. Stateof-the-art review of metallic dampers: Testing, development and implementation. Archives of Computational Methods in

Engineering. 2019:1-24. DOI: 10.1007/

s11831-019-09329-9

[16] Gordan M, Haddadiasl A,

the behavior of a four-storey steel frame using viscous damper. Applied Mechanics and Materials.

[17] Gordan M, Izadifar M, Haddadiasl A, Ahad J, Abadi R, Mohammadhosseini H. Interaction of across-wind and along-wind with tall buildings. Australian Journal of Basic and Applied Sciences. 2014;**8**:96-101

of Malaysia; 2014

2015;**735**:149-153

pp. 264-269

10.1155/2018/4517940

Engineering. 2018:1-19. DOI: 10.1080/13632469.2018.1515796 *Data Mining Technology for Structural Control Systems: Concept, Development, and Comparison DOI: http://dx.doi.org/10.5772/intechopen.88651*

Damper and Fluid Viscous Damper on a Slender Two Dimension Steel Frame. Johor, Malaysia: University Technology of Malaysia; 2014

[16] Gordan M, Haddadiasl A, Marsono AK, Md Tap M. Investigation the behavior of a four-storey steel frame using viscous damper. Applied Mechanics and Materials. 2015;**735**:149-153

[17] Gordan M, Izadifar M, Haddadiasl A, Ahad J, Abadi R, Mohammadhosseini H. Interaction of across-wind and along-wind with tall buildings. Australian Journal of Basic and Applied Sciences. 2014;**8**:96-101

[18] Ghaedi K, Javanmardi A, Gordan M, Hamed K, Abdollah M. Application of 2D and 3D finite element modelling of gravity dams under seismic loading. In: 3rd National Graduate. Conference, Universiti Tenaga Nasional, Kuala Lumpur, Putrajaya Campus; 2015. pp. 264-269

[19] Ghaedi K, Ibrahim Z, Jameel M, Javanmardi A, Khatibi H. Seismic response analysis of fully base-isolated adjacent buildings with segregated foundations. Advances in Civil Engineering. 2018;**2018**:1-21. DOI: 10.1155/2018/4517940

[20] Ghaedi K, Ibrahim Z, Javanmardi A, Rupakhety R. Experimental study of a new Bar damper device for vibration control of structures subjected to earthquake loads. Journal of Earthquake Engineering. 2018:1-19. DOI: 10.1080/13632469.2018.1515796

[21] Javanmardi A, Ibrahim Z, Ghaedi K, Benisi Ghadim H, Hanif MU. Stateof-the-art review of metallic dampers: Testing, development and implementation. Archives of Computational Methods in Engineering. 2019:1-24. DOI: 10.1007/ s11831-019-09329-9

[22] Miranda T, Correia AG, Santos M, Ribeiro L, Cortez P. New models for strength and deformability parameter calculation in rock masses using data-mining techniques. International Journal of Geomechanics. 2011;**11**:44-58

[23] Buchheit RB, Garrett JH Jr, Lee SR, Brahme R. A knowledge discovery case study for the intelligent workplace. Computing in Civil and Building Engineering. 2000:914-921. DOI: 10.1061/40513(279)119

[24] Gordan M, Razak HA, Ismail Z, Ghaedi K. Recent developments in damage identification of structures using data mining. Latin American Journal of Solids and Structures. 2017;**14**:2373-2401. DOI: 10.1590/1679-78254378

[25] Liao S-H, Chu P-H, Hsiao P-Y. Data mining techniques and applications – A decade review from 2000 to 2011. Expert Systems with Applications. 2012;**39**:11303-11311

[26] Alves V, Cremona C, Cury A. On the use of symbolic vibration data for robust structural health monitoring. Proceedings of the Institution of Civil Engineers. 2015;**169**:715-723

[27] Gordan M, Razak HA, Ismail Z, Ghaedi K. Data mining based damage identification using imperialist competitive algorithm and artificial neural network. Latin American Journal of Solids and Structures. 2018;**15**:1-14

[28] Vapnik V. The Nature of Statistical Learning Theory. New York: Springer-Verlag; 1995

[29] He H-X, Yan W. Structural damage detection with wavelet support vector machine: Introduction and applications. Structural Control and Health Monitoring. 2007;**14**:162-176

[30] Tinoco J, Gomes Correia A, Cortez P. Support vector machines

**132**

*Recent Trends in Artificial Neural Networks - From Training to Prediction*

Series: Materials Science and Engineering. 2018;**431**:1-7. DOI: 10.1088/1757-899X/431/12/122009

[9] Javanmardi A, Ibrahim Z, Ghaedi K, Jameel M, Khatibi H, Suhatril M. Seismic response characteristics of a base isolated cable-stayed bridge under moderate and strong ground motions. Archives of Civil and Mechanical Engineering. 2017;**17**:419-432. DOI: 10.1016/j.

[10] Javanmardi A, Ibrahim Z, Ghaedi K, Khan NB, Ghadim HB. Seismic isolation retrofitting solution for an existing steel cable-stayed bridge. PLoS One. 2018;**13**:1-22. DOI: 10.1371/journal.

[11] Ghaedi K, Jameel M, Ibrahim Z, Khanzaei P. Seismic analysis of roller compacted concrete (RCC) dams considering effect of sizes and shapes of galleries. KSCE Journal of Civil Engineering. 2016;**20**:261:261-272. DOI:

[12] Ghaedi K, Hejazi F, Ibrahim Z, Khanzaei P. Flexible foundation effect on seismic analysis of roller compacted concrete (RCC) dams using finite element method. KSCE Journal of Civil

[13] Ghaedi K, Ibrahim Z, Adeli H. Invited review: Recent developments in vibration control of building and bridge structures. Journal of Vibroengineering.

[14] Ghaedi K, Khanzaei P, Vaghei R, Fateh A, Javanmardi A, Gordan M, et al. Reservoir hydrostatic pressure effect on roller compacted concrete (RCC) dams. Malaysian Construction Research

10.1007/s12205-015-0538-2

Engineering. 2017;**22**:1-13

2017;**19**:3564-3580

Journal. 2016;**19**:1-9

[15] Gordan M. Experimental Investigation of Passive Tuned Mass

acme.2016.12.002

pone.0200482

[1] Hanif MU, Ibrahim Z, Jameel M, Ghaedi K, Aslam M. A new approach to estimate damage in concrete beams using non-linearity. Construction and Building Materials. 2016;**124**:1081-1089

**References**

[2] Hanif MU, Ibrahim Z, Jameel M, Ghaedi K, Hashim H. Simulation-based non-linear vibration model for damage detection in RC beams. European Journal of Environmental and Civil Engineering. March 2019:1-26. DOI: 10.1080/19648189.2019.1578270

[3] Filiatrault A. Principles of Passive Supplemental Damping and Seismic Isolation. Pavia: Iuss Press; 2006

Sulong NHR. Development and testing of hybrid precast concrete beamto-column connections under cyclic loading. Construction and Building Materials. 2017;**151**:258-278. DOI: 10.1016/j.conbuildmat.2017.06.073

[5] Gordan M, Ismail Z, Razak HA, Ibrahim Z, Vibration-based structural damage identification using data mining. In: 24th International Congress on Sound and Vibration. London; 2017

[6] Gordan M, Ghaedi K. Experimental study on the effectiveness of tuned mass damper on a steel frame under harmonic load. In: 4th International Congress on Civil Engineering, Architecture & Urban Development, Shahid Beheshti University, Tehran. Tehran, Iran: Shahid

Beheshti University; 2016

[7] Hanif MU, Ibrahim Z, Ghaedi K, Javanmardi A, Rehman SK. Finite element simulation of damage In RC beams. Journal of Civil Engineering, Science and Technology. 2018;**9**:50-57

[8] Ghaedi K, Ibrahim Z, Javanmardi A. A new metallic bar damper device for seismic energy dissipation of civil structures. IOP Conference

[4] Ghayeb HH, Razak HA,

applied to uniaxial compressive strength prediction of jet grouting columns. Computers and Geotechnics. 2014;**55**:132-140

[31] Kishore B, Satyanarayana MRS, Sujatha K. Efficient fault detection using support vector machine based hybrid expert system. International Journal of Systems Assurance Engineering and Management. 2014:34-40. DOI: 10.1007/ s13198-014-0281-y

[32] Li C, Liu Q. Support vector machine based semi-active control of structures: A new control strategy. Structural Design of Tall and Special Buildings. 2011;**20**:711-720

[33] Ahmed R, El Sayed M, Gadsden SA, Tjong J, Habibi S. Artificial neural network training utilizing the smooth variable structure filter estimation strategy. Neural Computing and Applications. 2015;**27**:537-548

[34] Ali A, Amin SE, Ramadan HH, Tolba MF. Enhancement of OMI aerosol optical depth data assimilation using artificial neural network. Neural Computing and Applications. 2013;**23**:2267-2279

[35] Azimzadegan T, Khoeini M, Etaat M, Khoshakhlagh A. An artificial neural-network model for impact properties in X70 pipeline steels. Neural Computing and Applications. 2012;**23**:1473-1480

[36] Suresh S, Narasimhan S, Nagarajaiah S. Direct adaptive neural controller for the active control of earthquake-excited nonlinear baseisolated buildings. Structural Control and Health Monitoring. 2012;**19**:370-384

[37] Mitchell R, Kim Y, El-Korchi T, Cha Y-J. Wavelet-neuro-fuzzy control of hybrid building-active tuned mass damper system under seismic excitations. Journal of Vibration and Control. 2012;**19**:1881-1894

[38] Jung H-J, Lee H-J, Yoon W-H, Oh J-W, Lee I-W. Semiactive neurocontrol for seismic response reduction using smart damping strategy. Journal of Computing in Civil Engineering. 2004;**18**:277-281

[39] Chen ZH, Ni YQ. On-board identification and control performance verification of an MR damper incorporated with structure. Journal of Intelligent Material Systems and Structures. 2011;**22**:1551-1565

[40] Vaidyanathan CV, Kamatchi P, Ravichandran R. Artificial neural networks for predicting the response of structural systems with viscoelastic dampers. Computer-Aided Civil and Infrastructure Engineering. 2005;**20**:294-302

[41] Lin T, Chang K, Chung L, Lin Y. Active control with optical fiber sensors and neural networks. I: Theoretical analysis. Journal of Structural Engineering. 2006;**132**:1293-1304

[42] Lin T, Chang K, Lin Y. Active control with optical fiber sensors and neural networks. II: experimental verification. Journal of Structural Engineering. 2006;**132**:1304-1314

[43] Weber F, Bhowmik S, Høgsberg J. Extended neural network-based scheme for real-time force tracking with magnetorheological dampers. Structural Control and Health Monitoring. 2014;**21**:225-247

[44] Rutkowski L. Flexible Neuro-Fuzzy Systems: Structures, Learning and Performance Evaluation. Poland: Technical University of Czestochowa, Springer Science & Business Media; 2004

[45] Nyongesa HO. Enhancing neural control systems by fuzzy logic and evolutionary reinforcement. Neural Computing and Applications. 1998;**7**:121-130

**135**

492-498

*Data Mining Technology for Structural Control Systems: Concept, Development, and Comparison*

[54] Lin C-J, Yau H-T, Lee C-Y, Tung K-H. System identification and semiactive control of a squeeze-mode magnetorheological damper. IEEE/ ASME Transactions on Mechatronics.

[55] Kim H, Roschke PN. Fuzzy control of base-isolation system using multiobjective genetic algorithm. Computer-

Aided Civil and Infrastructure Engineering. 2006;**21**:436-449

[56] Reigles DG, Symans MD. Supervisory fuzzy control of a baseisolated benchmark building utilizing a neuro-fuzzy model of controllable fluid viscous dampers. Structural Control and Health Monitoring.

[57] Shook DA, Roschke PN,

Ozbulut OE. Superelastic semi-active damping of a base-isolated structure. Structural Control and Health Monitoring. 2008;**15**:746-768

[58] Mohtat A, Yousefi-Koma A, Dehghan-Niri E. Active vibration control of seismically excited structures by Atmds: Stability and performance robustness

perspective. International Journal of Structural Stability and Dynamics.

[60] Pourzeynali S, Datta TK. Semiactive fuzzy logic control of suspension. Journal of Structural Engineering.

[61] Gu ZQ, Oyadiji SO. Application of MR damper in structural control using ANFIS method. Computers and Structures. 2008;**86**:427-436. DOI: 10.1016/j.compstruc.2007.02.024

[59] Guo Y-Q, Fei S-M, Xu Z-D. Simulation analysis on intelligent structures with magnetorheological dampers. Journal of Intelligent Material Systems and Structures.

2013;**18**:1691-1701

2006;**13**:724-747

2010;**10**:501-527

2007;**19**:715-726

2005;**131**:900-912

*DOI: http://dx.doi.org/10.5772/intechopen.88651*

[47] Zhou L, Chang C, Wang L. Adaptive fuzzy control for nonlinear building– Magnetorheological damper system. Journal of Structural Engineering.

[46] Nerves AC, Krishnan R. Active control strategies for tall civil structures. Proceedings of IECON'95-21st Annual

Conference on IEEE Industrial Electronics. 1995;**2**:962-967

[48] Wongprasert N, Symans MD. Experimental evaluation of adaptive elastomeric Base-isolated structures using variable-orifice fluid dampers.

[49] Ghaffarzadeh H, Dehrod EA, Talebian N. Semi-active fuzzy control for seismic response reduction of building frames using variable orifice dampers subjected to near-fault earthquakes. Journal of Vibration and

Control. 2012;**19**:1980-1998

[50] Kim Y, Langari R, Hurlebaus S. Control of a seismically excited benchmark building nonlinear fuzzy control. Journal of Structural Engineering. 2010;**136**:1023-1026

[51] Ahlawat AS, Ramaswamy A. Multiobjective optimal fuzzy logic controller driven active and hybrid control systems for seismically excited nonlinear buildings. Journal of Engineering Mechanics.

[52] Samali B, Al-dawod M, Kwok KCS, Naghdy F. Active control of cross wind response of 76-story tall building using a fuzzy controller. Journal of Engineering Mechanics. 2004;**130**:

[53] Soleymani M, Khodadadi M. Adaptive fuzzy controller for active tuned mass damper of a benchmark tall building subjected to seismic and wind loads. Structural Design of Tall and Special Buildings. 2014;**23**:781-800

2002;**130**:416-423

2003;**129**:905-913

2005;**131**:867-877

*Data Mining Technology for Structural Control Systems: Concept, Development, and Comparison DOI: http://dx.doi.org/10.5772/intechopen.88651*

[46] Nerves AC, Krishnan R. Active control strategies for tall civil structures. Proceedings of IECON'95-21st Annual Conference on IEEE Industrial Electronics. 1995;**2**:962-967

*Recent Trends in Artificial Neural Networks - From Training to Prediction*

[38] Jung H-J, Lee H-J, Yoon W-H, Oh J-W, Lee I-W. Semiactive neurocontrol for seismic response reduction using smart damping strategy. Journal of Computing in Civil Engineering.

[39] Chen ZH, Ni YQ. On-board

verification of an MR damper incorporated with structure. Journal of Intelligent Material Systems and Structures. 2011;**22**:1551-1565

[40] Vaidyanathan CV, Kamatchi P, Ravichandran R. Artificial neural networks for predicting the response of structural systems with viscoelastic dampers. Computer-Aided Civil and Infrastructure Engineering.

[41] Lin T, Chang K, Chung L, Lin Y. Active control with optical fiber sensors and neural networks. I: Theoretical analysis. Journal of Structural Engineering. 2006;**132**:1293-1304

[42] Lin T, Chang K, Lin Y. Active control with optical fiber sensors and neural networks. II: experimental verification. Journal of Structural Engineering.

[43] Weber F, Bhowmik S, Høgsberg J. Extended neural network-based scheme

magnetorheological dampers. Structural

for real-time force tracking with

Control and Health Monitoring.

[44] Rutkowski L. Flexible Neuro-Fuzzy Systems: Structures, Learning and Performance Evaluation. Poland: Technical University of Czestochowa, Springer Science & Business Media;

[45] Nyongesa HO. Enhancing neural control systems by fuzzy logic and evolutionary reinforcement. Neural Computing and Applications.

identification and control performance

2004;**18**:277-281

2005;**20**:294-302

2006;**132**:1304-1314

2014;**21**:225-247

1998;**7**:121-130

2004

applied to uniaxial compressive strength prediction of jet grouting columns. Computers and Geotechnics.

[31] Kishore B, Satyanarayana MRS, Sujatha K. Efficient fault detection using support vector machine based hybrid expert system. International Journal of Systems Assurance Engineering and Management. 2014:34-40. DOI: 10.1007/

[32] Li C, Liu Q. Support vector machine based semi-active control of structures: A new control strategy. Structural Design of Tall and Special Buildings.

[33] Ahmed R, El Sayed M, Gadsden SA, Tjong J, Habibi S. Artificial neural network training utilizing the smooth variable structure filter estimation strategy. Neural Computing and Applications. 2015;**27**:537-548

[34] Ali A, Amin SE, Ramadan HH, Tolba MF. Enhancement of OMI aerosol optical depth data assimilation using artificial neural network. Neural Computing and Applications.

[35] Azimzadegan T, Khoeini M, Etaat M, Khoshakhlagh A. An artificial neural-network model for impact properties in X70 pipeline steels. Neural Computing and Applications.

[36] Suresh S, Narasimhan S,

Nagarajaiah S. Direct adaptive neural controller for the active control of earthquake-excited nonlinear baseisolated buildings. Structural Control and Health Monitoring. 2012;**19**:370-384

[37] Mitchell R, Kim Y, El-Korchi T, Cha Y-J. Wavelet-neuro-fuzzy control of hybrid building-active tuned mass damper system under seismic excitations. Journal of Vibration and

Control. 2012;**19**:1881-1894

2014;**55**:132-140

s13198-014-0281-y

2011;**20**:711-720

2013;**23**:2267-2279

2012;**23**:1473-1480

**134**

[47] Zhou L, Chang C, Wang L. Adaptive fuzzy control for nonlinear building– Magnetorheological damper system. Journal of Structural Engineering. 2003;**129**:905-913

[48] Wongprasert N, Symans MD. Experimental evaluation of adaptive elastomeric Base-isolated structures using variable-orifice fluid dampers. 2005;**131**:867-877

[49] Ghaffarzadeh H, Dehrod EA, Talebian N. Semi-active fuzzy control for seismic response reduction of building frames using variable orifice dampers subjected to near-fault earthquakes. Journal of Vibration and Control. 2012;**19**:1980-1998

[50] Kim Y, Langari R, Hurlebaus S. Control of a seismically excited benchmark building nonlinear fuzzy control. Journal of Structural Engineering. 2010;**136**:1023-1026

[51] Ahlawat AS, Ramaswamy A. Multiobjective optimal fuzzy logic controller driven active and hybrid control systems for seismically excited nonlinear buildings. Journal of Engineering Mechanics. 2002;**130**:416-423

[52] Samali B, Al-dawod M, Kwok KCS, Naghdy F. Active control of cross wind response of 76-story tall building using a fuzzy controller. Journal of Engineering Mechanics. 2004;**130**: 492-498

[53] Soleymani M, Khodadadi M. Adaptive fuzzy controller for active tuned mass damper of a benchmark tall building subjected to seismic and wind loads. Structural Design of Tall and Special Buildings. 2014;**23**:781-800

[54] Lin C-J, Yau H-T, Lee C-Y, Tung K-H. System identification and semiactive control of a squeeze-mode magnetorheological damper. IEEE/ ASME Transactions on Mechatronics. 2013;**18**:1691-1701

[55] Kim H, Roschke PN. Fuzzy control of base-isolation system using multiobjective genetic algorithm. Computer-Aided Civil and Infrastructure Engineering. 2006;**21**:436-449

[56] Reigles DG, Symans MD. Supervisory fuzzy control of a baseisolated benchmark building utilizing a neuro-fuzzy model of controllable fluid viscous dampers. Structural Control and Health Monitoring. 2006;**13**:724-747

[57] Shook DA, Roschke PN, Ozbulut OE. Superelastic semi-active damping of a base-isolated structure. Structural Control and Health Monitoring. 2008;**15**:746-768

[58] Mohtat A, Yousefi-Koma A, Dehghan-Niri E. Active vibration control of seismically excited structures by Atmds: Stability and performance robustness perspective. International Journal of Structural Stability and Dynamics. 2010;**10**:501-527

[59] Guo Y-Q, Fei S-M, Xu Z-D. Simulation analysis on intelligent structures with magnetorheological dampers. Journal of Intelligent Material Systems and Structures. 2007;**19**:715-726

[60] Pourzeynali S, Datta TK. Semiactive fuzzy logic control of suspension. Journal of Structural Engineering. 2005;**131**:900-912

[61] Gu ZQ, Oyadiji SO. Application of MR damper in structural control using ANFIS method. Computers and Structures. 2008;**86**:427-436. DOI: 10.1016/j.compstruc.2007.02.024

[62] Ghaedi K, Ibrahim Z. Earthquake prediction. In: Zouaghi T, editor. Earthquakes - Tectonics, Hazard Risk Mitig. Rijeka, Croatia: InTech; 2017. pp. 205-227. DOI: 10.5772/65511

[63] Symeonidis A, Mitkas P. Data mining and knowledge discovery: A brief overview. In: Agent Intelligence Through Data Mining. United States: Springer; 2005. pp. 11-40

[64] Chen TY, Huang JH. Application of data mining in a global optimization algorithm. Advances in Engineering Software. 2013;**66**:24-33

[65] Xiao F, Fan C. Data mining in building automation system for improving building operational performance. Energy and Buildings. 2014;**75**:109-118

[66] Kim Y, Langari R, Hurlebaus S. MIMO fuzzy identification of building-MR damper systems. Journal of Intelligent Fuzzy Systems. 2011;**22**:185-205

[67] Aghajanloo M, Sabziparvar A. Artificial neural network – Genetic algorithm for estimation of crop evapotranspiration in a semi-arid region of Iran. Neural Computing and Applications. 2012;**23**:1387-1393

[68] Pourzeynali S, Salimi S. Robust multi-objective optimization design of active tuned mass damper system to mitigate the vibrations of a high-rise building. Proceedings of the Institution of Mechanical Engineers, Part C: Journal of Mechanical Engineering Science. 2014

[69] Singh MP, Singh S, Moreschi LM. Tuned mass dampers for response control of torsional buildings. Earthquake Engineering and Structural Dynamics. 2002;**31**:749-769

[70] Mohebbi M, Joghataie A. Designing optimal tuned mass dampers for

nonlinear frames by distributed genetic algorithms. Structural Design of Tall and Special Buildings. 2012;**21**:57-76

[71] Ahadi P, Mohebbi M, Shakeri K. Using optimal multiple tuned liquid column dampers for mitigating the seismic response of structures. ISRN Civil Engineering. 2012;**2012**:1-6

[72] Hejazi F, Toloue I, Jaafar MS, Noorzaei J. Optimization of earthquake energy dissipation system by genetic algorithm. Computer-Aided Civil and Infrastructure Engineering. 2013;**28**:796-810

[73] Cha Y-J, Kim Y, Raich AM, Agrawal AK. Multi-objective optimization for actuator and sensor layouts of actively controlled 3D buildings. Journal of Vibration and Control. 2012;**19**:942-960

[74] Wongprasert N, Symans MD. Application of a genetic algorithm for optimal damper. Journal of Engineering Mechanics. 2004;**130**:401-406

[75] Lin T, Chu Y, Chang K, Chang C. Renovated controller designed by genetic algorithms. Earthquake Engineering and Structural Dynamics. 2009;**38**:457-475

[76] Kennedy J, Eberhart R. Particle swarm optimization. Proceedings of ICNN'95- International Conference on Neural Networks. 1995;**4**:1942-1948

[77] Ghayeb HH, Razak HA, Sulong NHR, Hanoon AN, Abutaha F, Ibrahim HA, et al. Predicting the mechanical properties of concrete using intelligent techniques to reduce CO2 emissions. Materiales de Construcción. 2019;**69**:1-20

[78] Gholizadeh S, Fattahi F. Design optimization of tall steel buildings by a modified particle. Structural Design of Tall and Special Buildings. 2014;**23**:285-301

**137**

*Data Mining Technology for Structural Control Systems: Concept, Development, and Comparison*

*DOI: http://dx.doi.org/10.5772/intechopen.88651*

[80] Leung AYT, Zhang H, Cheng CC, Lee YY. Particle swarm optimization of TMD by non-stationary base excitation

during earthquake. Earthquake Engineering and Structural Dynamics.

[81] Amini F, Hazaveh NK, Rad AA. Wavelet PSO-based LQR algorithm for optimal structural control using active tuned mass dampers. Computer-Aided Civil and Infrastructure Engineering.

[79] Gundogdu O, Egrioglu E, Aladag CH, Yolcu U. Multiplicative neuron model artificial neural network based on Gaussian activation function. Neural Computing and Applications.

2015;**27**:927-935

2008;**37**:1223-1246

2013;**28**:542-557

*Data Mining Technology for Structural Control Systems: Concept, Development, and Comparison DOI: http://dx.doi.org/10.5772/intechopen.88651*

[79] Gundogdu O, Egrioglu E, Aladag CH, Yolcu U. Multiplicative neuron model artificial neural network based on Gaussian activation function. Neural Computing and Applications. 2015;**27**:927-935

*Recent Trends in Artificial Neural Networks - From Training to Prediction*

nonlinear frames by distributed genetic algorithms. Structural Design of Tall and Special Buildings. 2012;**21**:57-76

[71] Ahadi P, Mohebbi M, Shakeri K. Using optimal multiple tuned liquid column dampers for mitigating the seismic response of structures. ISRN Civil Engineering. 2012;**2012**:1-6

[72] Hejazi F, Toloue I, Jaafar MS, Noorzaei J. Optimization of earthquake energy dissipation system by genetic algorithm. Computer-Aided Civil and Infrastructure Engineering.

[73] Cha Y-J, Kim Y, Raich AM, Agrawal AK. Multi-objective

Control. 2012;**19**:942-960

optimization for actuator and sensor layouts of actively controlled 3D buildings. Journal of Vibration and

[74] Wongprasert N, Symans MD. Application of a genetic algorithm for optimal damper. Journal of Engineering

[75] Lin T, Chu Y, Chang K, Chang C. Renovated controller designed by genetic algorithms. Earthquake Engineering and Structural Dynamics.

[76] Kennedy J, Eberhart R. Particle swarm optimization. Proceedings of ICNN'95- International Conference on Neural Networks. 1995;**4**:1942-1948

Sulong NHR, Hanoon AN, Abutaha F, Ibrahim HA, et al. Predicting the mechanical properties of concrete using intelligent techniques to reduce CO2 emissions. Materiales de Construcción.

[78] Gholizadeh S, Fattahi F. Design optimization of tall steel buildings by a modified particle. Structural Design of Tall and Special Buildings.

[77] Ghayeb HH, Razak HA,

Mechanics. 2004;**130**:401-406

2013;**28**:796-810

2009;**38**:457-475

2019;**69**:1-20

2014;**23**:285-301

[62] Ghaedi K, Ibrahim Z. Earthquake prediction. In: Zouaghi T, editor. Earthquakes - Tectonics, Hazard Risk Mitig. Rijeka, Croatia: InTech; 2017. pp. 205-227. DOI: 10.5772/65511

[63] Symeonidis A, Mitkas P. Data mining and knowledge discovery: A brief overview. In: Agent Intelligence Through Data Mining. United States:

[64] Chen TY, Huang JH. Application of data mining in a global optimization algorithm. Advances in Engineering

[65] Xiao F, Fan C. Data mining in building automation system for improving building operational performance. Energy and Buildings.

[66] Kim Y, Langari R, Hurlebaus S. MIMO fuzzy identification of building-MR damper systems. Journal of Intelligent Fuzzy Systems.

[67] Aghajanloo M, Sabziparvar A. Artificial neural network – Genetic algorithm for estimation of crop evapotranspiration in a semi-arid region of Iran. Neural Computing and Applications. 2012;**23**:1387-1393

[68] Pourzeynali S, Salimi S. Robust multi-objective optimization design of active tuned mass damper system to mitigate the vibrations of a high-rise building. Proceedings of the Institution of Mechanical Engineers, Part C: Journal of Mechanical Engineering Science.

[69] Singh MP, Singh S, Moreschi LM. Tuned mass dampers for response control of torsional buildings.

Earthquake Engineering and Structural

[70] Mohebbi M, Joghataie A. Designing optimal tuned mass dampers for

Dynamics. 2002;**31**:749-769

Springer; 2005. pp. 11-40

Software. 2013;**66**:24-33

2014;**75**:109-118

2011;**22**:185-205

**136**

2014

[80] Leung AYT, Zhang H, Cheng CC, Lee YY. Particle swarm optimization of TMD by non-stationary base excitation during earthquake. Earthquake Engineering and Structural Dynamics. 2008;**37**:1223-1246

[81] Amini F, Hazaveh NK, Rad AA. Wavelet PSO-based LQR algorithm for optimal structural control using active tuned mass dampers. Computer-Aided Civil and Infrastructure Engineering. 2013;**28**:542-557

### *Edited by Ali Sadollah and Carlos M. Travieso-Gonzalez*

Artificial intelligence (AI) is everywhere and it's here to stay. Most aspects of our lives are now touched by artificial intelligence in one way or another, from deciding what books or flights to buy online to whether our job applications are successful, whether we receive a bank loan, and even what treatment we receive for cancer. Artificial Neural Networks (ANNs) as a part of AI maintains the capacity to solve problems such as regression and classification with high levels of accuracy.

This book aims to discuss the usage of ANNs for optimal solving of time series applications and clustering. Bounding of optimization methods particularly metaheuristics considered as global optimizers with ANNs make a strong and reliable prediction tool for handling real-life application. This book also demonstrates how different fields of studies utilize ANNs proving its wide reach and relevance.

Published in London, UK © 2020 IntechOpen © ktsimage / iStock

Recent Trends in Artificial Neural Networks - from Training to Prediction

Recent Trends in

Artificial Neural Networks

from Training to Prediction

*Edited by Ali Sadollah* 

*and Carlos M. Travieso-Gonzalez*