3. Artificial intelligence (AI) techniques

Artificial intelligence (AI) can be described as the imitation of human intelligence processes by machines, especially computer systems. These processes include the acquisition of information from sets of data, use logic of their interdependency to reach approximate or definite conclusions while self-correcting [18]. AI was coined by John McCarthy, an American computer scientist, in 1956 at The Dartmouth Conference where the discipline was born [19]. According to artificial intelligence applications institute (AIAI), AI areas of application are; case-based reasoning: a technique for utilizing historical datasets to guide diagnosis and fault finding; evolutionary algorithms: an adaptive search technique with very broad applicability in scheduling, optimization, and model adaptation; planning and workflow: modeling, task setting, planning, execution, coordination, and presentation of activity-related information; intelligent systems: an approach of building knowledge-based systems; and knowledge management: the identification of knowledge assets in an organization, and support for knowledge-based work [20].

Some of the advantages of AI techniques include, but not limited to ability to model complex, nonlinear processes without priori relationship assumption between input and output variables; potential to generate accurate analysis and results from large historical databases; ability to analyze large datasets to recognize patterns and characteristics in situations where rules are unknown or relationship and dependency of variables are complex; cost-effectiveness: many AI algorithms have the advantage of execution speed, once they have been trained. The ability to train the system with data sets, instead of writing programs, makes it more cost-effective and changes can be easily implemented when need arises. Multiple algorithms can be combined taking competitive advantages of each algorithm to develop an ensemble AI tools. AI techniques can be deployed to solve routine boring tasks which would be completed faster with minimal errors and defects than human [21].

all factors affecting ROP and downhole conditions are inherent in the collected surface drilling data. Applying data-driven predictive analysis has proven useful in decoding the hidden

ANN 6 Rock strength, rock type, abrasion, WOB, RPM, and mud weight [25] ROP and

Weight on Bit, Rotary Speed, Torque, Pump Flow Rate, Stand Pipe Pressure, mud weight,

Input variables Output

Rate of Penetration Prediction Utilizing Hydromechanical Specific Energy

http://dx.doi.org/10.5772/intechopen.76903

ROP

113

ROP

ROP

ROP

ROP

wear

ROP

ANN 9 UCS, bit size, bit type, drillability coefficient, gross hours drilled, WOB, RPM, drilling

ANN 20 Differential pressure, hydraulics, hole depth, pump pressure, density of the overlying

ANN 7 Depth, bit weight, rotary speed, tooth wear, Reynolds number function, ECD, and pore

SVR 12 Viscosity, MW, pump rate, well deviation, RPM, WOB, depth, formation, bit size, and bit

ANN 9 Formation drillability, formation abrasiveness, bearing wear, tooth wear, pump rate, rotating time, rotary torque, WOB, and rotary speed [24]

ANN 13 Bit Type, IADC Codes, Bit diameter, Bit Status, Measure Depth, True Vertical Depth,

rock, equivalent circulating density, hole size, formation drillability, permeability and porosity, drilling fluid type, plastic viscosity of mud, yield point of mud, initial gel strength of mud, 10 min gel strength of mud, bit type and its properties, weight on the bit

mud density, and AV (Apparent Viscosity) [22]

pressure gradient [23]

and Formation Mineralogy [26]

Table 1. Summary of some recent applications of AI in ROP prediction.

tooth wear [6]

and rotary speed, bit wear, and bit hydraulic power [5]

Table 1 shows some recent work done using artificial intelligence to predict ROP. ANN has been the most often used. What is also clear in the literature review is that the selection of input is not consistent and some may be difficult to obtain in some instances. Also, for optimization purpose while drilling, some of the variables included in the models are not controllable

Below are of some of the AI techniques considered in this study. A summary of their charac-

Artificial neural networks, ANN, are designed based on the examination of biological central nervous systems and neurons, axons, dendrites, and synapses. Similarly, an ANN is composed of elements that are called "neurons," "units," or "processing elements" (PEs). Each PE has a specification of input/output (I/O) and they are connected together to form a network of nodes for mimicking the biological neural networks, hence they are called "artificial neural network,"

information in these drilling data.

Model Input

number

factors that can be adjusted in real time.

teristics is presented in Table 2.

ANN.

3.1.1. Artificial neural network (ANN)

3.1. Some artificial intelligence techniques

AI techniques limitations includes some of them being tagged as "black boxes," which merely attempt to chart a relationship between input and output variables based on a training data set. This raises some concerns regarding the ability of the tool to generalize to situations that were not well represented in the data set. However, application of the right domain knowledge helps to address this limitation. Other limitations are the lack of human touch, enormous processing time for large datasets and requirement for high computational resources and skills.

Despite some of the disadvantages of AI techniques, their overwhelming advantages have made them endearing in different fields, including the exploration and exploitation of oil and gas. Recent advancement in the collection and transmission of real-time drilling data coupled with insufficiency of empirical ROP models to unveil the real-time downhole conditions has made researchers to shift into AI techniques for prediction purpose. Furthermore, the effects of


Table 1. Summary of some recent applications of AI in ROP prediction.

optimized their drilling operation with a result of an astonishing increase in ROP by 133%

Artificial intelligence (AI) can be described as the imitation of human intelligence processes by machines, especially computer systems. These processes include the acquisition of information from sets of data, use logic of their interdependency to reach approximate or definite conclusions while self-correcting [18]. AI was coined by John McCarthy, an American computer scientist, in 1956 at The Dartmouth Conference where the discipline was born [19]. According to artificial intelligence applications institute (AIAI), AI areas of application are; case-based reasoning: a technique for utilizing historical datasets to guide diagnosis and fault finding; evolutionary algorithms: an adaptive search technique with very broad applicability in scheduling, optimization, and model adaptation; planning and workflow: modeling, task setting, planning, execution, coordination, and presentation of activity-related information; intelligent systems: an approach of building knowledge-based systems; and knowledge management: the identification of knowledge assets in an organization, and support for knowledge-based

Some of the advantages of AI techniques include, but not limited to ability to model complex, nonlinear processes without priori relationship assumption between input and output variables; potential to generate accurate analysis and results from large historical databases; ability to analyze large datasets to recognize patterns and characteristics in situations where rules are unknown or relationship and dependency of variables are complex; cost-effectiveness: many AI algorithms have the advantage of execution speed, once they have been trained. The ability to train the system with data sets, instead of writing programs, makes it more cost-effective and changes can be easily implemented when need arises. Multiple algorithms can be combined taking competitive advantages of each algorithm to develop an ensemble AI tools. AI techniques can be deployed to solve routine boring tasks which would be completed faster

AI techniques limitations includes some of them being tagged as "black boxes," which merely attempt to chart a relationship between input and output variables based on a training data set. This raises some concerns regarding the ability of the tool to generalize to situations that were not well represented in the data set. However, application of the right domain knowledge helps to address this limitation. Other limitations are the lack of human touch, enormous processing time for large datasets and requirement for high computational resources and

Despite some of the disadvantages of AI techniques, their overwhelming advantages have made them endearing in different fields, including the exploration and exploitation of oil and gas. Recent advancement in the collection and transmission of real-time drilling data coupled with insufficiency of empirical ROP models to unveil the real-time downhole conditions has made researchers to shift into AI techniques for prediction purpose. Furthermore, the effects of

proven the concept a useful one [16, 17].

work [20].

112 Drilling

skills.

3. Artificial intelligence (AI) techniques

with minimal errors and defects than human [21].

all factors affecting ROP and downhole conditions are inherent in the collected surface drilling data. Applying data-driven predictive analysis has proven useful in decoding the hidden information in these drilling data.

Table 1 shows some recent work done using artificial intelligence to predict ROP. ANN has been the most often used. What is also clear in the literature review is that the selection of input is not consistent and some may be difficult to obtain in some instances. Also, for optimization purpose while drilling, some of the variables included in the models are not controllable factors that can be adjusted in real time.

#### 3.1. Some artificial intelligence techniques

Below are of some of the AI techniques considered in this study. A summary of their characteristics is presented in Table 2.

#### 3.1.1. Artificial neural network (ANN)

Artificial neural networks, ANN, are designed based on the examination of biological central nervous systems and neurons, axons, dendrites, and synapses. Similarly, an ANN is composed of elements that are called "neurons," "units," or "processing elements" (PEs). Each PE has a specification of input/output (I/O) and they are connected together to form a network of nodes for mimicking the biological neural networks, hence they are called "artificial neural network," ANN.


applied in various fields of engineering [28]. In the training process, weights and biases of the network are adjusted on basis of learning rules and completing training; these fixed weights

Rate of Penetration Prediction Utilizing Hydromechanical Specific Energy

http://dx.doi.org/10.5772/intechopen.76903

115

Some of the advantages of ANN are; ability to handle linear and nonlinear models: complex linear and nonlinear relationships can be derived using neural networks. Flexible input/output: neural networks can operate using one or more descriptors and/or response variables. They can also be used with categorical and continuous data. Noise: neural networks are less sensitive to noise than statistical regression models. While some of the major limitations are; Black box models: it is not possible to explain how the results were calculated in any meaningful way. Optimizing parameters: there are many parameters to be set in a neural network and optimizing the network can be challenging, especially to avoid overtraining [23, 27, 29–32].

Extreme learning machines (ELM) are derived from ANN, it is however a generally unified single layer feed-forward network framework with less requirement of human interventions and thus has been found to run faster than most conventional neuron-based techniques. This is notably due to the fact that the learning parameters of its hidden nodes, including input weights and biases, are assigned randomly without any dependency, and the simple generalized operation that is involved in the determination of the output weights. The training phase with data in the ELM algorithm is efficiently completed using a fixed nonlinear transformation which is a fast learning process. The efficiency of ELM in online or real-time applications cannot be over emphasized as it automatically determines all the network parameters analyt-

Also, the universal approximation ability of the standard ELM with additive or Radial Basis Function (RBF) activation function has been proved [7, 33]. Success story of the application of ELM in many real-world problems is well documented especially in classification and regression problems on very large scale datasets. ELM is very efficient and effective as an innovative

Some of the merits and limitations of ELM can be summarized as follows: ELM reduces the computation burden without sacrificing the generalization capability in the expectation sense. ELM needs much less training time compared to popular ANN and SVM/SVR. The prediction accuracy of ELM is usually slightly better than ANN and close to SVM/SVR in many applications. Compared with ANN and SVR, ELM can be implemented easily since there is no parameter to be tuned except an insensitive parameter L. It should be noted that many nonlinear activation functions can be used in ELM [33]. While the limitations are ELM suffered from both the uncertainty and generalization degradation problem and for the widely used

Support vector regressions (SVRs) methodology involves a group of related supervised learning methods employed for both regression and classification problems. They fall in the

training algorithm for single-hidden layer feed-forward neural networks (SLFNs) [33].

Gaussian-type activation function, ELM degraded the generalization capability [34].

ically and therefore avoids unnecessary human intervention [33].

and biases act as the memory of the network.

3.1.2. Extreme learning machine (ELM)

3.1.3. Support vector regression (SVR)

Table 2. Summary of AI techniques used in the case study.

The use of ANN as a reliable universal estimator in constructing nonlinear models from data is very common. It is capable of approximating both linear and nonlinear functions defined over a range of data to the desired degree of accuracy using an appropriate number of hidden neurons, this has been proven mathematically [27]. Being data-driven models, they learn from training data presented to them and do not require any a priori assumptions about the problem, not even information about statistical distributions. In petroleum engineering, the training data may be assembled from experimental data, past field data, numerical reservoir simulation, real-time data, or a combination of these [5]. Though assumptions are not required, knowledge of the statistical distribution of the input data and domain knowledge of the problem can help to speed up training. Several issues such as the ability to run parallel processes and apply learning instead of programming have made ANN an efficient tool to be applied in various fields of engineering [28]. In the training process, weights and biases of the network are adjusted on basis of learning rules and completing training; these fixed weights and biases act as the memory of the network.

Some of the advantages of ANN are; ability to handle linear and nonlinear models: complex linear and nonlinear relationships can be derived using neural networks. Flexible input/output: neural networks can operate using one or more descriptors and/or response variables. They can also be used with categorical and continuous data. Noise: neural networks are less sensitive to noise than statistical regression models. While some of the major limitations are; Black box models: it is not possible to explain how the results were calculated in any meaningful way. Optimizing parameters: there are many parameters to be set in a neural network and optimizing the network can be challenging, especially to avoid overtraining [23, 27, 29–32].

#### 3.1.2. Extreme learning machine (ELM)

Extreme learning machines (ELM) are derived from ANN, it is however a generally unified single layer feed-forward network framework with less requirement of human interventions and thus has been found to run faster than most conventional neuron-based techniques. This is notably due to the fact that the learning parameters of its hidden nodes, including input weights and biases, are assigned randomly without any dependency, and the simple generalized operation that is involved in the determination of the output weights. The training phase with data in the ELM algorithm is efficiently completed using a fixed nonlinear transformation which is a fast learning process. The efficiency of ELM in online or real-time applications cannot be over emphasized as it automatically determines all the network parameters analytically and therefore avoids unnecessary human intervention [33].

Also, the universal approximation ability of the standard ELM with additive or Radial Basis Function (RBF) activation function has been proved [7, 33]. Success story of the application of ELM in many real-world problems is well documented especially in classification and regression problems on very large scale datasets. ELM is very efficient and effective as an innovative training algorithm for single-hidden layer feed-forward neural networks (SLFNs) [33].

Some of the merits and limitations of ELM can be summarized as follows: ELM reduces the computation burden without sacrificing the generalization capability in the expectation sense. ELM needs much less training time compared to popular ANN and SVM/SVR. The prediction accuracy of ELM is usually slightly better than ANN and close to SVM/SVR in many applications. Compared with ANN and SVR, ELM can be implemented easily since there is no parameter to be tuned except an insensitive parameter L. It should be noted that many nonlinear activation functions can be used in ELM [33]. While the limitations are ELM suffered from both the uncertainty and generalization degradation problem and for the widely used Gaussian-type activation function, ELM degraded the generalization capability [34].

#### 3.1.3. Support vector regression (SVR)

The use of ANN as a reliable universal estimator in constructing nonlinear models from data is very common. It is capable of approximating both linear and nonlinear functions defined over a range of data to the desired degree of accuracy using an appropriate number of hidden neurons, this has been proven mathematically [27]. Being data-driven models, they learn from training data presented to them and do not require any a priori assumptions about the problem, not even information about statistical distributions. In petroleum engineering, the training data may be assembled from experimental data, past field data, numerical reservoir simulation, real-time data, or a combination of these [5]. Though assumptions are not required, knowledge of the statistical distribution of the input data and domain knowledge of the problem can help to speed up training. Several issues such as the ability to run parallel processes and apply learning instead of programming have made ANN an efficient tool to be

Characteristics Advantages Limitations

Ability to run parallel processes and apply learning Complex linear and nonlinear relationships can be derived

Black box models: it is not possible to explain how the results were calculated in any

Requirements of elaborate training examples

Suffers from uncertainty Suffers generalization degradation problem Black box models

Consumes lots of computer

Time consuming for training, testing and validation of models Uses a complex quadratic programming approach making it difficult for very large datasets

Highly sensitive to outliers Ineffective at handling non-

Consumes lots of computer

resources

Black box model

Gaussian noise

resources

Many optimizing parameters to be set in defining model to avoid

meaningful way

overtraining

Flexible input/output Less sensitive to noise

Online real-time application Avoids unnecessary human

Reduces computational

Needs less training time Prediction accuracy slightly better than ANN Easy implementation

Invaluable for the estimation of both real valued and indicator functions Handles very high dimensional data

Can learn very high elaborate

Robust to 'outliers' (i.e., data samples outside ε-insensitive

Requires less effort in model training in comparison to the original SVR, owing to its simplified algorithm

using ANN

intervention

burden

concepts More stable

zone)

Artificial intelligence techniques

114 Drilling

ANN Nonlinearity

Input-output mapping, supervised learning while working through

randomly without any dependency Fast learning process by using a fixed nonlinear transformation in the

An innovative training algorithm for Single-hidden Layer Feed-forward

Maximal hyperplane is constructed to separate a high dimensional space of input vectors mapped with the feature

regularization networks and Gaussian processes but additionally emphasize and exploit primal-dual interpretations

Its core feature in control of its attractiveness is the notion of an εinsensitive loss function

training samples Evidential response Neurobiological analogy Very large scale integration

applicability

training phase

SVR Supervised learning

space

LS-SVR LS-SVRs are closely related to

Simplified algorithm

Table 2. Summary of AI techniques used in the case study.

ELM Input weights and biases, are assigned

Neural networks SLFN

Support vector regressions (SVRs) methodology involves a group of related supervised learning methods employed for both regression and classification problems. They fall in the category of generalized linear classifiers (GLCs). In SVRs, a maximal hyperplane is constructed to separate a high dimensional space of input vectors mapped with the feature space. It was initially designed as a classifier only to be modified in a later study by Vapnik [35] as a support vector regressor (SVR) for regression problems. Its robustness in a single model estimation condition has been testified to [36]. Hence, it can be considered invaluable for the estimation of both real valued and indicator functions as common in pattern recognition and regression problems, respectively.

4. Case study

ling rate of penetration.

4.1. Data description

having significant oil rims.

16" Hole section

12-1/4" Hole section

8 1/2" Pilot hole section

Table 3. Bit details.

A case study is presented below to illustrate one of the advantages inherent in combining AI techniques with domain expert knowledge for improved prediction and optimization of dril-

Rate of Penetration Prediction Utilizing Hydromechanical Specific Energy

http://dx.doi.org/10.5772/intechopen.76903

117

In this study, data from two development wells from onshore Niger Delta hydrocarbon province were used for the development and testing of the models, in each of the AI algorithms compared. The field is about 95 square kilometers in extent with a northwest-southeast trending dual culmination rollover anticline. The wells chosen represents the best in terms of drilling performance as measured by best ROP and bit runs for all the three hole sections considered. The formations encountered are mainly consolidated intercalation of shales and shallow marine shoreface sands with a normal compaction trend, a typical elastic depositional environment of the Niger Delta. The field is a mainly gas field with some of the reservoirs

The wells used for the study were selected for ROP prediction because they were the best in class in terms of drilling performance, a result of carefully optimized drilling parameters and practices. The repeatability of such feat is highly desirable, and hence the choice of the wells. The formations encountered are well correlated across the field with lateral continuity. These two wells fairly represents the field with Well-A located in the Eastern flank of the field while Well-B is located 8 km to the west of Well-A and just about 3 km to the field western boundary. While Well-A is highly deviated and deeper in reach with maximum inclination of 74 at total depth of 11,701 ft TVD, Well-B is slightly deviated with maximum inclination of 23 at total depth of 9000 ft TVD The wells are also similar in terms of drilling equipment, the same rig was used for their construction; bit type and bottom hole assembly (BHA) used were same, hence, they were both drilled with the same bottom hole hydraulics. Details of the bit used in

Code

1 PDC Bit VAREL PDC(VTD713 P2DGX) New 16\*5; 18\*2 1.479 2-2-CT-A-X-1/16-WO-TD

1 PDC Bit BM 563 New 16\*2; 13\*8 1.17 1-1-WT-A-X-1-NO-TD

Initial status

Nozzle Size

135 New 22\*3; 1\*20 1.42 6-5-WT-A-E-1/16-FC-PR

TFA IADC Dull Grade

the three hole sections included in this research are presented in Table 3.

BHA No. Type Make/Model IADC

1 Tri-Cone bit Baker Hughes Christensen bits/ MXL-DS3DDT

When used as a regressor, SVRs attempt to choose the "best" model from a list of possible models (i.e., approximating functions) fð Þ x; ω , where a set of generalized parameters is given by ω. Generally, "good" models are those that can generalize their good predictive performance on an out-of-sample test set. This is often determined by how well the model minimizes the cost function while training with the training data. The core feature of SVR regression in control of its attractive properties is the notion of an ε-insensitive loss function. SVR is suitable for estimating the dominant model under multiple model formulation, where the objective function can be viewed as a primal problem, and its dual form can be obtained by constructing Lagrange function and introducing a set of (dual) variables.

SVRs generalization characteristics are ensured by the special properties of the optimal hyperplane that maximizes the distance to training examples in a high dimensional feature space. It has been shown to exhibit excellent performance [32]. The merits and limitations of SVRs are summarized thus; merits: SVRs can deal with very high dimensional data; they can learn very elaborate concepts; usually works very well. While the limitations are: requirement of both positive and negative examples; the need to select a good kernel function; consumes lots of memory and CPU time; there are some numerical stability problems in solving the constrained [30, 37, 38]. Analysis of (linear) SVR indicates that the regression model depends mainly on support vectors on the border of ε-insensitive zone; SVR solution is very robust to "outliers" (i.e., data samples outside ε-insensitive zone). These properties make SVM very attractive for its use in an iterative procedure for multiple model estimation.

#### 3.1.4. Least square support vector regressions (LS-SVR)

LS-SVRs are reformulated versions of the original SVRs algorithm for classification and function estimation, which maintains the advantages and the attributes of the original SVRs theory. LS-SVRs are closely related to regularization networks and Gaussian processes but additionally emphasize and exploit primal-dual interpretations [39]. LS-SVR possesses excellent generalization performances and is associated with low computational costs. LS-SVR requires less effort in model training in comparison to the original SVR, owing to its simplified algorithm. It minimizes a quadratic penalty on the slack variables which allows the quadratic programming problem to be reduced to a set of matrix inversion operations in the dual space, which takes less time compared to solving the SVR quadratic problem [40]. Robustness, sparseness, and weightings can be incorporated into LS-SVRs where needed and a Bayesian framework with three levels of inference has also been developed [41]. Some of its limitations include being ineffective at handling non-Gaussian noise as well as being sensitive to outliers [42].
