**3. Methodology**

Statistical methodologies have been traditional being utilized for diffuse pollutants predictions in streams. However, transport of herbicides is complex and uncertain phenomena and traditional methods like regression are not able to incorporate uncertainty in model predictions. Present work will discuss methodologies based on recent soft computing techniques like fuzzy, artificial neural network (ANN) and their hybrids. The application of the proposed methodology is illustrated with real data to estimate the diffuse pollution concentration in a stream system due to application of a typical herbicide, atrazine, in corn fields with limited data availability.

#### **3.1 Modeling approach**

The models based on fuzzy logic and ANN, also known as intelligent or soft computing models, are potentially capable of fitting a nonlinear function or relationships. Identification of model architecture is decisive factor in the simulation and comparison. The identification of model architecture is crucial in ANN model building process. While the input and output of the ANN model is problem dependent, there is no direct precise way to determine the optimal number of hidden nodes (Nayak et al., 2005).The model architecture is selected through a trial and error procedure (Singh et al.. 2004). The fuzzy model, on the other hand, may be considered as a mapping of input space into output space by partitions in the multidimensional feature space in inputs and outputs. Each partition represents a fuzzy set with a membership function.

#### **3.2 Fuzzy rule based system**

Fuzzy logic emerged as a more general form of logic that can handle the concept of partial truth. The pioneering work of Zadeh (1965) on fuzzy logic has been used as foundation for fuzzy modeling methodology that allows easier transition between humans and computers for decision making and a better way to handle imprecise and uncertain information. Human being think verbally, not numerically. As the fuzzy logic systems involves verbal

 **A** = f (**AC, AL**) (3) where **AC** represent the vector of applied agricultural chemical characteristics such as type of agricultural chemical (insecticide, herbicides etc.), application rate, application season etc., and **AL** is the land use patterns such as type of crop grown, percentage of cropped area,

Here, agricultural chemical considered is herbicide, atrazine, and crop considered is corn. In this study fuzzy rule based model with FCM simulates the stream system behavior from inputs of agricultural practices and corresponding observed concentration measurement values. In fact the model tries to emulate the mechanism that produced the data set. In this way, the mathematical description of the physical system is learned by the model, and therefore utilized as a tool for stream system simulation. The cluster centers of inputs and outputs obtained using FCM model, in essence, represents a typical characteristics of the system behaviour, and hence utilized in the formation of rule base of the fuzzy model.

Statistical methodologies have been traditional being utilized for diffuse pollutants predictions in streams. However, transport of herbicides is complex and uncertain phenomena and traditional methods like regression are not able to incorporate uncertainty in model predictions. Present work will discuss methodologies based on recent soft computing techniques like fuzzy, artificial neural network (ANN) and their hybrids. The application of the proposed methodology is illustrated with real data to estimate the diffuse pollution concentration in a stream system due to application of a typical herbicide,

The models based on fuzzy logic and ANN, also known as intelligent or soft computing models, are potentially capable of fitting a nonlinear function or relationships. Identification of model architecture is decisive factor in the simulation and comparison. The identification of model architecture is crucial in ANN model building process. While the input and output of the ANN model is problem dependent, there is no direct precise way to determine the optimal number of hidden nodes (Nayak et al., 2005).The model architecture is selected through a trial and error procedure (Singh et al.. 2004). The fuzzy model, on the other hand, may be considered as a mapping of input space into output space by partitions in the multidimensional feature space in inputs and outputs. Each partition represents a fuzzy set

Fuzzy logic emerged as a more general form of logic that can handle the concept of partial truth. The pioneering work of Zadeh (1965) on fuzzy logic has been used as foundation for fuzzy modeling methodology that allows easier transition between humans and computers for decision making and a better way to handle imprecise and uncertain information. Human being think verbally, not numerically. As the fuzzy logic systems involves verbal

The **A** may be further represented by

etc.

**3. Methodology** 

**3.1 Modeling approach** 

with a membership function.

**3.2 Fuzzy rule based system** 

atrazine, in corn fields with limited data availability.

statements and, therefore, the fuzzy logic is more in line with human perception (Zadeh, 2000). Fuzzy logic has an advantage over many statistical methods in that the performance of a fuzzy expert system is not dependent on the volume of historical data available. Since these expert systems produce a result based on logical linguistic rules, extreme data points in a small data set do not unduly influence these models. Because of these characteristics, fuzzy logic may be a more suitable method for diffuse pollution forecasting than the usual regression modeling techniques used by many researchers (e,g. Goolsby and Battaglin (1993); Larson and Gilliom (2001); and Tesfamichael et al. (2005) etc.) for estimation of diffuse pollution concentration in streams or other water bodies.

#### **3.2.1 Fuzzy rule based system architecture**

The most common way to represent human knowledge is to form it into natural language expression of the type,

$$\text{IF premise (antescendent), THEN conditions (conseqquent)}\tag{4}$$

The form in expression (4) is commonly referred to as the IF-THEN rule based form (Ross, 1997). It typically expresses an inference such that if a fact (premise, hypothesis, antecedent) is known, then another fact called a conclusion (consequent) can be inferred or derived. Fuzzy logic systems are rule base systems that implements a nonlinear mapping (Dadone and VanLandingham, 2000) between stresses (represented by consequents) and state variables (represented by antecedents). Creating a fuzzy rule based system may be summarized in four basic steps (Ross 1997; Mahabir et al. 2003; Singh and Singh 2005):


Subjective decisions are frequently required in fuzzy logic modeling, particularly in defining the membership functions for variables. In cases such as in this study, where large data sets are not available to define every potential occurrence scenario for the fuzzification of model, expert opinion is used to create logic in the rule base system.

#### **3.2.2 Membership functions**

Membership functions used to describe linguistic knowledge are the enormously subjective and context dependent part of fuzzy logic modeling (Vadiee, 1993). Each variable must have membership functions, usually represented by linguistic terms, defined for the entire range of possible values. The key idea in fuzzy logic, in fact, is the allowance of partial belongings of any object to different subsets of universal set instead of belonging to a single set

Prediction of Herbicides Concentration in Streams 233

This iteration will stop when maxij{│uijk+1-uijk │}< ε , where ε is a termination criterion between 0 and 1, where as *k* are the iteration steps. This study used FCM algorithm (Matlab

This study implements FCM algorithm (Matlab version 6.5), m=2, and ε equal to 10-5 to

The watershed of the streams plays a vital role in influencing the diffuse pollution concentration in the streams. Basic Steps 1 through Steps 4 as discussed earlier in section Rule Based System are implemented by partitioning the input and output spaces into fuzzy regions with FCM, generation of fuzzy rules from available data pairs, assigning a degree to each rule, construction of a combined fuzzy rule base, and mapping from the input space to

The vector AC and AL as represented by equation (2) are characterized for the specified watershed of the streams. As explained earlier, AC represents the vector of applied agricultural chemical characteristics such as type of agricultural chemical (insecticide, herbicides etc.), application rate, application season etc. The AL is the land use patterns such as type of crop grown, percentage of cropped area, etc. and C is the stream agricultural diffuse pollution observed concentration measurement values. Patterns were generated using a known set of input-output data pairs. The input data pairs AC and AL values and corresponding output values of C for a particular year constitutes a pattern. While AC and AL are constant for a particular year, the C is temporally and spatially varying at each of the

Fuzzy rules are building-blocks of fuzzy rule base systems. Partitioning the fuzzy variables into linguistic variables is necessary step towards designing the rule base system. Fuzzy partitions for the input and output variables are defined or generated according to the type of data as discussed in the membership section (Singh, 2008). In this work, FCM model is utilized to supply optimum number data centers to partition the input and output fuzzy

It is absolutely possible to obtain the redundant and inconsistent rules from the data patterns having same antecedent parts. As mentioned, each rule is assigned a degree or weight by multiplying the membership functions of inputs and outputs for that rule. In the standard approach the rule having largest degree is adopted (Wang and Mendel, 1992). As an improvement, the degree of each rule is multiplied by a redundancy index to obtain the

ir Redundancy Index (R.I.) *Tr*

where, ri represents the redundant rule with same i antecedents; and Tr represents the sum of all the redundant rules. Final fuzzy rule base includes the rules having the highest

(8)

effective degree for that rule. The redundancy index may be defined as:

the output space using the rule base and a defuzzification (Wang and Mendel, 1992).

version 6.5), and ε is equal to 0.1 - 10-5 to obtain the pre-specified fuzzy centers.

**3.4 Fuzzy rule based system with FCM for estimation of diffuse pollution** 

obtain the pre-specified fuzzy centers.

**concentration in streams** 

monitoring station sites.

variables.

effective degree.

completely. Partial belonging to a set can be described numerically by a membership function which assumes values between 0 and 1 inclusive. Intuition, inference, rank ordering, angular fuzzy sets, neural networks, genetic algorithms, and inductive reasoning can be, among many, ways to assign membership values or functions to fuzzy variables (Ross, 1997). Fuzzy membership functions may take on many forms, but in practical applications simple linear functions, such as triangular ones are preferable due to their computational efficiency (Khrisnapuram, R¸1998). In this study, triangular shapes are utilized to represent the membership functions.

#### **3.3 Fuzzy c-means partitioning**

Fuzzy rule based models represent the system behaviour by means of if then fuzzy rules. The basic requirement of fuzzy rule based model is to fuzzify or partition the inputs and outputs representation of a physical system. Assigning the number, shape, overlaps etc. of membership functions is most complex part of the fuzzy rule based model building. In most of the cases the optimality of the membership assigned to different fuzzy variables are not guaranteed. FCM is one of the methods to determine the fuzzy partitions of the available data sets into a predetermined number of groups. The data points are divided into group of points that are close to each other. Each data point belongs to a group or cluster with a membership function. Closeness between data points is defined by a metric distance or data center, and each metric yields a different portioning. This cluster centers are utilized in assigning overlaps of triangular shape membership function in this study.

Fuzzy c-means (FCM) is a method of clustering which allows one piece of data to belong to two or more clusters. The FCM method (developed by Dunn (1973) and improved by Bezdek (1981)) is frequently used in pattern recognition. It is based on minimization of the following objective function:

$$J\_m = \sum\_{i=1}^{N} \sum\_{j=1}^{C\_M} \mu^m \left\| x\_i - c\_j \right\|^2, 1 \le m < \infty \tag{5}$$

where m is any real number greater than 1, uij is the degree of membership of xi in the cluster j, xi is the ith of d-dimensional measured data, cj is the d-dimension center of the cluster, and ||\*|| is any norm expressing the similarity between any measured data and the center. The N represents total number of data points, and CN represents the total number of fuzzy centers. Fuzzy partitioning is carried out through an iterative optimization of the objective function shown above, with the update of membership *uij* and the cluster centers *cj* by:

$$u\_{ij} = \frac{1}{\sum\_{k=1}^{C\_N} \left(\frac{\left\|\mathbf{x}\_i - c\_f\right\|}{\left\|\mathbf{x}\_{i-c\_k}\right\|}\right)^{\frac{2}{m-1}}} \tag{6}$$

$$c\_j = \frac{\sum\_{i=1}^N \boldsymbol{\mu}^m\_{ij} \cdot \mathbf{x}\_i}{\sum\_{i=1}^N \boldsymbol{\mu}^m\_{ij}} \tag{7}$$

completely. Partial belonging to a set can be described numerically by a membership function which assumes values between 0 and 1 inclusive. Intuition, inference, rank ordering, angular fuzzy sets, neural networks, genetic algorithms, and inductive reasoning can be, among many, ways to assign membership values or functions to fuzzy variables (Ross, 1997). Fuzzy membership functions may take on many forms, but in practical applications simple linear functions, such as triangular ones are preferable due to their computational efficiency (Khrisnapuram, R¸1998). In this study, triangular shapes are

Fuzzy rule based models represent the system behaviour by means of if then fuzzy rules. The basic requirement of fuzzy rule based model is to fuzzify or partition the inputs and outputs representation of a physical system. Assigning the number, shape, overlaps etc. of membership functions is most complex part of the fuzzy rule based model building. In most of the cases the optimality of the membership assigned to different fuzzy variables are not guaranteed. FCM is one of the methods to determine the fuzzy partitions of the available data sets into a predetermined number of groups. The data points are divided into group of points that are close to each other. Each data point belongs to a group or cluster with a membership function. Closeness between data points is defined by a metric distance or data center, and each metric yields a different portioning. This cluster centers are utilized in

Fuzzy c-means (FCM) is a method of clustering which allows one piece of data to belong to two or more clusters. The FCM method (developed by Dunn (1973) and improved by Bezdek (1981)) is frequently used in pattern recognition. It is based on minimization of the

2

,1

2 1

(5)

(6)

(7)

assigning overlaps of triangular shape membership function in this study.

1 1

shown above, with the update of membership *uij* and the cluster centers *cj* by:

*ij*

*u*

1

*k i c*

1

*i*

*i <sup>j</sup> <sup>N</sup> <sup>m</sup>*

 

*c*

*<sup>N</sup> <sup>m</sup>*

1

*u*

*N*

*m m ij i j i j*

*J u xc m*

where m is any real number greater than 1, uij is the degree of membership of xi in the cluster j, xi is the ith of d-dimensional measured data, cj is the d-dimension center of the cluster, and ||\*|| is any norm expressing the similarity between any measured data and the center. The N represents total number of data points, and CN represents the total number of fuzzy centers. Fuzzy partitioning is carried out through an iterative optimization of the objective function

1

*<sup>C</sup> <sup>m</sup> i j*

*x c x*

 

*k*

*u x*

*ij i*

*ij*

*N CN*

utilized to represent the membership functions.

**3.3 Fuzzy c-means partitioning** 

following objective function:

This iteration will stop when maxij{│uijk+1-uijk │}< ε , where ε is a termination criterion between 0 and 1, where as *k* are the iteration steps. This study used FCM algorithm (Matlab version 6.5), and ε is equal to 0.1 - 10-5 to obtain the pre-specified fuzzy centers.

This study implements FCM algorithm (Matlab version 6.5), m=2, and ε equal to 10-5 to obtain the pre-specified fuzzy centers.

#### **3.4 Fuzzy rule based system with FCM for estimation of diffuse pollution concentration in streams**

The watershed of the streams plays a vital role in influencing the diffuse pollution concentration in the streams. Basic Steps 1 through Steps 4 as discussed earlier in section Rule Based System are implemented by partitioning the input and output spaces into fuzzy regions with FCM, generation of fuzzy rules from available data pairs, assigning a degree to each rule, construction of a combined fuzzy rule base, and mapping from the input space to the output space using the rule base and a defuzzification (Wang and Mendel, 1992).

The vector AC and AL as represented by equation (2) are characterized for the specified watershed of the streams. As explained earlier, AC represents the vector of applied agricultural chemical characteristics such as type of agricultural chemical (insecticide, herbicides etc.), application rate, application season etc. The AL is the land use patterns such as type of crop grown, percentage of cropped area, etc. and C is the stream agricultural diffuse pollution observed concentration measurement values. Patterns were generated using a known set of input-output data pairs. The input data pairs AC and AL values and corresponding output values of C for a particular year constitutes a pattern. While AC and AL are constant for a particular year, the C is temporally and spatially varying at each of the monitoring station sites.

Fuzzy rules are building-blocks of fuzzy rule base systems. Partitioning the fuzzy variables into linguistic variables is necessary step towards designing the rule base system. Fuzzy partitions for the input and output variables are defined or generated according to the type of data as discussed in the membership section (Singh, 2008). In this work, FCM model is utilized to supply optimum number data centers to partition the input and output fuzzy variables.

It is absolutely possible to obtain the redundant and inconsistent rules from the data patterns having same antecedent parts. As mentioned, each rule is assigned a degree or weight by multiplying the membership functions of inputs and outputs for that rule. In the standard approach the rule having largest degree is adopted (Wang and Mendel, 1992). As an improvement, the degree of each rule is multiplied by a redundancy index to obtain the effective degree for that rule. The redundancy index may be defined as:

$$\text{Redundancy Index (R.I.)} = \frac{\mathbf{r}\_i}{T\_r} \tag{8}$$

where, ri represents the redundant rule with same i antecedents; and Tr represents the sum of all the redundant rules. Final fuzzy rule base includes the rules having the highest effective degree.

Prediction of Herbicides Concentration in Streams 235

where Xai and Xpi are measured and computed values of diffuse pollution concentration values in streams; *Xai* and *Xpi* are average values of Xai and Xpi values respectively; i

The correlation coefficient measures the statistical correlation between the predicted and actual values. A higher value of R means a better model, with a 1 meaning perfect statistical

Mean-squared error is the most commonly used measure of success of numeric prediction, and root mean-squared error is the square root of mean-squared-error, take to give it the same dimensions as the predicted values themselves. This method exaggerates the prediction error - the difference between prediction value and actual value of a test case. The

1

For a perfect fit, Xai = Xpi and RMSE = 0. So, the RMSE index ranges from 0 to infinity, with

The standard error of estimate (SEE) is an estimate of the mean deviation of the regression

1

 

The model efficiency (MENash), an evaluation criterion proposed by Nash and Sutcliffe (1970), is employed to evaluate the performance of each of the developed model. It is

*i*

*SEE*

Nash

ME 1.0

*n*

( )

*Xai Xpi*

( 2)

1

*i n*

*i*

A value of 90% and above indicates very satisfactory performance, a value in the range of 80–90% indicates fairly good performance, and a value below 80% indicates an

*n*

1

*n* 

*i RMSE Xai Xpi n*

*n*

<sup>1</sup> ( ( ))

2

2

( )

*Xa X*

*i pi*

( )

*X Xai*

*ai*

2

(10)

(11)

(12)

represents index number and n is the total number of concentration observations.

correlation and a 0 meaning there is no correlation at all.

**3.6.2 Root sean square error (RMSE)** 

0 corresponding to the ideal.

defined as:

unsatisfactory fit.

**3.6.3 Standard error of estimates (SEE)** 

from observed data. It is defined as (Allen, 1986):

**3.6.4 Model efficiency (Nash–Sutcliffe coefficient)** 

root mean squared error (RMSE) is computed as:

The fuzzy inference mechanism uses the fuzzified inputs and rules stored in the rule base for processing the incoming inputs data and produces an output. The fuzzy rules are processed by fuzzy sets operations as discussed in rule based section as basic steps for fuzzy rule base system. The fuzzy rule based design is accepted to be satisfactorily completed when its performance during training and testing satisfies the stopping criteria based on some statistical parameters.

#### **3.5 ANN based methodology for estimation of diffuse pollution concentration in streams**

The ANN learns to solve a problem by developing a memory capable of associating a large number of example input patterns, with a resulting set of outputs or effects. ANN is discussed in ASCE Task Committee (2000), etc. An overview of artificial neural networks and neural computing, including details of basics and origins of ANN, biological neuron model etc. can be found in Hassoun (1999), Schalkoff (1997), and Zurada (1997). The details of ANN model building process and selection of best performing ANN model for a given problem is available in (Singh et al., 2004).

As illustrated in the fuzzy model building for estimation of diffuse pollution concentrations in streams, the AC and AL values for a particular year in a watershed are inputs, and corresponding C values in the stream is out put for the ANN model. The values of AC, AL and C for a particular year constitute a data pattern. A standard back propagation algorithm (Rumelhart et al., 1986) with single hidden layer is employed to capture the dynamic and complex relationship between the inputs and outputs utilizing the available patterns. The ANN architecture that perform better than other evaluated architectures based on certain performance evaluation criteria, both in training and testing, was selected as the final architecture.

#### **3.6 Performance evaluation criteria**

The performance of the developed models are evaluated based on some performance indices in both training and testing set. Varieties of performance evaluation criteria are available (e.g. Nash and Sutcliffe 1970; WMO 1975; ASCE Task Committee on Definition of Criteria for Evaluation of Watershed Models1993 etc.) which could be used for evaluation and inter comparison of different models. Following performance indices are selected in this study based on relevance to the evaluation process. There can be other criteria for evaluation of performance.

#### **3.6.1 Correlation coefficient (R)**

The correlation coefficient measures the statistical correlation between the predicted and actual values. It is computed as:

$$R = \frac{\sum\_{i=1}^{n} (\mathbf{X}ai - \overline{\mathbf{X}}ai)(\mathbf{X}pi - \overline{\mathbf{X}}pi)}{\sqrt{\sum\_{i=1}^{n} (\mathbf{X}ai - \overline{\mathbf{X}}ai)^2 \sum\_{i=1}^{n} (\mathbf{X}pi - \overline{\mathbf{X}}pi)^2}} \tag{9}$$

The fuzzy inference mechanism uses the fuzzified inputs and rules stored in the rule base for processing the incoming inputs data and produces an output. The fuzzy rules are processed by fuzzy sets operations as discussed in rule based section as basic steps for fuzzy rule base system. The fuzzy rule based design is accepted to be satisfactorily completed when its performance during training and testing satisfies the stopping criteria based on

**3.5 ANN based methodology for estimation of diffuse pollution concentration in** 

The ANN learns to solve a problem by developing a memory capable of associating a large number of example input patterns, with a resulting set of outputs or effects. ANN is discussed in ASCE Task Committee (2000), etc. An overview of artificial neural networks and neural computing, including details of basics and origins of ANN, biological neuron model etc. can be found in Hassoun (1999), Schalkoff (1997), and Zurada (1997). The details of ANN model building process and selection of best performing ANN model for a given

As illustrated in the fuzzy model building for estimation of diffuse pollution concentrations in streams, the AC and AL values for a particular year in a watershed are inputs, and corresponding C values in the stream is out put for the ANN model. The values of AC, AL and C for a particular year constitute a data pattern. A standard back propagation algorithm (Rumelhart et al., 1986) with single hidden layer is employed to capture the dynamic and complex relationship between the inputs and outputs utilizing the available patterns. The ANN architecture that perform better than other evaluated architectures based on certain performance evaluation criteria, both in training and testing, was selected as the

The performance of the developed models are evaluated based on some performance indices in both training and testing set. Varieties of performance evaluation criteria are available (e.g. Nash and Sutcliffe 1970; WMO 1975; ASCE Task Committee on Definition of Criteria for Evaluation of Watershed Models1993 etc.) which could be used for evaluation and inter comparison of different models. Following performance indices are selected in this study based on relevance to the evaluation process. There can be other criteria for evaluation

The correlation coefficient measures the statistical correlation between the predicted and

( )( )

*Xai Xai Xpi Xpi*

( )( )

*Xai Xai Xpi Xpi*

2 2

(9)

1

*i*

*R*

*n*

1 1

*i i*

*n n*

some statistical parameters.

problem is available in (Singh et al., 2004).

**3.6 Performance evaluation criteria** 

**3.6.1 Correlation coefficient (R)** 

actual values. It is computed as:

**streams** 

final architecture.

of performance.

where Xai and Xpi are measured and computed values of diffuse pollution concentration values in streams; *Xai* and *Xpi* are average values of Xai and Xpi values respectively; i represents index number and n is the total number of concentration observations.

The correlation coefficient measures the statistical correlation between the predicted and actual values. A higher value of R means a better model, with a 1 meaning perfect statistical correlation and a 0 meaning there is no correlation at all.

#### **3.6.2 Root sean square error (RMSE)**

Mean-squared error is the most commonly used measure of success of numeric prediction, and root mean-squared error is the square root of mean-squared-error, take to give it the same dimensions as the predicted values themselves. This method exaggerates the prediction error - the difference between prediction value and actual value of a test case. The root mean squared error (RMSE) is computed as:

$$RMSE = \sqrt{\frac{1}{n} \left(\sum\_{i=1}^{n} (Xai - Xpi)^2\right)}\tag{10}$$

For a perfect fit, Xai = Xpi and RMSE = 0. So, the RMSE index ranges from 0 to infinity, with 0 corresponding to the ideal.

#### **3.6.3 Standard error of estimates (SEE)**

The standard error of estimate (SEE) is an estimate of the mean deviation of the regression from observed data. It is defined as (Allen, 1986):

$$SEE = \sqrt{\frac{\sum\_{i=1}^{n} (Xai - Xpi)}{(n-2)}} \tag{11}$$

#### **3.6.4 Model efficiency (Nash–Sutcliffe coefficient)**

The model efficiency (MENash), an evaluation criterion proposed by Nash and Sutcliffe (1970), is employed to evaluate the performance of each of the developed model. It is defined as:

$$\text{ME}\_{\text{Nash}} = 1.0 - \frac{\sum\_{i=1}^{n} (Xa\_i - X\_{pi})^2}{\sum\_{i=1}^{n} (X\_{ai} - \overline{X}ai)^2} \tag{12}$$

A value of 90% and above indicates very satisfactory performance, a value in the range of 80–90% indicates fairly good performance, and a value below 80% indicates an unsatisfactory fit.

Prediction of Herbicides Concentration in Streams 237

The FCM model represented by equation (5) is used to partition the input data into fuzzy partitions. The FCM algorithm is implemented using MATLAB version 6.5 for ε equal to 10-5 to obtain the pre-specified fuzzy centers. The 3, 4, and 5 fuzzy centers for the inputs application rate and weighted percentage area obtained using the FCM model is shown in Table 2. Instead of iterating for the optimal number of fuzzy centers, a prior knowledge about the fuzzy partitioning for the fuzzy rule based models were utilized in implementing

Fuzzy Partition centers by FCM Model

Application Rate (lb/Acre)

> 80.38 86.68 90.75

> 79.50 84.02 87.21 90.88

> 80.00 86.67 87.00 89.17 91.0

Input Application Rate (lb/Acre)

> 1.26 1.31 1.37

> 1.26 1.31 1.33 1.36

> 1.26 1.31 1.33 1.35 1.41

The seven years data (1992-1998) are utilized for training and the three years data (1999- 2001) are utilized for testing the fuzzy rule based model with FCM. The model is assumed to be performing satisfactory when model efficiency coefficient (MENash) as given by equation (12) is greater than 90 percent, and other performance indices are also improved. Although arbitrary, it may be used as stopping criteria to limit the processing of large number of rules

Performance of fuzzification of inputs application rate and weighted percentage area were studied by assigning 3, 5, and 7 fuzzy variables without using FCM (Singh, 2008). Though performance of fuzzifiction with 7 variables worked better than fuzzification with 3 and 5 variables; fuzzification by 5 fuzzy variables are comparable to fuzzification with 7 variables as shown in Table 3. Fuzzy rule based models with 3, 5 and 7 fuzzy variables are represented by Fuzzy\_3M, Fuzzy\_5M, and Fuzzy\_7M models respectively in the Table 3. As 3 partitions are not adequate, four fuzzy partitions were specified for the use of fuzzy rule based system with FCM model. The four centers as shown in Table 2, obtained using FCM are partitioned into four linguistic fuzzy variables as low, medium, high, and very high. A

Table 2. Different Fuzzy Partition Centers Using FCM Model

with increase in linguistic fuzzy variables for the inputs.

**4.2 Training and testing the fuzzy rule based model with FCM** 

**4.1 Evaluation of fuzzy c-means centers** 

fuzzy c-means algorithm.

Fuzzy Partitions

3-Fuzzy Centers

4-Fuzzy Centers

5-Fuzzy Centers
