**Discrete Event Simulation Combined with Multiple Criteria Decision Analysis as a Decision Support Methodology in Complex Logistics Systems**

Thiago Barros Brito, Rodolfo Celestino dos Santos Silva, Edson Felipe Capovilla Trevisan and Rui Carlos Botter

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/50107

#### **1. Introduction**

Discrete Event Simulation (DES) is a decision support tool that is extensively used to solve logistics and industrial problems. Indeed, the scope of DES is now extremely broad in that it includes manufacturing environments, supply chains, transportations systems and computer information systems [1]. However, although its usage spreads dramatically, few authors, practitioners or users are able to fully understand and apply the methodology in order to derive its full potential.

While alone, the DES methodology is a tool that improves user comprehension of a system, it has sometimes been incorrectly stigmatized as a method, a "crystal ball." Indeed, a DES model should not be built to accurately predict the behavior of a system, but rather used to allow decision makers to fully understand and respond to the behavior of the variables (elements, resources, queues, etc.) of the system and the relations between those variables. However, depending on the complexity of the system, a deeper analysis and evaluation of the system behavior and variable tradeoff analyses may be a complicated task, since logistics problems, by nature, are composed of several elements interacting among themselves simultaneously, influencing each other in a complex relationship network, often under conditions that involve randomness. Further, the observation and evaluation of numerous decision criteria is required, led by multiple goals (often intangible and even antagonistic) and commonly running across long time horizons where the risks and uncertainties are salient elements.

© 2012 Brito et al., licensee InTech. This is an open access chapter distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. © 2012 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

In order to expand the capacity of DES to support decision making, other decision support methodologies may be incorporated, thereby adding greater value to the model and strengthening the overall capacity of the decision-making process. Consequently, the proposal of this chapter is to incorporate Multiple Criteria Decision Analysis (MCDA) into a DES model.

Discrete Event Simulation Combined with Multiple Criteria Decision Analysis as a Decision Support Methodology in Complex Logistics Systems 107

> **ABSTRACT MODEL**

**COMPUTER MODEL** 

VERIFICATION AND VALIDATION

**CONCEPTUAL MODEL**

MODEL REPRESENTATION

> **DATA INPUT**

MODEL IMPLEMENTATION

MODEL FORMULATION

**Figure 1.** Development of a simulation model [3]

**OPERATIONAL MODEL**

**EXPERIMENTAL RESULTS**

selected confidence interval, the behavior of the system;

combined with other tools of optimization;

reasoning in the decision-making process;

applied in operational research (OR); or

efficiently in a specific class of problems.

clarifying what a DES model is not:

MODEL EXPERIMENTATION

ANALYSIS AND DEFITIONS

system;

However, the work of [3] defines the DES methodology in the reverse way, namely by

**OBJECTIVES ANS SYSTEM DEFINITION** 

A crystal ball: the simulation is not able to predict the future, but it can predict, within a

 A mathematical model: the simulation does not correspond to a mathematical/analytical set of expressions, whose outputs represent the behavior of the

An optimization tool: the simulation is a tool of analysis of scenarios that can be

A substitute for logical/intelligent thought: the simulation is not able to replace human

A last resource tool: currently, the simulation is one of the most popular techniques

A panacea to resolve all considered problems: the simulation technique works

However, the complete definition of the DES methodology must be based on the advantages of its use. The characterization of the simulation tool as the concatenation of the procedures of building of a model that represents a system and its subsequent experimentation that aims at observing and understanding the behavior of the system already suggests its main purpose, namely to allow the observer/modeler to carry out "what if" analysis using the system. Indeed, "what if" is the best statement to illustrate the purpose of the simulation

In this context, the DES model is built to analyze the operational performance of the system's variables, based on several alternative system configurations. From this point on, a multicriteria decision model should be applied to the DES results, bringing to light and taking into account an evaluation of the decision-making priorities and judgments of decision makers over the decision criteria and thus formally studying the tradeoff between the performances of the decision criteria in the DES model. Therefore, the main objectives of this chapter are to:


## **2. The application of DES in complex logistics systems as a DSS**

A DES model is a mathematical/logical structure that represents the relationships among the components of a system. It has long been one of the mainstream computer-aided decisionmaking tools because of the availability of powerful computers. Traditionally, DES has been efficiently employed to simulate complex logistics systems owing to its capacity to replicate the behavior of the system, to represent all its relevant physical aspects and to provide decision-making insights into how to respond.

The DES methodology presented in this paper is based on the steps proposed by [2]. Those steps are summarized and graphically represented by [3], which divide the development of the model into three main stages (Figure 1):


In fact, a DES methodology represents a wider concept, with possible applications in numerous industries and expertise areas, from ordinary daily activities (e.g., the service process in a bank) to situations of elevated complexity (e.g., understanding the evolution of a country's economic indicators or its weather forecasting system).

Discrete Event Simulation Combined with Multiple Criteria Decision Analysis as a Decision Support Methodology in Complex Logistics Systems 107

**Figure 1.** Development of a simulation model [3]

106 Discrete Event Simulations – Development and Applications

DES model.

logistics systems;

system; and

decision-making insights into how to respond.

the model into three main stages (Figure 1):

conceptual modeling;

and

In order to expand the capacity of DES to support decision making, other decision support methodologies may be incorporated, thereby adding greater value to the model and strengthening the overall capacity of the decision-making process. Consequently, the proposal of this chapter is to incorporate Multiple Criteria Decision Analysis (MCDA) into a

In this context, the DES model is built to analyze the operational performance of the system's variables, based on several alternative system configurations. From this point on, a multicriteria decision model should be applied to the DES results, bringing to light and taking into account an evaluation of the decision-making priorities and judgments of decision makers over the decision criteria and thus formally studying the tradeoff between the performances of the decision criteria in the DES model. Therefore, the main objectives of this chapter are to:

Understand the capabilities of DES as a decision support methodology in complex

 Build and implement a Decision Support System (DSS) that merges the DES and MCDA methodologies to serve as a catalyst to improve the decision-making process; Present a real case study to analyze the establishment and operational configuration of a new steel production plant, an example of a complex and multifaceted logistics

A DES model is a mathematical/logical structure that represents the relationships among the components of a system. It has long been one of the mainstream computer-aided decisionmaking tools because of the availability of powerful computers. Traditionally, DES has been efficiently employed to simulate complex logistics systems owing to its capacity to replicate the behavior of the system, to represent all its relevant physical aspects and to provide

The DES methodology presented in this paper is based on the steps proposed by [2]. Those steps are summarized and graphically represented by [3], which divide the development of

a. Conception: definition of the system and its objectives, as well as data collection and

b. Implementation: preparation of the computer model itself, verification and validation;

In fact, a DES methodology represents a wider concept, with possible applications in numerous industries and expertise areas, from ordinary daily activities (e.g., the service process in a bank) to situations of elevated complexity (e.g., understanding the evolution of

c. Analysis: simulation runs and sensitivity and results analysis.

a country's economic indicators or its weather forecasting system).

Show the most important aspects of a decision-making process;

Draw conclusions on the application of this hybrid DSS methodology.

**2. The application of DES in complex logistics systems as a DSS** 

However, the work of [3] defines the DES methodology in the reverse way, namely by clarifying what a DES model is not:


However, the complete definition of the DES methodology must be based on the advantages of its use. The characterization of the simulation tool as the concatenation of the procedures of building of a model that represents a system and its subsequent experimentation that aims at observing and understanding the behavior of the system already suggests its main purpose, namely to allow the observer/modeler to carry out "what if" analysis using the system. Indeed, "what if" is the best statement to illustrate the purpose of the simulation methodology. [4] emphasizes the potential of "what if" analysis when affirming that the decision maker, in possession of a DES model, is capable of assuming any appropriate situation or scenario and analyzing the response of the system under such circumstances. Asking "what if" means nothing more than exploring the model, challenging its parameters and examining the impact of the proposed changes on the system.

Discrete Event Simulation Combined with Multiple Criteria Decision Analysis as a Decision Support Methodology in Complex Logistics Systems 109

However, this assumption is not true. Working with "incomplete" models that represent and link those elements relevant for understanding and fulfilling the aspiration of the modeler is a requirement. The importance of this topic is such that [7] proposes techniques to reduce the complexity of simulation models in the conception and design stage and proves the feasibility of this procedure without utility loss entailment to the model. Thus, for a significant number of decision makers, applying this modeling and simulation methodology as a reliable tool for systems analysis is complicated. However, what should

[8] states that a modeling and simulation methodology, considering both its potential and weaknesses, might play an important role in the process of "changing the mentality" of decision makers. As such, the developed model must fundamentally represent the apperception of the decision maker of the modeled system, no matter how incomplete or inaccurate it is. Built from the perspective of the decision maker, a model can become a "toy," allowing him or her to play fearlessly and avoid arousing any distrust and thus

The technique of modeling and simulation fits well with the final goal of becoming an element of learning and its prediction function. In fact, both these goals represent nothing more than achieving a good understanding of the real system so that one can act efficiently on it. Furthermore, this technique should be part of a broader effort to solve the problem, which may range from the application of complementary system-solving methodologies (optimization models, mathematical/statistical analysis, financial and economic analysis, etc.) to its application to the psychological/rational aspects of the decision makers and/or

Whenever there exists a single variable objective/utility function or when a decision is based on a single attribute, no decision making is involved: the decision is implicit in a single measurement unit. However, it is recognized that logistics systems are most commonly related to multiple attributes, objectives, criteria and value functions. As the alternatives become more complex, immersed in multiple relationships and interactions between variables and functions, and as it becomes necessary to combine those numerous aspects into a single measure of utility, some methodological help in the decision-making process

As stated by [9], decision making is a dynamic process; it is a complex search for information, full of detours, enriched by feedback from all directions, characterized by gathering and discarding information, and fueled by fluctuating uncertainty as well as indistinct and conflicting concepts. Moreover, the human being is a reluctant decision maker

For these reasons, successful decision making is one of the least understood or reputable capabilities in most organizations. Even though, as previously presented in this chapter,

be the ultimate goal of such a modeling and simulation technique?

**3. Decision-making processes, DSSs and tradeoff studies** 

providing valid results and allowing useful analysis.

senior executives of the company.

rather than a swiftly calculating machine.

becomes essential.

Then, one can define the main functions of the DES simulation technique as:


Further, the main reasons for its usage include the following [4]:


A disadvantage of simulation is that even though one can explore wide-ranging problems, they cannot usually be "solved." The simulation methodology does not provide the user with information about the "correct" solution of the problem explored. Instead, it provides subvention for the pursuit of alternatives that best fit the needs of the user's understanding of the problem. Thus, each user, through his or her own vision of the problem, can find particular (and often different) answers for the same model. This characteristic is emphasized by [5] in the work presented in a compilation organized by [6], in which the ultimate goal of modeling and the simulation methodology is discussed.

This discussion begins with the seemingly endless appeal of computational models and the promise that one day, supported by the growing power of computational processing, users will be able to completely and accurately represent a given system using the "perfect" model. However, it is unlikely that a model representing 100% of a given system will ever be built. Even the simplest of models carries a huge list of internal and external relationships between its components in a process under constant renewal and adaptation.

This conclusion reflects the inevitable necessity of working with models that are "incomplete." This is equivalent to carrying out simulation studies within the boundaries that govern the interpretation and representation of real systems. Decision makers are rarely conceptually capable of recognizing the validity of an "incomplete" model, resulting in their inability to work under such boundary conditions. This means that, under the watchful eyes of an "unprepared" decision maker, working with an "incomplete" model may not seem to be an alternative that provides valid results or allows useful analysis.

However, this assumption is not true. Working with "incomplete" models that represent and link those elements relevant for understanding and fulfilling the aspiration of the modeler is a requirement. The importance of this topic is such that [7] proposes techniques to reduce the complexity of simulation models in the conception and design stage and proves the feasibility of this procedure without utility loss entailment to the model. Thus, for a significant number of decision makers, applying this modeling and simulation methodology as a reliable tool for systems analysis is complicated. However, what should be the ultimate goal of such a modeling and simulation technique?

108 Discrete Event Simulations – Development and Applications

future;

building).

and examining the impact of the proposed changes on the system.

Further, the main reasons for its usage include the following [4]:

modifications to the *modus operandi* of the system; and

ultimate goal of modeling and the simulation methodology is discussed.

between its components in a process under constant renewal and adaptation.

be an alternative that provides valid results or allows useful analysis.

 To analyze a new system before its implementation; To improve the operation of an already existent system;

Then, one can define the main functions of the DES simulation technique as:

 To better understand the functioning of an already existent system; and To compare the results from hypothetical situations ("what if" analysis).

methodology. [4] emphasizes the potential of "what if" analysis when affirming that the decision maker, in possession of a DES model, is capable of assuming any appropriate situation or scenario and analyzing the response of the system under such circumstances. Asking "what if" means nothing more than exploring the model, challenging its parameters

The real system still does not exist; the simulation is used as a tool to project the

 Experimenting with the real system is expensive; a simulation methodology is used in order to avoid unnecessary expenses with regard to system stoppages and/or

 Experimenting with the real system is not appropriate; simulation should be used in order to avoid replicating extreme situations with the real system (e.g., a fire in a

A disadvantage of simulation is that even though one can explore wide-ranging problems, they cannot usually be "solved." The simulation methodology does not provide the user with information about the "correct" solution of the problem explored. Instead, it provides subvention for the pursuit of alternatives that best fit the needs of the user's understanding of the problem. Thus, each user, through his or her own vision of the problem, can find particular (and often different) answers for the same model. This characteristic is emphasized by [5] in the work presented in a compilation organized by [6], in which the

This discussion begins with the seemingly endless appeal of computational models and the promise that one day, supported by the growing power of computational processing, users will be able to completely and accurately represent a given system using the "perfect" model. However, it is unlikely that a model representing 100% of a given system will ever be built. Even the simplest of models carries a huge list of internal and external relationships

This conclusion reflects the inevitable necessity of working with models that are "incomplete." This is equivalent to carrying out simulation studies within the boundaries that govern the interpretation and representation of real systems. Decision makers are rarely conceptually capable of recognizing the validity of an "incomplete" model, resulting in their inability to work under such boundary conditions. This means that, under the watchful eyes of an "unprepared" decision maker, working with an "incomplete" model may not seem to [8] states that a modeling and simulation methodology, considering both its potential and weaknesses, might play an important role in the process of "changing the mentality" of decision makers. As such, the developed model must fundamentally represent the apperception of the decision maker of the modeled system, no matter how incomplete or inaccurate it is. Built from the perspective of the decision maker, a model can become a "toy," allowing him or her to play fearlessly and avoid arousing any distrust and thus providing valid results and allowing useful analysis.

The technique of modeling and simulation fits well with the final goal of becoming an element of learning and its prediction function. In fact, both these goals represent nothing more than achieving a good understanding of the real system so that one can act efficiently on it. Furthermore, this technique should be part of a broader effort to solve the problem, which may range from the application of complementary system-solving methodologies (optimization models, mathematical/statistical analysis, financial and economic analysis, etc.) to its application to the psychological/rational aspects of the decision makers and/or senior executives of the company.

## **3. Decision-making processes, DSSs and tradeoff studies**

Whenever there exists a single variable objective/utility function or when a decision is based on a single attribute, no decision making is involved: the decision is implicit in a single measurement unit. However, it is recognized that logistics systems are most commonly related to multiple attributes, objectives, criteria and value functions. As the alternatives become more complex, immersed in multiple relationships and interactions between variables and functions, and as it becomes necessary to combine those numerous aspects into a single measure of utility, some methodological help in the decision-making process becomes essential.

As stated by [9], decision making is a dynamic process; it is a complex search for information, full of detours, enriched by feedback from all directions, characterized by gathering and discarding information, and fueled by fluctuating uncertainty as well as indistinct and conflicting concepts. Moreover, the human being is a reluctant decision maker rather than a swiftly calculating machine.

For these reasons, successful decision making is one of the least understood or reputable capabilities in most organizations. Even though, as previously presented in this chapter, DES helps frame the problem and establish a defensible course of action, making good decisions and setting priorities is a further and much harder task. A DES model uses analysis to break things down in order to provide information only, not necessarily the right answers or directions for the decision maker. Thus, DES modeling could offer great potential for modeling and analyzing logistics processes. For example, DES models can dynamically model different samples of parameter values such as arrival rates or service intervals, which can help discern process bottlenecks and investigate suitable alternatives. However, while the DES output is tangible, decision making must often rely on intangible information, which raises the question of how to help organizational decision makers harness the incredible complexity of the interaction between logistics problem's variables and the wealth of data available from the analysis of DES models.

Discrete Event Simulation Combined with Multiple Criteria Decision Analysis as a Decision Support Methodology in Complex Logistics Systems 111

decision-making resources that may result in long- or short-term outcomes [14,15]. The main objective of a tradeoff study is thus to arrive at a single final score for each of a number of competing alternative scenarios, using normalizing criteria scoring functions and combining

However, the biggest contribution of a DSS application to evaluating a logistics problem is in pointing out solutions based on decision-making judgments, thus capturing companies' aspirations and worries. For this reason, when conducting anything other than a rough or obvious tradeoff study, careful and honed expert attention must be given to properly choose the criteria scoring functions, weights and inputs – especially if they are in any way subjective. This approach requires, during the process of capturing decision-making judgments, the adoption of an OR intervention tool that should pursue the so-called *facilitated modeling process*. This process requires the operational researcher to carry out the whole intervention jointly with the client, from helping structure and define the nature of the problem of interest to supporting the evaluation of priorities and development of plans

The traditional way of conducting OR intervention in logistics problems is so-called *expert modeling*, namely when the decision maker hires OR consultants to objectively analyze a problem situation. The result of this kind of intervention is often the recommendation of an optimal (or semi-optimal) solution. Nevertheless, when dealing with problems at a strategic level, complexity rises and the expert mode of intervention may not be appropriate. [16]

The lack of agreement on the scope and depth of the problem situation to be addressed;

The existence of several stakeholders and decision makers with distinct and often

Facilitated modeling intervention aims to overcome these issues by structuring decisions in complex and strategic logistics problems. It mainly helps in the negotiation of conflicts of interest during all phases of a decision-making process, taking into consideration different opinions and ideas related to the scope and definition of the problem as well as divergences

In a facilitated modeling approach, the OR consultant must work not only as an analyst, but also as a facilitator and negotiator of conflicts in order to reach a common, satisfactory and useful decision about the problem definition, investigation and resolution. Almost every step taken in the intervention – from defining the problem to creating and analyzing models and providing recommendations – is conducted interactively with the team, in a so-called "helping relationship" [17] between OR consultants and their clients. In Figure 3, [16] define the activities of an OR consultant working as a facilitator in all steps of a facilitated

these scores through weighted combining functions.

for subsequent implementation [16].

**3.2. Facilitated modeling process** 

and

modeling process.

present two of the main reasons for its inadequacy:

conflicting perspectives, objectives, values and interests.

in the output analysis, values and interest in the results.

#### **3.1. DSSs**

DSSs, a type of information system designed to support semi-structured or unstructured managerial activity [10], are ideally suited to bridge the gap between information (tangible and intangible) and decision makers. A properly designed DSS (such as that shown in Figure 2) is an interactive software-based system intended to help decision makers compile useful information from a combination of raw data, documents, personal knowledge and business models in order to identify problems and help make decisions.

**Figure 2.** Structuring a DSS application [11].

DSSs, especially in the form of spreadsheets, have become mainstream tools that organizations routinely use to improve managerial decision making [12] often by importing data from enterprise-wide information systems into spreadsheets to address specific business problems. Moreover, DSSs also have the potential to serve as a catalyst to improve the decision-making process, as they provide the capability to organize and share as well as to create knowledge, providing structure and new insights to managers [13].

The most important capability of a DSS might be the possibility of carrying out tradeoff studies. A tradeoff study is the choice of one alternative solution when you have limited decision-making resources that may result in long- or short-term outcomes [14,15]. The main objective of a tradeoff study is thus to arrive at a single final score for each of a number of competing alternative scenarios, using normalizing criteria scoring functions and combining these scores through weighted combining functions.

However, the biggest contribution of a DSS application to evaluating a logistics problem is in pointing out solutions based on decision-making judgments, thus capturing companies' aspirations and worries. For this reason, when conducting anything other than a rough or obvious tradeoff study, careful and honed expert attention must be given to properly choose the criteria scoring functions, weights and inputs – especially if they are in any way subjective. This approach requires, during the process of capturing decision-making judgments, the adoption of an OR intervention tool that should pursue the so-called *facilitated modeling process*. This process requires the operational researcher to carry out the whole intervention jointly with the client, from helping structure and define the nature of the problem of interest to supporting the evaluation of priorities and development of plans for subsequent implementation [16].

### **3.2. Facilitated modeling process**

110 Discrete Event Simulations – Development and Applications

**Figure 2.** Structuring a DSS application [11].

**3.1. DSSs** 

Decision‐making problem structuring

> Alternatives generation

Objectives and attributes especification

and the wealth of data available from the analysis of DES models.

business models in order to identify problems and help make decisions.

Possible impacts determination

Determination of the magnitude and probability of the impacts generated by the

DES helps frame the problem and establish a defensible course of action, making good decisions and setting priorities is a further and much harder task. A DES model uses analysis to break things down in order to provide information only, not necessarily the right answers or directions for the decision maker. Thus, DES modeling could offer great potential for modeling and analyzing logistics processes. For example, DES models can dynamically model different samples of parameter values such as arrival rates or service intervals, which can help discern process bottlenecks and investigate suitable alternatives. However, while the DES output is tangible, decision making must often rely on intangible information, which raises the question of how to help organizational decision makers harness the incredible complexity of the interaction between logistics problem's variables

DSSs, a type of information system designed to support semi-structured or unstructured managerial activity [10], are ideally suited to bridge the gap between information (tangible and intangible) and decision makers. A properly designed DSS (such as that shown in Figure 2) is an interactive software-based system intended to help decision makers compile useful information from a combination of raw data, documents, personal knowledge and

> Preferences determination

quantification of the value judgments of the decision‐makers

Evaluation and comparison of the alternatives

Evaluation of the alternatives and sensibility analysis

DSSs, especially in the form of spreadsheets, have become mainstream tools that organizations routinely use to improve managerial decision making [12] often by importing data from enterprise-wide information systems into spreadsheets to address specific business problems. Moreover, DSSs also have the potential to serve as a catalyst to improve the decision-making process, as they provide the capability to organize and share as well as

alternatives Structuring and

The most important capability of a DSS might be the possibility of carrying out tradeoff studies. A tradeoff study is the choice of one alternative solution when you have limited

to create knowledge, providing structure and new insights to managers [13].

The traditional way of conducting OR intervention in logistics problems is so-called *expert modeling*, namely when the decision maker hires OR consultants to objectively analyze a problem situation. The result of this kind of intervention is often the recommendation of an optimal (or semi-optimal) solution. Nevertheless, when dealing with problems at a strategic level, complexity rises and the expert mode of intervention may not be appropriate. [16] present two of the main reasons for its inadequacy:


Facilitated modeling intervention aims to overcome these issues by structuring decisions in complex and strategic logistics problems. It mainly helps in the negotiation of conflicts of interest during all phases of a decision-making process, taking into consideration different opinions and ideas related to the scope and definition of the problem as well as divergences in the output analysis, values and interest in the results.

In a facilitated modeling approach, the OR consultant must work not only as an analyst, but also as a facilitator and negotiator of conflicts in order to reach a common, satisfactory and useful decision about the problem definition, investigation and resolution. Almost every step taken in the intervention – from defining the problem to creating and analyzing models and providing recommendations – is conducted interactively with the team, in a so-called "helping relationship" [17] between OR consultants and their clients. In Figure 3, [16] define the activities of an OR consultant working as a facilitator in all steps of a facilitated modeling process.

Discrete Event Simulation Combined with Multiple Criteria Decision Analysis as a Decision Support Methodology in Complex Logistics Systems 113

as distinct points of view and backgrounds may lead to a much better interpretation of a given problem and bring new ideas and concepts, leading to a better result [20,21]; Despite the fact that experts recommend optimal (or semi-optimal) solutions, decision makers are rarely interested in the best alternative, but rather in a satisfactory one. Satisfactory alternatives also represent a feasible solution in the political, financial and environmental fields. Further, all logistics systems are inevitably too complex to be integrally modeled and they are thus subject to necessary simplifications and assumptions. Any model should thus be seen as a guide, a general indicator of the system behavior, rather than a precise indicator of the performances of the system's

 The decision maker's involvement in the whole process increases the commitment to implement the proposed solution. This involvement increases confidence in the decision-making process. It is a physiological factor of having our voice heard and ideas, preferences and beliefs taken into consideration during all steps of the process

It is important to note that neither of the two modes of intervention (i.e., expert or facilitated) is necessarily the best. [22] and [26] argue that for operational and well-defined problems, when there is a clear objective to be optimized or an unquestionable structure of the problem, the expert mode is usually appropriate. However, complex problems may require a facilitated intervention. In this case, the facilitator encourages divergent thinking while helping participants explicitly explore their particular perspectives of the problem. The next step is to stimulate and drive convergent thinking, consolidating a single and fully

Consequently, facilitated modeling is the doorway to building an efficient DSS application. In this paper, we propose the employment of facilitated modeling via the implementation of an MCDA model. Section 4 describes the development and employment of a DSS tool to support strategic decisions about the planning and sizing of a complex logistics system – in this case, a steel production plant and its logistical elements (stockyards, transportation fleet, etc.). Such a tool is able to analyze and evaluate its performance and execute a tradeoff

The developed DSS tool represents a hybrid software application combining the techniques of DES modeling with MCDA. The DES methodology will be supported by the Scenario Planning (SP) methodology, which uses hypothetical future scenarios to help decision makers think about the main uncertainties they face and devise strategies to cope with these

uncertainties [27]. The SP methodology can be described as the following set of steps:

 Each decision alternative is a combination of a strategic option in a given scenario (ai–sj). Define a value tree, which represents the fundamental objectives of the organization.

representative interpretation and perspective of the problem.

study of possible configurations and operational results.

 Define a set of n strategic options (ai). Define a set of m future scenarios (sj).

**4. Hybrid DSS: A combination of DES and MCDA** 

variables [22,23]; and

[24–26].

**Figure 3.** Activities of an OR consultant in a facilitated modeling process [16]

Further literature review based on [16] describes the four main assumptions taken under the facilitated intervention modes:


as distinct points of view and backgrounds may lead to a much better interpretation of a given problem and bring new ideas and concepts, leading to a better result [20,21];


It is important to note that neither of the two modes of intervention (i.e., expert or facilitated) is necessarily the best. [22] and [26] argue that for operational and well-defined problems, when there is a clear objective to be optimized or an unquestionable structure of the problem, the expert mode is usually appropriate. However, complex problems may require a facilitated intervention. In this case, the facilitator encourages divergent thinking while helping participants explicitly explore their particular perspectives of the problem. The next step is to stimulate and drive convergent thinking, consolidating a single and fully representative interpretation and perspective of the problem.

Consequently, facilitated modeling is the doorway to building an efficient DSS application. In this paper, we propose the employment of facilitated modeling via the implementation of an MCDA model. Section 4 describes the development and employment of a DSS tool to support strategic decisions about the planning and sizing of a complex logistics system – in this case, a steel production plant and its logistical elements (stockyards, transportation fleet, etc.). Such a tool is able to analyze and evaluate its performance and execute a tradeoff study of possible configurations and operational results.

## **4. Hybrid DSS: A combination of DES and MCDA**

The developed DSS tool represents a hybrid software application combining the techniques of DES modeling with MCDA. The DES methodology will be supported by the Scenario Planning (SP) methodology, which uses hypothetical future scenarios to help decision makers think about the main uncertainties they face and devise strategies to cope with these uncertainties [27]. The SP methodology can be described as the following set of steps:

Define a set of n strategic options (ai).

112 Discrete Event Simulations – Development and Applications

Formulating Problems • managent tea should be aware about its different aspects and contextual details; • problem structuring is supported by the operational researcher, acting as a facilitator, and the development of a model that captures the structure of the problem;

Defining Metrics • metrics to assess the performance of options reflect the objectives and priorities of the organisation;

Commiting for Action • the participatory process of reaching a decision, using a facilitated modeling approach, will increase the team's commitment to the implementation of the chosen options

Collecting data • data collection involves not only quantitative but also qualitative data and preference information. The objectives and priorities established by the managentteam guide which information will be gathered;

Aim of the Interveniation • client pays for the decision support, recommendations of actions, and the operational researcher's expertise on facilitating the decision process; • help the client in learning more about their problem and in gaining confidence for the way forward

Framing Problems • consultant has to help management team in negotiating a problem definition that can accommodate their different perspectives;

Evaluating Options

• conducted interactively with themanagent team. The consequences of adopting each option are assessed by a model and this informs the team's discussions;

**Figure 3.** Activities of an OR consultant in a facilitated modeling process [16]

Presenting Results • results are shown interactively to the managent team, which is allowed ''to play" with the model; • the report has typically a less important role, as it is the support for the decision making process that is the key for the client;

complicates problem modeling for decision makers [16];

facilitated intervention modes:

Further literature review based on [16] describes the four main assumptions taken under the

 Even the tangible aspects of problem framing and formulating, defining metrics and evaluating results have their salience and importance depending on how decision makers subjectively interpret them [18,19]. Different decision makers will perceive a given tangible variable in diverse ways because of their distinct interests and goals. This

 Following from the above consideration, subjectivity is unavoidable and thus it should be considered when solving a problem. The different perceptions of a problem as well


 Measure the achievement of each decision alternative (ai–sj) on each objective of the value tree using a 100–0 value scoring system.

Discrete Event Simulation Combined with Multiple Criteria Decision Analysis as a Decision Support Methodology in Complex Logistics Systems 115

circuit. The company's private port operates two berths for unloading, which are able to accommodate small capesize vessels (DWT 120,000 tons). One berth is dedicated exclusively

Thus, the main objectives of this study are (i) to size the company's own vessel fleet (dedicated to supplying iron ore to the plant) and (ii) to determine the storage area assigned to the two types of iron ore (SE and NE). This is because they must be stored separately owing to their physical characteristics and properties in order to avoid any restriction or interruption in the plant steelmaking process based on the poor supply of inputs. This work

The first step of the problem is the intervention of the OR consultant as a facilitator that focuses on identifying the decision-making group and assessing how it comprehends and evaluates the problem, scopes the decision situation and structures the problem efficiently. These steps correspond to Steps 1 to 4 in Figure 4. The aim here is to put together the full problem representation by considering all the aspects of the facilitated modeling process

to iron ore unloading and the other to coal unloading.

**5.2. Problem definition** 

presented in Section 3.2.

Introduction

Step 1

Meeting D1

Problem Identification

> Meeting D10,11

Meeting D1

Scope

Informal, Interviews and Narratives

does not cover the transportation, storage or processing of coal.

**Figure 4.** Complete modeling process of a hybrid DSS application (based on [30]).

Group Meeting D2,4,5, 6,7,8,9,10

Definition Decision Makers Values Capturing

MCDA, Phase

Meeting D7

3

Meeting D1

Development

Group Meeting D1,2,4,5, 6,7,8,9,10

> Testing and Implementation

Meeting D3

<sup>4</sup> Step

Multi-Attriobute Deicion Model Building

DES, Phase

Group Meeting D1, 2,4,5, 6,7,8,9,10

Operational Model Building

SP Phase

Meeting D1

Step

Decision Conferencing Free Form Interview

5

Tuning and Closure

Step 6

Group Meeting D1, 2,4,5, 6,7,8,9,10

Final Decision Support Model

and Scope Decision Framing Value Tree Draft Value Tree

<sup>2</sup> Step

Meeting D7

Step

Semi-structure Interviews


The SP approach is an extension of MCDA. The SP/MCDA methodology was thus applied in this work using the propositions of [27], which confirm the use of the MCDA methodology as a supporting tool to decision makers in situations of high complexity with potentially significant and long-term impacts. MCDA is a structured DSS technique for dealing with problems in which multiple and complex criteria influence decision making [28], as it allows for the visualization of the rational/logical structure of the problem by representing and quantifying the importance of its elements, relating them to the overall goal and allowing the execution of further tradeoff studies as well as benchmark and sensitivity analyses. The methodology organizes and synthesizes information, includes measures objectively and considers the value judgments of decision makers [29] in an interactive and iterative process. The value judgments of decision makers are captured as preference compensation, thus creating a robust tradeoff instrument.

Several authors have reviewed the utilization of the MCDA methodology as a decision support tool. The 10 major advantages of MCDA, summarized by [28], are the maintenance of the unity of the problem, complexity understanding, criteria interdependence relationship representation, capability of measuring criteria preference, maintenance of consistency, synthesis, tradeoff evaluation, consideration of decision makers' value judgments and consensus reaching. Thus, the goal sought by the MCDA methodology is to identify good and robust alternatives, granting coherence and offering a good tradeoff between different objectives that guide problem resolution. In that way, the multi-criteria analysis in this work will be performed after the results of the DES model have been obtained.

## **5. Case study**

#### **5.1. Problem and objectives**

A Brazilian steel company is establishing a new plant in the country's northeast region. The inputs to the plant production as well as the finished goods will all be handled through a private port located very close to the plant. Iron ore and coal are among the main steelmaking process inputs. Coal is imported from various locations around the world and is delivered to the terminal by a chartered vessel fleet, according to a procurement schedule. Iron ore is owned by the company and thus it comes from two distinct Brazilian regions, northeast (NE) and southeast (SE), with remarkable differences in physical properties. The transportation of iron ore from its original locations to the company's private port will be performed by the company's private dedicated fleet, which will operate in a closed-loop circuit. The company's private port operates two berths for unloading, which are able to accommodate small capesize vessels (DWT 120,000 tons). One berth is dedicated exclusively to iron ore unloading and the other to coal unloading.

Thus, the main objectives of this study are (i) to size the company's own vessel fleet (dedicated to supplying iron ore to the plant) and (ii) to determine the storage area assigned to the two types of iron ore (SE and NE). This is because they must be stored separately owing to their physical characteristics and properties in order to avoid any restriction or interruption in the plant steelmaking process based on the poor supply of inputs. This work does not cover the transportation, storage or processing of coal.

#### **5.2. Problem definition**

114 Discrete Event Simulations – Development and Applications

alternative.

obtained.

**5. Case study** 

**5.1. Problem and objectives** 

value tree using a 100–0 value scoring system.

on the worst and best decision alternatives in each criterion).

preference compensation, thus creating a robust tradeoff instrument.

Measure the achievement of each decision alternative (ai–sj) on each objective of the

Elicit the weights of each objective in the value tree using swing weighting (anchoring

 Aggregate the performances of each decision alternative (ai–sj) using the weights attached to the objectives in the value tree, finding an overall score for the decision

The SP approach is an extension of MCDA. The SP/MCDA methodology was thus applied in this work using the propositions of [27], which confirm the use of the MCDA methodology as a supporting tool to decision makers in situations of high complexity with potentially significant and long-term impacts. MCDA is a structured DSS technique for dealing with problems in which multiple and complex criteria influence decision making [28], as it allows for the visualization of the rational/logical structure of the problem by representing and quantifying the importance of its elements, relating them to the overall goal and allowing the execution of further tradeoff studies as well as benchmark and sensitivity analyses. The methodology organizes and synthesizes information, includes measures objectively and considers the value judgments of decision makers [29] in an interactive and iterative process. The value judgments of decision makers are captured as

Several authors have reviewed the utilization of the MCDA methodology as a decision support tool. The 10 major advantages of MCDA, summarized by [28], are the maintenance of the unity of the problem, complexity understanding, criteria interdependence relationship representation, capability of measuring criteria preference, maintenance of consistency, synthesis, tradeoff evaluation, consideration of decision makers' value judgments and consensus reaching. Thus, the goal sought by the MCDA methodology is to identify good and robust alternatives, granting coherence and offering a good tradeoff between different objectives that guide problem resolution. In that way, the multi-criteria analysis in this work will be performed after the results of the DES model have been

A Brazilian steel company is establishing a new plant in the country's northeast region. The inputs to the plant production as well as the finished goods will all be handled through a private port located very close to the plant. Iron ore and coal are among the main steelmaking process inputs. Coal is imported from various locations around the world and is delivered to the terminal by a chartered vessel fleet, according to a procurement schedule. Iron ore is owned by the company and thus it comes from two distinct Brazilian regions, northeast (NE) and southeast (SE), with remarkable differences in physical properties. The transportation of iron ore from its original locations to the company's private port will be performed by the company's private dedicated fleet, which will operate in a closed-loop The first step of the problem is the intervention of the OR consultant as a facilitator that focuses on identifying the decision-making group and assessing how it comprehends and evaluates the problem, scopes the decision situation and structures the problem efficiently. These steps correspond to Steps 1 to 4 in Figure 4. The aim here is to put together the full problem representation by considering all the aspects of the facilitated modeling process presented in Section 3.2.

**Figure 4.** Complete modeling process of a hybrid DSS application (based on [30]).

The next step corresponds to the DES+SP phase of the problem resolution. Based on all the information derived from Steps 1 to 4 in Figure 4 by the OR consultant, the DES model representing the real system should be able to evaluate all the results and variables necessary to help measure system performance, according to the decision maker's criteria. Further, the DES model is built to analyze the proposed logistics system based on several possible system configurations. From this point on, a DSS application, based on a multicriteria analysis of the results obtained from the DES model of each proposed alternative, was carried out. Through this analysis, it was possible to:

Discrete Event Simulation Combined with Multiple Criteria Decision Analysis as a Decision Support Methodology in Complex Logistics Systems 117

ore given the larger distance from the company's private port to the SE port compared

 Stocks capacities: storage capacities (in tons) for each type of iron ore (SE and NE); and Chartering: this variable determines whether vessels are chartered during the periods when the vessels of the company fleet are docked for maintenance. Dockage is carried out every 2 and ½ years, and ships may be unavailable from 7 to 40 days. Chartering vessels with the same operational characteristics is particularly difficult, especially for

Thus, with the variation of the proposed variables, it was possible to create a hall of

**Figure 5.** Representation of the iron ore transportation process from the SE and NE ports to the

a range of input parameters and variables of the DES model, as listed in Table 2.

Ten viable scenarios were created for further evaluation using MCDA. These scenarios cover

In Table 2, the first seven scenarios simulate a two-vessel operation, while the last three scenarios encompass a three-vessel fleet. Next, the first alternated variable is the necessity of vessels chartering during the fleet docking period. Thereafter, until scenario 6, the proportion of iron ore from each source (NE and SE) changes. Scenario 7 is a sensitivity analysis of Scenario 4, with a reduced storage capacity. From Scenarios 8–10, the proportion of iron ore from SE and NE is altered, but under the assumption of a three-

with to the NE port;

short time periods.

company's private port.

vessel operation.

**5.4. Scenario creation** 

simulation scenarios, which will be evaluated later.


#### **5.3. Input parameters**

All input parameters were provided by the company or derived from the in-depth statistic analysis of the available data. In all considered scenarios, an annual iron ore demand of 5 MTPY (million tons per year) was considered. As mentioned before, iron ore is supposed to be supplied by a dedicated vessel fleet operating in closed-loop system. Moreover, the project fleet is composed of small capesize vessels, while the largest ship able to dock at the port has a 120,000-ton capacity. Table 1 lists the input data for all scenarios.


**Table 1.** Input data for all scenarios.

However, a number of variables were considered in the simulation run process:


ore given the larger distance from the company's private port to the SE port compared with to the NE port;


Thus, with the variation of the proposed variables, it was possible to create a hall of simulation scenarios, which will be evaluated later.

**Figure 5.** Representation of the iron ore transportation process from the SE and NE ports to the company's private port.

#### **5.4. Scenario creation**

116 Discrete Event Simulations – Development and Applications

for the planned cargo; and

**Table 1.** Input data for all scenarios.

**5.3. Input parameters** 

was carried out. Through this analysis, it was possible to:

The next step corresponds to the DES+SP phase of the problem resolution. Based on all the information derived from Steps 1 to 4 in Figure 4 by the OR consultant, the DES model representing the real system should be able to evaluate all the results and variables necessary to help measure system performance, according to the decision maker's criteria. Further, the DES model is built to analyze the proposed logistics system based on several possible system configurations. From this point on, a DSS application, based on a multicriteria analysis of the results obtained from the DES model of each proposed alternative,

Determine the "best" size of the iron ore supply vessel fleet required to meet demand

All input parameters were provided by the company or derived from the in-depth statistic analysis of the available data. In all considered scenarios, an annual iron ore demand of 5 MTPY (million tons per year) was considered. As mentioned before, iron ore is supposed to be supplied by a dedicated vessel fleet operating in closed-loop system. Moreover, the project fleet is composed of small capesize vessels, while the largest ship able to dock at the

**Parameter Value Unit**

Planned Demand 5 mtpy Vessels Capacity 120.000 tonnes Travel Time (Plant-NE) 2.7 days Berthing Time (NE Port) 1.5 days Travel Time (Plant-NE) 7.9 days Berthing Time (SE Port) 1.4 days Berthing Time (Private Port) 3.25 days

Assess the capacity of the stockyards for the two types of iron ore (SE and NE).

port has a 120,000-ton capacity. Table 1 lists the input data for all scenarios.

However, a number of variables were considered in the simulation run process:

 SE/NE iron ore percentage: the iron ore employed in the steelmaking process is originally from either the SE or the NE regions of Brazil (see Figure 5). Owing to the specific physical and technical characteristics of each iron ore type, the percentage of SE iron ore may vary from 30% to 40% of the final composition of the steel process output. Although the production department prefers working with the maximum percentage of SE iron ore because of its enhanced physical properties, the procurement and transportation departments prefer working with the minimum percentage of SE iron

Company fleet: number of vessels in the company's private fleet;

Ten viable scenarios were created for further evaluation using MCDA. These scenarios cover a range of input parameters and variables of the DES model, as listed in Table 2.

In Table 2, the first seven scenarios simulate a two-vessel operation, while the last three scenarios encompass a three-vessel fleet. Next, the first alternated variable is the necessity of vessels chartering during the fleet docking period. Thereafter, until scenario 6, the proportion of iron ore from each source (NE and SE) changes. Scenario 7 is a sensitivity analysis of Scenario 4, with a reduced storage capacity. From Scenarios 8–10, the proportion of iron ore from SE and NE is altered, but under the assumption of a threevessel operation.


Discrete Event Simulation Combined with Multiple Criteria Decision Analysis as a Decision Support Methodology in Complex Logistics Systems 119

**Figure 6.** Determination of value function – days of production plant stoppage.

**Table 3.** Economic parameters of the investment.

0

0,25

0,5

Score

0,75

1

2; 4; 6; 8; 9; 10

3

1

presented in Table 4.

highest Investment NPV scenario (Figure 7).

obtained based on the parameters provided by the company (Table 3).

 Investment net present value (NPV): As the system modeled represents the internal logistics operation of the company, there is no revenue generation. Investment NPV is therefore directly related to the need for financial investment into the project (size of the company's fleet, need for vessel chartering, etc.). The results for Investment NPV are

**Parameter Unit**

Vessel Acquisition Value Mi US\$ Financed Percentage % Interests % Amortization Period years Grace Period years Vessel's Service Life years Return Rate %/year NPV Financed (per vessel) Mi US\$ NPV Own Capital (per vessel) Mi US\$ Chartering Costs (per vessel) US\$/day

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

5

7

Daysof Steel Plant Stoppage

The Investment NPV value function displays linear behavior, with a maximum score (1) assigned to the lowest total Investment NPV scenario and a minimum score (0) to the

 Annual fleet operational costs: This takes into account all the operational costs of the company fleet, such as fuel, port costs and running costs (crew, insurance, administrative costs, taxes, etc.). The components of the fleet operational costs are

**Table 2.** Description of the analyzed scenarios.

This table identifies a clear tradeoff between the number of vessels in the company fleet and the storage capacity required for each iron ore type, for example, by comparing Scenario 1 with Scenario 8. The simulation results are presented in Section 5.7.

#### **5.5. Decision criteria: Value functions and multi-criteria analysis**

The decision-making process implies capturing the value judgments of decision makers through the assignment of value functions to the relevant criteria and sub-criteria and the further positioning of the results of the scenarios on a value function scale. All evaluations and considerations were performed with the participation of representatives of the following areas of the company: Operations, Procurement, Transportation (Railroad and Navigation), Inventory Management and Finance.

The relevant criteria and sub-criteria considered in the system characterization, their descriptions and value functions are described below. The assignment of the scores associated with all decision criteria to each of the 10 previously considered scenarios is presented, as derived from the DES results.

 Power plant stoppages: Number of days per year that the plant stops production because of the lack of iron ore supply. The value function of this criterion is given as follows: when no interruption occurs in the operation of the steel production plant (0 days of interruption), the scenario gets a maximum score (1). If there is only 1 day of interruption, the scenario gets a score of 0.5. Two days of interruption corresponds to a score of 0.25 and 3 days to a score of 0.125. Thereafter, the score varies linearly until the scenario with 18 days of interruption, which scores 0. Between intervals, the value function varies linearly and thus aims at representing the extremely high costs of production resuming after any stoppage (Figure 6).

**Figure 6.** Determination of value function – days of production plant stoppage.

 Investment net present value (NPV): As the system modeled represents the internal logistics operation of the company, there is no revenue generation. Investment NPV is therefore directly related to the need for financial investment into the project (size of the company's fleet, need for vessel chartering, etc.). The results for Investment NPV are obtained based on the parameters provided by the company (Table 3).


**Table 3.** Economic parameters of the investment.

118 Discrete Event Simulations – Development and Applications

**Iron Ore Scenarios Vessels Fleet**

**Table 2.** Description of the analyzed scenarios.

Navigation), Inventory Management and Finance.

production resuming after any stoppage (Figure 6).

presented, as derived from the DES results.

This table identifies a clear tradeoff between the number of vessels in the company fleet and the storage capacity required for each iron ore type, for example, by comparing Scenario 1

Scenario 1 2 30 550,000 225,000 No Scenario 2 2 30 550,000 225,000 Yes Scenario 3 2 35 500,000 275,000 No Scenario 4 2 35 500,000 275,000 Yes Scenario 5 2 40 475,000 300,000 No Scenario 6 2 40 475,000 300,000 Yes Scenario 7 2 35 375,000 275,000 Yes Scenario 8 3 30 185,000 235,000 No Scenario 9 3 35 170,000 275,000 No Scenario 10 3 40 155,000 315,000 No

**% Min. SE** 

**NE SE**

**Stock Capacity** 

**(tonnes) Rely on** 

**chartering ?**

The decision-making process implies capturing the value judgments of decision makers through the assignment of value functions to the relevant criteria and sub-criteria and the further positioning of the results of the scenarios on a value function scale. All evaluations and considerations were performed with the participation of representatives of the following areas of the company: Operations, Procurement, Transportation (Railroad and

The relevant criteria and sub-criteria considered in the system characterization, their descriptions and value functions are described below. The assignment of the scores associated with all decision criteria to each of the 10 previously considered scenarios is

 Power plant stoppages: Number of days per year that the plant stops production because of the lack of iron ore supply. The value function of this criterion is given as follows: when no interruption occurs in the operation of the steel production plant (0 days of interruption), the scenario gets a maximum score (1). If there is only 1 day of interruption, the scenario gets a score of 0.5. Two days of interruption corresponds to a score of 0.25 and 3 days to a score of 0.125. Thereafter, the score varies linearly until the scenario with 18 days of interruption, which scores 0. Between intervals, the value function varies linearly and thus aims at representing the extremely high costs of

with Scenario 8. The simulation results are presented in Section 5.7.

**5.5. Decision criteria: Value functions and multi-criteria analysis** 

The Investment NPV value function displays linear behavior, with a maximum score (1) assigned to the lowest total Investment NPV scenario and a minimum score (0) to the highest Investment NPV scenario (Figure 7).

 Annual fleet operational costs: This takes into account all the operational costs of the company fleet, such as fuel, port costs and running costs (crew, insurance, administrative costs, taxes, etc.). The components of the fleet operational costs are presented in Table 4.

Discrete Event Simulation Combined with Multiple Criteria Decision Analysis as a Decision Support Methodology in Complex Logistics Systems 121

7

5; 6; 10

consumption. This parameter aims at representing the risk of interruption to plant production. A value function of this criterion assigns a maximum score (1) to a zero percentage (0%) of observation days of stock below the safety level and a minimum score (0) to the highest percentage. The variation between these extremes is linear

**Figure 9.** Determination of value function – time below the safety stock level.

**Figure 10.** Determination of value function – SE/NE iron ore origin percentage.

 Stock capacity: The company project includes a stockyard that is able to store 775,000 tons of iron ore. For obvious reasons, configurations with lower storage areas are preferred, representing less area commitment. Thus, in accordance with the established

3; 4; 7; 9

30% 35% 40%

SE iron percentage(%)

 SE/NE iron ore percentages: Operationally, the plant, owing to its physical characteristics, would rather work with SE than it would with NE iron ore. The scenarios are simulated within a discrete distribution of the percentage of SE iron ore (40%, 35% and 30%) and the value function is given as follows: 40% - valued as maximum (1), 35% - assigned with an intermediate score (0.5) and 30% - valued as

0% 25%

5

Time Percentage Bellow Safety Stock (%)

(Figure 9).

minimum (0) (Figure 10).

1

0

1; 2; 8

0,25

0,5

Score

0,75

0

0,25

0,5

**Score**

0,75

1

2; 8; 9; 10 4

6

1;3

**Figure 7.** Determination of value function – NPV.


**Table 4.** Components of the fleet operational costs.

Similar to NPV, the value function of this criterion is linear, with a maximum score (1) assigned to the scenario with the lowest total operational costs and a minimum score (0) assigned to the highest operational costs (Figure 8).

**Figure 8.** Determination of value function – operational costs.

 Stock below the safety level: This represents the time percentage that the plant's stock remains below the minimum inventory safety level, but it results in no interruption to the steelmaking process. The safety stock level is defined as 15 days of the plant's input consumption. This parameter aims at representing the risk of interruption to plant production. A value function of this criterion assigns a maximum score (1) to a zero percentage (0%) of observation days of stock below the safety level and a minimum score (0) to the highest percentage. The variation between these extremes is linear (Figure 9).

**Figure 9.** Determination of value function – time below the safety stock level.

120 Discrete Event Simulations – Development and Applications

<sup>1</sup> 3; <sup>5</sup>

**Figure 7.** Determination of value function – NPV.

0

0,25

0,5

Score

0,75

1

**Table 4.** Components of the fleet operational costs.

<sup>1</sup> 2; 3; 4; <sup>7</sup> 5 6

assigned to the highest operational costs (Figure 8).

**Figure 8.** Determination of value function – operational costs.

0

0,25

0,5

Score

0,75

1

Similar to NPV, the value function of this criterion is linear, with a maximum score (1) assigned to the scenario with the lowest total operational costs and a minimum score (0)

0,65 1

**Parameter Unit** Fuel Cost (at route) (US\$/day)/vessel Fuel Cost (at port) (US\$/day)/vessel Running Costs (US\$/day)/vessel Mooring Cost at Plant Port (US\$/mooring)/vessel Mooring Cost at NE Port (US\$/mooring)/vessel Mooring Cost at SE Port (US\$/mooring)/vessel 8 9, 10

8 9 10

NormalizedNPV

 Stock below the safety level: This represents the time percentage that the plant's stock remains below the minimum inventory safety level, but it results in no interruption to the steelmaking process. The safety stock level is defined as 15 days of the plant's input

0,68 1

NormalizedOperational Costs

 SE/NE iron ore percentages: Operationally, the plant, owing to its physical characteristics, would rather work with SE than it would with NE iron ore. The scenarios are simulated within a discrete distribution of the percentage of SE iron ore (40%, 35% and 30%) and the value function is given as follows: 40% - valued as maximum (1), 35% - assigned with an intermediate score (0.5) and 30% - valued as minimum (0) (Figure 10).

**Figure 10.** Determination of value function – SE/NE iron ore origin percentage.

 Stock capacity: The company project includes a stockyard that is able to store 775,000 tons of iron ore. For obvious reasons, configurations with lower storage areas are preferred, representing less area commitment. Thus, in accordance with the established value function, the scenario with lower storage capacity gets a maximum score (1) and that with a higher capacity gets a minimum score (0), with linear variation between these extremes (Figure 11).

Discrete Event Simulation Combined with Multiple Criteria Decision Analysis as a Decision Support Methodology in Complex Logistics Systems 123

> 1; 3; 5; 8; 9; 10

the difficulty in chartering vessels that meet the specific operational characteristics

 New mission allocation waiting time: This represents the average number of hours that each vessel of the company fleet waits to be allocated to a new mission (new route to any of the iron ore suppliers). Thus, a higher new mission waiting time, on one hand means fleet idleness, whereas, on the other hand, represents less risk to input supply to the plant. The value function assigns, for the lowest waiting time value observed, the maximum score (1) and to waiting times greater than 24 hours, the minimum score (0).

1 (Yes) 0 (No)

CharteringRelying ? (Binary)

Between 0 and 24 hours, the variation of the value function is linear (Figure 14).

Because the real system is still a future project, there is no historic operation database for validating the model's results. Thus, in order to validate the model, the analytical

4 24 200

6 3 1 10 9 8

5

New Mission Allocation Time (hours)

**Figure 14.** Determination of value function – new mission allocation time.

calculations of the fleet's operational parameters were compared.

**5.6. Model validation** 

0

0,25

0,5

Score

0,75

1

7

4

2

demanded, especially for short time periods (Figure 13).

**Figure 13.** Determination of value function – chartering.

0

2; 4; 6; 7

0,25

0,5

Score

0,75

1

**Figure 11.** Determination of value function – stock capacity.

 Average supported queuing time: This refers to the average time that vessels can queue at the iron ore terminals without affecting the delivery of inputs. Vessels have to obey the queuing disciplines in both iron ore terminals. This is an uncertain parameter, since a scenario that supports lower queues is riskier than one that supports high levels of the queuing in terms of the fulfillment of planned demand. Moreover, the behavior of queue patterns at Brazilian iron ore terminals is regulated by fluctuations in global demand. The scenario with the largest average supported queuing time scores 1 (maximum), while the shortest time scores 0 (minimum) (Figure 12).

**Figure 12.** Determination of value function – supported queuing time.

 Chartering: This criterion assumes only binary values, namely relying or not on chartering spare vessels. Thus, scenarios with no chartering reliance receive a maximum score (1) and scenarios where chartering spare vessels is considered to be an option receive a minimum score (0). As previously mentioned, such behavior occurs because of the difficulty in chartering vessels that meet the specific operational characteristics demanded, especially for short time periods (Figure 13).

**Figure 13.** Determination of value function – chartering.

122 Discrete Event Simulations – Development and Applications

9 10

**Figure 11.** Determination of value function – stock capacity.

these extremes (Figure 11).

8

0

0

1; 3; 5; 6; 7

0,25

0,5

Score

0,75

1

0,25

0,5

**Score**

0,75

1

value function, the scenario with lower storage capacity gets a maximum score (1) and that with a higher capacity gets a minimum score (0), with linear variation between

 Average supported queuing time: This refers to the average time that vessels can queue at the iron ore terminals without affecting the delivery of inputs. Vessels have to obey the queuing disciplines in both iron ore terminals. This is an uncertain parameter, since a scenario that supports lower queues is riskier than one that supports high levels of the queuing in terms of the fulfillment of planned demand. Moreover, the behavior of queue patterns at Brazilian iron ore terminals is regulated by fluctuations in global demand. The scenario with the largest average supported queuing time scores 1

420.000 775.000

7

1; 2; 3; 4; 5; 6

8; 9; 10

Stock Capacity (tonnes)

 Chartering: This criterion assumes only binary values, namely relying or not on chartering spare vessels. Thus, scenarios with no chartering reliance receive a maximum score (1) and scenarios where chartering spare vessels is considered to be an option receive a minimum score (0). As previously mentioned, such behavior occurs because of

1,75 5,25

SupportedQueuing Time (days)

2; 4

(maximum), while the shortest time scores 0 (minimum) (Figure 12).

**Figure 12.** Determination of value function – supported queuing time.

 New mission allocation waiting time: This represents the average number of hours that each vessel of the company fleet waits to be allocated to a new mission (new route to any of the iron ore suppliers). Thus, a higher new mission waiting time, on one hand means fleet idleness, whereas, on the other hand, represents less risk to input supply to the plant. The value function assigns, for the lowest waiting time value observed, the maximum score (1) and to waiting times greater than 24 hours, the minimum score (0). Between 0 and 24 hours, the variation of the value function is linear (Figure 14).

**Figure 14.** Determination of value function – new mission allocation time.

#### **5.6. Model validation**

Because the real system is still a future project, there is no historic operation database for validating the model's results. Thus, in order to validate the model, the analytical calculations of the fleet's operational parameters were compared.

Suppose the following initial scenario (fully deterministic):


Discrete Event Simulation Combined with Multiple Criteria Decision Analysis as a Decision Support Methodology in Complex Logistics Systems 125

> **Berth Occupancy Rate**

**NE SE**

**Average Suppported Queuing Time (days/cycle)**

**New Mission Allocation Time (h/cycle)**

**Demanda (million tonnes)**

**% Time Below Safety Stock**

**Table 6.** Cycle time composition (in days).

**Met**

**Lack of Inputs (days/year)**

**Scenarios % Demand** 

**Description # cycles in 25** 

**NPV Total (norm.)**

**years**

**Analytic Calculation** 2.010 140,70 55% **DES Simulation Model** 1.961 137,25 54%

**Accurancy** 98% 98% 98%

**Total Annual Operational Costs (norm.)**

**Table 7.** Results obtained by the DES model.

contemplates the MCDA.

**5.8. MCDA** 

The analysis in Table 5 demonstrates that those scenarios operating with fleets of three vessels (Scenarios 8–10) reached a higher performance level regarding the operational criteria and service levels (average supported queuing time, time below safety stock level, days of input lacking). Furthermore, these scenarios are less risky to the system, less susceptible to uncertainties, less demanding on storage areas and more tolerant of queues at the iron ore supplier's terminals. However, the costs of these configurations are higher compared with the

Scenario 1 99 2 0.65 0.68 5 1.75 1.25 44 Scenario 2 100 0 0.70 0.69 0 3.50 2.50 11 Scenario 3 99 1 0.66 0.69 5 1.75 1.25 35 Scenario 4 99 0 0.71 0.69 2 3.50 2.50 7 Scenario 5 99 12 0.66 0.70 13 1.75 1.25 22 Scenario 6 100 0 0.72 0.71 3 1.75 1.25 29 Scenario 7 99 18 0.71 0.69 25 1.75 1.25 4 Scenario 8 100 0 0.99 0.95 0 5.25 3.75 161 Scenario 9 100 0 1.00 0.97 0 5.25 3.75 146 Scenario 10 100 0 1.00 1.00 0 5.25 3.75 118

Among the first seven scenarios, which assume a two-vessel fleet, the comparison of similar scenarios in which variations only concern the reliability or not of chartering spare vessels (e.g., Scenarios 1 and 2, 3 and 4, 5 and 6) allows us to conclude that the chartering process is responsible for improving the operational results despite leading to increased costs. Moreover, it is noticeable that a higher percentage of SE iron ore incurs higher costs because of the greater distance between the input supplier and the steel production plant. Section 5.7

The decision-making process was based on the assignment of weights to the decision criteria listed in Section 5.5. The process is now presented. The following methodological step is the

other scenarios in terms of the initial investment needed or the operational costs.


Vessels cycle times can be calculated as shown in Table 5.


**Table 5.** Cycle time composition (in days).

As the availability of vessels in these analytical calculations is 100%, in 25 years we have 9,125 days of operation. Using the cycle times data shown in Table 5, and keeping the proportions of SE iron ore at 30%, we make the following calculations:

11.5 days / cycle × 70% × # of cycles + 18.5 days / cycle × 30% × # of cycles = 9,125 days.

Thus, the amount of full cycles (round trips) per vessel is 670, i.e., 469 cycles between the plant and the NE terminal and 201 cycles between the plant and the SE terminal. Through these calculations, under these conditions (with no queuing when unloading) and with a three-vessel fleet, it would be possible to operationalize 2,010 cycles in 25 years, adding up 140.7 million tons, or 5,628 MTPY. Further, considering 2.5 days of berth occupancy for each unloading process, we reach a berth occupation rate of 55% (5,025 days in 9,125 available days).

A DES model run was then carried out under the same criteria. The results obtained from the DES simulation model are shown in Table 6, compared with the analytical results.

The 98% average adherence of the DES simulation model was considered to be satisfactory and thus the model was validated.

#### **5.7. DES simulation result**

Twenty replications (of 25 years each) of the DES model were run for each scenario described in Section 5.4. The results are shown in Table 7.


**Table 6.** Cycle time composition (in days).

124 Discrete Event Simulations – Development and Applications

30% of the cargo comes from SE;

No restriction on storage;

Suppose the following initial scenario (fully deterministic):

(one way) from the plant port to the SE terminal;

Vessels cycle times can be calculated as shown in Table 5.

**Loading** 

**Queue Loading** 

 One unloading berth at the plant terminal; No downtime (offhire) nor docking; and No unloading queues at either terminal;

**Table 5.** Cycle time composition (in days).

**Empty Trip** 

and thus the model was validated.

described in Section 5.4. The results are shown in Table 7.

**5.7. DES simulation result** 

Vessel fleet composed of three Panamax class ships (70,000 tons);

 Queues at loading terminals of 5 days in the NE terminal and 7 days in the SE terminal; Travel times of 1.5 days (one way) from the plant port to the NE terminal and 4 days

1 day mooring time at the loading terminals and 2.5 days at the unloading terminals;

As the availability of vessels in these analytical calculations is 100%, in 25 years we have 9,125 days of operation. Using the cycle times data shown in Table 5, and keeping the

**NE** 1,50 5,00 1,00 1,50 0,00 2,50 **11,50 SE** 4,00 7,00 1,00 4,00 0,00 2,50 **18,50**

**Loaded Return Travel**

**Unloading** 

**Queue Unloadin Total**

11.5 days / cycle × 70% × # of cycles + 18.5 days / cycle × 30% × # of cycles = 9,125 days.

process, we reach a berth occupation rate of 55% (5,025 days in 9,125 available days).

Thus, the amount of full cycles (round trips) per vessel is 670, i.e., 469 cycles between the plant and the NE terminal and 201 cycles between the plant and the SE terminal. Through these calculations, under these conditions (with no queuing when unloading) and with a three-vessel fleet, it would be possible to operationalize 2,010 cycles in 25 years, adding up 140.7 million tons, or 5,628 MTPY. Further, considering 2.5 days of berth occupancy for each unloading

A DES model run was then carried out under the same criteria. The results obtained from the DES simulation model are shown in Table 6, compared with the analytical results.

The 98% average adherence of the DES simulation model was considered to be satisfactory

Twenty replications (of 25 years each) of the DES model were run for each scenario

proportions of SE iron ore at 30%, we make the following calculations:


**Table 7.** Results obtained by the DES model.

The analysis in Table 5 demonstrates that those scenarios operating with fleets of three vessels (Scenarios 8–10) reached a higher performance level regarding the operational criteria and service levels (average supported queuing time, time below safety stock level, days of input lacking). Furthermore, these scenarios are less risky to the system, less susceptible to uncertainties, less demanding on storage areas and more tolerant of queues at the iron ore supplier's terminals. However, the costs of these configurations are higher compared with the other scenarios in terms of the initial investment needed or the operational costs.

Among the first seven scenarios, which assume a two-vessel fleet, the comparison of similar scenarios in which variations only concern the reliability or not of chartering spare vessels (e.g., Scenarios 1 and 2, 3 and 4, 5 and 6) allows us to conclude that the chartering process is responsible for improving the operational results despite leading to increased costs. Moreover, it is noticeable that a higher percentage of SE iron ore incurs higher costs because of the greater distance between the input supplier and the steel production plant. Section 5.7 contemplates the MCDA.

#### **5.8. MCDA**

The decision-making process was based on the assignment of weights to the decision criteria listed in Section 5.5. The process is now presented. The following methodological step is the

assignment of scores associated with all the decision criteria in each of the 10 previously considered scenarios. Table 8 shows the importance of classifying the decision criteria and calculating the normalized weights associated with each of them. The criteria order of importance was defined unanimously by the group of decision makers.

Discrete Event Simulation Combined with Multiple Criteria Decision Analysis as a Decision Support Methodology in Complex Logistics Systems 127

Thus, the application of the normalized weights considered for each criterion (Table 8)

**Rank # Scenario Final Score** 

1 Scenario 4 0.78 2 Scenario 2 0.74 3 Scenario 6 0.72 4 Scenario 10 0.64 5 Scenario 9 0.62 6 Scenario 3 0.61 7 Scenario 8 0.60 8 Scenario 1 0.55 9 Scenario 5 0.50 10 Scenario 7 0.38

Table 10 shows that the scenario that has the highest final score is Scenario 4. The final scores of Scenarios 2 and 6 are, however, close to that of Scenario 4. Scenario 2 differs from Scenario 4 only by showing a smaller proportion of SE iron ore, while Scenario 6 employs a higher proportion of SE iron ore than does Scenario 4. However, Scenario 6 supports less

Scenario 10 is ranked fourth, virtually tied with Scenarios 9, 8 and 3. Scenario 3 is similar to Scenario 4, but with no vessels chartering and a lower average supported queuing time. The difference between Scenarios 10, 9 and 8, which are those with a dedicated three-vessel fleet operation, is in the proportion of SE iron ore employed in the steelmaking process: 40%, 35%

Given the proximity of the final scores of the three best-ranked scenarios (Scenarios 4, 2 and 6), a reasonable configuration is thus chosen between them. These three scenarios are composed by fleets of two vessels, which leads to close NPV values and total operational costs, similar total storage capacities (775,000 tons), a reliance on chartering vessels during fleet docking periods and no interruptions in the steelmaking process. Therefore, the final selection between these three scenarios is based on the average supported queuing time in

Scenario 2, second in the overall ranking, has the lowest SE iron ore percentage (30%), while Scenario 6, third in the overall ranking, has the highest SE proportion (40%). However, Scenario 6 supports only 50% of the average queuing time of Scenarios 2 and 4 (1.75 days versus 3.5 days). The final recommendation is thus Scenario 4 because its high average queuing time compared with Scenarios 2 and 6 and its intermediate percentage of SE iron ore.

results in a final score for each scenario. The scenarios are ranked in Table 10.

**Table 10.** Ranking of final scores for the 10 scenarios.

queuing time compared with Scenarios 4 and 2.

the supplier's terminal and the SE iron ore percentage.

and 30%, respectively.


**Table 8.** Importance of the decision criteria and normalized weights.

The criterion considered to be most important for the company was the number of days per year when the plant stops production owing to poor supply. This is an extremely critical criterion. Subsequently, the criteria related to costs are the most important (i.e., NPV and operational costs), followed by those related to operational risks (i.e., the safety stock level and uncertainty related to the average supported queuing time at iron ore terminals). After those criteria, the subsequent priorities are storage capacity, proportion of NE/SE iron ore input, stipulation of chartering vessels and new mission waiting time. From the simulation results shown in Table 7, the scores associated with all considered scenarios are presented in Table 9.


**Table 9.** Score by scenario and by criterion.

Thus, the application of the normalized weights considered for each criterion (Table 8) results in a final score for each scenario. The scenarios are ranked in Table 10.


**Table 10.** Ranking of final scores for the 10 scenarios.

126 Discrete Event Simulations – Development and Applications

**Criterion** 

assignment of scores associated with all the decision criteria in each of the 10 previously considered scenarios. Table 8 shows the importance of classifying the decision criteria and calculating the normalized weights associated with each of them. The criteria order of

1 Power Plant Stoppages 1 100.0 30 2 Net Investment Present Value (NPV) 2 50.0 15 3 Total Annual Operational Costs 2 50.0 15 4 % Time Below Safety Stock 3 33.3 10 5 Average Queuing Supported Time 4 25.0 8 6 Stocks Capacities 5 20.0 6 7 NE/SE Iron Ore Input Proportion 5 20.0 6 8 Vessels Chartering 6 16.7 5 9 New Mission Allocation Time 6 16.7 5

**(100/Priority)**

**Normalized Weight**

The criterion considered to be most important for the company was the number of days per year when the plant stops production owing to poor supply. This is an extremely critical criterion. Subsequently, the criteria related to costs are the most important (i.e., NPV and operational costs), followed by those related to operational risks (i.e., the safety stock level and uncertainty related to the average supported queuing time at iron ore terminals). After those criteria, the subsequent priorities are storage capacity, proportion of NE/SE iron ore input, stipulation of chartering vessels and new mission waiting time. From the simulation results shown in Table 7, the scores associated with all considered scenarios are presented in Table 9.

**Scenario Criterion 1 Criterion 2 Criterion 3 Criterion 4 Criterion 5 Criterion 6 Criterion 7 Criterion 8 Criterion 9** Scenario 1 0.38 1.00 1.00 0.80 0.00 0.00 0.00 1.00 0.00 Scenario 2 1.00 0.86 0.97 1.00 0.00 0.00 0.50 0.00 0.65 Scenario 3 0.50 0.97 0.97 0.80 0.50 0.00 0.00 1.00 0.00 Scenario 4 1.00 0.83 0.97 0.92 0.50 0.00 0.50 0.00 0.85 Scenario 5 0.10 0.97 0.94 0.48 1.00 0.00 0.00 1.00 0.10 Scenario 6 1.00 0.80 0.91 0.88 1.00 0.00 0.00 0.00 0.00 Scenario 7 0.00 0.83 0.97 0.00 0.50 0.35 0.00 0.00 1.00 Scenario 8 1.00 0.03 0.16 1.00 0.00 1.00 1.00 1.00 0.00 Scenario 9 1.00 0.00 0.09 1.00 0.50 0.93 1.00 1.00 0.00 Scenario 10 1.00 0.00 0.00 1.00 1.00 0.86 1.00 1.00 0.00

**Sum 332 100**

importance was defined unanimously by the group of decision makers.

**# Criterion Priority Weight** 

**Table 8.** Importance of the decision criteria and normalized weights.

**Table 9.** Score by scenario and by criterion.

Table 10 shows that the scenario that has the highest final score is Scenario 4. The final scores of Scenarios 2 and 6 are, however, close to that of Scenario 4. Scenario 2 differs from Scenario 4 only by showing a smaller proportion of SE iron ore, while Scenario 6 employs a higher proportion of SE iron ore than does Scenario 4. However, Scenario 6 supports less queuing time compared with Scenarios 4 and 2.

Scenario 10 is ranked fourth, virtually tied with Scenarios 9, 8 and 3. Scenario 3 is similar to Scenario 4, but with no vessels chartering and a lower average supported queuing time. The difference between Scenarios 10, 9 and 8, which are those with a dedicated three-vessel fleet operation, is in the proportion of SE iron ore employed in the steelmaking process: 40%, 35% and 30%, respectively.

Given the proximity of the final scores of the three best-ranked scenarios (Scenarios 4, 2 and 6), a reasonable configuration is thus chosen between them. These three scenarios are composed by fleets of two vessels, which leads to close NPV values and total operational costs, similar total storage capacities (775,000 tons), a reliance on chartering vessels during fleet docking periods and no interruptions in the steelmaking process. Therefore, the final selection between these three scenarios is based on the average supported queuing time in the supplier's terminal and the SE iron ore percentage.

Scenario 2, second in the overall ranking, has the lowest SE iron ore percentage (30%), while Scenario 6, third in the overall ranking, has the highest SE proportion (40%). However, Scenario 6 supports only 50% of the average queuing time of Scenarios 2 and 4 (1.75 days versus 3.5 days). The final recommendation is thus Scenario 4 because its high average queuing time compared with Scenarios 2 and 6 and its intermediate percentage of SE iron ore.

## **5.9. Sensitivity analysis**

After obtaining the first recommended alternatives, further analyses may be performed through a sensitivity analysis by changing the weights of the criteria and priorities as well as through the generation of new alternative solutions. Another alternative is the reapplication of the MCDA model, after the elimination of the less promising alternatives (in this case, Scenarios 1, 5 and 7, which obtained final scores lower than 0.60). Following the removal of these scenarios, there will be a redistribution of the normalized scores and thus the evaluation of the remaining alternatives will become a more robust process. Although the range of evaluation scenarios and possible solutions may be lost, the decision-making process certainly becomes more meticulous and accurate.

Discrete Event Simulation Combined with Multiple Criteria Decision Analysis as a Decision Support Methodology in Complex Logistics Systems 129

MDCA methodology

Selection of the best scenario

> Simulation-Optmization

Another important contribution of this chapter is the possibility of choosing alternatives based on various relevant and often antagonistic criteria, usually ignored by skewed decision-making processes, which are primarily guided by the financial aspects (costs/income) of each proposed solution. Moreover, in a conventional simulation study, scenario evaluations are usually based on a single criterion (related mostly to operational aspects) and the classification of two scenarios: viable or not viable. By observing, developing and working from a more extensive and complete evaluation perspective, including the participation of several experts in setting criteria and priorities, decision

We believe that there is a great field to be explored in science management and DSSs by applying the presented hybrid methodology, especially in strategic environments such as Supply Chain Management and Supply Chain Strategy. For example, a manufacturing plant could be redesigned based on the company's objectives and marketing department's

intentions using a specific set of products (for further discussion, see [32,33]).

Evaluation of alternatives through comprehensiveMCDA‐DES model

Optimization

Determine attributes, weights and value functions.

MCDA model

**Figure 15.** Framework of the simulation/optimization/MCDA methodology.

making becomes a more inclusive and trustful process.

**6.1. Future research and recommendations** 

**Decision support methodology**

MCDA-Simulation-Optimization

In addition, sensitivity analysis regarding other aspects such as value functions may be performed in order to observe the behavior and responses of the system as a whole to variations in data inputs. Another point to consider is the participation of several specialists to establish the criteria and their importance weights, as this commitment increases credibility to the study.

## **6. Conclusion**

Firstly, it is important to keep in mind that the DSS methodology proposed in this chapter is not a solution optimizer methodology that necessarily indicates the best decisions to make. This chapter contributes to the decision-making literature by showing how DSSs can bridge the gap between enterprise systems and decision makers. Implementing the proposed DSS would provide companies with a distinct competitive advantage: when following the hybrid methodology steps exemplified, the proposed DSS tool could certainly guide and orientate decision making based on technical and practical fundamentals. Moreover, such a methodology may involve a team of several experts in the definition of criteria and weights, corroborating decision-making credibility. In that way, human evaluation capability and judgments should never to be left alone in any decision-making process.

However, the main conclusion of this study confirms the efficacy of using DES in a broader and more complex environment compared with the development and application of a simplistic model. The proposed DSS tool works as a catalyst to improve the decision-making process, deriving the capabilities of a DES model and surpassing its shortcomings through the employment of MCDA to allow for further tradeoff studies. The DES and MCDA combined methodology has been shown to be effective as a complex logistics problem decision-making support tool. Further, the developed DSS tool, with some minor modifications, would be applicable for the evaluation of similar logistics systems.

Merging MCDA and DES can enhance the interaction between model development and users/customers, thereby improving model development and the analysis of the results. This interaction is an important quality issue in DES models, especially those that promote social change, namely those that help users make better decisions [31]. Therefore, quality improvement for DES can be investigated when using MCDA in combination.

Another important contribution of this chapter is the possibility of choosing alternatives based on various relevant and often antagonistic criteria, usually ignored by skewed decision-making processes, which are primarily guided by the financial aspects (costs/income) of each proposed solution. Moreover, in a conventional simulation study, scenario evaluations are usually based on a single criterion (related mostly to operational aspects) and the classification of two scenarios: viable or not viable. By observing, developing and working from a more extensive and complete evaluation perspective, including the participation of several experts in setting criteria and priorities, decision making becomes a more inclusive and trustful process.

#### **6.1. Future research and recommendations**

128 Discrete Event Simulations – Development and Applications

process certainly becomes more meticulous and accurate.

After obtaining the first recommended alternatives, further analyses may be performed through a sensitivity analysis by changing the weights of the criteria and priorities as well as through the generation of new alternative solutions. Another alternative is the reapplication of the MCDA model, after the elimination of the less promising alternatives (in this case, Scenarios 1, 5 and 7, which obtained final scores lower than 0.60). Following the removal of these scenarios, there will be a redistribution of the normalized scores and thus the evaluation of the remaining alternatives will become a more robust process. Although the range of evaluation scenarios and possible solutions may be lost, the decision-making

In addition, sensitivity analysis regarding other aspects such as value functions may be performed in order to observe the behavior and responses of the system as a whole to variations in data inputs. Another point to consider is the participation of several specialists to establish the criteria and their importance weights, as this commitment increases

Firstly, it is important to keep in mind that the DSS methodology proposed in this chapter is not a solution optimizer methodology that necessarily indicates the best decisions to make. This chapter contributes to the decision-making literature by showing how DSSs can bridge the gap between enterprise systems and decision makers. Implementing the proposed DSS would provide companies with a distinct competitive advantage: when following the hybrid methodology steps exemplified, the proposed DSS tool could certainly guide and orientate decision making based on technical and practical fundamentals. Moreover, such a methodology may involve a team of several experts in the definition of criteria and weights, corroborating decision-making credibility. In that way, human evaluation capability and

However, the main conclusion of this study confirms the efficacy of using DES in a broader and more complex environment compared with the development and application of a simplistic model. The proposed DSS tool works as a catalyst to improve the decision-making process, deriving the capabilities of a DES model and surpassing its shortcomings through the employment of MCDA to allow for further tradeoff studies. The DES and MCDA combined methodology has been shown to be effective as a complex logistics problem decision-making support tool. Further, the developed DSS tool, with some minor

Merging MCDA and DES can enhance the interaction between model development and users/customers, thereby improving model development and the analysis of the results. This interaction is an important quality issue in DES models, especially those that promote social change, namely those that help users make better decisions [31]. Therefore, quality

judgments should never to be left alone in any decision-making process.

modifications, would be applicable for the evaluation of similar logistics systems.

improvement for DES can be investigated when using MCDA in combination.

**5.9. Sensitivity analysis** 

credibility to the study.

**6. Conclusion** 

We believe that there is a great field to be explored in science management and DSSs by applying the presented hybrid methodology, especially in strategic environments such as Supply Chain Management and Supply Chain Strategy. For example, a manufacturing plant could be redesigned based on the company's objectives and marketing department's intentions using a specific set of products (for further discussion, see [32,33]).

**Figure 15.** Framework of the simulation/optimization/MCDA methodology.

In addition, MCDA has a potential interface in simulation/optimization problems during the definition of the objective function. Converting objectives and values into an input/output assessment framework improves the scope of the study. In logistics systems, for example, DES has a greater capability to deal with randomness, increase the comprehension of the system and thus generate new ideas and solutions. Further, MCDA increases the visualization, measurement and weight of values and objectives through a set of attributes, while the methodologies of optimization for simulation (see [34]) enhance the elaboration, comparison and determination of the optimal or most efficient scenario. A framework for a more complex methodology for a DSS is presented in Figure 15, illustrating how these three methodologies may be implemented.

Discrete Event Simulation Combined with Multiple Criteria Decision Analysis as a Decision Support Methodology in Complex Logistics Systems 131

[11] Belton, V., Stewart, J.T. (2001). Multiple Criteria Decision Analysis – An Integrated

[12] Arnott, D., Pervan, G. (2005). A critical analysis of decision support systems research,

[13] Holsapple, C.W., Whinston, A.B. (2000) Decision Support Systems: A Knowledge-based

[14] Smith, E.D. (2006). Tradeoff Studies and Cognitive Biases. PhD Thesis, Arizona

[15] Landman, J.R. (1993). The Persistence of The Possible. Oxford University Press, New

[16] Franco, L.A., Montibeller, G. (2010). Facilitated Modeling In Operational Research.

[17] Schein, E.H. (1998). Process Consultation Revisited: Building the Helping Relationship.

[18] Eden, C. (1982). Problem Construction and the Influence of OR. Interfaces 12 (2), pp. 50–

[19] Eden, C., Jones, S., Sims, D., Smithin, T. (1981). The Intersubjectivity of Issues and Issues

[20] Rosenhead, J., Mingers, J., (2001). A New Paradigm of Analysis. In: Rosenhead, J., Mingers, J. (Eds.), Rational Analysis for a Problematic World Revisited: Problem Structuring Methods for Complexity, Uncertainty, and Conflict. Wiley, Chichester, pp.

[21] Eden, C., Sims, D., 1979. On the Nature of Problems in Consulting Practice. OMEGA:

[22] Eden, C., Ackermann, F. (2004). Use of 'Soft OR' Models by Clients: What do they want from them? In: Pidd, M. (Ed.), Systems Modelling: Theory and Practice. Wiley,

[23] Phillips, L. (1984). A Theory of Requisite Decision Models. Acta Psychologica 56 (1–3),

[24] Friend, J., Hickling, A., 2005. Planning Under Pressure: The Strategic Choice Approach,

[25] Phillips, L., 2007. Decision Conferencing. In: Edwards, W., Miles, R., Jr., von Winterfeldt, D. (Eds.), Advances in Decision Analysis: From Foundations to

[26] Rosenhead, J., Mingers, J. (2001). A New Paradigm of Analysis. In: Rosenhead, J., Mingers, J. (Eds.), Rational Analysis for a Problematic World Revisited: Problem Structuring Methods for Complexity, Uncertainty, and Conflict. Wiley, Chichester, pp.

[27] Montibeller, G., Franco L.A. (2007). Decision And Risk Analysis for the Evaluation Of Strategic Options. In: Supporting Strategy: Frameworks, Methods and Models. ed. F.A.

[28] Saaty, T.L. (2001). Decision making for leaders. RWS Publications. Pittsburgh.

of Intersubjectivity. The Journal of Management Studies 18 (1), pp. 37– 47.

The International Journal of Management Science 7 (2), pp. 119–127.

Applications. Cambridge University Press, New York, pp. 375–399.

O'Brien, and R.G. Dyson. pp. 251–284. Wiley, Chichester.

Approach. Kluwer Academic Publishers, London.

European Journal of Operational Research 205, pp. 489–500.

Journal of Information Technology, 20 (2).

Approach, West Publishing, St. Paul, MN.

University.

York, NY.

60.

1–19.

pp. 29–48.

1–19.

Addison Wesley.

Chichester, pp. 146–163.

3rd Edition, Elsevier.

## **Author details**

Thiago Barros Brito\* , Rodolfo Celestino dos Santos Silva, Edson Felipe Capovilla Trevisan and Rui Carlos Botter

*University of Sao Paulo, Department of Naval Engineering, CILIP (Innovation Center for Logistics and Ports Infrastructure), Sao Paulo, Brazil* 

## **7. References**


<sup>\*</sup> Corresponding Author

[11] Belton, V., Stewart, J.T. (2001). Multiple Criteria Decision Analysis – An Integrated Approach. Kluwer Academic Publishers, London.

130 Discrete Event Simulations – Development and Applications

methodologies may be implemented.

*University of Sao Paulo, Department of Naval Engineering,* 

SIMAN, McGraw-Hill. 2nd Edition. New York.

Dynamics Series, Productivity Press, Portland, OR.

Approach. PhD Thesis, University of Sao Paulo.

Practice, 2nd Edition. São Paulo.

Ltd, 2nd Edition, São Paulo.

Nijmegen, the Netherlands.

September – October.

Edition, New York.

Corresponding Author

 \*

Intelligence Systems, Prentice Hall.

*CILIP (Innovation Center for Logistics and Ports Infrastructure), Sao Paulo, Brazil* 

[1] Altiok, T., Melamed, B. (2007). Simulation modeling and analysis with Arena. Academic

[2] Pedgen, C.D., Shannon, R.E., Sadowski, R.P. (1995). Introduction to simulation using

[3] Chwif, L., Medina, A.C. (2006). Modeling and Simulation of Discrete Events: Theory &

[4] Freitas,, P.J.F. (2001). Introduction to Systems Modeling and Simulation. Visual Books

[5] De Geu, A.P. (1994). Foreword: Modeling to predict or to learn? System Dynamics Series, p. xiii – xv, Productivity Press, Portland, OR. Dynamics Society, July 23-27,

[6] Morecroft, J.D.W., Stermann, J.D. (1994). Modeling for Learning Organizations. System

[7] Chwif, L. (1999). Discrete Event Simulation Model Reduction in Design Step: A Causal

[8] Wack, P. (1985). Scenarios: the gentle art of re-perceiving. Harvard Business Review,

[9] Zeleny, M. (1982). Multiple Criteria Decision Making. McGraw-Hill Book Company, 1st

[10] Turban, E., Aronson, J.E., Liang, T.P., Sharda, R. (2007) Decision Support and Business

**Author details** 

Thiago Barros Brito\*

**7. References** 

Press.

and Rui Carlos Botter

In addition, MCDA has a potential interface in simulation/optimization problems during the definition of the objective function. Converting objectives and values into an input/output assessment framework improves the scope of the study. In logistics systems, for example, DES has a greater capability to deal with randomness, increase the comprehension of the system and thus generate new ideas and solutions. Further, MCDA increases the visualization, measurement and weight of values and objectives through a set of attributes, while the methodologies of optimization for simulation (see [34]) enhance the elaboration, comparison and determination of the optimal or most efficient scenario. A framework for a more complex methodology for a DSS is presented in Figure 15, illustrating how these three

, Rodolfo Celestino dos Santos Silva, Edson Felipe Capovilla Trevisan

	- [29] Montibeller, G., Franco L.A. (2008). Multi-criteria Decision Analysis for Strategic Decision Making. In: Handbook of Multicriteria Analysis, Volume 103, Part 1, pp. 25– 48, Springer, 1st Edition, Gainesville.

**Section 3** 

**Applications of Discrete Event** 

**Simulation Towards Various Systems** 


## **Applications of Discrete Event Simulation Towards Various Systems**

132 Discrete Event Simulations – Development and Applications

48, Springer, 1st Edition, Gainesville.

(3), Summer 2002 pp. 192–215.

Journal of Operations Research 138, pp. 103–117.

pp. 464–475.

[29] Montibeller, G., Franco L.A. (2008). Multi-criteria Decision Analysis for Strategic Decision Making. In: Handbook of Multicriteria Analysis, Volume 103, Part 1, pp. 25–

[30] Barcus, A., Montibeller, G. (2008). Supporting the Allocation of Software Development Work in Distributed Teams With Multi-Criteria Decision Analysis. Omega 36 (2008),

[31] Robinson, S. (2002). General concepts of quality for discrete-event simulation. European

[32] Fisher, M.L. (1997). What is the right supply chain for your product? Harvard Business Review. March-April 1997, Harvard Business Review Publishing, Boston, MA. [33] Mentzer, J.T., DeWitt, W., Keebler, J.S., Nix, N.W., Smith, C.D., Zacharia, Z.G. (2001).

[34] Optimization for Simulation: Theory vs. Practice. INFORMS Journal on Computing, 14

Defining Supply Chain Management. Journal of Business Logistics, 22 (2).

**Chapter 5** 

© 2012 Wee Hun Lim and Wee Chuan Lim, licensee InTech. This is an open access chapter distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is

and reproduction in any medium, provided the original work is properly cited.

© 2012 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution,

properly cited.

**Human Evacuation Modeling** 

Stephen Wee Hun Lim and Eldin Wee Chuan Lim

The modeling of movement patterns of human crowds at the exit point of an enclosed space is a complex and challenging problem. In a densely populated space, if all the occupants are simultaneously rushing for the exits, shuffling, pushing, crushing and trampling of people in the crowd may cause serious injuries and even loss of lives. An analytical study of crowd dynamics through exits may provide useful information for crowd control purposes. Proper understanding of the evacuation dynamics will allow, for example, improvements of designs of pedestrian facilities. In particular, the dynamics of evacuation through a narrow door during an emergency is a complex problem that is not yet well understood. The possible causes for evacuation may include building fires, military or terrorist attacks, natural disasters such as earthquakes, etc. In the light of tightened homeland security, research on evacuation modeling has been gaining impetus and attracting the attention of

In the published literature, one of the first computational studies of human evacuation was reported by Helbing *et al*. [1]. They applied a model of pedestrian behavior to investigate the mechanisms of panic and jamming by uncoordinated motion in crowds and suggested an optimal strategy for escape from a smoke-filled room involving a mixture of individualistic behavior and collective herding instinct. Subsequently, two main approaches, referred to as cellular automata or the lattice gas model and the continuum modeling framework, have been pursued by researchers in this field for modeling studies of human evacuation over the last decade. In the cellular automata approach, the computational domain is discretised into cells which can either be empty or occupied by one human subject exactly. Each human subject is then simulated to either remain stationary or move into an empty neighboring cell according to certain transition probability rules. Kirchner and Schadschneider [2] applied such an approach to model evacuation from a large room with one or two doors and observed that a proper combination of herding behavior and use of knowledge about the surrounding was

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/49938

researchers from various fields.

**1. Introduction** 

## **Human Evacuation Modeling**

Stephen Wee Hun Lim and Eldin Wee Chuan Lim

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/49938

## **1. Introduction**

The modeling of movement patterns of human crowds at the exit point of an enclosed space is a complex and challenging problem. In a densely populated space, if all the occupants are simultaneously rushing for the exits, shuffling, pushing, crushing and trampling of people in the crowd may cause serious injuries and even loss of lives. An analytical study of crowd dynamics through exits may provide useful information for crowd control purposes. Proper understanding of the evacuation dynamics will allow, for example, improvements of designs of pedestrian facilities. In particular, the dynamics of evacuation through a narrow door during an emergency is a complex problem that is not yet well understood. The possible causes for evacuation may include building fires, military or terrorist attacks, natural disasters such as earthquakes, etc. In the light of tightened homeland security, research on evacuation modeling has been gaining impetus and attracting the attention of researchers from various fields.

In the published literature, one of the first computational studies of human evacuation was reported by Helbing *et al*. [1]. They applied a model of pedestrian behavior to investigate the mechanisms of panic and jamming by uncoordinated motion in crowds and suggested an optimal strategy for escape from a smoke-filled room involving a mixture of individualistic behavior and collective herding instinct. Subsequently, two main approaches, referred to as cellular automata or the lattice gas model and the continuum modeling framework, have been pursued by researchers in this field for modeling studies of human evacuation over the last decade. In the cellular automata approach, the computational domain is discretised into cells which can either be empty or occupied by one human subject exactly. Each human subject is then simulated to either remain stationary or move into an empty neighboring cell according to certain transition probability rules. Kirchner and Schadschneider [2] applied such an approach to model evacuation from a large room with one or two doors and observed that a proper combination of herding behavior and use of knowledge about the surrounding was

© 2012 Wee Hun Lim and Wee Chuan Lim, licensee InTech. This is an open access chapter distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. © 2012 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

necessary for achieving optimal evacuation times. Perez *et al.* [3] used the same modeling approach and found that in situations where exit door widths could accommodate the simultaneous exit of more than one human subject at any given time, subjects left the room in bursts of different sizes. Takimoto and Nagatani [4] applied the lattice gas model to simulate the evacuation process from a hall and observed that the average escape time was dependent on the average initial distance from the exit. The same conclusion was reached by Helbing *et al.* [5] who applied the same modeling approach and compared escape times with experimental results. Subsequently, the authors extended their lattice gas model to simulate evacuation of subjects in the absence of visibility and found that addition of more exits did not improve escape time due to a kind of herding effect based on acoustic interactions in such situations [6]. Nagatani and Nagai [7] then derived the probability density distributions of the number of steps of a biased random walk to a wall during an evacuation process from a dark room, first contact point on the wall and the number of steps of a second walk along the wall. In a following study, the probability density distributions of escape times were also derived and shown to be dependent on exit configurations [8]. Qiu *et al.* [9] simulated escaping pedestrian flow along a corridor under open boundary condition using the cellular automata approach. It was found that transition times were closely dependent on the width of the corridor and maximum speed of people but only weakly dependent on the width of doors. More recently, a contrasting mathematical approach for modeling crowd dynamics that is based on the framework of continuum mechanics has also been introduced by some research workers [10]. Such an approach uses the mass conservation equations closed by phenomenological models linking mass velocity to density and density gradients. These closures can take into account movement in more than one space dimension, presence of obstacles, pedestrian strategies and panic conditions. However, it is also recognized that human evacuation systems do not strictly satisfy the classical continuum assumption [11] and so macroscopic models have to be considered as approximations of physical reality which in some cases, such as low density regimes, may not be satisfactory. Furthermore, such macroscopic models are derived based on the assumption that all individuals behave in the same way, or namely, that the system is homogeneous.

Human Evacuation Modeling 137

translational and rotational motions of individual solid particles are governed by Newton's

1

*i c ij d ij j dv m ff dt*

*N i*

, ,

1

where mi and vi are the mass and velocity of ith particle respectively, N is the number of particles in contact with ith particle, fc,ij and fd,ij are the contact and viscous contact damping forces respectively, Ii is the moment of inertia of ith particle, i is its angular velocity and Tij is

Contact and damping forces have to be calculated using force-displacement models that relate such forces to the relative positions, velocities and angular velocities of the colliding particles. Following previous studies, a linear spring-and-dashpot model was implemented for the calculation of these collision forces. With such a closure, interparticle collisions are modeled as compressions of a perfectly elastic spring while the inelasticities associated with such collisions are modeled by the damping of energy in the dashpot component of the model. Collisions between particles and a wall may be handled in a similar manner but with the latter not incurring any change in its momentum. In other words, a wall at the point of contact with a particle may be treated as another particle but with an infinite amount of inertia. The normal (fcn,ij, fdn,ij) and tangential (fct,ij, fdt,ij) components of the contact and damping forces are calculated according to the following

> *f n cn ij n i n ij i* , ,,

*f t ct ij t i t ij i* , ,, 

 *dn ij n i r i i* , , *f v nn* 

*f v tt R R dt ij t i r i i i i j j* , ,

where n,i, n,ij, ni, n,i and t,i, t,ij, ti, t,i are the spring constants, displacements between particles, unit vectors and viscous contact damping coefficients in the normal and tangential directions respectively, vr is the relative velocity between particles and Ri and Rj are the radii

surfaces is simulated based on Coulomb-type friction law, i.e. , , tan *ct ij cn ij f f*

of particles i and j respectively. If , , tan *ct ij cn ij f f*

is analogous to the coefficient of friction.

the torque arising from contact forces which causes the particle to rotate.

*N i i ij j <sup>d</sup> I T dt* 

(1)

(2)

(3)

(4)

, then 'slippage' between two contacting

, where tan

(6)

(5)

laws of motion:

equations:

In the present study, a particle-based simulation approach known as the Discrete Element Method (DEM) was applied for modeling of human evacuation from a room with a single exit. The governing equations used in this method will be presented in the following section.

### **2. Mathematical model**

#### **2.1. Discrete Element Method**

The molecular dynamics approach to modeling of granular systems, otherwise known as the Discrete Element Method (DEM), has been applied extensively for studies of various aspects of granular behavior. The method of implementation in this proposed study followed that used by the author in previous studies of various types of granular systems [12–20]. The translational and rotational motions of individual solid particles are governed by Newton's laws of motion:

136 Discrete Event Simulations – Development and Applications

or namely, that the system is homogeneous.

section.

**2. Mathematical model** 

**2.1. Discrete Element Method** 

necessary for achieving optimal evacuation times. Perez *et al.* [3] used the same modeling approach and found that in situations where exit door widths could accommodate the simultaneous exit of more than one human subject at any given time, subjects left the room in bursts of different sizes. Takimoto and Nagatani [4] applied the lattice gas model to simulate the evacuation process from a hall and observed that the average escape time was dependent on the average initial distance from the exit. The same conclusion was reached by Helbing *et al.* [5] who applied the same modeling approach and compared escape times with experimental results. Subsequently, the authors extended their lattice gas model to simulate evacuation of subjects in the absence of visibility and found that addition of more exits did not improve escape time due to a kind of herding effect based on acoustic interactions in such situations [6]. Nagatani and Nagai [7] then derived the probability density distributions of the number of steps of a biased random walk to a wall during an evacuation process from a dark room, first contact point on the wall and the number of steps of a second walk along the wall. In a following study, the probability density distributions of escape times were also derived and shown to be dependent on exit configurations [8]. Qiu *et al.* [9] simulated escaping pedestrian flow along a corridor under open boundary condition using the cellular automata approach. It was found that transition times were closely dependent on the width of the corridor and maximum speed of people but only weakly dependent on the width of doors. More recently, a contrasting mathematical approach for modeling crowd dynamics that is based on the framework of continuum mechanics has also been introduced by some research workers [10]. Such an approach uses the mass conservation equations closed by phenomenological models linking mass velocity to density and density gradients. These closures can take into account movement in more than one space dimension, presence of obstacles, pedestrian strategies and panic conditions. However, it is also recognized that human evacuation systems do not strictly satisfy the classical continuum assumption [11] and so macroscopic models have to be considered as approximations of physical reality which in some cases, such as low density regimes, may not be satisfactory. Furthermore, such macroscopic models are derived based on the assumption that all individuals behave in the same way,

In the present study, a particle-based simulation approach known as the Discrete Element Method (DEM) was applied for modeling of human evacuation from a room with a single exit. The governing equations used in this method will be presented in the following

The molecular dynamics approach to modeling of granular systems, otherwise known as the Discrete Element Method (DEM), has been applied extensively for studies of various aspects of granular behavior. The method of implementation in this proposed study followed that used by the author in previous studies of various types of granular systems [12–20]. The

$$m\_i \frac{dv\_i}{dt} = \sum\_{j=1}^{N} \left( f\_{c, \vec{\eta}} + f\_{d, \vec{\eta}} \right) \tag{1}$$

$$I\_i \frac{d\alpha\_i}{dt} = \sum\_{j=1}^{N} T\_{ij} \tag{2}$$

where mi and vi are the mass and velocity of ith particle respectively, N is the number of particles in contact with ith particle, fc,ij and fd,ij are the contact and viscous contact damping forces respectively, Ii is the moment of inertia of ith particle, i is its angular velocity and Tij is the torque arising from contact forces which causes the particle to rotate.

Contact and damping forces have to be calculated using force-displacement models that relate such forces to the relative positions, velocities and angular velocities of the colliding particles. Following previous studies, a linear spring-and-dashpot model was implemented for the calculation of these collision forces. With such a closure, interparticle collisions are modeled as compressions of a perfectly elastic spring while the inelasticities associated with such collisions are modeled by the damping of energy in the dashpot component of the model. Collisions between particles and a wall may be handled in a similar manner but with the latter not incurring any change in its momentum. In other words, a wall at the point of contact with a particle may be treated as another particle but with an infinite amount of inertia. The normal (fcn,ij, fdn,ij) and tangential (fct,ij, fdt,ij) components of the contact and damping forces are calculated according to the following equations:

$$f\_{c n, \dot{\imath}\dot{\jmath}} = -\left(\kappa\_{n, \dot{\imath}} \delta\_{n, \dot{\imath}\dot{\jmath}}\right) \mathfrak{n}\_{\dot{\imath}}\tag{3}$$

$$(f\_{ct,ij} = -\left(\kappa\_{t,i}\delta\_{t,ij}\right)t\_i\tag{4}$$

$$f\_{dn,ij} = -\eta\_{n,i} (\upsilon\_r \cdot n\_i) n\_i \tag{5}$$

$$f\_{\rm dt,ij} = -\eta\_{t,i} \left| \left( \upsilon\_r \cdot t\_i \right) t\_i + \left( \alpha\_i \times \mathbb{R}\_i - \alpha\_j \times \mathbb{R}\_j \right) \right| \tag{6}$$

where n,i, n,ij, ni, n,i and t,i, t,ij, ti, t,i are the spring constants, displacements between particles, unit vectors and viscous contact damping coefficients in the normal and tangential directions respectively, vr is the relative velocity between particles and Ri and Rj are the radii of particles i and j respectively. If , , tan *ct ij cn ij f f* , then 'slippage' between two contacting surfaces is simulated based on Coulomb-type friction law, i.e. , , tan *ct ij cn ij f f* , where tan is analogous to the coefficient of friction.

#### **2.2. Simulation conditions**

The geometry of the computational domain considered in this study was in the form of a room measuring 10 m × 10 m. A single exit located at the center of one of the walls of the room was simulated. The width of the exit was specified as 1 m. A total of 100 human subjects initially randomly distributed within the room were considered. During the evacuation process, each subject was simulated to move generally in the direction of the exit while interacting with other subjects through human-human collisions according to the governing equations of the model.

Human Evacuation Modeling 139

**(b)**

**(d)**

**EXIT**

**EXIT**

**Figure 1.** Top view of an evacuation process involving 100 human subjects from a room measuring 10

**(a)**

**(c)**

**(e)**

**EXIT**

**EXIT**

**EXIT**

m × 10 m.

## **3. Results and discussions**

Fig. 1 shows the top view of the evacuation process simulated. The exit of the room was simulated to be located at the centre of the bottom wall. The arrow symbols associated with each subject indicate the instantaneous direction of movement. The subjects were originally distributed randomly throughout the room and it was assumed that each subject sought to reach the exit in the most direct manner while obeying only basic laws of physics as defined by the governing equations of the DEM model. The typical phenomenon of jamming that is ubiquitous in various physical systems, such as the flows of granular materials for example, could be reproduced computationally with such an approach. It can be seen that there was a tendency for the subjects to first cluster round the exit of the room and then spread along the wall where the exit was situated. The limiting factor of the evacuation process in this case was the necessity for subjects to leave the room through the exit one at a time. The speed of movement during the initial stage of the evacuation process to form the human cluster around the exit did not play a significant role in determining the total amount of time required for the entire evacuation process to be completed. In other words, the limiting factor or bottleneck of the overall evacuation process in this case was movement of individual subjects through the exit. This is consistent with observations of other researchers utilizing other modeling approaches, such as cellular automata or the lattice gas model, for simulating such evacuation processes. This points towards the possibility of improving the evacuation time simply by increasing the width of the exit such that more than one subject can exit at any one time or by increasing the total number of exits of the room.

Fig. 2 shows the spatial distribution of collision forces that developed due to human-human collisions during the evacuation process. Here, the color contours indicate high (red) and low (blue) magnitudes of such collision forces. This ability to predict collision forces is a novel feature of the current approach for crowd dynamics modeling that is unavailable in all other approaches reported by other researchers in the literature to date. This will be important for subsequent estimations of the likelihood of the human subjects to sustain injuries as a result of the evacuation process and so will be crucial for casualty predictions. In terms of engineering designs of the interiors of buildings or any enclosed spaces, such predictions can also be applied in a reverse engineering sense with a view towards minimizing human casualties in such events of emergencies.

The geometry of the computational domain considered in this study was in the form of a room measuring 10 m × 10 m. A single exit located at the center of one of the walls of the room was simulated. The width of the exit was specified as 1 m. A total of 100 human subjects initially randomly distributed within the room were considered. During the evacuation process, each subject was simulated to move generally in the direction of the exit while interacting with other subjects through human-human collisions according to the

Fig. 1 shows the top view of the evacuation process simulated. The exit of the room was simulated to be located at the centre of the bottom wall. The arrow symbols associated with each subject indicate the instantaneous direction of movement. The subjects were originally distributed randomly throughout the room and it was assumed that each subject sought to reach the exit in the most direct manner while obeying only basic laws of physics as defined by the governing equations of the DEM model. The typical phenomenon of jamming that is ubiquitous in various physical systems, such as the flows of granular materials for example, could be reproduced computationally with such an approach. It can be seen that there was a tendency for the subjects to first cluster round the exit of the room and then spread along the wall where the exit was situated. The limiting factor of the evacuation process in this case was the necessity for subjects to leave the room through the exit one at a time. The speed of movement during the initial stage of the evacuation process to form the human cluster around the exit did not play a significant role in determining the total amount of time required for the entire evacuation process to be completed. In other words, the limiting factor or bottleneck of the overall evacuation process in this case was movement of individual subjects through the exit. This is consistent with observations of other researchers utilizing other modeling approaches, such as cellular automata or the lattice gas model, for simulating such evacuation processes. This points towards the possibility of improving the evacuation time simply by increasing the width of the exit such that more than one subject can exit at

any one time or by increasing the total number of exits of the room.

minimizing human casualties in such events of emergencies.

Fig. 2 shows the spatial distribution of collision forces that developed due to human-human collisions during the evacuation process. Here, the color contours indicate high (red) and low (blue) magnitudes of such collision forces. This ability to predict collision forces is a novel feature of the current approach for crowd dynamics modeling that is unavailable in all other approaches reported by other researchers in the literature to date. This will be important for subsequent estimations of the likelihood of the human subjects to sustain injuries as a result of the evacuation process and so will be crucial for casualty predictions. In terms of engineering designs of the interiors of buildings or any enclosed spaces, such predictions can also be applied in a reverse engineering sense with a view towards

**2.2. Simulation conditions** 

governing equations of the model.

**3. Results and discussions** 

**Figure 1.** Top view of an evacuation process involving 100 human subjects from a room measuring 10 m × 10 m.

Human Evacuation Modeling 141

cluster around the exit of the room followed by departure of subjects one at a time that

The application of the agent based approach for extensive parametric studies of effects of various engineering factors on the evacuation process such as number of human subjects present, initial configuration of the subjects, placement and number of exits, presence of unmovable obstacles, size and shape of the enclosed space will be the subject of a future study. In particular, in order to study human decisions underlying an evacuation process more closely, a multi-objective evolutionary algorithm for emergency response optimization can be applied. These algorithms are stochastic optimization methods that simulate the process of natural evolution [21]. Such an evolutionary approach is expected to discover and develop human factors and useful psychological models that determine decision-making

An agent based model was applied for crowd dynamics simulation in this study. The computational domain consisted of a room without any obstacles and a single exit and the evacuation of 100 subjects from the room was simulated. The typical phenomenon of jamming that is typical of such systems was reproduced computationally with such an approach. The evacuation process was observed to consist of the formation of a human cluster around the exit of the room followed by departure of subjects one at a time that created a significant bottleneck for the entire process. Future work can adopt an evolutionary algorithm to closely

[1] D. Helbing, I. Farkas, and T. Vicsek, "Simulating Dynamical Features of Escape Panic",

[2] A. Kirchner, and A. Schadschneider, "Simulation of Evacuation Processes using a Bionics-inspired Cellular Automaton Model for Pedestrian Dynamics", Physica A, vol.

[3] G. J. Perez, G. Tapang, M. Lim, and C. Saloma, "Streaming, Disruptive Interference and Power-law Behavior in the Exit Dynamics of Confined Pedestrians", Physica A, vol. 312,

created a significant bottleneck for the entire process.

predict human decision processes in an emergency context.

This study has been supported by the National University of Singapore.

Stephen Wee Hun Lim and Eldin Wee Chuan Lim

*National University of Singapore, Singapore* 

Nature, vol. 407, pp. 487–490, 2000.

312, pp. 260–276, 2002.

pp. 609–618, 2002.

processes in an emergency context.

**5. Summary** 

**Author details** 

**Acknowledgement** 

**6. References** 

**Figure 2.** A novel feature of the current approach where collision forces developed due to humanhuman collisions during the evacuation process can be predicted by the algorithm.

### **4. Conclusions**

An agent based model has been applied for modeling of the human evacuation process in this study. A relatively simple configuration consisting of a room without any obstacles and a single exit was considered and the evacuation of 100 subjects was simulated. The typical phenomenon of jamming that is ubiquitous in various physical systems, such as the flows of granular materials for example, could be reproduced computationally with such an approach. The evacuation process was observed to consist of the formation of a human cluster around the exit of the room followed by departure of subjects one at a time that created a significant bottleneck for the entire process.

The application of the agent based approach for extensive parametric studies of effects of various engineering factors on the evacuation process such as number of human subjects present, initial configuration of the subjects, placement and number of exits, presence of unmovable obstacles, size and shape of the enclosed space will be the subject of a future study.

In particular, in order to study human decisions underlying an evacuation process more closely, a multi-objective evolutionary algorithm for emergency response optimization can be applied. These algorithms are stochastic optimization methods that simulate the process of natural evolution [21]. Such an evolutionary approach is expected to discover and develop human factors and useful psychological models that determine decision-making processes in an emergency context.

### **5. Summary**

140 Discrete Event Simulations – Development and Applications

**Figure 2.** A novel feature of the current approach where collision forces developed due to human-

**EXIT EXIT EXIT**

**EXIT EXIT EXIT**

**150 s 200 s 250 s**

**300 s 350 s 400 s**

**50 s 100 s**

**EXIT EXIT EXIT**

An agent based model has been applied for modeling of the human evacuation process in this study. A relatively simple configuration consisting of a room without any obstacles and a single exit was considered and the evacuation of 100 subjects was simulated. The typical phenomenon of jamming that is ubiquitous in various physical systems, such as the flows of granular materials for example, could be reproduced computationally with such an approach. The evacuation process was observed to consist of the formation of a human

human collisions during the evacuation process can be predicted by the algorithm.

**4. Conclusions** 

An agent based model was applied for crowd dynamics simulation in this study. The computational domain consisted of a room without any obstacles and a single exit and the evacuation of 100 subjects from the room was simulated. The typical phenomenon of jamming that is typical of such systems was reproduced computationally with such an approach. The evacuation process was observed to consist of the formation of a human cluster around the exit of the room followed by departure of subjects one at a time that created a significant bottleneck for the entire process. Future work can adopt an evolutionary algorithm to closely predict human decision processes in an emergency context.

## **Author details**

Stephen Wee Hun Lim and Eldin Wee Chuan Lim *National University of Singapore, Singapore* 

#### **Acknowledgement**

This study has been supported by the National University of Singapore.

#### **6. References**

	- [4] K. Takimoto, and T. Nagatani, "Spatio-temporal Distribution of Escape Time in Evacuation Process", Physica A, vol. 320, pp. 611–621, 2003.

**Chapter 6** 

© 2012 Kotenko et al., licensee InTech. This is an open access chapter distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

© 2012 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution,

**Discrete-Event Simulation of** 

**Botnet Protection Mechanisms** 

Igor Kotenko, Alexey Konovalov and Andrey Shorov

The common use of computers, connected to the Internet, as well as insufficient level of security, allow malefactors to execute large-scale infrastructure attacks, engaging in criminal activity a huge number of computing nodes. Attacks of such type have been traditionally performing by botnets. There are examples of successful large-scale attacks fulfilled by armies of bots. For example, attacks such as distributed denial of service (DDoS), aimed at government websites of Estonia in 2007 and Georgia in 2008 had led to the practical inaccessibility of these sites for several days. In 2009 and 2010 spying botnets "GhostNet"

Further research of these botnets has shown their presence on governmental servers, which contain important sensitive information. In 2009 a malware "Stuxnet" was discovered, which was capable to affect SCADA-systems and steal intellectual property of corporations. Report "Worldwide Infrastructure Security Report", published by Arbor Networks in 2010, shows that the total capacity of DDoS attacks in 2010 has grown considerably and has overcome the barrier of 100 GB/sec. It is noted that the power of DDoS-attacks has grown more than twice in comparison with 2009 and more than 10 times in comparison with 2005.

On this basis, it becomes obvious that existing modern botnets are a very important phenomenon in the network security. Thus, the task of researching botnets and methods of protection against them is important. One of the promising approaches to research botnets

This paper is devoted to investigating botnets, which realize their expansion by network worm propagation mechanisms and perform attacks like "distributed denial of service" (DDoS). The protection mechanisms against botnets are of top-priority here. The main results of this work are the development of integrated simulation environment, including

and reproduction in any medium, provided the original work is properly cited.

and "Shadow Network" have been occurred in many countries around the world.

Additional information is available at the end of the chapter

and protection mechanisms is simulation.

http://dx.doi.org/10.5772/50101

**1. Introduction** 


**Chapter 6** 

## **Discrete-Event Simulation of Botnet Protection Mechanisms**

Igor Kotenko, Alexey Konovalov and Andrey Shorov

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/50101

## **1. Introduction**

142 Discrete Event Simulations – Development and Applications

2003.

16, pp. 225–235, 2005.

Sciences, vol. 18, pp. 1317–1345, 2008.

Engineering Science, vol. 61(24), pp. 7889–7908, 2006b.

Chemical Engineering Science, vol. 62(17), pp. 4529–4543, 2007.

Engineering Chemistry Research, vol. 47(2), pp. 481–485, 2008.

European Physical Journal E, vol. 32(4), pp. 365–375, 2010.

AIChE Journal, vol. 56(10), pp. 2588–2597, 2010.

of Hazardous Materials, vol. 178, pp. 792–803, 2010.

1247, 2008.

pp. 041302, 2009.

[4] K. Takimoto, and T. Nagatani, "Spatio-temporal Distribution of Escape Time in

[5] D. Helbing, M. Isobe, T. Nagatani, and K. Takimoto, "Lattice Gas Simulation of Experimentally Studied Evacuation Dynamics", Physical Review E, vol. 67, pp. 067101,

[6] M. Isobe, D. Helbing, and T. Nagatani, "Experiment, Theory, and Simulation of the Evacuation of a room without Visibility", Physical Review E, vol. 69, pp. 066132, 2004. [7] T. Nagatani, and R. Nagai, "Statistical Characteristics of Evacuation without Visibility

[8] R. Nagai, T. Nagatani, M. Isobe, and T. Adachi, "Effect of Exit Configuration on Evacuation of a room without Visibility", Physica A, vol. 343, pp. 712–724, 2004. [9] B. Qiu, H. Tan, C. Zhang, L. Kong, and M. Liu, "Cellular Automaton Simulation of the Escaping Pedestrian Flow in Corridor", International Journal of Modern Physics C, vol.

[10] V. Coscia, and C. Canavesio, "First-order Macroscopic Modelling of Human Crowd Dynamics", Mathematical Models and Methods in Applied Sciences, vol. 18, pp. 1217–

[11] N. Bellomo, and C. Dogbe, "On the Modelling Crowd Dynamics from Scaling to Hyperbolic Macroscopic Models", Mathematical Models and Methods in Applied

[12] E. W. C. Lim, C. H. Wang, and A. B. Yu, "Discrete Element Simulation for Pneumatic Conveying of Granular Material", AIChE Journal, vol. 52(2), pp. 496–509, 2006. [13] E. W. C. Lim., Y. Zhang, and C. H. Wang, "Effects of an Electrostatic Field in Pneumatic Conveying of Granular Materials through Inclined and Vertical Pipes", Chemical

[14] E. W. C. Lim, and C. H. Wang, "Diffusion Modeling of Bulk Granular Attrition", Industrial and Engineering Chemistry Research, vol. 45(6), pp. 2077–2083, 2006. [15] E. W. C. Lim, Y. S. Wong, and C. H. Wang, "Particle Image Velocimetry Experiment and Discrete-Element Simulation of Voidage Wave Instability in a Vibrated Liquid-Fluidized Bed", Industrial and Engineering Chemistry Research, vol. 46(4), pp. 1375–1389, 2007. [16] E. W. C. Lim, "Voidage Waves in Hydraulic Conveying through Narrow Pipes",

[17] E. W. C. Lim, "Master Curve for the Discrete-Element Method", Industrial and

[18] E. W. C. Lim, "Vibrated Granular Bed on a Bumpy Surface", Physical Review E, vol. 79,

[19] E. W. C. Lim, "Density Segregation in Vibrated Granular Beds with Bumpy Surfaces",

[20] E. W. C. Lim, "Granular Leidenfrost Effect in Vibrated Beds with Bumpy Surfaces",

[21] Georgiadou, P. S., Papazoglou, I. A., Kiranoudis, C. T., and N. C. Markatos, "Multiobjective evolutionary emergency response optimization for major accidents", Journal

Evacuation Process", Physica A, vol. 320, pp. 611–621, 2003.

in Random Walk Model", Physica A, vol. 341, pp. 638–648, 2004.

The common use of computers, connected to the Internet, as well as insufficient level of security, allow malefactors to execute large-scale infrastructure attacks, engaging in criminal activity a huge number of computing nodes. Attacks of such type have been traditionally performing by botnets. There are examples of successful large-scale attacks fulfilled by armies of bots. For example, attacks such as distributed denial of service (DDoS), aimed at government websites of Estonia in 2007 and Georgia in 2008 had led to the practical inaccessibility of these sites for several days. In 2009 and 2010 spying botnets "GhostNet" and "Shadow Network" have been occurred in many countries around the world.

Further research of these botnets has shown their presence on governmental servers, which contain important sensitive information. In 2009 a malware "Stuxnet" was discovered, which was capable to affect SCADA-systems and steal intellectual property of corporations. Report "Worldwide Infrastructure Security Report", published by Arbor Networks in 2010, shows that the total capacity of DDoS attacks in 2010 has grown considerably and has overcome the barrier of 100 GB/sec. It is noted that the power of DDoS-attacks has grown more than twice in comparison with 2009 and more than 10 times in comparison with 2005.

On this basis, it becomes obvious that existing modern botnets are a very important phenomenon in the network security. Thus, the task of researching botnets and methods of protection against them is important. One of the promising approaches to research botnets and protection mechanisms is simulation.

This paper is devoted to investigating botnets, which realize their expansion by network worm propagation mechanisms and perform attacks like "distributed denial of service" (DDoS). The protection mechanisms against botnets are of top-priority here. The main results of this work are the development of integrated simulation environment, including

© 2012 Kotenko et al., licensee InTech. This is an open access chapter distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. © 2012 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

the libraries for implementing models of botnets and models of protection mechanisms. As distinct from other authors' papers (for example, [12, 13]), this paper specifies the architecture of the integrated simulation environment and describes the set of conducted experiments on simulation of botnets and protection mechanisms against them.

Discrete-Event Simulation of Botnet Protection Mechanisms 145

Defense techniques "Virus Throttling" [37] and "Failed Connection" [4] are used to oppose botnet propagation on spreading stage. Such techniques as Threshold Random Walk [20]

Beyond many types of botnets attacks, we studied botnets which implement DDoS as an actual attack stage. We considered protection methods for different phases of DDoS attacks. Approaches Ingress/Egress Filtering and SAVE (Source Address Validity Enforcement Protocol) [17] are used as attack prevention mechanisms. They realize filtering of traffic streams for which IP spoofing was detected. Moreover, such techniques as SIM (Source IP Address Monitoring) [23] and Detecting SYN flooding [35] were taken into consideration as

We also investigated protection methods destined to detect botnets of different architectures. Botnet architecture is defined by the applied communication protocol. At present moment IRC-, HTTP- and P2P-related botnet architectures [21] are important for

Research on botnet modeling and simulation is based on a variety of methods and approaches. A large set of publications is devoted to botnet analytical modeling. For instance, a stochastic model of decentralized botnet propagation is presented in [26]. This model represents a botnet as a graph. Nodes of this graph represent the botnet states, and edges depict possible transitions between states. D.Dagon et al. [5] proposes an analytical model of global botnet, which describes dependencies between the activities of botnet nodes

Another group of studies uses simulation as a main tool to investigate botnets and computer networks in general. Studies in this group mainly rely on methods of discreteevent simulation of processes being executed in network structures [29, 36], as well as on trace-driven models initiated by trace data taken from actual networks [22]. G.Riley et al. [25] use the GTNetS simulation environment to build network worm propagation model. A.Suvatne [30] suggests a model of "Slammer" worm propagation by using "Wormulator" [14] simulation environment. M.Schuchard [27] presents simulation environment which allows to simulate a large-scale botnet containing 250 thousands of nodes. Gamer at al. [7] consider a DDoS simulation tool, called Distack. It is based on OMNeT++ discrete simulation system. Li at al. [17] use own simulation environment and testbeds to estimate efficiency, scalability and cost of implementation of protection

Other techniques, which are very important for investigation of botnets, are emulation, combining analytical, packet-based and emulation-based models of botnets and botnet defense (on macro level), as well as exploring real small-sized networks (to investigate

This paper describes the approach, which combines discrete-event simulation, componentbased design and packet-level simulation of network protocols. Initially this approach was suggested for network attack and defense simulation. In the present paper, as compared

and Credit-based Rate Limiting also require consideration.

methods for discovering DDoS attacks.

and the time zone for location of these nodes.

consideration.

mechanism SAVE.

botnets on micro level).

Comparing with previous research, the architecture of the integrated simulation environment has been substantially revised - more emphasis has been placed to extend the libraries of attacks and protection mechanisms. Also in this version of the simulation environment, we have applied a hierarchical component-based way of representation of architecture. The main attention in the paper is paid to the set of experiments, which provided the opportunity to compare protection methods against botnets on different stages of their life cycle.

The rest of the paper is organized as follows. Section 2 discusses the related work. Section 3 presents the architecture of the integrated simulation environment developed. Section 4 considers the attack and defense models. Section 5 contains the description of implementation and main parameters and the plan of the experiments. Section 6 describes the results of experiments. Concluding remarks and directions for further research are given in Section 7.

## **2. Related work**

The current work is based on results of three directions of research: analysis of botnets as a phenomenon occurred in the Internet [2, 6, 8, 19, 21], including the studies of botnet taxonometry, approaches of creation and improving the techniques for counteraction against modern botnets, and enhancement of concepts and methods for efficient modeling and simulation of botnet infrastructure and counteraction.

At present moment using public proceedings we can find many interpretations of different aspects of botnet functionality. A group of researches, related to analysis of botnet as a network phenomenon, defines botnet lifecycle [6, 21], which is consisting of several stages: initial infection and spreading stage, stage of 'stealth' operation and attack stage. Centralized [21] and decentralized [6, 8, 34] kinds of architectures are considered as results of investigation of feasible node roles, and different types of botnet arracks are described.

The investigations, devoted to botnet counteraction methods, may be conditionally divided into two logical groups: methods, which are based on identification of predefined signatures [28], and methods which rely on detection of local and network anomalies [3, 10, 18, 32]. The second group of methods has a significant advantage against first group in ability to detect unknown threats not having specific knowledge of their implementation [15]. On the other hand, the second group is much more resource consuming and more subjected to false positive and false negative errors.

Due to significant differences of botnet lifecycle stages, the combined protection methods are used extensively which take into account specificities of each stage.

Defense techniques "Virus Throttling" [37] and "Failed Connection" [4] are used to oppose botnet propagation on spreading stage. Such techniques as Threshold Random Walk [20] and Credit-based Rate Limiting also require consideration.

144 Discrete Event Simulations – Development and Applications

of their life cycle.

in Section 7.

described.

positive and false negative errors.

**2. Related work** 

the libraries for implementing models of botnets and models of protection mechanisms. As distinct from other authors' papers (for example, [12, 13]), this paper specifies the architecture of the integrated simulation environment and describes the set of conducted

Comparing with previous research, the architecture of the integrated simulation environment has been substantially revised - more emphasis has been placed to extend the libraries of attacks and protection mechanisms. Also in this version of the simulation environment, we have applied a hierarchical component-based way of representation of architecture. The main attention in the paper is paid to the set of experiments, which provided the opportunity to compare protection methods against botnets on different stages

The rest of the paper is organized as follows. Section 2 discusses the related work. Section 3 presents the architecture of the integrated simulation environment developed. Section 4 considers the attack and defense models. Section 5 contains the description of implementation and main parameters and the plan of the experiments. Section 6 describes the results of experiments. Concluding remarks and directions for further research are given

The current work is based on results of three directions of research: analysis of botnets as a phenomenon occurred in the Internet [2, 6, 8, 19, 21], including the studies of botnet taxonometry, approaches of creation and improving the techniques for counteraction against modern botnets, and enhancement of concepts and methods for efficient modeling

At present moment using public proceedings we can find many interpretations of different aspects of botnet functionality. A group of researches, related to analysis of botnet as a network phenomenon, defines botnet lifecycle [6, 21], which is consisting of several stages: initial infection and spreading stage, stage of 'stealth' operation and attack stage. Centralized [21] and decentralized [6, 8, 34] kinds of architectures are considered as results of investigation of feasible node roles, and different types of botnet arracks are

The investigations, devoted to botnet counteraction methods, may be conditionally divided into two logical groups: methods, which are based on identification of predefined signatures [28], and methods which rely on detection of local and network anomalies [3, 10, 18, 32]. The second group of methods has a significant advantage against first group in ability to detect unknown threats not having specific knowledge of their implementation [15]. On the other hand, the second group is much more resource consuming and more subjected to false

Due to significant differences of botnet lifecycle stages, the combined protection methods

are used extensively which take into account specificities of each stage.

and simulation of botnet infrastructure and counteraction.

experiments on simulation of botnets and protection mechanisms against them.

Beyond many types of botnets attacks, we studied botnets which implement DDoS as an actual attack stage. We considered protection methods for different phases of DDoS attacks. Approaches Ingress/Egress Filtering and SAVE (Source Address Validity Enforcement Protocol) [17] are used as attack prevention mechanisms. They realize filtering of traffic streams for which IP spoofing was detected. Moreover, such techniques as SIM (Source IP Address Monitoring) [23] and Detecting SYN flooding [35] were taken into consideration as methods for discovering DDoS attacks.

We also investigated protection methods destined to detect botnets of different architectures. Botnet architecture is defined by the applied communication protocol. At present moment IRC-, HTTP- and P2P-related botnet architectures [21] are important for consideration.

Research on botnet modeling and simulation is based on a variety of methods and approaches. A large set of publications is devoted to botnet analytical modeling. For instance, a stochastic model of decentralized botnet propagation is presented in [26]. This model represents a botnet as a graph. Nodes of this graph represent the botnet states, and edges depict possible transitions between states. D.Dagon et al. [5] proposes an analytical model of global botnet, which describes dependencies between the activities of botnet nodes and the time zone for location of these nodes.

Another group of studies uses simulation as a main tool to investigate botnets and computer networks in general. Studies in this group mainly rely on methods of discreteevent simulation of processes being executed in network structures [29, 36], as well as on trace-driven models initiated by trace data taken from actual networks [22]. G.Riley et al. [25] use the GTNetS simulation environment to build network worm propagation model. A.Suvatne [30] suggests a model of "Slammer" worm propagation by using "Wormulator" [14] simulation environment. M.Schuchard [27] presents simulation environment which allows to simulate a large-scale botnet containing 250 thousands of nodes. Gamer at al. [7] consider a DDoS simulation tool, called Distack. It is based on OMNeT++ discrete simulation system. Li at al. [17] use own simulation environment and testbeds to estimate efficiency, scalability and cost of implementation of protection mechanism SAVE.

Other techniques, which are very important for investigation of botnets, are emulation, combining analytical, packet-based and emulation-based models of botnets and botnet defense (on macro level), as well as exploring real small-sized networks (to investigate botnets on micro level).

This paper describes the approach, which combines discrete-event simulation, componentbased design and packet-level simulation of network protocols. Initially this approach was suggested for network attack and defense simulation. In the present paper, as compared with other works of authors, the various methods of botnet attacks and counteraction against botnets are explored by implementing comprehensive libraries of attack and defense components.

Discrete-Event Simulation of Botnet Protection Mechanisms 147

The first layer of abstraction is implemented by use of discrete event simulation environment OMNET++ [31]. OMNET++ provides the tools for simulation of network structures of different kinds and processes of message propagation in these structures.

The library INET Framework [11] is used for simulation of packet-switching networks. This library provides components implemented as OMNET++ modules and contains large variety of models of network devices and network protocols for wired and wireless

Simulation of realistic computer networks is carried out by using the library ReaSE [24]. The library is an extension of INET Framework [11]. It provides tools for creating realistic network topologies which parameters are statistically identical to parameters of real computer networks topologies. ReaSE includes also a realistic model of network traffic, modeled at the packet level [16, 38]. Models of network traffic are based on the approach, presented in [33]. This approach allows generating packet level traffic with parameters,

Simulation of target domain entities is committed through the set of components implemented by the authors. These components are integrated into the BOTNET Foundation Classes library (Fig.2). This library includes models of network applications

On the fourth layer of abstraction, all set of components of the target domain is divided into

The first group contains the components (1) responsible for propagation of attack network, (2) supporting attack network on the stage of 'stealth' operation, (3) hampering detection

The group of defense network includes the components (1) detecting and suppressing of attack network during every stage of its lifecycle, (2) carrying out management and control

which are statistically equivalent to the traffic observed in real computer networks.

belonging to botnets of various types and appropriate defense methods.

**Figure 2.** Relation between Abstraction and Implementation Layers

two groups related to attack and defense network correspondingly.

and suppression of attack network, and (4) executing DDoS attacks.

networks.

## **3. Simulation environment architecture**

The proposed simulation environment realizes a set of simulation models, called BOTNET, which implement processes of botnet operation and protection mechanisms.

With narrowing the context of consideration, these models could be represented as a sequence of internal abstraction layers: (1) discrete event simulation on network structures, (2) computational network with packet switching, (3) meshes of network services, (4) attack and defense networks.

Specification of every subsequent layer is an extended specification of the previous one. Enhancement of specification is achieved by defining new entities to the preceding layer of abstraction. Proposed view on semantic decomposition of BOTNET models is shown in Fig.1. Hierarchy of abstraction layers reproduces the structure of simulation components and modules.

Simulation environment relies on several libraries - implemented by authors of the paper and third party libraries. Functionality of each library matches to the appropriate layer of abstraction. The library, which is related to attack and defense networks layer, is implemented by the authors. All components of simulation environment are implemented in C++ programming language with standard runtime libraries.

**Figure 1.** Hierarchy of Abstraction Layers

The diagram representing the relations between layers of abstraction and implementation layers (libraries) is shown in Fig.2. Each particular library provides a set of modules and components, which are implementations of entities of appropriate semantic layer. Any given library can rely on the components exported by the libraries of the previous layer and can be used as a provider of components needed for the subsequent layer implementation.

The first layer of abstraction is implemented by use of discrete event simulation environment OMNET++ [31]. OMNET++ provides the tools for simulation of network structures of different kinds and processes of message propagation in these structures.

146 Discrete Event Simulations – Development and Applications

**3. Simulation environment architecture** 

components.

and defense networks.

and modules.

with other works of authors, the various methods of botnet attacks and counteraction against botnets are explored by implementing comprehensive libraries of attack and defense

The proposed simulation environment realizes a set of simulation models, called BOTNET,

With narrowing the context of consideration, these models could be represented as a sequence of internal abstraction layers: (1) discrete event simulation on network structures, (2) computational network with packet switching, (3) meshes of network services, (4) attack

Specification of every subsequent layer is an extended specification of the previous one. Enhancement of specification is achieved by defining new entities to the preceding layer of abstraction. Proposed view on semantic decomposition of BOTNET models is shown in Fig.1. Hierarchy of abstraction layers reproduces the structure of simulation components

Simulation environment relies on several libraries - implemented by authors of the paper and third party libraries. Functionality of each library matches to the appropriate layer of abstraction. The library, which is related to attack and defense networks layer, is implemented by the authors. All components of simulation environment are implemented

The diagram representing the relations between layers of abstraction and implementation layers (libraries) is shown in Fig.2. Each particular library provides a set of modules and components, which are implementations of entities of appropriate semantic layer. Any given library can rely on the components exported by the libraries of the previous layer and can be used as a provider of components needed for the subsequent layer

which implement processes of botnet operation and protection mechanisms.

in C++ programming language with standard runtime libraries.

**Figure 1.** Hierarchy of Abstraction Layers

implementation.

The library INET Framework [11] is used for simulation of packet-switching networks. This library provides components implemented as OMNET++ modules and contains large variety of models of network devices and network protocols for wired and wireless networks.

Simulation of realistic computer networks is carried out by using the library ReaSE [24]. The library is an extension of INET Framework [11]. It provides tools for creating realistic network topologies which parameters are statistically identical to parameters of real computer networks topologies. ReaSE includes also a realistic model of network traffic, modeled at the packet level [16, 38]. Models of network traffic are based on the approach, presented in [33]. This approach allows generating packet level traffic with parameters, which are statistically equivalent to the traffic observed in real computer networks.

Simulation of target domain entities is committed through the set of components implemented by the authors. These components are integrated into the BOTNET Foundation Classes library (Fig.2). This library includes models of network applications belonging to botnets of various types and appropriate defense methods.

**Figure 2.** Relation between Abstraction and Implementation Layers

On the fourth layer of abstraction, all set of components of the target domain is divided into two groups related to attack and defense network correspondingly.

The first group contains the components (1) responsible for propagation of attack network, (2) supporting attack network on the stage of 'stealth' operation, (3) hampering detection and suppression of attack network, and (4) executing DDoS attacks.

The group of defense network includes the components (1) detecting and suppressing of attack network during every stage of its lifecycle, (2) carrying out management and control of defense network and (3) providing robustness of defense network (these components realize protocols of centralized and decentralized overlay networks).

Discrete-Event Simulation of Botnet Protection Mechanisms 149

Component «Vulnerable App» is a model of the vulnerable server, which is prone to be

infected by the receiving malformed packed.

applications.

master.

architectures.

The component interaction diagram is shown in Fig.4.

**Figure 4.** Component interaction diagram for the botnet propagation scenario

All communication between the models of network applications is carried out via instant messaging. The core component that responsible for the messaging support is the "Worm Activity Director". In case of receiving a malformed network packet, the model of vulnerable network service notifies the component "Worm Activity Director" immediately. Then the component "Worm Activity Director" decapsulates the addresses of the hosts which are supposed to be command centers and sends the message which contains the information about servers to the "Worm App" component. The event of receiving such message by "Worm App" component is considered as a signal of the transition to the infecting state and as the instruction to start spreading a worm from the given host. In the current research we implemented TCP and UDP based worms and vulnerable network

The *model of botnet control* implements the botnet control scenario. The goal of this scenario is to provide the persistent controllability of the whole set of botnet nodes. Such process includes the procedures for support of the connectivity of the nodes and methods that are supposed to make the botnet responsible on the commands being issued by the bot-

The model of botnet control represented in this paper implements two types of

The first type of the botnet is based on the IRC protocol. It is a classic implementation of a

centralized botnet with a centralized command centers.

According to the structure of the scenarios, the behavior of BOTNET models is defined by the set of conditionally independent network processes. The model of legitimate traffic is based on approach described in [33], and implemented by the components of ReaSE library [24].

## **4. Attack and defense models**

*Attack network model* specifies a set of activities generated by attack network. In the current work we implemented this model by three relatively independent sub-models: the propagation model, the management and control model, and the attack phase model.

The *model of botnet's propagation* implements a scenario of expanding botnet over the computer network. The main goal of this scenario is the support of the ongoing process of involvement of new nodes into the botnet. In this paper, such scenario is presented as a model of the bot-agent code propagation by the means of the network worms of various types.

Participants of the botnet propagation scenario are the "IP Worm" and "Vulnerable App" components, which are the model of a network worm and the model of vulnerable network server, respectively (Fig. 3).

**Figure 3.** Static diagram of components involved in the botnet propagation scenario

The component "IP Worm" is a model of client application that responsible for sending malformed network packets in order to compromise possible vulnerable hosts. The parameters of this model are: the algorithm of victim's address generation, the method of source address spoofing, and the frequency which is used to send malformed packets. The payload of the malformed packet includes the network address of the server, which is supposed to be the command center of the growing botnet.

Component «Vulnerable App» is a model of the vulnerable server, which is prone to be infected by the receiving malformed packed.

The component interaction diagram is shown in Fig.4.

148 Discrete Event Simulations – Development and Applications

**4. Attack and defense models** 

library [24].

model.

types.

server, respectively (Fig. 3).

of defense network and (3) providing robustness of defense network (these components

According to the structure of the scenarios, the behavior of BOTNET models is defined by the set of conditionally independent network processes. The model of legitimate traffic is based on approach described in [33], and implemented by the components of ReaSE

*Attack network model* specifies a set of activities generated by attack network. In the current work we implemented this model by three relatively independent sub-models: the propagation model, the management and control model, and the attack phase

The *model of botnet's propagation* implements a scenario of expanding botnet over the computer network. The main goal of this scenario is the support of the ongoing process of involvement of new nodes into the botnet. In this paper, such scenario is presented as a model of the bot-agent code propagation by the means of the network worms of various

Participants of the botnet propagation scenario are the "IP Worm" and "Vulnerable App" components, which are the model of a network worm and the model of vulnerable network

**Figure 3.** Static diagram of components involved in the botnet propagation scenario

supposed to be the command center of the growing botnet.

The component "IP Worm" is a model of client application that responsible for sending malformed network packets in order to compromise possible vulnerable hosts. The parameters of this model are: the algorithm of victim's address generation, the method of source address spoofing, and the frequency which is used to send malformed packets. The payload of the malformed packet includes the network address of the server, which is

realize protocols of centralized and decentralized overlay networks).

**Figure 4.** Component interaction diagram for the botnet propagation scenario

All communication between the models of network applications is carried out via instant messaging. The core component that responsible for the messaging support is the "Worm Activity Director". In case of receiving a malformed network packet, the model of vulnerable network service notifies the component "Worm Activity Director" immediately.

Then the component "Worm Activity Director" decapsulates the addresses of the hosts which are supposed to be command centers and sends the message which contains the information about servers to the "Worm App" component. The event of receiving such message by "Worm App" component is considered as a signal of the transition to the infecting state and as the instruction to start spreading a worm from the given host. In the current research we implemented TCP and UDP based worms and vulnerable network applications.

The *model of botnet control* implements the botnet control scenario. The goal of this scenario is to provide the persistent controllability of the whole set of botnet nodes. Such process includes the procedures for support of the connectivity of the nodes and methods that are supposed to make the botnet responsible on the commands being issued by the botmaster.

The model of botnet control represented in this paper implements two types of architectures.

The first type of the botnet is based on the IRC protocol. It is a classic implementation of a centralized botnet with a centralized command centers.

Another type of botnet is based on some implementation of P2P-protocol. Such botnets belong to a group of decentralized botnets [34]. Library component BOTNET Foundation Classes includes components that are common to both types of architectures and components specific to each of them.

Discrete-Event Simulation of Botnet Protection Mechanisms 151

The essential point is that the components "IRC Client App" and "P2P Client App" are the models of network clients that implement the corresponding application layer protocols. The way to manipulate such clients is performed by the means of the component that represents the model of the user. The commonality that is inherent in the models of the clients allows distinguishing invariant interface for manipulating models of clients by the models of the user. This interface also permits to separate the business logic of the network application from a component that implements the communication protocol itself. Thus, it is possible to apply the components "IRC Client App" and "P2P Client App" over a variety of

The user models are implementations of a generic user model represented as a component "Network App Generic". The generic user model is an abstract model, which communicates in the network through the application layer protocol. The user model interacts with the protocol by the "IOverlayNetwork" interface. The specific implementations of the generic user model directly specify the logic of network activity. In the control scenario, two types of specific user models are realized: a bot-master model ("BotMaster") and a model of bot-

In the present work, in the botnet attack scenario an attack Distributed Denial of Service (DDoS) is simulated. The signal to begin the attack is the special command of the botnet master. The specific ways to implement this attack are "TCP Flood" and "UDP Flood". These components are directly related to the component representing the network layer protocols in the node model. Fig. 6 shows the diagram of components that implement the "TCP Flood" attack. This figure shows the structure of the node which is included in the

The "zombie"-node model consist of: (1) the IRC-client model that realize communication between the node and the rest of the network, (2) the zombie-agent model that implements the communication protocol between the zombie node and the bot-master, and (3) the model of the TCP Flood component that realize the attack of TCP Flood type. The component that implements the TCP Flood attack and the IRC-client use the services

The component interaction diagram for components involved in the botnet attack scenario is

The bot-master initiates the command to begin the attack and sends it to zombies. This command includes the victim address in data field. Then the message containing the command is delivered to the zombies by using network protocols. The message is transmitted to the "zombie" components for processing in accordance with the logic of the

The zombie component identifies the command, retrieves the victim address and notifies the component "IP Flood" to switch to the attack mode. The attach target is the victim node with the given address. The time to finalize the attack is determined by the logic of its

provided by the module realizing the network layer protocols (TCP, IP, and UDP).

different protocols without any change of business logic of the clients.

agent, which is located on the "zombie"-nodes.

centralized botnet.

shown in Fig. 7.

control protocol.

implementation.

Classes include components that are common to both types of architectures and components specific to each of them.

Static diagram of components involved in this scenario shown in Fig. 5.

**Figure 5.** Static diagram of components involved in the botnet control scenario

The main components that are directly responsible for connectivity of the whole botnet, are models of network applications that implement the corresponding application protocols.

In the case of a centralized network, there is a subset of nodes – command centers – which are responsible for management of the rest of botnet nodes. The model of network service that corresponds to command center, is represented as a component "IRC Server App". This component is a particular implementation of a generalized model of the network server. It provides a partial implementation of the server side of the IRC protocol. This implementation is sufficient to reproduce the essential aspects of network behavior. Clients of such server are the "IRC Client App" components that are particular implementations of the generalized model of the network client. These components implement the client side of the IRC protocol as well.

The model of the decentralized botnet control is represented by the "P2P Agent App" component. This component is an implementation of the P2P-protocol client. Regarding to the P2P protocol specificity this component is a particular implementation of both the network server and client applications at the same time.

The essential point is that the components "IRC Client App" and "P2P Client App" are the models of network clients that implement the corresponding application layer protocols. The way to manipulate such clients is performed by the means of the component that represents the model of the user. The commonality that is inherent in the models of the clients allows distinguishing invariant interface for manipulating models of clients by the models of the user. This interface also permits to separate the business logic of the network application from a component that implements the communication protocol itself. Thus, it is possible to apply the components "IRC Client App" and "P2P Client App" over a variety of different protocols without any change of business logic of the clients.

150 Discrete Event Simulations – Development and Applications

components specific to each of them.

specific to each of them.

the IRC protocol as well.

Another type of botnet is based on some implementation of P2P-protocol. Such botnets belong to a group of decentralized botnets [34]. Library component BOTNET Foundation Classes includes components that are common to both types of architectures and

Classes include components that are common to both types of architectures and components

Static diagram of components involved in this scenario shown in Fig. 5.

**Figure 5.** Static diagram of components involved in the botnet control scenario

network server and client applications at the same time.

The main components that are directly responsible for connectivity of the whole botnet, are models of network applications that implement the corresponding application protocols.

In the case of a centralized network, there is a subset of nodes – command centers – which are responsible for management of the rest of botnet nodes. The model of network service that corresponds to command center, is represented as a component "IRC Server App". This component is a particular implementation of a generalized model of the network server. It provides a partial implementation of the server side of the IRC protocol. This implementation is sufficient to reproduce the essential aspects of network behavior. Clients of such server are the "IRC Client App" components that are particular implementations of the generalized model of the network client. These components implement the client side of

The model of the decentralized botnet control is represented by the "P2P Agent App" component. This component is an implementation of the P2P-protocol client. Regarding to the P2P protocol specificity this component is a particular implementation of both the The user models are implementations of a generic user model represented as a component "Network App Generic". The generic user model is an abstract model, which communicates in the network through the application layer protocol. The user model interacts with the protocol by the "IOverlayNetwork" interface. The specific implementations of the generic user model directly specify the logic of network activity. In the control scenario, two types of specific user models are realized: a bot-master model ("BotMaster") and a model of botagent, which is located on the "zombie"-nodes.

In the present work, in the botnet attack scenario an attack Distributed Denial of Service (DDoS) is simulated. The signal to begin the attack is the special command of the botnet master. The specific ways to implement this attack are "TCP Flood" and "UDP Flood". These components are directly related to the component representing the network layer protocols in the node model. Fig. 6 shows the diagram of components that implement the "TCP Flood" attack. This figure shows the structure of the node which is included in the centralized botnet.

The "zombie"-node model consist of: (1) the IRC-client model that realize communication between the node and the rest of the network, (2) the zombie-agent model that implements the communication protocol between the zombie node and the bot-master, and (3) the model of the TCP Flood component that realize the attack of TCP Flood type. The component that implements the TCP Flood attack and the IRC-client use the services provided by the module realizing the network layer protocols (TCP, IP, and UDP).

The component interaction diagram for components involved in the botnet attack scenario is shown in Fig. 7.

The bot-master initiates the command to begin the attack and sends it to zombies. This command includes the victim address in data field. Then the message containing the command is delivered to the zombies by using network protocols. The message is transmitted to the "zombie" components for processing in accordance with the logic of the control protocol.

The zombie component identifies the command, retrieves the victim address and notifies the component "IP Flood" to switch to the attack mode. The attach target is the victim node with the given address. The time to finalize the attack is determined by the logic of its implementation.

Discrete-Event Simulation of Botnet Protection Mechanisms 153

**Module Description**

"Botnet Master" Model of botmaster application "Bot Client" Model of zombie client application "Worm" Model of network worm application "Vulnerable Application" Model of vulnerable network application

"IRC client" Model of IRC client application "IRC Server" Model of IRC server application "P2P Agent" Model of P2P client application "UDP Flooder" Model of UDP flooding application "SYN Flooder" Model of SYN flooding application

"IRC Monitor" IRC traffic monitor

**Table 1.** Modules of BOTNET Foundation Classes

which have vulnerabilities.

upper-right) are also shown in Fig.8.

"Filtering router" Model of router for filtering network traffic "Failed Connection filter" Traffic filter based on "Failed Connection" "Worm Throttling filter" Traffic filter based on "Worm Throttling" "HIPC filter" Traffic filter based on "Source IP Counting"

"Hop-Count Filter" Traffic filter based on "Hop-Count Filtering"

Main panel shows components, which are included in BOTNET models.

allow performing efficient search of the appropriate instance and editing its state.

Network topology and configuration are modeled on the two levels of details.

**5. Implementation and parameters of experiments** 

**Attack network modules**

**Defense network modules**

"IRC Relationship filter" IRC related traffic filter based on "Relationship" metric [1] "IRC Synchronization filter" IRC related traffic filter based on "Synchronization" metric [1]

"SIMP Filter" Traffic filter based on "Source IP Address Monitoring" "SAVE Filter" Traffic filter based on "Source Address Validity Enforcement"

An example of the simulation environment user interface in one of the experiments is shown in Fig.8. The main panel and control elements are in the upper-left corner of user interface.

Control elements allow user to interact with these components. The model time control elements are presented optionally on the main panel. These elements allow, for example, to execute model step-by-step or in the fastest mode. There are also control elements, which

Fragment of the modeled network is also shown in Fig.8 (at the bottom left). Routers models are depicted as cylinders with arrows, and hosts models are represented as computers of different colors. Color represents the node state. Blue color is used for the incoming nodes,

The legitimate nodes without vulnerabilities are not colored. The view window of one of hosts (at the bottom right) and the editing window of the "router" object parameters (at the

**Figure 6.** Diagram of components involved in the botnet attack scenario.

**Figure 7.** Component interaction diagram for components involved in the botnet attack scenario

For each activity of attack network, the *defense network* is performing the opposite activity, aiming to suppress corresponding activity of attack network. Therefore, the defense network model is implemented by three following sub-models: counteraction of attack network propagation, counteraction of attack network management and control, and counteraction of DDoS attack. In addition to actions against attacking network, protective network also performs some steps destined to ensure its own robustness.

Components that realize the entities of attack and defense networks are given in Table 1.


**Table 1.** Modules of BOTNET Foundation Classes

152 Discrete Event Simulations – Development and Applications

**Figure 6.** Diagram of components involved in the botnet attack scenario.

Command + Data

Attack Command

ensure its own robustness.

Table 1.

MASTER

Packet

Received

IRC Client App Syn-Flooder

**Figure 7.** Component interaction diagram for components involved in the botnet attack scenario

For each activity of attack network, the *defense network* is performing the opposite activity, aiming to suppress corresponding activity of attack network. Therefore, the defense network model is implemented by three following sub-models: counteraction of attack network propagation, counteraction of attack network management and control, and counteraction of DDoS attack. In addition to actions against attacking network, protective network also performs some steps destined to

Components that realize the entities of attack and defense networks are given in

Zombie

Start Flooding

Network

Send malformed packet

Send malformed packet

Send malformed packet

Send malformed packet

## **5. Implementation and parameters of experiments**

An example of the simulation environment user interface in one of the experiments is shown in Fig.8. The main panel and control elements are in the upper-left corner of user interface. Main panel shows components, which are included in BOTNET models.

Control elements allow user to interact with these components. The model time control elements are presented optionally on the main panel. These elements allow, for example, to execute model step-by-step or in the fastest mode. There are also control elements, which allow performing efficient search of the appropriate instance and editing its state.

Fragment of the modeled network is also shown in Fig.8 (at the bottom left). Routers models are depicted as cylinders with arrows, and hosts models are represented as computers of different colors. Color represents the node state. Blue color is used for the incoming nodes, which have vulnerabilities.

The legitimate nodes without vulnerabilities are not colored. The view window of one of hosts (at the bottom right) and the editing window of the "router" object parameters (at the upper-right) are also shown in Fig.8.

Network topology and configuration are modeled on the two levels of details.

Discrete-Event Simulation of Botnet Protection Mechanisms 155

**Figure 9.** AS-level topology

routers - 0.2.

client nodes.

On the second level the router-level topology is modeled for each AS (Fig. 10).

In this work we use HOT-model (Heuristically Optimal Topology) [16] with the following parameters: number of routers - from 5 to 20; the part of the core routers in the total number of routers - 1%; number of hosts on the router - from 5 to 12; connectivity level of core

Core routers are connected via communication channel with bandwidth dr=2500 Mbit/s and delay – 1 milliseconds, communication of gateways with core routers - dr=1000 Mbit/s and delay - 1 milliseconds, connection of gateways with edge routers - dr=155 Mbit/s and delay - 1 milliseconds, connection of edge router with servers - dr=10 Mbit/s and delay - 5 milliseconds. Connection of edge router with client nodes is as follows: to node - dr=0.768 Mbit/s and delay - 5 milliseconds, from node - dr=0.128 Mbit/s and delay - 5 milliseconds.

On the base of the parameters provided above, different networks were generated, including network with 3652 nodes (which is used for experiments described). 10 of these nodes are servers (including one DNS-server, three web-servers and six mail servers). 1119

Also the node "master" is defined in the network. It works as the initial source of worm distribution and the initiator of botnet management commands. All nodes in the subnets are connected via edge routers. Root router "gateway" is defined in every subnet. Subnets are united via this router. User models, which send requests to the servers, are installed on the

It is the way to create legitimate traffic. Model of the standard protocol stack is installed on each node. This stack includes PPP, LCP, IP, TCP, ICMP, ARP, UDP protocols. Models of the network components (which implement appropriate functionality) can be installed

nodes (near 30% from the total number) have vulnerabilities.

**Figure 8.** User Interface of Simulation Environment

On the first level the network topology is modeled on the level of the autonomous systems (AS). We used technique of positive-feedback preference (PFP) [38] to model computer network topology on the autonomous system level.

The networks which consist from 30 autonomous systems (AS-level topology) were modeled in experiments (Fig. 9). To generate the graph of the autonomous systems level the following parameters were used: threshold to take AS nodes as transit (Transit Node Threshold= 20); number of new nodes connections (P=0.4); assortative level of the generated network, which characterizes the level of nodes preference depending on their connectivity after addition of the new node to the network (Delta=0.04) [38].

Connection of transit AS is made via communication channel with bandwidth dr=10000 Mbit/s and delay d=50 milliseconds. Connection of limited AS is made with dr=5000 Mbit/s and d=20 milliseconds.

**Figure 9.** AS-level topology

**Figure 8.** User Interface of Simulation Environment

and d=20 milliseconds.

network topology on the autonomous system level.

after addition of the new node to the network (Delta=0.04) [38].

On the first level the network topology is modeled on the level of the autonomous systems (AS). We used technique of positive-feedback preference (PFP) [38] to model computer

The networks which consist from 30 autonomous systems (AS-level topology) were modeled in experiments (Fig. 9). To generate the graph of the autonomous systems level the following parameters were used: threshold to take AS nodes as transit (Transit Node Threshold= 20); number of new nodes connections (P=0.4); assortative level of the generated network, which characterizes the level of nodes preference depending on their connectivity

Connection of transit AS is made via communication channel with bandwidth dr=10000 Mbit/s and delay d=50 milliseconds. Connection of limited AS is made with dr=5000 Mbit/s On the second level the router-level topology is modeled for each AS (Fig. 10).

In this work we use HOT-model (Heuristically Optimal Topology) [16] with the following parameters: number of routers - from 5 to 20; the part of the core routers in the total number of routers - 1%; number of hosts on the router - from 5 to 12; connectivity level of core routers - 0.2.

Core routers are connected via communication channel with bandwidth dr=2500 Mbit/s and delay – 1 milliseconds, communication of gateways with core routers - dr=1000 Mbit/s and delay - 1 milliseconds, connection of gateways with edge routers - dr=155 Mbit/s and delay - 1 milliseconds, connection of edge router with servers - dr=10 Mbit/s and delay - 5 milliseconds. Connection of edge router with client nodes is as follows: to node - dr=0.768 Mbit/s and delay - 5 milliseconds, from node - dr=0.128 Mbit/s and delay - 5 milliseconds.

On the base of the parameters provided above, different networks were generated, including network with 3652 nodes (which is used for experiments described). 10 of these nodes are servers (including one DNS-server, three web-servers and six mail servers). 1119 nodes (near 30% from the total number) have vulnerabilities.

Also the node "master" is defined in the network. It works as the initial source of worm distribution and the initiator of botnet management commands. All nodes in the subnets are connected via edge routers. Root router "gateway" is defined in every subnet. Subnets are united via this router. User models, which send requests to the servers, are installed on the client nodes.

It is the way to create legitimate traffic. Model of the standard protocol stack is installed on each node. This stack includes PPP, LCP, IP, TCP, ICMP, ARP, UDP protocols. Models of the network components (which implement appropriate functionality) can be installed

additionally depending on the nodes functional role. The experiments include investigation of botnet actions and defense activities on the stages of botnet propagation, botnet management and control (reconfiguration and preparation to attacks) and attack execution.

Discrete-Event Simulation of Botnet Protection Mechanisms 157

source addresses; the buffer operates by FIFO principle. For each new source address one slot of the buffer is allocated. The buffer includes up to 5 authorized destination addresses. If the buffer is full, one of its slots can be released every 5 seconds by FIFO method and it is allowed to connect to a new remote host. This protection mechanism is

Several experiments were performed. Fig.11 shows the dependencies of the number of infected hosts from the botnet propagation time for cases without protection and with

We analyzed in the experiments the dependencies of number of false positive rate (FP, when the legitimate packet is recognized as malicious), false negative rate (FN, when the malicious packet is not detected), and truth positive rate (TP, when the malicious packet is

It was shown that the numbers of FP and FN are just slightly different under a small number of established protection mechanisms (30%) and limiting the source buffer up to 300 addresses. This occurs because Virus Throttling passes the packets from infected nodes, which were previously included in the source buffer, but were jammed with new infected

When the number of protection mechanisms increased, the number of FN is significantly

Fig.12 shows the dependencies of the volume of total and filtered traffic, as well as the numbers of FP and FN rates from botnet propagation time using Virus Throttling at 30% (a),

installed on routers.

protection set at 30%, 50% and 100% of the routers.

**Figure 11.** Number of Infected Hosts when using Virus Throttling

detected) from the botnet propagation time.

50% (b) and 100% (c) routers, respectively.

node addresses.

reduced.

**Figure 10.** Router-level topology

## **6. Experiments**

As part of our research, a set of experiments was performed. They demonstrate the operability of the developed simulation environment and main characteristics of botnets and defense mechanisms investigated.

a. Botnet propagation and defense against propagation

At the 100th second of the model time, the bot master initiate scanning the network for vulnerable hosts using one of network worm techniques. At the same time it connects to the public IRC server and creates new communication channel, thus turning into kind of "command center". We adjust the frequency of network scanning to 6 packets per second in our experiment. Random scanning on a range of predefined IP addresses is used. In case of some host getting compromised it becomes the "zombie". "Zombie" connects to public IRC server, which is "command center", and reports to bot master of its successful integrating to the botnet infrastructure and readiness to process further orders. Also "Zombie" starts to scan the network for vulnerable hosts the same way as bot master did initially.

To protect against botnet propagation the protection mechanism based on "Virus Throttling" is used. It has the following parameters - the source buffer contains 300 traffic source addresses; the buffer operates by FIFO principle. For each new source address one slot of the buffer is allocated. The buffer includes up to 5 authorized destination addresses. If the buffer is full, one of its slots can be released every 5 seconds by FIFO method and it is allowed to connect to a new remote host. This protection mechanism is installed on routers.

Several experiments were performed. Fig.11 shows the dependencies of the number of infected hosts from the botnet propagation time for cases without protection and with protection set at 30%, 50% and 100% of the routers.

**Figure 11.** Number of Infected Hosts when using Virus Throttling

156 Discrete Event Simulations – Development and Applications

**Figure 10.** Router-level topology

and defense mechanisms investigated.

a. Botnet propagation and defense against propagation

**6. Experiments** 

initially.

additionally depending on the nodes functional role. The experiments include investigation of botnet actions and defense activities on the stages of botnet propagation, botnet management and control (reconfiguration and preparation to attacks) and attack execution.

As part of our research, a set of experiments was performed. They demonstrate the operability of the developed simulation environment and main characteristics of botnets

At the 100th second of the model time, the bot master initiate scanning the network for vulnerable hosts using one of network worm techniques. At the same time it connects to the public IRC server and creates new communication channel, thus turning into kind of "command center". We adjust the frequency of network scanning to 6 packets per second in our experiment. Random scanning on a range of predefined IP addresses is used. In case of some host getting compromised it becomes the "zombie". "Zombie" connects to public IRC server, which is "command center", and reports to bot master of its successful integrating to the botnet infrastructure and readiness to process further orders. Also "Zombie" starts to scan the network for vulnerable hosts the same way as bot master did

To protect against botnet propagation the protection mechanism based on "Virus Throttling" is used. It has the following parameters - the source buffer contains 300 traffic We analyzed in the experiments the dependencies of number of false positive rate (FP, when the legitimate packet is recognized as malicious), false negative rate (FN, when the malicious packet is not detected), and truth positive rate (TP, when the malicious packet is detected) from the botnet propagation time.

It was shown that the numbers of FP and FN are just slightly different under a small number of established protection mechanisms (30%) and limiting the source buffer up to 300 addresses. This occurs because Virus Throttling passes the packets from infected nodes, which were previously included in the source buffer, but were jammed with new infected node addresses.

When the number of protection mechanisms increased, the number of FN is significantly reduced.

Fig.12 shows the dependencies of the volume of total and filtered traffic, as well as the numbers of FP and FN rates from botnet propagation time using Virus Throttling at 30% (a), 50% (b) and 100% (c) routers, respectively.

Discrete-Event Simulation of Botnet Protection Mechanisms 159

Fig.13 shows the dependencies of the percentage of filtered legitimate traffic related to all legitimate traffic from the botnet propagation time, when Virus Throttling are used at 30%,

To investigate several other protection mechanisms, we performed the same set of experiments as for Virus Throttling. For example, for "Failed Connection" technique, relative high levels of TP in all dependencies indicate that this technique allows filtering a

However, Failed Connection does not allow significantly constraining the botnet propagation under current parameters of experiments. Such characteristics, as the relation of the number of vulnerable hosts to legitimate ones, the method of vulnerable hosts scanning and threshold for decision making, have a great impact on the quality of this technique.

On this stage of botnet life cycle we investigated mainly the protection technique proposed by M. Akiyama et al. [1]. This technique involves monitoring of IRC-traffic, passing through the observer node, and subsequent calculation of the metrics "Relationship", "Response" and "Synchronization", based on the content of network packets. Metric "Relationship" characterizes the distribution of clients in IRC-channel. Too high value of this metric is

For example, the threshold for this metric can have a value of 30, 100 or 200 clients per channel. If the threshold is exceeded, the packets related to this IRC channel are filtered. The metric "Response" is calculated as the distribution of response time to the broadcasting request. The metric "Synchronization" characterizes the synchronism in group behavior of

**IRC traffic monitoring and relationship metric calculation.** IRC traffic is monitored by using "observer" components, which are installed on the core routers of network segments. Information about IRC channel and its clients is defined by analysis of IRC packets. Then, based on data obtained, the relationship metrics of observed channels are calculated in real

**Figure 13.** Volume of Filtered Legitimate Traffic when using Virus Throttling

b. Botnet Management and Protection against Botnet on this Stage

IRC clients. Consider the examples of different experiments.

50% and 100% of the routers.

large number of malicious packets.

considered as abnormal.

**Figure 12.** Main Characteristics of Virus Throttling

Fig.12 shows that when we increase the number of nodes with Virus Throttling from 30 to 50% the filtered traffic also increases, although the total amount of traffic generated by worms decrease slightly. In the case of full (100%) coverage of Virus Throttling the volume of the traffic generated by network worms and the amount of FN is significantly reduced.

Fig.13 shows the dependencies of the percentage of filtered legitimate traffic related to all legitimate traffic from the botnet propagation time, when Virus Throttling are used at 30%, 50% and 100% of the routers.

**Figure 13.** Volume of Filtered Legitimate Traffic when using Virus Throttling

158 Discrete Event Simulations – Development and Applications

**Figure 12.** Main Characteristics of Virus Throttling

reduced.

Fig.12 shows that when we increase the number of nodes with Virus Throttling from 30 to 50% the filtered traffic also increases, although the total amount of traffic generated by worms decrease slightly. In the case of full (100%) coverage of Virus Throttling the volume of the traffic generated by network worms and the amount of FN is significantly

(c)

(a)

(b)

To investigate several other protection mechanisms, we performed the same set of experiments as for Virus Throttling. For example, for "Failed Connection" technique, relative high levels of TP in all dependencies indicate that this technique allows filtering a large number of malicious packets.

However, Failed Connection does not allow significantly constraining the botnet propagation under current parameters of experiments. Such characteristics, as the relation of the number of vulnerable hosts to legitimate ones, the method of vulnerable hosts scanning and threshold for decision making, have a great impact on the quality of this technique.

b. Botnet Management and Protection against Botnet on this Stage

On this stage of botnet life cycle we investigated mainly the protection technique proposed by M. Akiyama et al. [1]. This technique involves monitoring of IRC-traffic, passing through the observer node, and subsequent calculation of the metrics "Relationship", "Response" and "Synchronization", based on the content of network packets. Metric "Relationship" characterizes the distribution of clients in IRC-channel. Too high value of this metric is considered as abnormal.

For example, the threshold for this metric can have a value of 30, 100 or 200 clients per channel. If the threshold is exceeded, the packets related to this IRC channel are filtered. The metric "Response" is calculated as the distribution of response time to the broadcasting request. The metric "Synchronization" characterizes the synchronism in group behavior of IRC clients. Consider the examples of different experiments.

**IRC traffic monitoring and relationship metric calculation.** IRC traffic is monitored by using "observer" components, which are installed on the core routers of network segments. Information about IRC channel and its clients is defined by analysis of IRC packets. Then, based on data obtained, the relationship metrics of observed channels are calculated in real time. It is assumed that the data, obtained from observer components, will strongly depend on the location of the observer in relation to main IRC flows, merging near the network segment that contains the IRC server.

Discrete-Event Simulation of Botnet Protection Mechanisms 161

From 200 seconds of simulation time, every 100 seconds we can observe sharp spikes of traffic volume related to the botnet control IRC channel. These bursts are caused by

Network segment tas0 is located in proximity from the network segment which includes IRC-server. Thus, a significant part of IRC control traffic is transmitted through the router of network segment tas0. For this reason, the bursts of control channel traffic are markedly

Traffic on a network segment router sas13 was measured (Fig.15) to evaluate the impact of the proximity of the observation point from the IRC server on the severity of bursts of

Traffic measurements show a general decrease of traffic level in the observation point sas13, as well as a good visibility of traffic spikes on the core router of this network segment. Thus, the results of experiments demonstrate the applicability of synchronization metric to detect

response messages from zombies on a request from the botnet master.

**Figure 14.** Synchronization Metrics for tas0

**Figure 15.** Synchronization Metric for sas13

the IRC control traffic.

expressed against the traffic of legitimate communication.

control traffic (and thus on the discernibility of synchronization metric).

Table 2 shows a part of observed relationship metrics for IRC channels in various network locations. There are data for the botnet control channel (Irc-bot) and two channels for legitimate IRC communication (Irc-1 and Irc-2). The number of clients in the Irc-channel 1 is 10; the number of clients in Irc-channel 2 is 9. For legitimate channels, we observe either full detection of all channel clients or complete lack of detection. This is due the legitimate IRC communication is done through the exchange of broadcast messages and, thus, if any observer is situated on the path of the IRC traffic, it finds all the clients of the channel.


**Table 2.** Relationship Metrics of IRC Channels

We see strong differentiation of observed metrics, depending from the observer network position. This is due to peculiarities of communication of botnet clients in the IRC control channel. Instead of using broadcast messages directed to all participants in the channel, bots exchange information only with a small number of nodes, belonging to botnet masters. We observe almost complete detection of botnet control channel for two routers in table 2. Analysis of the topology of the simulated network shows that the segment sas17 (sensor\_sas17) has an IRC server node. The segment tas0, located in proximity to the segment sas17, is a transit for traffic between the IRC server and the most of IRC bots.

Thus, we can suppose that a defense mechanism, fulfilled on a small number of routers which are transit for the main IRC traffic, can be as effective as the defense mechanism installed in more number of routers. We can also assume that a defense mechanism, having a small covering of the protected network, generally will not be efficient, because only a small part of IRC control traffic passes the vast majority of routers.

**IRC traffic monitoring and synchronization metric calculation.** In these experiments the IRC traffic is monitored in different network locations. Based on monitoring results, the synchronization metrics are calculated. Let us consider the synchronization metrics determined by monitoring the traffic on the core router of network segment tas0 (Fig.14).

From 200 seconds of simulation time, every 100 seconds we can observe sharp spikes of traffic volume related to the botnet control IRC channel. These bursts are caused by response messages from zombies on a request from the botnet master.

**Figure 14.** Synchronization Metrics for tas0

160 Discrete Event Simulations – Development and Applications

segment that contains the IRC server.

**Table 2.** Relationship Metrics of IRC Channels

time. It is assumed that the data, obtained from observer components, will strongly depend on the location of the observer in relation to main IRC flows, merging near the network

Table 2 shows a part of observed relationship metrics for IRC channels in various network locations. There are data for the botnet control channel (Irc-bot) and two channels for legitimate IRC communication (Irc-1 and Irc-2). The number of clients in the Irc-channel 1 is 10; the number of clients in Irc-channel 2 is 9. For legitimate channels, we observe either full detection of all channel clients or complete lack of detection. This is due the legitimate IRC communication is done through the exchange of broadcast messages and, thus, if any observer is situated on the path of the IRC traffic, it finds all the clients of the channel.

> **#Sensor #Irc-bot #Irc-1 #Irc-2 sensor sas17** 97,91% 100,00% 100,00% **sensor tas0** 95,82% 100,00% 100,00% **sensor tas4** 26,82% 100,00% 100,00% **sensor\_tas2** 26,00% 100,00% 100,00% **sensor\_sas1** 15,00% 100,00% 100,00% **sensor\_sas18** 7,27% 0,00% 0,00% **sensor sas26** 5,45% 100,00% 0,00% **sensor sas11** 5,45% 0,00% 0,00% **sensor tas8** 5,27% 100,00% 0,00% **sensor\_tas5** 5,27% 0,00% 0,00% **sensor\_sas20** 5,09% 100,00% 0,00% **sensor\_sas13** 5,00% 0,00% 0,00%

We see strong differentiation of observed metrics, depending from the observer network position. This is due to peculiarities of communication of botnet clients in the IRC control channel. Instead of using broadcast messages directed to all participants in the channel, bots exchange information only with a small number of nodes, belonging to botnet masters. We observe almost complete detection of botnet control channel for two routers in table 2. Analysis of the topology of the simulated network shows that the segment sas17 (sensor\_sas17) has an IRC server node. The segment tas0, located in proximity to the segment sas17, is a transit for traffic between the IRC server and the most of IRC bots.

Thus, we can suppose that a defense mechanism, fulfilled on a small number of routers which are transit for the main IRC traffic, can be as effective as the defense mechanism installed in more number of routers. We can also assume that a defense mechanism, having a small covering of the protected network, generally will not be efficient, because only a

**IRC traffic monitoring and synchronization metric calculation.** In these experiments the IRC traffic is monitored in different network locations. Based on monitoring results, the synchronization metrics are calculated. Let us consider the synchronization metrics determined by monitoring the traffic on the core router of network segment tas0 (Fig.14).

small part of IRC control traffic passes the vast majority of routers.

Network segment tas0 is located in proximity from the network segment which includes IRC-server. Thus, a significant part of IRC control traffic is transmitted through the router of network segment tas0. For this reason, the bursts of control channel traffic are markedly expressed against the traffic of legitimate communication.

Traffic on a network segment router sas13 was measured (Fig.15) to evaluate the impact of the proximity of the observation point from the IRC server on the severity of bursts of control traffic (and thus on the discernibility of synchronization metric).

**Figure 15.** Synchronization Metric for sas13

Traffic measurements show a general decrease of traffic level in the observation point sas13, as well as a good visibility of traffic spikes on the core router of this network segment. Thus, the results of experiments demonstrate the applicability of synchronization metric to detect the IRC control traffic.

**IRC traffic filtering based on relationship metric.** This filtering method is based on the assumption that the IRC channels with a very large number of clients are anomalous. We carried out a series of experiments where the relationship metric was used for different configurations of filtering components and different critical levels of relationship. It was shown that the efficiency of IRC traffic detection and filtering, based on relationship metric, increases sharply when the routers, which are transit for IRC control traffic, are fully covered by filtering components.

Discrete-Event Simulation of Botnet Protection Mechanisms 163

**Figure 16.** Number of Incoming Packets on Target Host when SAVE Method is enabled

filtered legitimate packets can reach up to 30-40% of legitimate traffic (Fig.18).

**Figure 17.** Main metrics of SIM

d. Comparison of with the Results of Emulation

The SIM defense mechanism reveals a high level of TP and a very low level of FN in all cases (Fig.16). A tiny spike of FN observed only at the beginning of attack stage (Fig.17). The level of FP increases gradually due to the fact of continual increment of dropped packets with unspecified IP-addresses, since the beginning of attack stage. It is noted that the ratio of

To verify the developed simulation models, we emulated the functioning of small networks consisting of many nodes on real computers combined to a network using Oracle VM Virtual Box. On emulated computers the typical software was installed, and the work of legitimate users and malefactors was imitated. To emulate the botnet such hosts as "master", "control center" and "vulnerable computers" were selected. Furthermore, the

software for monitoring of network traffic was installed on the computers.

**IRC traffic filtering based on synchronization metric.** This filtering method is based on the assumption that the short synchronous messaging in a single IRC channel is anomalous. The observed synchronization metric is calculated as the number of IRC packets, passing through the observation point, for a fixed period of time. In the experiments fulfilled, the filtering criterion is a fivefold increase in traffic for 20 seconds followed by a return to its original value. The results of experiments allow concluding about low quality of the method in the current configuration, since false positive rate has a rather high value.

c. DDoS attacks and defense against them

The module of DDoS attacks is parameterized for experiments described as follows: type of attack – SYN flooding; flooding frequency – 10, 30 or 60 packets per second; total number of packets to send by a single host – 1000. Attacks take some Web server as a target, so TCP port 80 is the destination port during attack stage. Spoofing of source IP address is fulfilled in some experiments. We adopt address range that is subset of address range of the whole network to implement spoofing.

Two kinds of defense mechanisms are considered in the description of experimental results: SAVE and SIM.

At the 400th second of model time, bot master initiates the beginning of the attack stage by sending broadcasting network message through the IRC server. This message is transmitted to all "zombie" hosts involved into the botnet. Other data, enveloped into this message, contains specification of IP address and port of the target host. Every zombie, received this message, is able to extract such information and use it as a parameter of own DDoS related activity.

Fig.16 shows the number of packets targeted to the victim host relative to the model time while SAVE method is active. Different portion of routers is used as hosts for defense method deployment. The metrics for 30%, 50% and 100% of router coverage are shown.

The false positive rate, false negative rate and correct detection rate relative to model time are shown in Fig.17. Such metrics are observed for the traffic passing through the core router of the network which victim host belongs to. The only SIM method was enabled. Fig.18 depicts the relative estimation of filtered legitimate traffic against the model time. Relative estimation of the traffic is calculated as a ratio of the number of filtered packets to the number of all packets passed through the SIM defense mechanism. Three cases for different protection coverage (30%, 50% and 100%) were considered.

**Figure 16.** Number of Incoming Packets on Target Host when SAVE Method is enabled

The SIM defense mechanism reveals a high level of TP and a very low level of FN in all cases (Fig.16). A tiny spike of FN observed only at the beginning of attack stage (Fig.17). The level of FP increases gradually due to the fact of continual increment of dropped packets with unspecified IP-addresses, since the beginning of attack stage. It is noted that the ratio of filtered legitimate packets can reach up to 30-40% of legitimate traffic (Fig.18).

**Figure 17.** Main metrics of SIM

162 Discrete Event Simulations – Development and Applications

c. DDoS attacks and defense against them

network to implement spoofing.

SAVE and SIM.

activity.

covered by filtering components.

**IRC traffic filtering based on relationship metric.** This filtering method is based on the assumption that the IRC channels with a very large number of clients are anomalous. We carried out a series of experiments where the relationship metric was used for different configurations of filtering components and different critical levels of relationship. It was shown that the efficiency of IRC traffic detection and filtering, based on relationship metric, increases sharply when the routers, which are transit for IRC control traffic, are fully

**IRC traffic filtering based on synchronization metric.** This filtering method is based on the assumption that the short synchronous messaging in a single IRC channel is anomalous. The observed synchronization metric is calculated as the number of IRC packets, passing through the observation point, for a fixed period of time. In the experiments fulfilled, the filtering criterion is a fivefold increase in traffic for 20 seconds followed by a return to its original value. The results of experiments allow concluding about low quality of the method

The module of DDoS attacks is parameterized for experiments described as follows: type of attack – SYN flooding; flooding frequency – 10, 30 or 60 packets per second; total number of packets to send by a single host – 1000. Attacks take some Web server as a target, so TCP port 80 is the destination port during attack stage. Spoofing of source IP address is fulfilled in some experiments. We adopt address range that is subset of address range of the whole

Two kinds of defense mechanisms are considered in the description of experimental results:

At the 400th second of model time, bot master initiates the beginning of the attack stage by sending broadcasting network message through the IRC server. This message is transmitted to all "zombie" hosts involved into the botnet. Other data, enveloped into this message, contains specification of IP address and port of the target host. Every zombie, received this message, is able to extract such information and use it as a parameter of own DDoS related

Fig.16 shows the number of packets targeted to the victim host relative to the model time while SAVE method is active. Different portion of routers is used as hosts for defense method deployment. The metrics for 30%, 50% and 100% of router coverage are shown.

The false positive rate, false negative rate and correct detection rate relative to model time are shown in Fig.17. Such metrics are observed for the traffic passing through the core router of the network which victim host belongs to. The only SIM method was enabled. Fig.18 depicts the relative estimation of filtered legitimate traffic against the model time. Relative estimation of the traffic is calculated as a ratio of the number of filtered packets to the number of all packets passed through the SIM defense mechanism. Three cases for different

protection coverage (30%, 50% and 100%) were considered.

in the current configuration, since false positive rate has a rather high value.

d. Comparison of with the Results of Emulation

To verify the developed simulation models, we emulated the functioning of small networks consisting of many nodes on real computers combined to a network using Oracle VM Virtual Box. On emulated computers the typical software was installed, and the work of legitimate users and malefactors was imitated. To emulate the botnet such hosts as "master", "control center" and "vulnerable computers" were selected. Furthermore, the software for monitoring of network traffic was installed on the computers.

Discrete-Event Simulation of Botnet Protection Mechanisms 165

The experiments investigated botnet actions and protection mechanisms on stages of botnet propagation, botnet management and control (reconfiguration and preparation to attacks), and attack execution. We analyzed several techniques, including Virus Throttling and Failed Connection, to protect from botnet on the propagation stage. Botnet propagation was performed via network worm spreading. We researched techniques of IRC-oriented botnet detection to counteract botnets on the management and control stage. These techniques are based on the "Relationship" metric of particular IRC-channels, metric of the distribution of response time to the broadcasting request ("Response") and the metric of botnet group behavior synchronization ("Synchronization"). We also analyzed techniques, which work on the different stages of defense against DDoS attacks. These techniques include SAVE (Source Address Validity Enforcement Protocol), SIM (Source IP Address Monitoring) and Hop-

The purpose of this paper is to provide an environment for simulation of computer networks, botnet attacks and defense mechanisms against them. This simulation environment allows investigating various processes in computer networks - the performance of communication channels and servers, the operation of computer network nodes during different attacks on them, the effect of protection mechanisms on the computer network, the best strategies for location and implementation of protection

The developed simulation environment allows changing the main parameters for conducting experiments. These parameters can be adjusted to simulate different types of worms, DDoS attacks, command centers' operation, as well as various protection

By changing the values of the parameters used to model the life cycle of botnets and protection mechanisms against them, we can generate different types of botnets. For example, it is possible to simulate the spread of botnets for a few milliseconds or the case of

The experiments fulfilled were based on typical values of the parameters that can demonstrate the overall dynamics of the development and operation of botnets and the ability to implement different protection mechanisms. The conclusions derived from the simulation results are generalizable to other cases where values of these parameters are

The developed environment allows building the models based on the real network topologies with highly detailed units included in the network to provide the high fidelity of

We suppose that suggested approach can be used to investigate operation of different types of botnets, to evaluate effectiveness of defense mechanisms against botnets and other

Future research is connected with the analysis of effectiveness of botnet operation and defense mechanisms, and improvement of the implemented simulation environment. One of

network attacks, and to choose optimal configurations of such mechanisms.

count filtering.

mechanisms.

the models.

mechanisms with a wide range of values.

their blocking at the first stage of the operation.

outside the range or different from those investigated.

**Figure 18.** Volume of filtered legitimate traffic

Using the developed network testbed, we compared the results obtained on the basis of simulation models with the results of emulation. In case of discrepancies in the results the corresponding simulation models were corrected.

To test the adequacy of the simulation models, the network consisting of 20 virtual nodes was built.

The examples of parameters which were evaluated for the real network are as follows: packet delay, packet loss, the rate of infection of vulnerable hosts, the number of bots participating in the attack, the load of the victim node during DDoS-attack, etc.

In the normal state of the network, the packet delay for legitimate packets to reach the victim node was from 3 to 7 milliseconds, the packet loss was less than 1%. When we emulated a worm spreading, the rate of infection of 18 nodes was from 2.2 to 2.4 s. When performing DDoS-attack, the victim node received from 174 to 179 packets per second, and packet loss increased to 4%. In the case of simulation with the same parameters, the packet delay was 5 milliseconds with the dispersion - 2 milliseconds and packet loss - 1%. The rate of infection of 18 nodes was about 2.4 s. The number of packets used for DDoS-attacks was about 180.

#### **7. Conclusion**

The paper suggested a common approach to investigative modelling and packet-level simulation of botnets and defense mechanisms against them in the Internet. We proposed a generalized architecture of simulation environment aiming to analyze botnets and defense mechanisms. On the base of this architecture we designed and implemented a multilevel software simulation environment. This environment includes the system of discrete event simulation (OMNeT++), the component of networks and network protocols simulation (based on INET Framework library), the component of realistic networks simulation (using the library ReaSE) and BOTNET Foundation Classes library consisting of the models of network applications related to botnets and defense against them.

The experiments investigated botnet actions and protection mechanisms on stages of botnet propagation, botnet management and control (reconfiguration and preparation to attacks), and attack execution. We analyzed several techniques, including Virus Throttling and Failed Connection, to protect from botnet on the propagation stage. Botnet propagation was performed via network worm spreading. We researched techniques of IRC-oriented botnet detection to counteract botnets on the management and control stage. These techniques are based on the "Relationship" metric of particular IRC-channels, metric of the distribution of response time to the broadcasting request ("Response") and the metric of botnet group behavior synchronization ("Synchronization"). We also analyzed techniques, which work on the different stages of defense against DDoS attacks. These techniques include SAVE (Source Address Validity Enforcement Protocol), SIM (Source IP Address Monitoring) and Hopcount filtering.

164 Discrete Event Simulations – Development and Applications

**Figure 18.** Volume of filtered legitimate traffic

was built.

about 180.

**7. Conclusion** 

corresponding simulation models were corrected.

Using the developed network testbed, we compared the results obtained on the basis of simulation models with the results of emulation. In case of discrepancies in the results the

To test the adequacy of the simulation models, the network consisting of 20 virtual nodes

The examples of parameters which were evaluated for the real network are as follows: packet delay, packet loss, the rate of infection of vulnerable hosts, the number of bots

In the normal state of the network, the packet delay for legitimate packets to reach the victim node was from 3 to 7 milliseconds, the packet loss was less than 1%. When we emulated a worm spreading, the rate of infection of 18 nodes was from 2.2 to 2.4 s. When performing DDoS-attack, the victim node received from 174 to 179 packets per second, and packet loss increased to 4%. In the case of simulation with the same parameters, the packet delay was 5 milliseconds with the dispersion - 2 milliseconds and packet loss - 1%. The rate of infection of 18 nodes was about 2.4 s. The number of packets used for DDoS-attacks was

The paper suggested a common approach to investigative modelling and packet-level simulation of botnets and defense mechanisms against them in the Internet. We proposed a generalized architecture of simulation environment aiming to analyze botnets and defense mechanisms. On the base of this architecture we designed and implemented a multilevel software simulation environment. This environment includes the system of discrete event simulation (OMNeT++), the component of networks and network protocols simulation (based on INET Framework library), the component of realistic networks simulation (using the library ReaSE) and BOTNET Foundation Classes library consisting of the models of

network applications related to botnets and defense against them.

participating in the attack, the load of the victim node during DDoS-attack, etc.

The purpose of this paper is to provide an environment for simulation of computer networks, botnet attacks and defense mechanisms against them. This simulation environment allows investigating various processes in computer networks - the performance of communication channels and servers, the operation of computer network nodes during different attacks on them, the effect of protection mechanisms on the computer network, the best strategies for location and implementation of protection mechanisms.

The developed simulation environment allows changing the main parameters for conducting experiments. These parameters can be adjusted to simulate different types of worms, DDoS attacks, command centers' operation, as well as various protection mechanisms with a wide range of values.

By changing the values of the parameters used to model the life cycle of botnets and protection mechanisms against them, we can generate different types of botnets. For example, it is possible to simulate the spread of botnets for a few milliseconds or the case of their blocking at the first stage of the operation.

The experiments fulfilled were based on typical values of the parameters that can demonstrate the overall dynamics of the development and operation of botnets and the ability to implement different protection mechanisms. The conclusions derived from the simulation results are generalizable to other cases where values of these parameters are outside the range or different from those investigated.

The developed environment allows building the models based on the real network topologies with highly detailed units included in the network to provide the high fidelity of the models.

We suppose that suggested approach can be used to investigate operation of different types of botnets, to evaluate effectiveness of defense mechanisms against botnets and other network attacks, and to choose optimal configurations of such mechanisms.

Future research is connected with the analysis of effectiveness of botnet operation and defense mechanisms, and improvement of the implemented simulation environment. One of the main tasks of our current and future research is to improve the scalability and fidelity of the simulation. We are in the process of experimenting with parallel versions of the simulation environment and developing a simulation and emulation tesbed, which combines a hierarchy of macro and micro level analytical and simulation models of botnets and botnet defense (analytical, packet-based, emulation-based) and real small-sized networks.

Discrete-Event Simulation of Botnet Protection Mechanisms 167

[9] Huang Z, Zeng X, Liu Y (2010) Detecting and blocking P2P botnets through contact tracing chains, International Journal of Internet Protocol Technology archive, Vol.5,

[10] Hyunsang C, Hanwoo L, Heejo L, Hyogon K (2007) Botnet Detection by Monitoring Group Activities in DNS Traffic, 7th IEEE International Conference on Computer and

[11] The INET Framework is an open-source communication networks simulation package for the OMNeT++ simulation environment. Available: http://inet.omnetpp.org/.

[12] Kotenko I (2010) Agent-Based Modelling and Simulation of Network Cyber-Attacks and Cooperative Defence Mechanisms", Discrete Event Simulations, Sciyo, pp.223-246. [13] Kotenko I, Konovalov A, Shorov A (2010) Agent-based Modeling and Simulation of Botnets and Botnet Defense", Conference on Cyber Conflict. CCD COE Publications.

[14] Krishnaswamy J (2009) Wormulator: Simulator for Rapidly Spreading Malware,

[15] Kugisaki Y, Kasahara Y, Hori Y, Sakurai K (2007) Bot detection based on traffic analysis, Proceedings of the International Conference on Intelligent Pervasive Computing,

[16] Li L, Alderson D, Willinger W, Doyle J (2004) A first-principles approach to understanding the internet router-level topology", ACM SIGCOMM Computer

[17] Li J, Mirkovic J, Wang M, Reither P, Zhang L (2002) Save: Source address validity

[18] Mao C, Chen Y, Huang S, Lee H (2009) IRC-Botnet Network Behavior Detection in Command and Control Phase Based on Sequential Temporal Analysis, Proceedings of

[19] Mazzariello C (2008) IRC traffic analysis for botnet detection, Proceedings of Fourth

[20] Nagaonkar V, Mchugh J (2008) Detecting stealthy scans and scanning patterns using

[21] Naseem F, Shafqat M, Sabir U, Shahzad A (2010) A Survey of Botnet Technology and Detection, International Journal of Video & Image Processing and Network Security,

[22] Owezarski P, Larrieu N (2004) A trace based method for realistic simulation,

[23] Peng T, Leckie C, Ramamohanarao K (2004) Proactively Detecting Distributed Denial of Service Attacks Using Source IP Address Monitoring, Lecture Notes in Computer

[24] ReaSE - Realistic Simulation Environments for OMNeT++. Available: https://i72projekte.

enforcement protocol", Proceedings of IEEE INFOCOM, pp.1557-1566.

the 19th Cryptology and Information Security Conference.

threshold random walk", Dalhousie University.

tm.uka.de/trac/ReaSE. Accessed 2012 Marth 24.

Communications, 2004 IEEE International Conference.

International Conference on Information Assurance and Security.

Issue 1/2.

Information Technology CIT, pp.715-720.

Accessed 2012 Marth 24.

Tallinn, Estonia, pp.21-44.

Communication Review.

Master's Projects.

pp.303-306.

Vol.10, No. 1.

Science, Vol.3042, pp.771-782.

## **Author details**

Igor Kotenko, Alexey Konovalov and Andrey Shorov *Laboratory of Computer Security Problems, St.-Petersburg Institute for Informatics and Automation of Russian Academy of Sciences, St. Petersburg, Russia* 

## **Acknowledgement**

This research is being supported by grants of the Russian Foundation of Basic Research (project 10-01-00826), the Program of fundamental research of the Department for Nanotechnologies and Informational Technologies of the Russian Academy of Sciences (2.2), State contract #11.519.11.4008 and partly funded by the EU as part of the SecFutur and MASSIF projects.

## **8. References**


[9] Huang Z, Zeng X, Liu Y (2010) Detecting and blocking P2P botnets through contact tracing chains, International Journal of Internet Protocol Technology archive, Vol.5, Issue 1/2.

166 Discrete Event Simulations – Development and Applications

Igor Kotenko, Alexey Konovalov and Andrey Shorov

*of Russian Academy of Sciences, St. Petersburg, Russia* 

networks.

**Author details** 

**Acknowledgement** 

MASSIF projects.

**8. References** 

Internet, Vol.2.

Technologies.

SAINT Workshops, pp. 82-82.

International Workshop on OMNeT++.

Overview and Case Study.

the main tasks of our current and future research is to improve the scalability and fidelity of the simulation. We are in the process of experimenting with parallel versions of the simulation environment and developing a simulation and emulation tesbed, which combines a hierarchy of macro and micro level analytical and simulation models of botnets and botnet defense (analytical, packet-based, emulation-based) and real small-sized

*Laboratory of Computer Security Problems, St.-Petersburg Institute for Informatics and Automation* 

This research is being supported by grants of the Russian Foundation of Basic Research (project 10-01-00826), the Program of fundamental research of the Department for Nanotechnologies and Informational Technologies of the Russian Academy of Sciences (2.2), State contract #11.519.11.4008 and partly funded by the EU as part of the SecFutur and

[1] Akiyama M, Kawamoto T, Shimamura M, Yokoyama T, Kadobayashi Y, Yamaguchi S. (2007) A proposal of metrics for botnet detection based on its cooperative behavior,

[2] Bailey M, Cooke E, Jahanian F, Xu Y, Karir M (2009) A Survey of Botnet Technology and Defenses, Cybersecurity Applications Technology Conference for Homeland Security,. [3] Binkley JR, Singh S (2006) An algorithm for anomaly-based botnet detection", Proceedings of the 2nd conference on Steps to Reducing Unwanted Traffic on the

[4] Chen S, Tang Y (2004) Slowing Down Internet Worms, Proceedings of the 24th

[5] Dagon D, Zou C, Lee W (2006) Modeling botnet propagation using time zones, Proc. 13th Annual Network and Distributed System Security Symposium. San Diego, CA. [6] Feily M, Shahrestani A, Ramadass S (2009) A Survey of Botnet and Botnet Detection", Third International Conference on Emerging Security Information Systems and

[7] Gamer T, Mayer C (2009) Large-scale Evaluation of Distributed Attack Detection, 2nd

[8] Grizzard JB, Sharma V, Nunnery C, Kang BB, Dagon D (2007) Peer-to-Peer Botnets:

International Conference on Distributed Computing Systems.


[25] Riley G, Sharif M, Lee W (2004) Simulating internet worms, Proceedings of the 12th International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS), pp.268-274.

**Chapter 7** 

© 2012 Li, licensee InTech. This is an open access chapter distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use,

© 2012 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution,

distribution, and reproduction in any medium, provided the original work is properly cited.

and reproduction in any medium, provided the original work is properly cited.

**Using Discrete Event Simulation for Evaluating** 

Today's hyper-competitive worldwide market, turbulent environment, demanding customers, and diverse technological advancements force any corporations who develop new products to look into all the possible areas of improvement in the entire product lifecycle management process. One of the areas facing both practitioners and scholars that

On the one hand, even though the demand has increased for more effective ECM as an important competitive advantage of product development companies, the existing ECM literature focuses mainly on the following topics: i) administrative evaluation that supports the formal EC approval, implementation, and documentation process, ii) ECM in product structure and material resource planning, and iii) change propagation and knowledge management. In addition, with a few exceptions [1, 2, 4, 12, 18, 19, 20, 26], almost all the previous research or empirical studies were qualitatively discussed in a descriptive nature. On the other hand, despite of a rich body of concurrent engineering literature that emphasizes the iterative nature of New Product Development (NPD) process, "these models see iterations as *exogenous* and *probabilistic*, and do not consider the source of iteration" [23], which causes the identified rework too general, and therefore not sufficient for an effective ECM study. As a result, there is a lack of research–based analytical models to enhance the understanding of complex interrelationships between NPD and ECM, especially from a systems perspective.

The vision behind this chapter is to ultimately bridge this gap between these two bodies of literature by recognizing the main characteristics of both New Product Development (NPD) and ECM processes, quantifying the interrelated connections among these process features in a Discrete Event Simulation (DES) model (Arena), experimenting with the model under different parameter settings and coordination policies, and finally, drawing decision-making

suggestions considering EC impacts from an overall organizational viewpoint.

have been overlooked in the past is Engineering Change Management (ECM).

**Engineering Change Management Decisions** 

Additional information is available at the end of the chapter

Weilin Li

http://dx.doi.org/10.5772/50579

**1. Introduction** 


## **Using Discrete Event Simulation for Evaluating Engineering Change Management Decisions**

Weilin Li

168 Discrete Event Simulations – Development and Applications

World Wide Web, pp. 512-521.

protocols for computer communications.

INFOCOM, pp.1530–1539.

Springer-Verlag.

simulation.

Computing.

Telecommunication Systems (MASCOTS), pp.268-274.

[25] Riley G, Sharif M, Lee W (2004) Simulating internet worms, Proceedings of the 12th International Workshop on Modeling, Analysis, and Simulation of Computer and

[26] Ruitenbeek EV, Sanders WH (2008) Modeling peer-to-peer botnets, Proceeding of 5th International Conference on Quantitative Evaluation of Systems, pp. 307-316. [27] Schuchard M, Mohaisen A, Kune D, Hopper N, Kim Y, Vasserman E (2010) Losing control of the internet: using the data plane to attack the control plane, Proceedings of the 17th ACM conference on Computer and communications security, pp.726-728. [28] Sen S, Spatscheck O, Wang D (2004) Accurate, scalable in-network identification of p2p traffc using application signatures, Proceedings of the 13th international conference on

[29] Simmonds R, Bradford R, Unger B (2000) Applying parallel discrete event simulation to network emulation, Proceedings of the fourteenth workshop on Parallel and distributed

[31] Varga A. (2010) OMNeT++. Chapter in the book "Modeling and Tools for Network Simulation", Wehrle, Klaus; Günes, Mesut; Gross, James (Eds.) Springer Verlag. [32] Villamarín-Salomón R, Brustoloni JC (2009) Bayesian bot detection based on DNS traffic similarity, Proceeding SAC '09 Proceedings of the 2009 ACM symposium on Applied

[33] Vishwanath KV, Vahdat A (2006) Realistic and responsive network traffic generation, Proceedings of the Conference on Applications, technologies, architectures, and

[34] Wang P, Sparks S, Zou CC (2007) An advanced hybrid peer-to-peer botnet, Proceedings

[35] Wang H, Zhang D, Shin K (2002) Detecting SYN flooding attacks, Proceedings of IEEE

[36] Wehrle K, Gunes M, Gross J (2010) Modeling and Tools for Network Simulation,

[37] Williamson M. (2002) Throttling Viruses: Restricting propagation to defeat malicious

[38] Zhou S, Zhang G, Zhang G, Zhuge Zh (2006) Towards a Precise and Complete Internet Topology Generator", Proceedings of International Conference Communications.

of the First Workshop on Hot Topics in Understanding Botnets.

mobile code", Proceedings of ACSAC Security Conference, pp.61–68.

[30] Suvatne A. Improved Worm Simulator and Simulations. Master's Projects, 2010.

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/50579

## **1. Introduction**

Today's hyper-competitive worldwide market, turbulent environment, demanding customers, and diverse technological advancements force any corporations who develop new products to look into all the possible areas of improvement in the entire product lifecycle management process. One of the areas facing both practitioners and scholars that have been overlooked in the past is Engineering Change Management (ECM).

On the one hand, even though the demand has increased for more effective ECM as an important competitive advantage of product development companies, the existing ECM literature focuses mainly on the following topics: i) administrative evaluation that supports the formal EC approval, implementation, and documentation process, ii) ECM in product structure and material resource planning, and iii) change propagation and knowledge management. In addition, with a few exceptions [1, 2, 4, 12, 18, 19, 20, 26], almost all the previous research or empirical studies were qualitatively discussed in a descriptive nature.

On the other hand, despite of a rich body of concurrent engineering literature that emphasizes the iterative nature of New Product Development (NPD) process, "these models see iterations as *exogenous* and *probabilistic*, and do not consider the source of iteration" [23], which causes the identified rework too general, and therefore not sufficient for an effective ECM study. As a result, there is a lack of research–based analytical models to enhance the understanding of complex interrelationships between NPD and ECM, especially from a systems perspective.

The vision behind this chapter is to ultimately bridge this gap between these two bodies of literature by recognizing the main characteristics of both New Product Development (NPD) and ECM processes, quantifying the interrelated connections among these process features in a Discrete Event Simulation (DES) model (Arena), experimenting with the model under different parameter settings and coordination policies, and finally, drawing decision-making suggestions considering EC impacts from an overall organizational viewpoint.

© 2012 Li, licensee InTech. This is an open access chapter distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. © 2012 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

## **2. Background**

## **2.1. Problem definition**

ECM refers to a collection of procedures, tools, and guidelines for handling modifications and changes to released product design specifications or locked product scope [4, 6, 22, 35]. ECs can be classified into two main categories [4, 5, 11, 13, 27]:

Using Discrete Event Simulation for Evaluating Engineering Change Management Decisions 171

Highly engineered *product* is a complex assembly of interacting components [21, 25]. In automobile industry, a fairly typical modern vehicle is composed of more than ten thousand manufactured component pieces, supplied by thousands of outside suppliers. In the face of such great quantities of components, complex products are impossible to be built all at once. They are decomposed into minimally coupled major systems, and then further broken into smaller sub–systems of manageable size and complexity, and finally down to separate components or parts for individual detailed engineering design. On the other hand, the integration of interdependent decompositions within and across system(s) into the final overall solution as well adds up to the level of complexity and requires substantial

Similarly, a large complex Product Development (PD) *process*, through which all the stages of a product's lifecycle occur, is itself a complex system involving hundreds or thousands of interrelated or interacting activities which transforms inputs into outputs. As shown in the PD literature, tremendous research effort has been devoted into exploring the complexity of PD processes, especially in studying both of the advantages and disadvantages of parallel development process (also known as concurrent engineering) or spiral development process (which is applied more often in software industry) as compared with the traditional staged (also known as waterfall or sequential) development process. Some prior research particularly stressed structuring and managing the process through efforts in minimizing

the interdependencies among tasks via process sequencing optimization [8, 9, 34].

Also, multi–disciplinary *teams* participating in an NPD project are typically composed of numerous decision makers from different functional areas (e.g., marketing, engineering, manufacturing, purchasing, quality assurance, etc.) with varied skill sets (e.g., degree of specialization, depth of knowledge, qualifications, work experience, etc.), responsibilities, and authorities working together and contributing to the achievement of the final product solution. These teams exhibit another set of complex and non–linear organizational behaviors in communication, collaboration, and integration when considering local task decisions as well as task interactions in determining aggregate system performance [28].

Last but not least, an NPD project interacts with its internal (e.g., simultaneous concurrent development of other products within the same organization) and external (e.g., customers/market, competitors, suppliers, and other socio–economic factors such as government regulations, etc.) *environments* throughout the project cycle. The dynamic and sometimes even chaotic competitive environmental factors also contribute significantly to

The concept of *concurrent engineering* is characterized by i) the execution of PD tasks concurrently and iteratively, and ii) the cross–functional integration through improved coordination and incremental information sharing among participating groups. It has been widely embraced by both academia and industry for the well documented advantages of NPD cycle acceleration, risk minimization by the detections of design errors in early stages,

the complexity in the coordination of NPD projects.

**2.2. Concurrency and uncertainty** 

coordination efforts [31].


Under this classification scheme, design iterations within an NPD process and *problem– induced* EECs are very similar, but occur in different situations. Both of them aim at correcting mistakes or solving problems through repetitively achieving unmet goals that have been set initially. EECs are requested rework to prior activities whose outcomes have already been finalized and released to the next phase. However, NPD iterations take place before any design information is formally released to downstream phases, and therefore it generally takes less time to handle iterations due to both a smaller rework scope and a shorter approval processing time. For simplicity, term "*rework*" will be used to refer to both iterations and EECs, unless specific distinction is required. From another standpoint, *opportunity–driven* IECs arise from new needs and requirements, which result in the adding of functionality to a product [10], or enlargement of the original design solution scope. A formal assessment and approval process is desirable in handling both types of ECs due to the associated complexity and potential risks [13, 35].

#### **2.2. Context**

ECM problems cannot be studied in isolation. But rather, they need to be addressed within a broader context, including the following three principle facets: i) complex systems, ii) current engineering and uncertainty, and iii) rework and change propagation.

#### *2.2.1. Complexity*

A new product is designed and developed via an NPD process through the efforts from a group of specialists under dynamic internal and external environment. This DES model brings together the four main elements of complexity associated with design and product development [10], namely, product, process, team (/designer), and environment (/user), on the decision of how iterations and ECs emerge and thus impact NPD project performance, and how should they be effectively managed by applying different coordination policies.

Highly engineered *product* is a complex assembly of interacting components [21, 25]. In automobile industry, a fairly typical modern vehicle is composed of more than ten thousand manufactured component pieces, supplied by thousands of outside suppliers. In the face of such great quantities of components, complex products are impossible to be built all at once. They are decomposed into minimally coupled major systems, and then further broken into smaller sub–systems of manageable size and complexity, and finally down to separate components or parts for individual detailed engineering design. On the other hand, the integration of interdependent decompositions within and across system(s) into the final overall solution as well adds up to the level of complexity and requires substantial coordination efforts [31].

Similarly, a large complex Product Development (PD) *process*, through which all the stages of a product's lifecycle occur, is itself a complex system involving hundreds or thousands of interrelated or interacting activities which transforms inputs into outputs. As shown in the PD literature, tremendous research effort has been devoted into exploring the complexity of PD processes, especially in studying both of the advantages and disadvantages of parallel development process (also known as concurrent engineering) or spiral development process (which is applied more often in software industry) as compared with the traditional staged (also known as waterfall or sequential) development process. Some prior research particularly stressed structuring and managing the process through efforts in minimizing the interdependencies among tasks via process sequencing optimization [8, 9, 34].

Also, multi–disciplinary *teams* participating in an NPD project are typically composed of numerous decision makers from different functional areas (e.g., marketing, engineering, manufacturing, purchasing, quality assurance, etc.) with varied skill sets (e.g., degree of specialization, depth of knowledge, qualifications, work experience, etc.), responsibilities, and authorities working together and contributing to the achievement of the final product solution. These teams exhibit another set of complex and non–linear organizational behaviors in communication, collaboration, and integration when considering local task decisions as well as task interactions in determining aggregate system performance [28].

Last but not least, an NPD project interacts with its internal (e.g., simultaneous concurrent development of other products within the same organization) and external (e.g., customers/market, competitors, suppliers, and other socio–economic factors such as government regulations, etc.) *environments* throughout the project cycle. The dynamic and sometimes even chaotic competitive environmental factors also contribute significantly to the complexity in the coordination of NPD projects.

#### **2.2. Concurrency and uncertainty**

170 Discrete Event Simulations – Development and Applications

ECs can be classified into two main categories [4, 5, 11, 13, 27]:

conceptualized *environmental uncertainty*.

the associated complexity and potential risks [13, 35].

probability determined by the conceptualized *solution uncertainty*,

ECM refers to a collection of procedures, tools, and guidelines for handling modifications and changes to released product design specifications or locked product scope [4, 6, 22, 35].

 *Emergent EC* (EEC) originates from the problems or errors detected from activity outcomes (i.e., design data and information) that have already been frozen and formally released to the downstream phase. EECs are assumed to occur according to a certain

 *Initiated EC* (IEC) requested by sources outside the project's control such as changing market conditions, arising customer requirements, new legislation, or emerging technology advances any point along the NPD process in response to the

Under this classification scheme, design iterations within an NPD process and *problem– induced* EECs are very similar, but occur in different situations. Both of them aim at correcting mistakes or solving problems through repetitively achieving unmet goals that have been set initially. EECs are requested rework to prior activities whose outcomes have already been finalized and released to the next phase. However, NPD iterations take place before any design information is formally released to downstream phases, and therefore it generally takes less time to handle iterations due to both a smaller rework scope and a shorter approval processing time. For simplicity, term "*rework*" will be used to refer to both iterations and EECs, unless specific distinction is required. From another standpoint, *opportunity–driven* IECs arise from new needs and requirements, which result in the adding of functionality to a product [10], or enlargement of the original design solution scope. A formal assessment and approval process is desirable in handling both types of ECs due to

ECM problems cannot be studied in isolation. But rather, they need to be addressed within a broader context, including the following three principle facets: i) complex systems, ii)

A new product is designed and developed via an NPD process through the efforts from a group of specialists under dynamic internal and external environment. This DES model brings together the four main elements of complexity associated with design and product development [10], namely, product, process, team (/designer), and environment (/user), on the decision of how iterations and ECs emerge and thus impact NPD project performance, and how should they be effectively managed by applying different coordination policies.

current engineering and uncertainty, and iii) rework and change propagation.

**2. Background** 

**2.2. Context** 

*2.2.1. Complexity* 

**2.1. Problem definition** 

The concept of *concurrent engineering* is characterized by i) the execution of PD tasks concurrently and iteratively, and ii) the cross–functional integration through improved coordination and incremental information sharing among participating groups. It has been widely embraced by both academia and industry for the well documented advantages of NPD cycle acceleration, risk minimization by the detections of design errors in early stages,

and overall quality improvement (e.g., [3, 17, 27]). It is one of the process features that are captured and thoroughly analyzed by the DES model proposed here.

Using Discrete Event Simulation for Evaluating Engineering Change Management Decisions 173

Before the actural construction of a computer simulation model that is quantitatively augmented by algebraic relationships among interrelated variables, causal loop diagrams are first constructed to study how external factors and internal system structure (the interacting variables comprising the system and the cause-and-effect relationships among

Four feedback loops of various lengths (i.e., the number of variables contained within the loop) that drive EEC occurrence are illustrated in *Fig. 1* for purpose of demonstration. Five interdependent variables1 are considered to form these loops: i) *EEC size*, ii) *solution completeness*, iii) *solution uncertainty*, iv) *Learning Curve Effects (LCE)*, and v) *resource availability*.

*Balancing Loop 1* **(# of EECs Solution Completeness Solution Uncertainty # of EECs)** depicts the reduction in the number of incoming EECs as a result of handling EECs. This phenomenon is due to the fact that processing of more EECs leads to an increase in solution completeness of the NPD project; and thus solution uncertainty decreases. Given the assuption that EEC probability is exponentially decreasing as the project's solution uncertainty decreases, the influence is along the same direction, and therefore number of

The reasoning behind *Balancing Loop 2* **(# of EECs Learning Curve Effects EEC Size Resource availability Solution Completeness Solution Uncertainty # of EECs)** is that an increase in occurrence of EECs leads to a reduction in later EEC durations compared with the original level (i.e., the basework duration of that particular activity) because of increasing LCE. As a result, resource availability increases since less time is taken

1 See *Subsection 4.3.3* for detailed mathematical definition of variables ii) and iii).

them) contribute qualitatively to specific behavioral patterns.

**Figure 1.** Feedback Loops of EEC Occurrence

**3.1. Balancing (negative) loops** 

EEC occurrence decreases.

**3. Causal framework** 

Complexity drives *uncertainty*. Uncertainty is an inherent nature of NPD projects stemming from all aspects of complexity associated with efforts creating a new product as discussed above. The presence of inherent uncertainty in NPD processes is much greater and, interestingly, much more complicated than those in processes of other kinds (e.g., business or manufacturing processes), even though the latter also possess certain degree of inherent unpredictability. Types of uncertainty in engineering design include *subjective uncertainty* derived from incomplete information, and *objective uncertainty* associated with environment [37]. Moreover, concurrent processing of NPD activities will further increase the uncertainty of an NPD project by starting activities with incomplete or missing input information. In this model uncertainty is explicitly differentiated into three types: i) low–level *activity uncertainty* represented by the stochastic activity duration, ii) medium–level *solution uncertainty* that dynamically calculates rework probability, and iii) high–level *environmental uncertainty* captured by the arrival frequency and magnitude of IECs.

#### **2.3. Rework and change propagation**

Evidences show clearly that excessive project budget and schedule overruns typically involve significant effort on rework [14, 15, 16, 26, 29, 30, 32]. Moreover, it is claimed by Reichelt and Lyneis [32] that "these phenomena are not caused by late scope growth or a sudden drop in productivity, but rather by the late discovery and correction of rework created earlier in the project." In this study, primary characteristics of NPD projects will be transformed into a DES model to study their relative impacts on the stochastic arrivals of *rework* (i.e., iterations or EECs).

Rework probability, if included in previous PD process models, is typically assigned a fixed number and remains statically along the process [4, 8, 9, 15, 26]. In reality, however, it is not always the case. Rework probability will be calculated in the proposed DES model by dynamic, evolving solution uncertainty influenced by important feedback effects from other interrelated system variables such as design solutions scope, resource availability, etc. And also, any type of rework is usually discussed on an aggregate level, instead of being categorized into iterations, EECs, and IECs as discussed in this study.

A change rarely occurs alone and multiple changes can have interacting effects on the complex change networks [13]. *Change propagation* is included by considering both of dependent product items and interrelated NPD activities. A complex product usually consists of several interrelated major systems, and each further contains interconnected subsystems, components, and elements. The interactions, in terms of spatial, energy, information, and material [31], that occur between the functional and physical items will cause EC of one product item propagate to the others. Besides highly dependent product configuration, product development activities are also coupled. An EC may propagate to its later activities within the current phase or after. For example, an EC that solves a design fault may trigger further changes to downstream activities in design or production phase.

## **3. Causal framework**

172 Discrete Event Simulations – Development and Applications

**2.3. Rework and change propagation** 

*rework* (i.e., iterations or EECs).

and overall quality improvement (e.g., [3, 17, 27]). It is one of the process features that are

Complexity drives *uncertainty*. Uncertainty is an inherent nature of NPD projects stemming from all aspects of complexity associated with efforts creating a new product as discussed above. The presence of inherent uncertainty in NPD processes is much greater and, interestingly, much more complicated than those in processes of other kinds (e.g., business or manufacturing processes), even though the latter also possess certain degree of inherent unpredictability. Types of uncertainty in engineering design include *subjective uncertainty* derived from incomplete information, and *objective uncertainty* associated with environment [37]. Moreover, concurrent processing of NPD activities will further increase the uncertainty of an NPD project by starting activities with incomplete or missing input information. In this model uncertainty is explicitly differentiated into three types: i) low–level *activity uncertainty* represented by the stochastic activity duration, ii) medium–level *solution uncertainty* that dynamically calculates rework probability, and iii) high–level *environmental* 

Evidences show clearly that excessive project budget and schedule overruns typically involve significant effort on rework [14, 15, 16, 26, 29, 30, 32]. Moreover, it is claimed by Reichelt and Lyneis [32] that "these phenomena are not caused by late scope growth or a sudden drop in productivity, but rather by the late discovery and correction of rework created earlier in the project." In this study, primary characteristics of NPD projects will be transformed into a DES model to study their relative impacts on the stochastic arrivals of

Rework probability, if included in previous PD process models, is typically assigned a fixed number and remains statically along the process [4, 8, 9, 15, 26]. In reality, however, it is not always the case. Rework probability will be calculated in the proposed DES model by dynamic, evolving solution uncertainty influenced by important feedback effects from other interrelated system variables such as design solutions scope, resource availability, etc. And also, any type of rework is usually discussed on an aggregate level, instead of being

A change rarely occurs alone and multiple changes can have interacting effects on the complex change networks [13]. *Change propagation* is included by considering both of dependent product items and interrelated NPD activities. A complex product usually consists of several interrelated major systems, and each further contains interconnected subsystems, components, and elements. The interactions, in terms of spatial, energy, information, and material [31], that occur between the functional and physical items will cause EC of one product item propagate to the others. Besides highly dependent product configuration, product development activities are also coupled. An EC may propagate to its later activities within the current phase or after. For example, an EC that solves a design fault may trigger further changes to downstream activities in design or production phase.

captured and thoroughly analyzed by the DES model proposed here.

*uncertainty* captured by the arrival frequency and magnitude of IECs.

categorized into iterations, EECs, and IECs as discussed in this study.

Before the actural construction of a computer simulation model that is quantitatively augmented by algebraic relationships among interrelated variables, causal loop diagrams are first constructed to study how external factors and internal system structure (the interacting variables comprising the system and the cause-and-effect relationships among them) contribute qualitatively to specific behavioral patterns.

**Figure 1.** Feedback Loops of EEC Occurrence

Four feedback loops of various lengths (i.e., the number of variables contained within the loop) that drive EEC occurrence are illustrated in *Fig. 1* for purpose of demonstration. Five interdependent variables1 are considered to form these loops: i) *EEC size*, ii) *solution completeness*, iii) *solution uncertainty*, iv) *Learning Curve Effects (LCE)*, and v) *resource availability*.

#### **3.1. Balancing (negative) loops**

*Balancing Loop 1* **(# of EECs Solution Completeness Solution Uncertainty # of EECs)** depicts the reduction in the number of incoming EECs as a result of handling EECs. This phenomenon is due to the fact that processing of more EECs leads to an increase in solution completeness of the NPD project; and thus solution uncertainty decreases. Given the assuption that EEC probability is exponentially decreasing as the project's solution uncertainty decreases, the influence is along the same direction, and therefore number of EEC occurrence decreases.

The reasoning behind *Balancing Loop 2* **(# of EECs Learning Curve Effects EEC Size Resource availability Solution Completeness Solution Uncertainty # of EECs)** is that an increase in occurrence of EECs leads to a reduction in later EEC durations compared with the original level (i.e., the basework duration of that particular activity) because of increasing LCE. As a result, resource availability increases since less time is taken

<sup>1</sup> See *Subsection 4.3.3* for detailed mathematical definition of variables ii) and iii).

for completing EECs, which in turn accelerates the rate of solution completeness and thus leads to the decreasing occurrence of EECs.

Using Discrete Event Simulation for Evaluating Engineering Change Management Decisions 175

2. There is no overlapping between activities within a same phase. An NPD activity only receives finalized information from its upstream activity within one phase, but downstream action can start with information in a preliminary form before all activities in upstream phase are completed. In addition, there is no information exchange in the

3. Demand on resource for NPD activity is assumed to be deterministic fixed. However, the activity duration varies stochastically subject to activity uncertainty and LCE which

4. The dynamic progress of an NPD entity is reflected in the work flow within and among NPD phases. Workflow routing is probabilistically altered by either intra–phase iterations or inter–phase EECs according to the dynamically updated rework probabilities, which are calculated based on the current value of solution uncertainty. 5. Each IEC is initially associated with a directly affected NPD activity (and a directly affected product item when product structure is modeled), and may further propagate to any downstream activities based on randomly assigned probabilities. IECs are modeled within a parallel co–flow structure similar to its NPD counterpart. The IEC

Based upon these general assumptions made, notations of important model parameters and

variables which will be later used in mathematical formulation are introduced.




3 the cumulative functional effort of the ongoing rework(s) at time �

3 An aggregate term consists of ongoing rework(s)/rework propagations each one corresponding to its current

in phase ���

at time �

vary depending on the number of attempts to that particular activity.

work flow is restricted by the precedence constraints.



: the latest–finished rework for activity ���

2 The Erlang distribution ���������� �� is used as a description of NPD activity duration

middle of an activity.

**4.2. Notations** 

*4.2.1. Model parameters* 


�� 2� � � ��) in phase �

*4.2.2. Model variables* 

����

stochastic functional effort value







#### **3.2. Reinforcing (positive) loops**

While the explanation of Balancing Loop 2 is based upon the indirect positive impact of EEC occurrence on resource availability through the reduction of later EEC durations owing to learning curve effects, *Reinforcing Loop 3* **(# of EECs Resource availability Solution Completeness Solution Uncertainty # of EECs)** can be interpreted by the direct negative influence of EEC occurrence on resource availability: the more EECs occur, the more resource will be allocated to process them. As opposed to Loop 2, a decrease in resource availability decelerates the rate of solution completeness, and thus causes an increasing occurrence of EECs.

Despite the indirect effects of EEC size reduction on an accelerating solution completeness rate that results in an increase in the resource availability, a decrease in EEC size also has a direct negative impact on solution completeness because of less contribution to close the information deficiency towards the final design solution, which is shown in *Reinforcing Loop 4* **(# of EECs Learning Curve Effects EEC Size Solution Completeness Solution Uncertainty # of EECs)**.

#### **3.3. Summary**

The above four closed feedback loops depict how the initial occurrence of EECs will lead to the subsequent modification of occurrence frequency by taking into account other interrelated variables and presenting simple cause–and–effect relationships between them. A combination of both positive and negative feedback loops indicates that the complex and dynamic interrelationships among variables make the prediction of occurring patterns of iterations/EECs not so straightforward. This phenomenon points out the necessity of constructing a simulation model that can help further quantitative analyses.

## **4. Model description**

#### **4.1. General assumptions**

This model has two constituent sections: *NPD Section with Reworks* and *IEC Section*. Primary model assumptions underlying are listed below.

1. The overall structure of NPD process can be systematically planned beforehand in an activity–based representation according to historical data from previously accomplished projects of similar products and teams' expertise as well. All NPD phases and activities, their expected durations and units of resource required, and interdependencies relationships among them are obtainable and remain stable as the NPD project evolves. Therefore, optimization of process sequencing and scheduling is not pursued by this study.


#### **4.2. Notations**

174 Discrete Event Simulations – Development and Applications

leads to the decreasing occurrence of EECs.

**3.2. Reinforcing (positive) loops** 

increasing occurrence of EECs.

**Solution Uncertainty # of EECs)**.

**3.3. Summary** 

**4. Model description** 

**4.1. General assumptions** 

not pursued by this study.

model assumptions underlying are listed below.

for completing EECs, which in turn accelerates the rate of solution completeness and thus

While the explanation of Balancing Loop 2 is based upon the indirect positive impact of EEC occurrence on resource availability through the reduction of later EEC durations owing to learning curve effects, *Reinforcing Loop 3* **(# of EECs Resource availability Solution Completeness Solution Uncertainty # of EECs)** can be interpreted by the direct negative influence of EEC occurrence on resource availability: the more EECs occur, the more resource will be allocated to process them. As opposed to Loop 2, a decrease in resource availability decelerates the rate of solution completeness, and thus causes an

Despite the indirect effects of EEC size reduction on an accelerating solution completeness rate that results in an increase in the resource availability, a decrease in EEC size also has a direct negative impact on solution completeness because of less contribution to close the information deficiency towards the final design solution, which is shown in *Reinforcing Loop 4* **(# of EECs Learning Curve Effects EEC Size Solution Completeness** 

The above four closed feedback loops depict how the initial occurrence of EECs will lead to the subsequent modification of occurrence frequency by taking into account other interrelated variables and presenting simple cause–and–effect relationships between them. A combination of both positive and negative feedback loops indicates that the complex and dynamic interrelationships among variables make the prediction of occurring patterns of iterations/EECs not so straightforward. This phenomenon points out the necessity of

This model has two constituent sections: *NPD Section with Reworks* and *IEC Section*. Primary

1. The overall structure of NPD process can be systematically planned beforehand in an activity–based representation according to historical data from previously accomplished projects of similar products and teams' expertise as well. All NPD phases and activities, their expected durations and units of resource required, and interdependencies relationships among them are obtainable and remain stable as the NPD project evolves. Therefore, optimization of process sequencing and scheduling is

constructing a simulation model that can help further quantitative analyses.

Based upon these general assumptions made, notations of important model parameters and variables which will be later used in mathematical formulation are introduced.

## *4.2.1. Model parameters*


#### *4.2.2. Model variables*


<sup>2</sup> The Erlang distribution ���������� �� is used as a description of NPD activity duration

<sup>3</sup> An aggregate term consists of ongoing rework(s)/rework propagations each one corresponding to its current stochastic functional effort value

	-
	-
	-
	-
	-

$$EN\_m = \Sigma\_{l=1}^I \Sigma\_{l=1}^J e\_{l\,lm} = \Sigma\_{l=1}^I \Sigma\_{l=1}^J (r\_{l\,lm} \times d\_{l\,j}) \tag{1}$$

$$(EI\_m)\_t = \Sigma\_{l=1}^{L\_t} \sum\_{g=g\_{l\_1}}^{g\_{l\_{G\_l}}} e\_{lgm} + (I\_m)\_t = \Sigma\_{l=1}^{L\_t} \sum\_{g=g\_{l\_1}}^{g\_{l\_{G\_l}}} (\mathfrak{s}\_{lgm} \times \boldsymbol{w}\_{lg}) + (I\_m)\_t \tag{2}$$

$$\{S\_m\}\_t = EN\_m + \{EI\_m\}\_t \tag{3}$$


#### *4.4.1. NPD activity duration and learning curve effects*

Low–level activity uncertainty is represented by random variation of activity duration around its estimate. For each NPD activity, its duration ݀ is sampled from a pre– determined probability distribution. The Erlang distribution ܧܴܣܮܰܩሺߚǡ ݇ሻ is used as a description of the activity duration. Employment of the Erlang distribution to represent activity interval is based on the hypothesis that each NPD activity consists of ݇ number of random tasks, everyone individually having an identical exponentially distributed processing time with mean ߚ. These mutually independent tasks can be considered as the lowest un–decomposable unit of the NPD process. Number of tasks ݇ comprising each activity and the anticipated task duration ߚ should be estimated by process participants and provided as model inputs.

According to the learning curve theory, the more often an activity is performed, the less time it requires to complete it and thus the lower will be the cost. This well recognized phenomenon is included as a process characteristic to improve the comprehensiveness of this DES model. Following the assumptions made in [9], LCE is modeled in the form of a linearly diminishing fraction (Ͳ൏ܮ ൏ ͳ) of the original duration whenever an activity is repeated until the minimum fraction (Ͳ൏ܮ ൏ ܮ ൏ ͳ) is hit and the rework processing time remains unchanged afterward. That is to say, learning curve improves through each round of rework until it reaches the minimum fraction of basework duration which is indispensible for activity execution. Let ݊ be the number of times an activity is attempted, LCE can be expressed as:

$$LCE = \max\left(\left(L\_f\right)^{N\_{lf}-1}, L\_{mln}\right) \tag{4}$$

Using Discrete Event Simulation for Evaluating Engineering Change Management Decisions 179

*department* during that phase. However, the other ��� departments, defined as *minor departments*, also need to participate but with less resource requirements. Cross–functional integration enables a decentralized NPD process by facilitated communications among involving departments. Recourse consumption in the form of departmental interaction is, again, an estimate from process participants. Resources can represent staffs, computers /machines, documentations, or any other individual servers. It's assumed that each resource

In the process modeling literature, NPD is often considered as a system of interrelated activities that aims to increase knowledge or reduce uncertainty of the final design solution [7, 24, 37]. This DES model assumes that any knowledge or experience accumulation through an NPD activity, no matter accepted to be transferred to the next activity/activities or rejected and requested for a rework, will contribute to the common knowledge base of the NPD project towards its final design solution. No development effort is ever wasted. In this context, knowledge/experience accumulation is simply measured by the cumulative

*Functional solution completeness* is defined as a criterion to reflect the effort gap between the actual cumulative functional effort accomplished to date and the evolving functional design solution scope �����. The exact expression for ������� is determined by the amount of overlap between NPD activities. The more concurrency a process holds, the more complicated the expression will be. Eq. (5) is an illustration of solution completeness at time � for the easiest case: a sequential process. ������� is improved by knowledge or experience accumulation through performing NPD basework (indicated by the first term in Eq. (5)) and rework (the second term), and handling IECs (the third term). Again, a generalized abstract term ����� is used here to represent the cumulative functional effort of the ongoing

��� ���∑ �∑ ∑ ����

On the contrary, *functional solution uncertainty* ������� reflects the degree of functional effort absence towards the design solution scope. Therefore, the solution uncertainty of

After each activity, there is a rework review decision point that decides whether the activity output is acceptable and the NPD project entity gets through or it needs to flow back for a rework according to a weighted rework probability determined by the latest levels of functional solution uncertainty. A critical assumption we made is that the *iteration probability* of an activity is negatively proportional to the NPD project's latest level of solution uncertainty. That is, chance of an activity gets to iterate before it is released to the next phase

�� ���

����� �∑ ∑ ����

�� ��� � ��� � ����������� ��������� ��������������

(5)

��

�����

is qualified to handle all the NPD activities within all phases.

effort that has been committed to the project in terms of person–days.

*4.4.3. Solution uncertainty* 

rework(s) at time �.

*4.4.4. Rework probability* 

� � �∑ ∑ ���� � ��� ����

��� �∑ ����� ��

activity � in phase � at time � is ������� � ���� � ������� *.*

������

Therefore, the processing time of a rework to an NPD activity depends on two variables: the stochastic basework duration ݀ of the activity and the number of times ܰ it is attempted. Any types of NPD rework, no matter intra–phase iterations or inter–phase EECs, are assumed to be subject to the same LEC.

#### *4.4.2. Overlapping and cross–functional interactions*

*Overlapping* is defined as the partial or full parallel execution of nominally sequential development activities [25]. The underlying risk of overlapping raised by Krishnan that "the duration of the downstream activity may be altered in converting the sequential process into an overlapped process" [24] is addressed here in a slightly different way from directly increasing downstream duration and effort by a certain calculated value (e.g., [33]). The more number of activities start with information in a preliminary form or even missing information, the less is the design solution completeness, which will in turn affect rework probabilities as discussed in detail in the next section.

The concept of cross–functional integration among different functional areas during an NPD process is defined as *Departmental Interaction* (DI). One of the ݉ departments takes major responsibility for the phase in its own area with specialized knowledge, and is called *major*  *department* during that phase. However, the other ��� departments, defined as *minor departments*, also need to participate but with less resource requirements. Cross–functional integration enables a decentralized NPD process by facilitated communications among involving departments. Recourse consumption in the form of departmental interaction is, again, an estimate from process participants. Resources can represent staffs, computers /machines, documentations, or any other individual servers. It's assumed that each resource is qualified to handle all the NPD activities within all phases.

#### *4.4.3. Solution uncertainty*

178 Discrete Event Simulations – Development and Applications

provided as model inputs.

LCE can be expressed as:

assumed to be subject to the same LEC.

*4.4.2. Overlapping and cross–functional interactions* 

probabilities as discussed in detail in the next section.

*4.4.1. NPD activity duration and learning curve effects* 

Low–level activity uncertainty is represented by random variation of activity duration around its estimate. For each NPD activity, its duration ݀ is sampled from a pre– determined probability distribution. The Erlang distribution ܧܴܣܮܰܩሺߚǡ ݇ሻ is used as a description of the activity duration. Employment of the Erlang distribution to represent activity interval is based on the hypothesis that each NPD activity consists of ݇ number of random tasks, everyone individually having an identical exponentially distributed processing time with mean ߚ. These mutually independent tasks can be considered as the lowest un–decomposable unit of the NPD process. Number of tasks ݇ comprising each activity and the anticipated task duration ߚ should be estimated by process participants and

According to the learning curve theory, the more often an activity is performed, the less time it requires to complete it and thus the lower will be the cost. This well recognized phenomenon is included as a process characteristic to improve the comprehensiveness of this DES model. Following the assumptions made in [9], LCE is modeled in the form of a linearly diminishing fraction (Ͳ൏ܮ ൏ ͳ) of the original duration whenever an activity is repeated until the minimum fraction (Ͳ൏ܮ ൏ ܮ ൏ ͳ) is hit and the rework processing time remains unchanged afterward. That is to say, learning curve improves through each round of rework until it reaches the minimum fraction of basework duration which is indispensible for activity execution. Let ݊ be the number of times an activity is attempted,

ܧܥܮ ൌ ቀ൫ܮ൯

Therefore, the processing time of a rework to an NPD activity depends on two variables: the stochastic basework duration ݀ of the activity and the number of times ܰ it is attempted. Any types of NPD rework, no matter intra–phase iterations or inter–phase EECs, are

*Overlapping* is defined as the partial or full parallel execution of nominally sequential development activities [25]. The underlying risk of overlapping raised by Krishnan that "the duration of the downstream activity may be altered in converting the sequential process into an overlapped process" [24] is addressed here in a slightly different way from directly increasing downstream duration and effort by a certain calculated value (e.g., [33]). The more number of activities start with information in a preliminary form or even missing information, the less is the design solution completeness, which will in turn affect rework

The concept of cross–functional integration among different functional areas during an NPD process is defined as *Departmental Interaction* (DI). One of the ݉ departments takes major responsibility for the phase in its own area with specialized knowledge, and is called *major* 

ேೕିଵǡ ܮቁ (4)

In the process modeling literature, NPD is often considered as a system of interrelated activities that aims to increase knowledge or reduce uncertainty of the final design solution [7, 24, 37]. This DES model assumes that any knowledge or experience accumulation through an NPD activity, no matter accepted to be transferred to the next activity/activities or rejected and requested for a rework, will contribute to the common knowledge base of the NPD project towards its final design solution. No development effort is ever wasted. In this context, knowledge/experience accumulation is simply measured by the cumulative effort that has been committed to the project in terms of person–days.

*Functional solution completeness* is defined as a criterion to reflect the effort gap between the actual cumulative functional effort accomplished to date and the evolving functional design solution scope �����. The exact expression for ������� is determined by the amount of overlap between NPD activities. The more concurrency a process holds, the more complicated the expression will be. Eq. (5) is an illustration of solution completeness at time � for the easiest case: a sequential process. ������� is improved by knowledge or experience accumulation through performing NPD basework (indicated by the first term in Eq. (5)) and rework (the second term), and handling IECs (the third term). Again, a generalized abstract term ����� is used here to represent the cumulative functional effort of the ongoing rework(s) at time �.

$$\left(\mathbf{C}\_{ljm}\right)\_{\mathbf{t}} = \frac{\left(\sum\_{l=1}^{l\_{\mathbf{t}}-1} \sum\_{j=1}^{l\_{\mathbf{t}}} e\_{l/m} + \sum\_{j=1}^{l\_{\mathbf{t}}} e\_{l/m}\right) + \left(\sum\_{\mathbf{x}=\mathbf{x}\_{1}, \mathbf{y}=\mathbf{y}\_{1}}^{\mathbf{x}} \left(\sum\_{l=\mathbf{x}+1}^{l\_{\mathbf{t}}} \sum\_{j=1}^{l\_{\mathbf{t}}} e\_{l/m} + \sum\_{l=\mathbf{x}}^{l\_{\mathbf{t}}} \sum\_{j=\mathbf{y}}^{l\_{\mathbf{t}}} e\_{l/m}\right) + (\mathbf{R}\_{\mathbf{m}})\_{\mathbf{t}}}{\left(\mathbf{S}\_{\mathbf{m}}\right)\_{\mathbf{t}}} \tag{5}$$

On the contrary, *functional solution uncertainty* ������� reflects the degree of functional effort absence towards the design solution scope. Therefore, the solution uncertainty of activity � in phase � at time � is ������� � ���� � ������� *.*

#### *4.4.4. Rework probability*

After each activity, there is a rework review decision point that decides whether the activity output is acceptable and the NPD project entity gets through or it needs to flow back for a rework according to a weighted rework probability determined by the latest levels of functional solution uncertainty. A critical assumption we made is that the *iteration probability* of an activity is negatively proportional to the NPD project's latest level of solution uncertainty. That is, chance of an activity gets to iterate before it is released to the next phase will increase as the project unfolds with more information available and its solution uncertainty decreases. Two arguments are presented to backup this assumption:


The functional iteration probability is formulated by a negative exponential function of uncertainty as appeared in Eq. (6), where ����� is a process–specific *Iteration Probability Constant* (IPC) that should be determined beforehand as a model input.

$$(Pl\_{ljm})\_t = \alpha^{(U\_{l/m})\_t + 1} \tag{6}$$

Using Discrete Event Simulation for Evaluating Engineering Change Management Decisions 181

To avoid the dominance of such reinforcing loops which will eventually lead to a net effect of overall divergence with no termination condition, *rework criteria* are established as the first step of rework review after the completion of an activity to check whether the cumulative functional effort committed to the deliverable is high enough to provide a satisfying outcome, and therefore let the NPD entity pass rework evaluation. If the cumulative devoted effort fails to meet the pre–determined criteria (i.e., the cumulative effort is less than the expected amount), the entity will be evaluated at the rework decision– point and go for iteration or EEC according to the rework probability calculated by solution completeness. If the committed effort is higher than the pre–set amount, the NPD entity will

conditionally pass rework evaluation and continue executing next activity/activities.

committed to a particular NPD activity.

and other dependent product items.

until the beginning of processing of that particular activity.

includes three major steps as discussed before:

process due to the occurrence of IECs to any of its upstream activities.

becomes available; if condition is false, go to the next step;

*4.5.1. IEC processing rules* 

**4.5. IEC framework** 

Unger and Eppinger [36] define *rigidity* by the degree to which deliverables are held to previously–established criteria as metrics to characterize design reviews. By putting it in a slightly different way, rigidity of rework review is considered in this DES model as the strictness of pre–defined rework criteria with respect to the amount of cumulative effort

Unlike iterations and EECs, IECs are studied through a different DES model section other than the NPD framework. The IEC framework explores how IECs emerging from outside sources after the NPD process begins are handled and how an initiating IEC to a specific activity of a product item will cause further change propagation in its downstream activities

IECs affecting activities in different NPD phases are assumed to arrive in randomly after the NPD project starts. A checkpoint is inserted before the processing of an IEC to verify whether the directly affected NPD activity has started yet. The incoming IEC will be hold

During NPD rework reviews, the upcoming NPD activity will also be hold from getting processed if there are IECs currently being handled with respect to any of its upstream activities until new information from these IECs becomes available (i.e., the completion of IECs). Purpose of such an inspection is to avoid unnecessary rework as a result of expected new information and updates. However, the NPD activity will not pause in middle of its

*Fig. 3* summaries the entire rework review process after the completion of each activity that

1. Check if there are currently any IECs being handled with regards to any of its upstream activities. If the condition is true, wait until new information from all of these IECs

Since NPD activities are decentralized through cross–functional integration among participating departments, so is the decision making process of carrying out rework. The overall iteration probability of activity � in phase � is the weighted mean by the number of resources each department commits to the activity.

$$(PI\_{lj})\_t = \frac{\Sigma\_{m=1}^{M} (r\_{l/m} \times (PI\_{l/m})\_t)}{\Sigma\_{m=1}^{M} r\_{l/m}} \tag{7}$$

Similarly, *EEC probability* is characterized by an *EEC Probability Constant* (EPC) �����. However, as opposed to iteration probability, it is assumed to be exponentially decreasing as the project's solution uncertainty decreases. That is to say, the chance of revisiting NPD activities, whose outputs have already been frozen and released to their successor phase, is the highest after the first activity of the second phase and continuously reduces according to the continually increasing design solution completeness.

$$(PE\_{l/m})\_t = \chi^{(C\_{l/m})\_t + 1} = \chi^{2 - (U\_{l/m})\_t} \tag{8}$$

$$(PE\_{lj})\_t = \frac{\sum\_{k=1}^{n} (r\_{l\,fm} \times (PE\_{l\,fm})\_t)}{\sum\_{k=1}^{n} r\_{l\,fm}} \tag{9}$$

Given the overall rework probability, the next step is to identify which upstream activity generates the design error disclosed by rework review and therefore becomes the starting point of rework loop. For simplicity, it is assumed that each upstream activity gets an equal chance of initiating an intra–phase iteration loop or an inter–phase EEC loop.

#### *4.4.5. Rework criteria and rigidity of rework review*

According to the rationale explained in previous subsections and causal loop diagrams created, the occurrences of both iterations and EECs are governed by a combination of balancing and reinforcing loops. Take *Loop 3* as an example, less resource availability resulted from increasing EEC arrivals will decelerates the rate of solution completeness, and further increase the occurrence of EECs.

To avoid the dominance of such reinforcing loops which will eventually lead to a net effect of overall divergence with no termination condition, *rework criteria* are established as the first step of rework review after the completion of an activity to check whether the cumulative functional effort committed to the deliverable is high enough to provide a satisfying outcome, and therefore let the NPD entity pass rework evaluation. If the cumulative devoted effort fails to meet the pre–determined criteria (i.e., the cumulative effort is less than the expected amount), the entity will be evaluated at the rework decision– point and go for iteration or EEC according to the rework probability calculated by solution completeness. If the committed effort is higher than the pre–set amount, the NPD entity will conditionally pass rework evaluation and continue executing next activity/activities.

Unger and Eppinger [36] define *rigidity* by the degree to which deliverables are held to previously–established criteria as metrics to characterize design reviews. By putting it in a slightly different way, rigidity of rework review is considered in this DES model as the strictness of pre–defined rework criteria with respect to the amount of cumulative effort committed to a particular NPD activity.

## **4.5. IEC framework**

180 Discrete Event Simulations – Development and Applications

resources each department commits to the activity.

the continually increasing design solution completeness.

*4.4.5. Rework criteria and rigidity of rework review* 

further increase the occurrence of EECs.

will increase as the project unfolds with more information available and its solution

1. As the project unfolds, more information will be available to justify further iteratively

2. Since a product architecture often consists of multiple conflicting targets that may be difficult to meet simultaneously and thus requires further trade–offs, "design oscillations" on a system level may occur due to the interdependencies among local components and subsystems even after the achievement of individual optimum [10, 28].

The functional iteration probability is formulated by a negative exponential function of uncertainty as appeared in Eq. (6), where ����� is a process–specific *Iteration* 

Since NPD activities are decentralized through cross–functional integration among participating departments, so is the decision making process of carrying out rework. The overall iteration probability of activity � in phase � is the weighted mean by the number of

> ������� � <sup>∑</sup> ��������������� � ���

Similarly, *EEC probability* is characterized by an *EEC Probability Constant* (EPC) �����. However, as opposed to iteration probability, it is assumed to be exponentially decreasing as the project's solution uncertainty decreases. That is to say, the chance of revisiting NPD activities, whose outputs have already been frozen and released to their successor phase, is the highest after the first activity of the second phase and continuously reduces according to

> ������� � <sup>∑</sup> ��������������� � ���

Given the overall rework probability, the next step is to identify which upstream activity generates the design error disclosed by rework review and therefore becomes the starting point of rework loop. For simplicity, it is assumed that each upstream activity gets an equal

According to the rationale explained in previous subsections and causal loop diagrams created, the occurrences of both iterations and EECs are governed by a combination of balancing and reinforcing loops. Take *Loop 3* as an example, less resource availability resulted from increasing EEC arrivals will decelerates the rate of solution completeness, and

chance of initiating an intra–phase iteration loop or an inter–phase EEC loop.

∑ ���� � ���

∑ ���� � ���

�������� � ���������� (6)

�������� � ���������� � ���������� (8)

(7)

(9)

*Probability Constant* (IPC) that should be determined beforehand as a model input.

uncertainty decreases. Two arguments are presented to backup this assumption:

refinement of the design solution for each component [37].

Unlike iterations and EECs, IECs are studied through a different DES model section other than the NPD framework. The IEC framework explores how IECs emerging from outside sources after the NPD process begins are handled and how an initiating IEC to a specific activity of a product item will cause further change propagation in its downstream activities and other dependent product items.

#### *4.5.1. IEC processing rules*

IECs affecting activities in different NPD phases are assumed to arrive in randomly after the NPD project starts. A checkpoint is inserted before the processing of an IEC to verify whether the directly affected NPD activity has started yet. The incoming IEC will be hold until the beginning of processing of that particular activity.

During NPD rework reviews, the upcoming NPD activity will also be hold from getting processed if there are IECs currently being handled with respect to any of its upstream activities until new information from these IECs becomes available (i.e., the completion of IECs). Purpose of such an inspection is to avoid unnecessary rework as a result of expected new information and updates. However, the NPD activity will not pause in middle of its process due to the occurrence of IECs to any of its upstream activities.

*Fig. 3* summaries the entire rework review process after the completion of each activity that includes three major steps as discussed before:

1. Check if there are currently any IECs being handled with regards to any of its upstream activities. If the condition is true, wait until new information from all of these IECs becomes available; if condition is false, go to the next step;

2. Compare the cumulative devoted functional effort so far to the pre–determined rework criteria. If the condition is true, the work flow conditionally pass the rework review and directly proceeds to next activity/activities; if the condition is false, go to the next step;

Using Discrete Event Simulation for Evaluating Engineering Change Management Decisions 183

This phenomenon is simulated by two layers of *IEC propagation loop*. Firstly, CP review decisions are performed after the completion of an IEC and then propagate to one of its downstream activities by predefined probabilities. We assume uni–directional change propagation based on process structure. That is, an IEC to one NPD activity will propagate only to its successor activities within the current or next phase. For example, an IEC to enhance a particular design feature may result in substantial alterations in prototyping and manufacturing. On the other hand, innovations in manufacturing process will only cause

Secondly, the first–level activity IEC propagation loop is then nested within an outer loop determined by particular dependency properties of the product configuration. Once an IEC to one product item and its CPs to affected downstream activities are completed, it will

A numerical example is presented in this section to illustrate how this DES model can actually be applied to facilitate ECM policy analysis. A combination of different process, product, team, and environment characteristics are tested through design of experiments. NPD project lead time, cost (or engineering effort in some cases), and quality are generated by the model as the three key performance measurements of the project under study to

The NPD section is demonstrated by a simple application of three representational phases of an NPD process: i) concept design and development (*Concept*), ii) detailed product design (*Design*), and iii) production ramp up (*Production*). Each phase consists of three sequentially numbered and chronologically related activities. The information flow between

Through this 3–phase and 3–activity framework, various overlapping ratios of an NPD process: 0%, 33%, 66%, or mixed (e.g., 0% overlap between Concept and Design and 33%

modifications within production phase but not changes in design.

every two activities is indicated by solid arrows as shown in *Fig. 4*.

further propagate to item(s) that is/are directly linked to it.

**5. Numerical application** 

**5.1. NPD section** 

evaluate overall product development efforts.

**Figure 4.** NPD Framework with Iterations and EECs

3. As a result of cross–functional negotiation and integration, calculate rework probability according to the current levels of functional solution uncertainty. NPD project entity either flows back to the identified activity that contains design errors for rework or proceeds to the activity/activities by probability.

**Figure 3.** 3-Step NPD Rework Review Process

### *4.5.2. Frequency and resource consumption of IEC*

Compared with NPDs that are much more likely to adhere to a planned schedule, IECs can occur without any plans. Therefore, the Exponential distribution is used to represent IECs' arrival interval. IEC's processing time is assumed to follow the Triangular distribution, where there is a most–likely time with some variation on two sides, represented by the most likely (Mode), minimum (Min), and maximum (Max) values respectively. The Triangular distribution is widely used in project management tools to estimate activity duration (e.g., Project Evaluation and Review Technique, Critical Path Method, etc.). The amount of resources required for an IEC to be processed is *IEC effort*. When there are not enough resources available for both processes, resource using priority needs to be assigned to either NPD or ECM to seize necessary resource first.

#### *4.5.3. IEC propagation*

*Change Propagation* (CP) is assumed to be rooted in both interrelated *activities* of a PD process and closely dependent constituent product *items*. That is, modifications to an initiating activity of one product item are highly likely to propagate to other activities within the same or different stages along the PD process, and may require further changes across to other items that are interconnected through design features and product attributes [23].

This phenomenon is simulated by two layers of *IEC propagation loop*. Firstly, CP review decisions are performed after the completion of an IEC and then propagate to one of its downstream activities by predefined probabilities. We assume uni–directional change propagation based on process structure. That is, an IEC to one NPD activity will propagate only to its successor activities within the current or next phase. For example, an IEC to enhance a particular design feature may result in substantial alterations in prototyping and manufacturing. On the other hand, innovations in manufacturing process will only cause modifications within production phase but not changes in design.

Secondly, the first–level activity IEC propagation loop is then nested within an outer loop determined by particular dependency properties of the product configuration. Once an IEC to one product item and its CPs to affected downstream activities are completed, it will further propagate to item(s) that is/are directly linked to it.

## **5. Numerical application**

A numerical example is presented in this section to illustrate how this DES model can actually be applied to facilitate ECM policy analysis. A combination of different process, product, team, and environment characteristics are tested through design of experiments. NPD project lead time, cost (or engineering effort in some cases), and quality are generated by the model as the three key performance measurements of the project under study to evaluate overall product development efforts.

#### **5.1. NPD section**

182 Discrete Event Simulations – Development and Applications

**Figure 3.** 3-Step NPD Rework Review Process

*4.5.2. Frequency and resource consumption of IEC* 

NPD or ECM to seize necessary resource first.

*4.5.3. IEC propagation* 

proceeds to the activity/activities by probability.

2. Compare the cumulative devoted functional effort so far to the pre–determined rework criteria. If the condition is true, the work flow conditionally pass the rework review and directly proceeds to next activity/activities; if the condition is false, go to the next step; 3. As a result of cross–functional negotiation and integration, calculate rework probability according to the current levels of functional solution uncertainty. NPD project entity either flows back to the identified activity that contains design errors for rework or

Compared with NPDs that are much more likely to adhere to a planned schedule, IECs can occur without any plans. Therefore, the Exponential distribution is used to represent IECs' arrival interval. IEC's processing time is assumed to follow the Triangular distribution, where there is a most–likely time with some variation on two sides, represented by the most likely (Mode), minimum (Min), and maximum (Max) values respectively. The Triangular distribution is widely used in project management tools to estimate activity duration (e.g., Project Evaluation and Review Technique, Critical Path Method, etc.). The amount of resources required for an IEC to be processed is *IEC effort*. When there are not enough resources available for both processes, resource using priority needs to be assigned to either

*Change Propagation* (CP) is assumed to be rooted in both interrelated *activities* of a PD process and closely dependent constituent product *items*. That is, modifications to an initiating activity of one product item are highly likely to propagate to other activities within the same or different stages along the PD process, and may require further changes across to other items that are interconnected through design features and product attributes [23].

The NPD section is demonstrated by a simple application of three representational phases of an NPD process: i) concept design and development (*Concept*), ii) detailed product design (*Design*), and iii) production ramp up (*Production*). Each phase consists of three sequentially numbered and chronologically related activities. The information flow between every two activities is indicated by solid arrows as shown in *Fig. 4*.

**Figure 4.** NPD Framework with Iterations and EECs

Through this 3–phase and 3–activity framework, various overlapping ratios of an NPD process: 0%, 33%, 66%, or mixed (e.g., 0% overlap between Concept and Design and 33% overlap between Design and Production), can be constructed by connecting intra–phase activities via different combinations of dashed arrows.

Using Discrete Event Simulation for Evaluating Engineering Change Management Decisions 185

When considering the activity duration estimates, it is further assumed that the mutually independent and exponentially distributed duration has a mean of ߚൌʹ days for activities in all three phases. Furthermore, the number of tasks that compose activities within one phase remains the same, but increases from phase to phase to represent the increasing content and complexity of design and development activities as the NPD project unfolds: ݇ൌͶ for activities in Concept phase; ݇ൌ for Design phase; and ݇ ൌ ͳͲ for Production phase. Note that when LCE are taken into account, random variables described by the Erlang distribution ܧܴܣܮܰܩሺߚǡ ݇ሻ only represent processing intervals of NPD basework. Rework duration is also subject to ܰ, the number of times that an activity is attempted, in

൬ቀଵ ݔ݉ܽ ൌ ܧܥܮ

usage cost rates are set as *\$25/hour* and idle cost as *\$10/hour* for all resources.

ሺܵሻ௧) will be explored more in depth through "what–if" analysis.

point and the corresponding IEC process modules.

from each of the three departments to get processed.

ଶ ቁ ேೕ ǡ ͲǤͳ൰

To match the three major phases of the illustrated NPD process, it is assumed that there exist three different functional areas: *marketing*, *engineering*, and *manufacturing*, that participate in the overall NPD process through integrated DI. Based on the model assumption that each activity consumes a total number of 100 resources units to complete, DI is defined as follows: 60 units requested from major department and 20 units requested from each of the other two minor departments. To estimate the final project cost, the busy

Different rigidities of rework review, which are represented by various rework criteria ratios (i.e., relationships between rework criteria and the evolving functional design solution scope

*Fig. 5* gives an overview of the IEC model section applying 33% overlapping strategy. It is assumed that an IEC will propagate to one of its downstream activities in the current or next phase with equal chances, and this propagation will continue in the same manner until the end of IEC propagation loop when no more change is identified. For the purpose of demonstration, a full list of potential downstream change propagations of each IEC is provided on the right side of the *IEC Propagation* decision point. In the actual simulation model, verbal description is replaced by connectors between the IEC propagation decision

Take the IEC to activity Concept1 as an example, change propagation will result in a maximum of six follow–up IECs (i.e., IECs to C2, C3/D1, D2, D3/P1, P2, and P3) and a minimum of two (i.e., IECs to C3/D1/D2 and D3/P1/P2/P3). For simplicity, it is also assumed that each IEC, no matter in which activity it is occurred, equally consumes 10 resource units

**5.3. NPD process parameter** 

the form of

**5.4. IEC section** 

#### **5.2. Overlapping strategy**

An NPD process with 0% overlapping is also called a *sequential* process, in which the downstream phase is allowed to start only after receiving the output information from the upstream phase in its finalized form. That is, different phases comprising an NPD process are connected in a completely linear fashion.

Besides its capability of representing a sequential process, this framework can also be assembled into *concurrent* processes by allowing the parallelization of upstream and downstream activities*.* For a 33% overlapped process, the first activity of downstream phase begins simultaneously with the last activity of upstream phase. For a 66% overlapped NPD process, the first activity of the following phase starts simultaneously with the second activity of the preceding phase.

**Figure 5.** NPD Framework with Iterations and EECs

Obviously, as compared to its counterpart in a sequential process, the solution uncertainty of downstream activity increases due to the fact that it begins before the completion of all upstream activities using only preliminary output information, while the solution uncertainty of the upstream activity remains unchanged. That is, only the solution uncertainty of overlapped activities in succeeding phases will be affected under the current model assumptions.

#### **5.3. NPD process parameter**

184 Discrete Event Simulations – Development and Applications

are connected in a completely linear fashion.

**Figure 5.** NPD Framework with Iterations and EECs

model assumptions.

**5.2. Overlapping strategy** 

activity of the preceding phase.

activities via different combinations of dashed arrows.

overlap between Design and Production), can be constructed by connecting intra–phase

An NPD process with 0% overlapping is also called a *sequential* process, in which the downstream phase is allowed to start only after receiving the output information from the upstream phase in its finalized form. That is, different phases comprising an NPD process

Besides its capability of representing a sequential process, this framework can also be assembled into *concurrent* processes by allowing the parallelization of upstream and downstream activities*.* For a 33% overlapped process, the first activity of downstream phase begins simultaneously with the last activity of upstream phase. For a 66% overlapped NPD process, the first activity of the following phase starts simultaneously with the second

Obviously, as compared to its counterpart in a sequential process, the solution uncertainty of downstream activity increases due to the fact that it begins before the completion of all upstream activities using only preliminary output information, while the solution uncertainty of the upstream activity remains unchanged. That is, only the solution uncertainty of overlapped activities in succeeding phases will be affected under the current When considering the activity duration estimates, it is further assumed that the mutually independent and exponentially distributed duration has a mean of ߚൌʹ days for activities in all three phases. Furthermore, the number of tasks that compose activities within one phase remains the same, but increases from phase to phase to represent the increasing content and complexity of design and development activities as the NPD project unfolds: ݇ൌͶ for activities in Concept phase; ݇ൌ for Design phase; and ݇ ൌ ͳͲ for Production phase. Note that when LCE are taken into account, random variables described by the Erlang distribution ܧܴܣܮܰܩሺߚǡ ݇ሻ only represent processing intervals of NPD basework. Rework duration is also subject to ܰ, the number of times that an activity is attempted, in the form of

$$LCE = \max\left(\left(\frac{1}{2}\right)^{N\_{lj}}, 0.1\right)$$

To match the three major phases of the illustrated NPD process, it is assumed that there exist three different functional areas: *marketing*, *engineering*, and *manufacturing*, that participate in the overall NPD process through integrated DI. Based on the model assumption that each activity consumes a total number of 100 resources units to complete, DI is defined as follows: 60 units requested from major department and 20 units requested from each of the other two minor departments. To estimate the final project cost, the busy usage cost rates are set as *\$25/hour* and idle cost as *\$10/hour* for all resources.

Different rigidities of rework review, which are represented by various rework criteria ratios (i.e., relationships between rework criteria and the evolving functional design solution scope ሺܵሻ௧) will be explored more in depth through "what–if" analysis.

#### **5.4. IEC section**

*Fig. 5* gives an overview of the IEC model section applying 33% overlapping strategy. It is assumed that an IEC will propagate to one of its downstream activities in the current or next phase with equal chances, and this propagation will continue in the same manner until the end of IEC propagation loop when no more change is identified. For the purpose of demonstration, a full list of potential downstream change propagations of each IEC is provided on the right side of the *IEC Propagation* decision point. In the actual simulation model, verbal description is replaced by connectors between the IEC propagation decision point and the corresponding IEC process modules.

Take the IEC to activity Concept1 as an example, change propagation will result in a maximum of six follow–up IECs (i.e., IECs to C2, C3/D1, D2, D3/P1, P2, and P3) and a minimum of two (i.e., IECs to C3/D1/D2 and D3/P1/P2/P3). For simplicity, it is also assumed that each IEC, no matter in which activity it is occurred, equally consumes 10 resource units from each of the three departments to get processed.



## **6. Results**

Impacts of the following managerial strategies and coordination policies on the responses of interest are investigated, and the root causes behind the performance of measurement system are explored:

Using Discrete Event Simulation for Evaluating Engineering Change Management Decisions 189

ౠିଵ ǡ ͲǤͳ൰in the model. Significant increase of both

concluded that the evaluation of learning curve effects unambiguously results in a

3. Effects of RL: by comparing *(i)* and *(ii)* of scenarios *(1)* with scenarios *(2)* under different combinations of LCE and OS levels, it can be concluded that a higher likelihood of rework in NPD activity undoubtedly causes an increase in both lead time and cost. 4. Effects of OS w/o LCE: by comparing lead time and project cost of scenarios *(A)* in a relative sense, we find that an increasing overlapping ratio aggravates the impact of NPD rework on both responses. That is, when NPD rework is included in the model but no LCE is considered, the greater the overlapping ratio, the higher the percentages of increase in both lead time and project cost as compared to baseline case. In addition, we notice the time–cost tradeoffs between a sequential process and a 66% overlapped process from columns *(i)* and *(ii)*. This observation agrees to the general

5. Effects of OS w/ LCE: Situation is not that predictable when LCE is taken into account

6. By comparing columns *(I)* and *(II)*, we observe a project behavioral pattern that the percentage increase of project cost is always higher than that of lead time at the occurrence of rework. That is to say, compared with lead time, project cost is more sensitive to rework. And the difference between the two percentages of increase is largest when a sequential NPD process is adopted. The only exception is scenario *(B)– (1)–(c)* with the percentage increase of project cost 0.9% lower than that of lead time.

After investigating project cost performance that reflects the overall effort devoted to the NPD project, how the amount of functional effort contributed by each participating department is affected by different LCE, RL, and OS levels is further examined.Three major conclusions can be drawn by breaking down the overall committed effort into functional

1. From *Fig. 7-a*, we observe that differences between the committed effort from the major department (i.e., Mfg Effort) of downstream phase (i.e., Production phase), and the efforts devoted by the other two departments (i.e., Mkt Effort & Eng Effort) drop dramatically from a sequential process *(a)* to concurrent processes *(b)* and *(c)* regardless

2. Moreover, from a relative perspective (*Fig. 7-b*), the percentage increase of Mfg Effort versus baseline is higher than those of Mkt and Eng Efforts in all sequential processes but *(A)–(2)–(a)*, in which Mfg Effort %Change = *72.5%* and is slightly lower than Mkt

time and cost due to rework is alleviated by the evaluation of LCE. Under low RL circumstances (Ƚ ൌ ɀ ൌ ͲǤ͵), a highly overlapped process excels in both response variables in an absolute sense. However, there is not clear trend shown in the comparative values. Particularly, at high level of RL (Ƚ ൌ ɀ ൌ ͲǤͶͷ), we observe that a 33% overlapped process leads to both absolute (compared with the results of 0% and 66% in scenario *(B)–(2)*) and relative (compared with the 33% baseline results *(BL1)–(b)*)

acknowledgement that overlapping may save time but is more costly.

ଶ ቁ

remarkable decrease in both NPD lead time and cost.

and formulated as ൌ ൬ቀ<sup>ଵ</sup>

effort contributed by each department:

of LCE or RL levels.

maximum values for lead time and project cost.


Due to space limit, only partial results of policy analysis a are presented to demonstrate how the proposed DES model can be used as a valuable tool for evaluating ECM decisions. 200 *replicates* are generated under each combination of LCE, RL, and OS, and thus result in altogether 2400 simulation runs, each using separate input random numbers. Performance data generated by the model are then exported to a Excel worksheet, in which individual project performance measures are recorded and various data graphs are generated.

Mean values of the experiment outcomes are displayed in Table 3. Columns *(i)* and *(ii)* record in an absolute sense the mean values of the observed lead time and project cost from 200 replications of each scenario, while columns *(I)* and *(II)* show the percentage change of (i) and (ii) relative to the baseline case results *(BL1)*, respectively.

It is important to note that managerial suggestions are not made merely based on the final output performance measures obtained for each scenario. Rather, attention is focused on the comparison of these numbers to their corresponding baseline results*,* which helps to provide us intuitive understanding of the impacts of reworks on project performance under different process features and parameter settings. Through the interpretation of results presented in *Table 3*, several concluding observations can be issued:


<sup>7</sup> The first two strategies are analyzed with only the NPD section of the model.

<sup>8</sup> Model is continuously verified by the reading through and examining the outputs for reasonableness and justification under a variety of scenarios and settings of parameters.

concluded that the evaluation of learning curve effects unambiguously results in a remarkable decrease in both NPD lead time and cost.

3. Effects of RL: by comparing *(i)* and *(ii)* of scenarios *(1)* with scenarios *(2)* under different combinations of LCE and OS levels, it can be concluded that a higher likelihood of rework in NPD activity undoubtedly causes an increase in both lead time and cost.

188 Discrete Event Simulations – Development and Applications

Impacts of the following managerial strategies and coordination policies on the responses of interest are investigated, and the root causes behind the performance of measurement

a. Impact of NPD process characteristics such as LCE, Rework Likelihood (RL) and

Due to space limit, only partial results of policy analysis a are presented to demonstrate how the proposed DES model can be used as a valuable tool for evaluating ECM decisions. 200 *replicates* are generated under each combination of LCE, RL, and OS, and thus result in altogether 2400 simulation runs, each using separate input random numbers. Performance data generated by the model are then exported to a Excel worksheet, in which individual

Mean values of the experiment outcomes are displayed in Table 3. Columns *(i)* and *(ii)* record in an absolute sense the mean values of the observed lead time and project cost from 200 replications of each scenario, while columns *(I)* and *(II)* show the percentage change of

It is important to note that managerial suggestions are not made merely based on the final output performance measures obtained for each scenario. Rather, attention is focused on the comparison of these numbers to their corresponding baseline results*,* which helps to provide us intuitive understanding of the impacts of reworks on project performance under different process features and parameter settings. Through the interpretation of results

1. When rework is not involved, the project performance stays consistent: the higher the activity overlapping ratio, the less the lead time. It can be obtained by summing up the durations of activities along the critical path. At the same time, since total person–days effort required for completing the project remains unchanged no matter which OS is applied, final project cost for all levels of *OS* (i.e., *(a)*, *(b)*, and *(c)*) in the baseline case should be very much similar, which is confirmed by the running results. This can be

2. Effects of LCE: by comparing the mean values of lead time and project cost of scenarios *(A)* with scenarios *(B)* under different combinations of RL and OS levels, it can be

8 Model is continuously verified by the reading through and examining the outputs for reasonableness and justification

project performance measures are recorded and various data graphs are generated.

b. Impact of rework review rigidity – *Rework Review Strategy (RRS)7*;

(i) and (ii) relative to the baseline case results *(BL1)*, respectively.

presented in *Table 3*, several concluding observations can be issued:

considered as a simple model verification check8.

7 The first two strategies are analyzed with only the NPD section of the model.

under a variety of scenarios and settings of parameters.

d. Combined impact of IEC arrival frequency and size – *IEC batching policy*; e. Impact of functional resource constraints – *resource assignment Strategy;* f. Impact of change propagation due to interconnected product configuration.

**6. Results** 

system are explored:

*Overlapping Strategy* (OS);

c. Impact of IEC arrival frequency;


After investigating project cost performance that reflects the overall effort devoted to the NPD project, how the amount of functional effort contributed by each participating department is affected by different LCE, RL, and OS levels is further examined.Three major conclusions can be drawn by breaking down the overall committed effort into functional effort contributed by each department:


Effort %Change = *75.6%*. However, in concurrent processes, an inverse relationship but of a much greater magnitude (especially at high RL level) is observed. That is, by starting downstream activities early with only preliminary information, concurrent engineering tends to alleviate the impacts of rework on activities in Production phase while intensifying those on activities in the two upstream phases. Although the concept of *cross–functional integration* has already been applied to the sequential process that allows engineers from Mfg Dept to be engaged early in both Concept and Design phases, which differentiates it from a traditional waterfall process, the impact of rework mostly occur in Mfg Dept. A concurrent process tends to shift rework risks and even out committed efforts among various functional areas owing to another critical characterization of concurrent engineering: *parallelization of activities*.

Using Discrete Event Simulation for Evaluating Engineering Change Management Decisions 191

**Figure 7.** (a/b). Overall/ Percentage Change of Functional Effort Devoted

**Figure 8.** (a/b). Scatter Plots of the RL Impact on Different OS

random inputs of activity duration, rework probabilities, etc.).

**7. Conclusion** 

We can clearly observe that a majority of replications exceed the lead time and effort of *BL1* by a considerable amount because of rework. Furthermore, as overlapping ratio and rework probability constants (α for IPC and γ for EPC) increase, there is also a notable increase in the number of replicates that are off the trend line. This phenomenon reveals that a high overlap ratio of upstream and downstream activities, combined with a high likelihood of unanticipated activity rework that requires additional resources will result in a strong tendency for NPD projects to behave in an unstable and unpredictable manner and lead to unforeseen departures from the predetermined baseline plan. Also note that, there exist possibilities where total effort or lead time or both are smaller than those required for the respective baseline cases, which is due to the stochastic nature of the model inputs (i.e.,

This research proposes a comprehensive discrete event simulation model that captures different aspects of PD project–related (i.e., product, process, team, and environment) complexity to investigate their resultant impacts on the occurrence and magnitude of

3. Mkt Effort undergoes the highest percentage of increase from when RL changes from low to high regardless of LCE or OS levels. Then is the Eng Effort. Mft Effort has the least amount of fluctuation across different scenarios.


**Table 3.** Project Performance under the Impact of OS, RL and LCE

To better visualize the correlations between lead time and effort, scatter plots of 200 model replicates' lead time and total effort outcomes under different levels of OS and RL are demonstrated in *Fig. 8*. Red lines in the plots indicate the lead time and total effort required for BL1 baseline cases (an "ideally executed" project without accounting for rework).

**Figure 7.** (a/b). Overall/ Percentage Change of Functional Effort Devoted

**Figure 8.** (a/b). Scatter Plots of the RL Impact on Different OS

We can clearly observe that a majority of replications exceed the lead time and effort of *BL1* by a considerable amount because of rework. Furthermore, as overlapping ratio and rework probability constants (α for IPC and γ for EPC) increase, there is also a notable increase in the number of replicates that are off the trend line. This phenomenon reveals that a high overlap ratio of upstream and downstream activities, combined with a high likelihood of unanticipated activity rework that requires additional resources will result in a strong tendency for NPD projects to behave in an unstable and unpredictable manner and lead to unforeseen departures from the predetermined baseline plan. Also note that, there exist possibilities where total effort or lead time or both are smaller than those required for the respective baseline cases, which is due to the stochastic nature of the model inputs (i.e., random inputs of activity duration, rework probabilities, etc.).

### **7. Conclusion**

190 Discrete Event Simulations – Development and Applications

Effort %Change = *75.6%*. However, in concurrent processes, an inverse relationship but of a much greater magnitude (especially at high RL level) is observed. That is, by starting downstream activities early with only preliminary information, concurrent engineering tends to alleviate the impacts of rework on activities in Production phase while intensifying those on activities in the two upstream phases. Although the concept of *cross–functional integration* has already been applied to the sequential process that allows engineers from Mfg Dept to be engaged early in both Concept and Design phases, which differentiates it from a traditional waterfall process, the impact of rework mostly occur in Mfg Dept. A concurrent process tends to shift rework risks and even out committed efforts among various functional areas owing to another critical

characterization of concurrent engineering: *parallelization of activities*.

least amount of fluctuation across different scenarios.

Rework

**(1)** *Low* ���� 0.3

**(2)** *High* ���� 0.45

**(1)** *Low* ���� 0.3

**(2)** *High* ���� 0.45

**Table 3.** Project Performance under the Impact of OS, RL and LCE

*LCE RL* **(**�� �**)** *OS* 

� 0.1��

**(BL1) Baseline** No

**(A)**  ������

**(B)**  ��� � ��� ��1 2 � �����

3. Mkt Effort undergoes the highest percentage of increase from when RL changes from low to high regardless of LCE or OS levels. Then is the Eng Effort. Mft Effort has the

> **(i) Lead Time (Days)**

To better visualize the correlations between lead time and effort, scatter plots of 200 model replicates' lead time and total effort outcomes under different levels of OS and RL are demonstrated in *Fig. 8*. Red lines in the plots indicate the lead time and total effort required

for BL1 baseline cases (an "ideally executed" project without accounting for rework).

**(a)** 0% 119 7,168 **(b)** 33% 101 7,168 **(c)** 66% 81 7,169

**(I) Time %Change c/w BL1** 

**(a)** 0% 158 32.0% **10,781** 48.2% **(b)** 33% 160 58.9% 11,778 61.9% **(c)** 66% **131** 62.6% 12,107 66.6%

**(a)** 0% 176 47.2% **11,948** 64.2% **(b)** 33% 192 90.4% 14,542 99.8% **(c)** 66% **162** 100.1% 14,927 105.4%

**(a)** 0% 141 17.6% 9,542 33.1% **(b)** 33% 129 28.1% 9,436 31.6% **(c)** 66% **106** 31.0% **9,185** 28.1%

**(a)** 0% 152 27.2% **10,370** 44.7% **(b)** 33% 158 56.6% 12,044 68.0% **(c)** 66% **121** 49.2% 11,037 54.0%

**(ii) Project Cost**  �� � 1000�

**(II) PC %Change c/w BL1** 

> This research proposes a comprehensive discrete event simulation model that captures different aspects of PD project–related (i.e., product, process, team, and environment) complexity to investigate their resultant impacts on the occurrence and magnitude of

iterations and ECs that stochastically arise during the course of an NPD project, and how the multiple dimensions of project performance, including lead time, cost, and quality, are consequently affected. In addition to the integration of several critical characteristics of PD projects that have been previously developed and tested, (e.g., concurrent and collaborative development process, learning curve effects, resources constraints), this research introduces the following new features and dynamic structures that are explicitly modeled, verified, and validated for the first time:

Using Discrete Event Simulation for Evaluating Engineering Change Management Decisions 193

1. Significant increase of both time and cost due to rework is alleviated by the evaluation

2. The percentage increase of project cost is always higher than that of lead time at the occurrence of rework and IECs. That is, compared with lead time, project cost is more

3. By starting downstream activities early with only preliminary information, concurrent engineering tends to alleviate the impacts of rework on activities in downstream phases while intensifying those on activities in the upstream phases. It also tends to shift rework risks and even out committed efforts among various functional areas. In addition, departments that are majorly involved in upstream phases undergo higher

4. A high overlap ratio of upstream and downstream activities, combined with a high likelihood of unanticipated activity rework that requires additional resources will result in a strong tendency for NPD projects to behave in an unstable and unpredictable manner and lead to unforeseen departures from the predetermined baseline plan. 5. Adopting a more restrictive *RRS* (Convex–Up) leads to a longer NPD lead time and higher project cost. There is no obvious distinction between Stepped Linear and Linear

6. When only the IEC process propagation among development activities is examined, high correlations between lead time, cost, and quality are observed. However, when the effects of IEC product propagation among dependent product components/systems, the correlation between lead time and project cost, and the one between lead time and

7. Batching of IECs possesses a competitive advantage in lead time over handling IECs individually. This superiority is the greatest when a sequential PD process is adopted, and reduces as overlapping ratio increases. However, there is neither IEC policy shows

8. Potential tradeoffs among NPD lead time and total cost are clearly identified when resource assignment decision is to be made. A higher level of OS leads to a shorter NPD lead time and less total cost given the same amount of functional resource allocation. However, the benefits of lead time reduction by assigning more resources are the most obvious in a sequential process, and activity overlap reduces the degree of obviousness

9. Linearity between lead time and quality is observed in all three OS levels: the higher the functional resource availability, the shorter the lead time, and the lower the quality. The linearity slope increases as the *OS* increases. The percentage of decrease in quality versus baseline case is the largest in a sequential process and decreases as *OS* increases. 10. The evaluation of IEC product propagation leads to a general increase of the multiple dimensions of NPD project performance from baseline case, except a counterintuitive decrease in NPD project lead time for a less coupled product configuration under a high

Three possible main directions of future studies beyond the work presented here are

*RRSs*. Also, the evaluation of *LCE* reduces the impacts of *RRS*.

"dominant" advantage in project cost or quality.

environmental uncertainty and a high *RL*.

summarized as follows:

the benefits have. The higher the OS, the less the benefits.

of LCE.

sensitive to rework/IECs.

fluctuation in effort.

quality drop significantly.


Results show under different conditions of uncertainty, how we should apply various kinds of strategies and policies, including process overlapping, rework review, IEC batching, resource allocation, to not only achieve benefits but also recognize potential tradeoffs among lead time, cost and quality. The study concludes with the following observations or understandings that either have been identified previously in the existing literature or disclosed for the first time with the help of newly added and verified model features:

1. Significant increase of both time and cost due to rework is alleviated by the evaluation of LCE.

192 Discrete Event Simulations – Development and Applications

validated for the first time:

levels.

rules in priority evaluation order.

solution uncertainty.

iterations and ECs that stochastically arise during the course of an NPD project, and how the multiple dimensions of project performance, including lead time, cost, and quality, are consequently affected. In addition to the integration of several critical characteristics of PD projects that have been previously developed and tested, (e.g., concurrent and collaborative development process, learning curve effects, resources constraints), this research introduces the following new features and dynamic structures that are explicitly modeled, verified, and

1. This DES model *explicitly distinguishes between two different types of rework by the time of occurrence*: intra–phase iterations and inter–phase EECs. Moreover, *engineering changes are further categorized into two groups by their causes of occurrence*, emergent ECs "that are necessary to reach an initially defined standard in the product" [13], and initiated ECs

2. *Uncertainty is differentiated and conceptualized into three categories*. Activity uncertainty is reflected in the stochastic activity duration using probability distributions, and environmental uncertainty is primarily modeled by the arrival frequency and magnitude of IECs. In particular, solution uncertainty is an important model variable that dynamically determines the rework probability which will be discussed next. 3. This study provides presumably the first attempt to integrate cause–and–effect relationships among project variables into a DES model of PD projects. Traditional DES model deals with only static project features in "open–loop, single–link" causal relationship format [14] that remain constant as the model evolves. *Rework probability is no longer pre–determined* and remains fixed over the entire time frame of the NPD process as appeared in most of previous studies. Instead, it is calculated in real time by the model itself. That is to say, rework probability is now included in a *feedback structure* that changes over time in response to the project's evolving uncertainty

4. The specific three–step *rework review process structure*, together with the *rigidity of rework reviews*, allows more explicit and detailed modeling of this critical aspect of ECM, which is not attempted by previous studies. Decision points are used with rules to conditionally process ECs. They also give the users flexibility to define one or more

5. *The traditional restrictive assumption of a stable development process with no environmental disturbance is also relaxed* by introducing the random occurrence of IECs, which will lead to an enlarged design solution scope of the final product and thus affecting the project

Results show under different conditions of uncertainty, how we should apply various kinds of strategies and policies, including process overlapping, rework review, IEC batching, resource allocation, to not only achieve benefits but also recognize potential tradeoffs among lead time, cost and quality. The study concludes with the following observations or understandings that either have been identified previously in the existing literature or

disclosed for the first time with the help of newly added and verified model features:

in response to new customer requirements or technology advances.


Three possible main directions of future studies beyond the work presented here are summarized as follows:

1. Model features including: i) different relationships between solution uncertainty and rework probability, ii) more detailed modeling of dynamic rework review criteria (in replace of the current static one), and iii) parallel rework policy need to be tested to assess their impacts on project performance measures.

Using Discrete Event Simulation for Evaluating Engineering Change Management Decisions 195

[8] Browning, T. R., S. D. Eppinger. (2002). Modeling Impacts of Process Architecture on Cost and Schedule Risk in Product Development. *IEEE Transactions on Engineering* 

[9] Cho, S. H., S. D. Eppinger. (2005). A Simulation–Based Process Model for Managing Complex Design Projects. *IEEE Transactions on Engineering Management*. 52(3):316–328. [10] Clark, K. B., T. Fujimoto. (1991). Product Development Perfornmance: Strategy, Organization, and Management in the World Auto Industry. Boston, Mass.: Harvard

[11] Clarkson P. J. and C. Eckert. (2004). Design and Process Improvement: A Review of

[12] Earl C., J. H. Johnson and C. Eckert. (2005). "Complexity" in *Design Process Improvement* 

[13] Eckert, C., P. J. Clarkson, W. Zanker. (2004). Change and Customisation in Complex

[14] Ford, D. N. (1995). The Dynamics of Project Management: An Investigation of the Impacts of Project Process and Coordination on Performance. Doctoral thesis. Sloan

[16] Ford, D. N., J. D. Sterman. 2003. Overcoming the 90% Syndrome: Iteration Management in Concurrent Development Projects. *Concurrent Engineering: Research and Applications*.

[17] Ha, A. Y., E. L. Porteus. (1995). Optimal Timing of Reviews in Concurrent Design for

[18] Hegde, G. G., Sham Kekre, Sunder Kekre. 1992. Engineering Changes and Time Delays: A Field Investigation. *International Journal of Production Economics*. 28(3):341–352. [19] Ho, C. J. 1994. Evaluating the Impact of Frequent Engineering Changes on MRP System

[20] Ho, C. J., J. Li. 1997. Progressive Engineering Changes in Multi–level Product Structures. Omega: International Journal for Management Science. 25(5):585–594. [21] Hobday, M. (1998). Product Complexity, Innovation, and Industrial Organization.

[22] Huang, G. Q., K. L. Mak. (1999). Current Practices of Engineering Change Management in Hong Kong Manufacturing Industries. *Journal of Materials Processing Technology*.

[23] Koh, E. CY., P. J. Clarkson. (2009). A Modelling Method to Manage Change Propagation. In *Proceedings of the 18th International Conference on Engineering Design*.

[24] Krishnan, V., S. D. Eppinger, D. E. Whitney. (1997). A Model–Based Framework to Overlap Product Development Activities. *Management Science*. 43(4):437–451. [25] Krishnan, V., K. T. Ulrich. (2001). Product Development Decisions : A Review of the

Performance. International Journal of Production Research. 32(3):619–641.

School of Management. Massachusetts Institute of Technology. Cambridge, MA. [15] Ford, D. N., J. D. Sterman. (1998). Dynamic Modeling of Product Development

*– A Review of Current Practice*. 174-196, Springer, ISBN 1-85233-701-X, 2005

Engineering Domains. *Research in Engineering Design*. 15(1):1–21.

*Management*. 49(4): 428–442.

Business School Press.

11(3):177–186

19(1):21–37.

Stanford, California.

*Research Policy*. 26(6):689–710.

Literature. *Management Science*. 47(1):1-21.

Current Practice. Springer. 1st edition.

Processes. *System Dynamics Review*. 14(1): 31–68.

Manufacturability. *Management Science*. 41(9):1431–1447.


## **Author details**

Weilin Li *Syracuse University, The United States* 

## **8. References**


[8] Browning, T. R., S. D. Eppinger. (2002). Modeling Impacts of Process Architecture on Cost and Schedule Risk in Product Development. *IEEE Transactions on Engineering Management*. 49(4): 428–442.

194 Discrete Event Simulations – Development and Applications

interdependent parallel projects, etc.

project performance can be explored.

*Syracuse University, The United States* 

**Author details** 

**8. References** 

(4):335–356.

1703.

9(1):5–19.

*Engineering*. 6(2):188–195.

Technology. Cambridge, MA.

Weilin Li

assess their impacts on project performance measures.

1. Model features including: i) different relationships between solution uncertainty and rework probability, ii) more detailed modeling of dynamic rework review criteria (in replace of the current static one), and iii) parallel rework policy need to be tested to

2. The review of literature has indicated a lack of development process models that are capable to be extended and implemented into a multi–project environment while still keeping detailed aspects of project complexity. Building blocks of the model framework can be reconfigured and applied at various detail levels. From a single project level to the entire organizational level, it opens possibilities for further analyses of multi–project management, such as work force planning strategies, coordination policies of

3. This DES model can also be further extended across organizations. By relaxing the single organization restriction of the current model and including inter–organizational influences, how engineering changes propagate along supply chain and affect NPD

[1] Balakrishnan, N., A. K. Chakravarty. (1996). Managing Engineering Change: Market Opportunities and Manufacturing Costs. *Production and Operations Management*. 5

[2] Barzizza, R., M. Caridi, R. Cigolini. (2001). Engineering Change: A Theoretical

[3] Bhuiyan, N., D. Gerwin, V. Thomson. (2004). Simulation of the New Product Development Process for Performance Improvement. *Management Science*. 50(12):1690–

[4] Bhuiyan, N., G. Gregory, V. Thomson. (2006). Engineering Change Request Management in a New Product Development Process. *European Journal of Innovation Management*.

[5] Black, L. J., N. P. Repenning. (2001). Why Firefighting Is Never Enough: Preserving

[6] Bouikni, N., A. Desrochers. (2006). A Product Feature Evolution Validation Model for Engineering Change Management. *Journal of Computing and Information Science in* 

[7] Browning, T. R. (1998). Modeling and Analyzing Cost, Schedule, and Performance in Complex System Product Development. Doctoral thesis. Massachusetts Institute of

High–Quality Product Development. *System Dynamic Review*. 17(1):33–62.

Assessment and A Case Study. *Production Planning & Control*. 12 (7):717–726.

	- [26] Lin, J., K. H. Chai, Y. S. Wong, A. C. Brombacher. (2007). A Dynamic Model for Managing Overlapped Iterative Product Development. *European Journal of Operational Research*. 185:378-392
	- [27] Loch, C. H., C. Terwiesch. (1999). Accelerating the Process of Engineering Change Orders: Capacity and Congestion Effects. *Journal of Product Innovation Management*. 16(2):145–159.
	- [28] Loch, C. H., J. Mihm, A. Huchzermeier. (2003). Concurrent Engineering and Design Oscillations in Complex Engineering Projects. *Concurrent Engineering: Research and Applications*. 11(3):187–199.
	- [29] Lyneis, J. M., D. N. Ford. (2007). System Dynamics Applied to Project Management: a Survey, Assessment, and Directions for Future Research. *System Dynamics Review*. 23(2/3): 157–189.
	- [30] Park, M., F. Peña–Mora. (2003). Dynamic Change Management for Construction: Introducing the Change Cycle into Model-Based Project Management. *System Dynamics Review*. 19(3):213-242.
	- [31] Pimmler, T. U., and S. D. Eppinger. (1994). Integration Analysis of Product Decompositions. In *Proceedings of the ASME Design Theory and Methodology Conference*, 343–351. Minneapolis, Minnesota: American Society of Mechanical Engineers.
	- [32] Reichelt, K., J. Lyneis. (1999). The Dynamics of Project Performance: Benchmarking the Drivers of Cost and Schedule Overrun. *European Management Journal*. 17(2):135-150.
	- [33] Roemer, T. A., R. Ahmadi. (2004). Concurrent Crashing and Overlapping in Product Development. *Operations Research*. 52(4): 606–622.
	- [34] Smith, R. P., S. D. Eppinger. (1997). A Predictive Model of Sequential Iteration in Engineering Design. *Management Science*. 43(8):1104–1120.
	- [35] Terwiesch, C., C. H. Loch. 1999. Managing the Process of Engineering Change Orders: The Case of the Climate Control System in Automobile Development. Journal of Product Innovation Management. 16(2):160–172.
	- [36] Unger, D. W., S. D. Eppinger. 2009. Comparing Product Development Processes and Managing Risk. International Journal of Product Development. 8(4):382–401.
	- [37] Wynn, D. C., K. Grebici, P. J. Clarkson. 2011. Modelling the Evolution of Uncertainty Levels during Design. International Journal on Interactive Design and Manufacturing. 5:187–202.

*Research*. 185:378-392

*Applications*. 11(3):187–199.

16(2):145–159.

23(2/3): 157–189.

5:187–202.

*Review*. 19(3):213-242.

[26] Lin, J., K. H. Chai, Y. S. Wong, A. C. Brombacher. (2007). A Dynamic Model for Managing Overlapped Iterative Product Development. *European Journal of Operational* 

[27] Loch, C. H., C. Terwiesch. (1999). Accelerating the Process of Engineering Change Orders: Capacity and Congestion Effects. *Journal of Product Innovation Management*.

[28] Loch, C. H., J. Mihm, A. Huchzermeier. (2003). Concurrent Engineering and Design Oscillations in Complex Engineering Projects. *Concurrent Engineering: Research and* 

[29] Lyneis, J. M., D. N. Ford. (2007). System Dynamics Applied to Project Management: a Survey, Assessment, and Directions for Future Research. *System Dynamics Review*.

[30] Park, M., F. Peña–Mora. (2003). Dynamic Change Management for Construction: Introducing the Change Cycle into Model-Based Project Management. *System Dynamics* 

[31] Pimmler, T. U., and S. D. Eppinger. (1994). Integration Analysis of Product Decompositions. In *Proceedings of the ASME Design Theory and Methodology Conference*,

[34] Smith, R. P., S. D. Eppinger. (1997). A Predictive Model of Sequential Iteration in

[35] Terwiesch, C., C. H. Loch. 1999. Managing the Process of Engineering Change Orders: The Case of the Climate Control System in Automobile Development. Journal of

[36] Unger, D. W., S. D. Eppinger. 2009. Comparing Product Development Processes and

[37] Wynn, D. C., K. Grebici, P. J. Clarkson. 2011. Modelling the Evolution of Uncertainty Levels during Design. International Journal on Interactive Design and Manufacturing.

Managing Risk. International Journal of Product Development. 8(4):382–401.

343–351. Minneapolis, Minnesota: American Society of Mechanical Engineers. [32] Reichelt, K., J. Lyneis. (1999). The Dynamics of Project Performance: Benchmarking the Drivers of Cost and Schedule Overrun. *European Management Journal*. 17(2):135-150. [33] Roemer, T. A., R. Ahmadi. (2004). Concurrent Crashing and Overlapping in Product

Development. *Operations Research*. 52(4): 606–622.

Product Innovation Management. 16(2):160–172.

Engineering Design. *Management Science*. 43(8):1104–1120.

## *Edited by Eldin Wee Chuan Lim*

The Discrete Event Simulation (DES) method has received widespread attention and acceptance by both researchers and practitioners in recent years. The range of application of DES spans across many different disciplines and research fields. In research, further development and advancements of the basic DES algorithm continue to be sought while various hybrid methods derived by combining DES with other simulation techniques continue to be developed. This book presents state-of-the-art contributions on fundamental development of the DES method, novel integration of the method with other modeling techniques as well as applications towards simulating and analyzing the performances of various types of systems. This book will be of interest to undergraduate and graduate students, researchers as well as professionals who are actively engaged in DES related work.

Photo by foto-ruhrgebiet / iStock

Discrete Event Simulations - Development and Applications

Discrete Event Simulations

Development and Applications

*Edited by Eldin Wee Chuan Lim*