**Robust Control for Single Unit Resource Allocation Systems**

Shengyong Wang, Song Foh Chew and Mark Lawley *University of Akron, Southern Illinois University Edwardsville, and Purdue University USA* 

## **1. Introduction**

18 Robust control book 3

390 Challenges and Paradigms in Applied Robust Control

[6] R. J. Anderson and M. W. Spong, Asymptotic stability for force reflecting teleoperators

[7] Niemeyer,G and Slotine,J.J, Stable adaptive tele-operation, *IEEE Journal of Ocean Eng.*

[8] Leung,G.M.H. Francis,B.A. and Apkarian,J., Bilateral controller for teleoperators with time delay via *μ*-synthesis, *IEEE Trans.of Robotics and Automation*, RA-11-1,(1995) 105–116 [9] N.L.Johnson A.Kotz and N.Balakrishnan, Continuous uninvariate distributions: Weily series in probability and mathematical statistics, 1, John Weily & Sons, Inc.(1994)

with time delay *IEEE Trans. Automatic. Contr.* (1988) 1618–1625

16-1,(1991)152–162

Supervisory control for deadlock-free resource allocation has been an active area of manufacturing systems research. Most work, however, assumes that allocated resources do not fail. Little research has addressed allocating resources that may fail. Automated manufacturing systems have many types of components that may fail unexpectedly. We develop robust controllers for single unit resource allocation systems with unreliable resources (Chew et al., 2008; Chew et al., 2011; Chew & Lawley, 2006; Lawley, 2002; Lawley & Sulistyono, 2002; Wang et al., 2008; Wang et al., 2009). These controllers guarantee that when unreliable resources fail, parts requiring failed resources do not block the production of parts not requiring failed resources. Further, while resources are down, the system is controlled so that when repair events occur, the system is in a safe and admissible state.

There is little manufacturing research literature on robust supervision. Reveliotis (1999) considers the case where parts requiring a failed resource can be re-routed or removed from the system through human intervention. Park & Lim (1999) address existence questions for robust supervisors. Hsieh (2004) develops methods that determine the feasibility of production given a set of resource failures modelled as the extraction of tokens from a Petri net. In contrast, our work models the failure of the workstation server while assuming that buffer space remains accessible after the failure event. We assume that when the server of a workstation fails, we can continue allocating its buffer space up to capacity, but that none of the waiting parts can be processed and thus cannot proceed along their routes until the server is repaired. We further assume that server failure does not prevent finished parts occupying the workstation's buffer space from being moved away from the workstation and proceeding along their routes. Finally, we assume that server failure does not damage or destroy the part being processed and that failure can only occur when the server is working. The last two assumptions are made for notational efficiency and presentation clarity. They can be easily relaxed by adding appropriate events and state variables to our treatment.

Our objective is to control the system so that failure of an unreliable resource does not prevent processing of parts not requiring the failed resource. When a resource fails, all parts in the system requiring the failed resource for future processing are unable to complete until the failed resource is repaired. Because these parts occupy buffer space, they can block production of parts not requiring the failed resource. Thus, we want to assure that, when unreliable resources fail, the buffer space allocation can evolve under normal operation so

Robust Control for Single Unit Resource Allocation Systems 393

R = {r1, r2, r3, r4, r5, r6, r7, r8, r9} RR = {r1, r3, r4, r5, r6, r7, r8}

C = C1,C2,C3,C4,C5,C6,C7,C8,C9

P2 = {P21, P22, P23, P24, P25, P26, P27, P28} T2 = r2, r3, r4, r6, r5, r1, r2, r3 P3 = {P31, P32, P33, P34, P35, P36} T3 = r8, r7, r5, r6, r9, r4

c = {1i : i=1...6}{2i : i=1...9} {3i : i=1...7}{4i : i=1...3} u1 = {1i : i=1...5}{2i : i=1...8} {3i : i=1...6}{4i : i=1,2}

{r2} = {1i : i=1...6}{2i : i=1...9} {1i : i=1...5}{2i : i=1...8} {r9} = {3i : i=1...7}{3i : i=1...6}

 = 1,1,1,2,1,1,1,1,1 P = {P1, P2, P3, P4} P1 = {P11, P12, P13, P14, P15} T1 = r6, r3, r2, r1, r2

RU = {r2, r9}

P4 = {P41, P42} T4 = r4, r5

u2 = {2,9,2,9}

{r2,r9} = {r2} {r9}

Workstation *failure* will imply the failure of the workstation's server, not any of its buffer space. We will assume that when the server of a workstation fails, we can continue to allocate its buffer space up to capacity, but that none of the waiting parts can be processed and thus cannot proceed along their respective routes until the server is repaired. We further assume that server failure does not prevent finished parts occupying the workstation's buffer space from being moved away from the workstation and proceeding along their respective routes. Finally, we assume that server failure does not damage or destroy the part being processed and that failure can only occur when the server is

We are now in a position to define the system states and events. Let Q represent the set of system states, where Q q = svi, yjk, xjk : i=1|R|, j=1|P| and k=1|Pj|, with svi being the status of the server of workstation i (0 if failed, 1 if operational), yjk being the number of unfinished units of Pjk (parts waiting or in-process) located in the buffer space of (Pjk), and xjk being the number of finished units of Pjk located in the buffer space of (Pjk). Q0 is the set of initial states with q0Q0 being the state in which no resources are allocated and all servers

j 1 2|P| | <sup>j</sup> R|. Let =cu, where c={jk : j=1|P| and k=1|Pj|+1} is the set of controllable events with jk representing the allocation of (Pjk) to a part instance of Pjk; that is, jk is the event that a part instance of a part type Pj advances into the buffer space of a workstation that will perform its kth operation. Then, j,|Pj|+1 represents a finished part of type Pj leaving the system. We assume that the supervisor controls the occurrences of these events through

Fig. 1. An example system with two unreliable resources

P1

r6

r9

r2

r1 r3

r4

P3

P2

P4

are operational. The dimension of q is | |P

resource allocation decisions.

working.

r7 r8

r5

that parts not requiring failed resources can continue production. Operation must continue to obey part routings and must assure that when a failed resource is repaired, the system is not in an unsafe state. We refer to supervisors guaranteeing this as *robust*.

The remainder of the chapter comprises the following sections. Most briefly, Section 2 discusses the way we model our systems. An example system is presented in this section to motivate properties that robust controllers must possess. In Section 3, we develop robust controllers for systems with multiple unreliable resources where each part type requires at most one unreliable resource. Specifically, Subsection 3.1 develops two robust controllers using a neighbourhood policy, a modified version of banker's algorithm, and a single step look ahead policy. Subsection 3.2 uses a resource order policy to construct another robust controller; Subsection 3.3 employs a notion of shared buffer capacity to develop a robust controller. Relaxing the restriction, Section 4 builds robust controllers for systems for which part types may require multiple unreliable resources. Finally, Section 5 concludes the chapter and discusses future research directions.

## **2. Modelling of robust control**

There are two subsections in this section. Specifically, we will discuss the way we model our systems in Subsection 2.1. Subsection 2.2 will provide examples to motivate properties that robust controllers must possess.

### **2.1 The discrete event system**

We model our systems using the approach of Ramadge & Wonham (1987). This is necessary to define the properties that we want our supervisors to enforce. The following model is similar to that developed by Lawley & Sulistyono (2002), but differs in that now we have more complex failure scenarios and thus some of the underlying formalism has to be generalized. Figure 1 provides an example for the following development.

The system is defined as a 9tuple vector S = R,C,P,,Q,Q0,,,. In S, R is the set of system resource types, with R=RRRU, RRRU=, where RR is the set of reliable resource types, not subject to failure and RU is the set of unreliable resource types, subject to failure. Let C=Ci : i=1|R| where Ci is the capacity of the buffer space associated with system resource type riR.

The set P of part types is produced by the system with each part type PjP representing an ordered set of processing stages, Pj=Pj1Pj|Pj|, where Pjk represents the kth processing stage of Pj. Also, let RPjk=PjkPj|Pj| be the *residual* part stages. We will use pjk to represent a part instance of Pjk. Let :PjR such that (Pjk) returns the resource type required by Pjk. Thus, the route of Pj is Tj=(Pj1)(Pj|Pj|), and the residual route RTjk=(Pjk)(Pj|Pj|). Finally, let i={Pjk:(Pjk)=riR}, the set of part type stages associated with resource riR.

We will suppose that our resource types are workstations with buffer space for staging and storing parts and a processor or server for operating on parts. We will use the standard assumption from queuing theory that the server is not idle so long as there are unfinished parts in a workstation's buffer space. The resource units that we are concerned with allocating are instances of the workstation's buffer space. The controllers that we design are not intended to allocate the server among parts waiting at the workstation. We assume this to be done by some local queuing discipline.

392 Challenges and Paradigms in Applied Robust Control

that parts not requiring failed resources can continue production. Operation must continue to obey part routings and must assure that when a failed resource is repaired, the system is

The remainder of the chapter comprises the following sections. Most briefly, Section 2 discusses the way we model our systems. An example system is presented in this section to motivate properties that robust controllers must possess. In Section 3, we develop robust controllers for systems with multiple unreliable resources where each part type requires at most one unreliable resource. Specifically, Subsection 3.1 develops two robust controllers using a neighbourhood policy, a modified version of banker's algorithm, and a single step look ahead policy. Subsection 3.2 uses a resource order policy to construct another robust controller; Subsection 3.3 employs a notion of shared buffer capacity to develop a robust controller. Relaxing the restriction, Section 4 builds robust controllers for systems for which part types may require multiple unreliable resources. Finally, Section 5 concludes the

There are two subsections in this section. Specifically, we will discuss the way we model our systems in Subsection 2.1. Subsection 2.2 will provide examples to motivate properties that

We model our systems using the approach of Ramadge & Wonham (1987). This is necessary to define the properties that we want our supervisors to enforce. The following model is similar to that developed by Lawley & Sulistyono (2002), but differs in that now we have more complex failure scenarios and thus some of the underlying formalism has to be

The system is defined as a 9tuple vector S = R,C,P,,Q,Q0,,,. In S, R is the set of system resource types, with R=RRRU, RRRU=, where RR is the set of reliable resource types, not subject to failure and RU is the set of unreliable resource types, subject to failure. Let C=Ci : i=1|R| where Ci is the capacity of the buffer space associated with system resource type

The set P of part types is produced by the system with each part type PjP representing an ordered set of processing stages, Pj=Pj1Pj|Pj|, where Pjk represents the kth processing stage of Pj. Also, let RPjk=PjkPj|Pj| be the *residual* part stages. We will use pjk to represent a part instance of Pjk. Let :PjR such that (Pjk) returns the resource type required by Pjk. Thus, the route of Pj is Tj=(Pj1)(Pj|Pj|), and the residual route RTjk=(Pjk)(Pj|Pj|). Finally,

We will suppose that our resource types are workstations with buffer space for staging and storing parts and a processor or server for operating on parts. We will use the standard assumption from queuing theory that the server is not idle so long as there are unfinished parts in a workstation's buffer space. The resource units that we are concerned with allocating are instances of the workstation's buffer space. The controllers that we design are not intended to allocate the server among parts waiting at the workstation. We assume this

generalized. Figure 1 provides an example for the following development.

let i={Pjk:(Pjk)=riR}, the set of part type stages associated with resource riR.

not in an unsafe state. We refer to supervisors guaranteeing this as *robust*.

chapter and discusses future research directions.

**2. Modelling of robust control** 

robust controllers must possess.

**2.1 The discrete event system** 

to be done by some local queuing discipline.

riR.

Fig. 1. An example system with two unreliable resources

Workstation *failure* will imply the failure of the workstation's server, not any of its buffer space. We will assume that when the server of a workstation fails, we can continue to allocate its buffer space up to capacity, but that none of the waiting parts can be processed and thus cannot proceed along their respective routes until the server is repaired. We further assume that server failure does not prevent finished parts occupying the workstation's buffer space from being moved away from the workstation and proceeding along their respective routes. Finally, we assume that server failure does not damage or destroy the part being processed and that failure can only occur when the server is working.

We are now in a position to define the system states and events. Let Q represent the set of system states, where Q q = svi, yjk, xjk : i=1|R|, j=1|P| and k=1|Pj|, with svi being the status of the server of workstation i (0 if failed, 1 if operational), yjk being the number of unfinished units of Pjk (parts waiting or in-process) located in the buffer space of (Pjk), and xjk being the number of finished units of Pjk located in the buffer space of (Pjk). Q0 is the set of initial states with q0Q0 being the state in which no resources are allocated and all servers are operational. The dimension of q is | |P j 1 2|P| | <sup>j</sup> R|.

Let =cu, where c={jk : j=1|P| and k=1|Pj|+1} is the set of controllable events with jk representing the allocation of (Pjk) to a part instance of Pjk; that is, jk is the event that a part instance of a part type Pj advances into the buffer space of a workstation that will perform its kth operation. Then, j,|Pj|+1 represents a finished part of type Pj leaving the system. We assume that the supervisor controls the occurrences of these events through resource allocation decisions.

Robust Control for Single Unit Resource Allocation Systems 395

This subsection motivates a set of desired properties for a robust controller based upon an example production system. Figure 1 presents an example manufacturing system with two unreliable resources. The stages, routes, and resource capacities are given, as is the complete discrete event model. This model enumerates the resources, capacities, events, and so forth. For now, we will constrain our discussion to the system states presented in Figures 2-4. We recall that, by definition, a resource allocation state is safe if, starting from that state, there exists a sequence of resource allocations/deallocations that completes all parts and takes the system to the empty and idle state, the state in which no resources are allocated and no servers are busy. Our underlying assumption is that if a resource allocation state is safe, then, under correct supervision and starting from that state, it is possible to produce all part

We have several control objectives for the system of Figure 1. First, we desire that the controller guarantee deadlock-free operation, i.e., that it keeps the system producing all part types. Second, in the event that r2 fails, we want to continue producing part types not requiring r2, {P3,P4}, without having to intervene by clearing the system of parts requiring r2. Similarly, in the event that r9 fails, we want to continue producing part types not requiring r9, {P1,P2,P4}, again without having to intervene by clearing the system of parts requiring r9. Further, if both r2 and r9 are in the failed state, we want to continue producing part types not

Consider for example the state given in Figure 2. This state is safe; however, if r2 fails while processing part p27 in this state, the production of both P3 and P4 will be blocked by two p23s at r4. Note that if we advance a p23 from r4 to r6, then production of P4 can proceed. However, production of P3 will now be blocked. Thus, this state does not satisfy our condition that after the failure of r2, we should be able to continue producing both P3 and P4. As another example, consider the state of Figure 3. Again, we see that this state is safe. However, if r9 fails while processing part p35 in this state, production of part types P1 and P2

r1 r2 r3

p27

p23

p23

P3

r8

Fig. 2. An undesirable system state since unreliable resource r2 may fail while processing

r6

P1

r9

r4

P2

P4

**2.2 Motivating examples for properties of robust supervisory control** 

requiring r2 or r9, {P4}, again without explicit intervention.

r7

p32

part p27

p14

will be blocked by p34 at r6, although the production of P4 is unaffected.

r5

types indefinitely.

We have u=u1u2 being the set of uncontrollable events where u1={jk : j=1|P| and k=1|Pj|} represents the completion of service for Pjk. Then, u2={i,i : riRU} represents the failure (i) and repair (i) of the server of unreliable resource riRU. Service completions, failures and repairs are assumed to be beyond the controller's influence.

Let :Q2 be a function that, for a given state, returns the set of enabled events. This function is defined for a state, qQ, as follows:

$$\text{1.}\quad \text{For } \mathbf{P}\_{\text{jl}} \in \Omega\_{\text{i}} \text{ if } \mathbf{C}\_{\text{i}} - \sum\_{\mathbf{P}\_{\text{jk}} \in \Omega\_{\text{i}}} (\mathbf{y}\_{\text{jk}} + \mathbf{x}\_{\text{jk}}) > 0 \text{, then } \alpha\_{\text{i}} \mathbf{i} \in \mathfrak{F}(\mathbf{q}).$$

Events that release new parts into the system are enabled when space is available on the first required workstation in the route.


When a part finishes its current operation and buffer space becomes available at the next required workstation in its route, the event corresponding to advancing the part is enabled.

6. For Pj,|Pj|i, if xj,|Pj|0, then j,|Pj|+1(q). If a part has finished all of its operations, the event corresponding to unloading it from the system is enabled.

The state transition function is now defined as follows. The transition function, , is a partial function from the cross product Q to the set Q of system states. Specifically, let :QQ such that

(q,jk)=qexj,k<sup>1</sup>+eyjk, advancing a part pj,k1;

(q,jk)=qeyjk+exjk, service completion of a part pjk;

(q,i)=qesvi , failure of server i;

(q,i)=q+esvi , repair of server i;

where exj,k<sup>1</sup>, eyjk, exjk, and esvi are the standard unit vectors with components corresponding to xj,k1, yjk, xjk and svi being 1, respectively. Note that, eyj,|Pj|+1 = exj0 = **0**, the zero vector with the same dimension, and that pj0 represents a raw part of Pj waiting to be released into the system.

We assume that |RU|1. In this case, any subset of the unreliable resources can be simultaneously in a failed state. Thus, if one of the <sup>U</sup> |R | <sup>i</sup> subsets of size i, i=1…|RU|, is

down, we want the remaining resources to continue producing parts not requiring any of the failed resources without human intervention to remove or rearrange the parts requiring failed resources. Further, when one of the failed resources is repaired, we want production of parts requiring that resource to resume. A robust controller must possess certain properties in order to accomplish the above-mentioned characteristics.

394 Challenges and Paradigms in Applied Robust Control

We have u=u1u2 being the set of uncontrollable events where u1={jk : j=1|P| and k=1|Pj|} represents the completion of service for Pjk. Then, u2={i,i : riRU} represents the failure (i) and repair (i) of the server of unreliable resource riRU. Service completions,

be a function that, for a given state, returns the set of enabled events. This

Events that release new parts into the system are enabled when space is available on the

If the server is failed, the corresponding repair event is enabled and the corresponding

jk jk

(y x) ,

then jk(q).

are the standard unit vectors with components corresponding

<sup>i</sup> subsets of size i, i=1…|RU|, is

jk i

When a part finishes its current operation and buffer space becomes available at the next required workstation in its route, the event corresponding to advancing the part is

If a part has finished all of its operations, the event corresponding to unloading it from

The state transition function is now defined as follows. The transition function, , is a partial function from the cross product Q to the set Q of system states. Specifically, let :QQ

to xj,k1, yjk, xjk and svi being 1, respectively. Note that, eyj,|Pj|+1 = exj0 = **0**, the zero vector with the same dimension, and that pj0 represents a raw part of Pj waiting to be released into the

We assume that |RU|1. In this case, any subset of the unreliable resources can be

down, we want the remaining resources to continue producing parts not requiring any of the failed resources without human intervention to remove or rearrange the parts requiring failed resources. Further, when one of the failed resources is repaired, we want production of parts requiring that resource to resume. A robust controller must possess certain

p Ω

If a part is at service, then the corresponding service completion event is enabled.

If the server is busy with a part, then the corresponding failure event is enabled.

failures and repairs are assumed to be beyond the controller's influence.

jk jk

(y x) ,

then j1(q).

function is defined for a state, qQ, as follows: 1. For Pj1i, if Ci 0

jk i

4. For riRU, if svi = 0, then i(q) and jk(q) Pjki.

5. For Pjki, 1k|Pj|, if xj,k<sup>1</sup>0 and Ci 0

p Ω

first required workstation in the route. 2. For Pjki, if yjk0 and svi=1, then jk(q).

3. For riRU, Pjki and jk(q) i(q).

service completion events are disabled.

6. For Pj,|Pj|i, if xj,|Pj|0, then j,|Pj|+1(q).

(q,jk)=qexj,k<sup>1</sup>+eyjk, advancing a part pj,k1;

, failure of server i;

, repair of server i;

(q,jk)=qeyjk+exjk, service completion of a part pjk;

simultaneously in a failed state. Thus, if one of the <sup>U</sup> |R |

properties in order to accomplish the above-mentioned characteristics.

Let :Q2

enabled.

(q,i)=qesvi

(q,i)=q+esvi

where exj,k<sup>1</sup>, eyjk, exjk, and esvi

such that

system.

the system is enabled.

#### **2.2 Motivating examples for properties of robust supervisory control**

This subsection motivates a set of desired properties for a robust controller based upon an example production system. Figure 1 presents an example manufacturing system with two unreliable resources. The stages, routes, and resource capacities are given, as is the complete discrete event model. This model enumerates the resources, capacities, events, and so forth. For now, we will constrain our discussion to the system states presented in Figures 2-4. We recall that, by definition, a resource allocation state is safe if, starting from that state, there exists a sequence of resource allocations/deallocations that completes all parts and takes the system to the empty and idle state, the state in which no resources are allocated and no servers are busy. Our underlying assumption is that if a resource allocation state is safe, then, under correct supervision and starting from that state, it is possible to produce all part types indefinitely.

We have several control objectives for the system of Figure 1. First, we desire that the controller guarantee deadlock-free operation, i.e., that it keeps the system producing all part types. Second, in the event that r2 fails, we want to continue producing part types not requiring r2, {P3,P4}, without having to intervene by clearing the system of parts requiring r2. Similarly, in the event that r9 fails, we want to continue producing part types not requiring r9, {P1,P2,P4}, again without having to intervene by clearing the system of parts requiring r9. Further, if both r2 and r9 are in the failed state, we want to continue producing part types not requiring r2 or r9, {P4}, again without explicit intervention.

Consider for example the state given in Figure 2. This state is safe; however, if r2 fails while processing part p27 in this state, the production of both P3 and P4 will be blocked by two p23s at r4. Note that if we advance a p23 from r4 to r6, then production of P4 can proceed. However, production of P3 will now be blocked. Thus, this state does not satisfy our condition that after the failure of r2, we should be able to continue producing both P3 and P4. As another example, consider the state of Figure 3. Again, we see that this state is safe. However, if r9 fails while processing part p35 in this state, production of part types P1 and P2 will be blocked by p34 at r6, although the production of P4 is unaffected.

Fig. 2. An undesirable system state since unreliable resource r2 may fail while processing part p27

Robust Control for Single Unit Resource Allocation Systems 397

p27

r6

p11

r9

r1 r2 r3

r4

p41

P3

r8

P2

P4

Fig. 4. An undesirable system state since r2 may fail while processing part p27

**3. Robust control for systems with multiple unreliable resources** 

systems with unreliable resources.

r7

**3.1.1 A neighbourhood policy** 

resources.

**3.1 Robust control using a neighbourhood policy** 

r5

p33

This section endeavours to delve into robust control for single unit resource allocation

P1

This subsection develops controllers that satisfy Property 2.2 above, while maintaining polynomial complexity. Each controller is a conjunction of a modified deadlock avoidance policy and a set of neighbourhood constraints. The deadlock avoidance policy guarantees deadlock-free operation, while the neighbourhood constraints control the distribution of parts that require unreliable resources. Subsection 3.1.1 develops the neighbourhood constraints, NHC. Subsection 3.1.2 constructs a supervisor based on a modified Banker's Algorithm and NHC, while subsection 3.1.3 develops a supervisor based on single-step look-head (SSL) and NHC. The complete proofs can be found in Chew and Lawley (2006).

In this subsection, we discuss neighbourhood constraints based on the notion of *failure dependency*. Informally, a resource is failure-dependent if every part that enters its buffer space requires some future processing on an unreliable workstation. Thus, all unreliable resources are failure-dependent. Some reliable resources may also be failure-dependent if they only process parts that require future processing on an unreliable resource. This is defined more precisely later. For each failure-dependent resource, we generate a neighbourhood. The neighbourhood of a failure-dependent resource is a virtual space of finite capacity that is used to control the distribution of parts requiring that failuredependent resource. Again, this is formalized in the following, where we extend the neighbourhood concepts presented by Lawley & Sulistyono (2002) for systems with multiple unreliable resources. We first discuss and illustrate neighbourhood concepts, and then illustrate how neighbourhood constraints are constructed for failure-dependent

Thus, these examples illustrate that parts requiring a failed resource can prevent the system from producing parts not requiring the failed resource through propagation of blocking. Our objective is to develop supervisory controllers that avoid this by guaranteeing that if an unreliable resource fails, it is possible to redistribute the parts requiring that resource so that part types not requiring that resource can continue to produce.

Fig. 3. An undesirable system state since unreliable resource r9 may fail while processing part p35

For the third objective, consider the state of Figure 4. Again, we see that the state is safe. If r2 fails, production of P3 is blocked by p11 at r6. Further, production of P4 is blocked by p33 at r5. We note that by advancing p11 from r6 to its next required resource, r3, the blockages of P3 and P4 can now be resolved and thus the system can continue producing both P3 and P4, as desired. However, when r2 is repaired, the system is no longer safe since resources r2 and r3 are now involved in deadlock. This illustrates that our controller must guarantee that any redistribution of parts requiring the failed resource does not result in system deadlock when the resource is repaired.

The above discussion lays a foundation for a robust supervisory controller. In summary, a supervisory controller is said to be robust to resource failures of RU if the supervisory controller satisfies Property 2.2.

## **Property 2.2:**


We say that a state is a *feasible initial state* if, starting from that state, it is possible to produce all part types not requiring failed resources. The formal development and definition of this property using language theory is presented in Chew and Lawley (2006).

396 Challenges and Paradigms in Applied Robust Control

Thus, these examples illustrate that parts requiring a failed resource can prevent the system from producing parts not requiring the failed resource through propagation of blocking. Our objective is to develop supervisory controllers that avoid this by guaranteeing that if an unreliable resource fails, it is possible to redistribute the parts requiring that resource so that

r1 r2 r3

r4

p23

p15

P3

r8

Fig. 3. An undesirable system state since unreliable resource r9 may fail while processing

For the third objective, consider the state of Figure 4. Again, we see that the state is safe. If r2 fails, production of P3 is blocked by p11 at r6. Further, production of P4 is blocked by p33 at r5. We note that by advancing p11 from r6 to its next required resource, r3, the blockages of P3 and P4 can now be resolved and thus the system can continue producing both P3 and P4, as desired. However, when r2 is repaired, the system is no longer safe since resources r2 and r3 are now involved in deadlock. This illustrates that our controller must guarantee that any redistribution of parts requiring the failed resource does not result in system deadlock when

The above discussion lays a foundation for a robust supervisory controller. In summary, a supervisory controller is said to be robust to resource failures of RU if the supervisory

**2.2.1**: The supervisory controller ensures continuing production of part types not requiring failed resources, given that additional failures/repairs do not occur. **2.2.2**: The supervisory controller allows only those states that serve as feasible initial

**2.2.3**: The supervisory controller allows only those states that serve as feasible initial

We say that a state is a *feasible initial state* if, starting from that state, it is possible to produce all part types not requiring failed resources. The formal development and definition of this

states if a failed resource is repaired and becomes operational.

property using language theory is presented in Chew and Lawley (2006).

states if an additional resource failure occurs.

P2

P4

r6

p34

P1

r9

p35

part types not requiring that resource can continue to produce.

r5

p42

r7

the resource is repaired.

**Property 2.2:** 

controller satisfies Property 2.2.

part p35

Fig. 4. An undesirable system state since r2 may fail while processing part p27

## **3. Robust control for systems with multiple unreliable resources**

This section endeavours to delve into robust control for single unit resource allocation systems with unreliable resources.

## **3.1 Robust control using a neighbourhood policy**

This subsection develops controllers that satisfy Property 2.2 above, while maintaining polynomial complexity. Each controller is a conjunction of a modified deadlock avoidance policy and a set of neighbourhood constraints. The deadlock avoidance policy guarantees deadlock-free operation, while the neighbourhood constraints control the distribution of parts that require unreliable resources. Subsection 3.1.1 develops the neighbourhood constraints, NHC. Subsection 3.1.2 constructs a supervisor based on a modified Banker's Algorithm and NHC, while subsection 3.1.3 develops a supervisor based on single-step look-head (SSL) and NHC. The complete proofs can be found in Chew and Lawley (2006).

## **3.1.1 A neighbourhood policy**

In this subsection, we discuss neighbourhood constraints based on the notion of *failure dependency*. Informally, a resource is failure-dependent if every part that enters its buffer space requires some future processing on an unreliable workstation. Thus, all unreliable resources are failure-dependent. Some reliable resources may also be failure-dependent if they only process parts that require future processing on an unreliable resource. This is defined more precisely later. For each failure-dependent resource, we generate a neighbourhood. The neighbourhood of a failure-dependent resource is a virtual space of finite capacity that is used to control the distribution of parts requiring that failuredependent resource. Again, this is formalized in the following, where we extend the neighbourhood concepts presented by Lawley & Sulistyono (2002) for systems with multiple unreliable resources. We first discuss and illustrate neighbourhood concepts, and then illustrate how neighbourhood constraints are constructed for failure-dependent resources.

Robust Control for Single Unit Resource Allocation Systems 399

We now construct neighbourhood constraints to enforce the above intention. The constraint for a neighbourhood, say <sup>v</sup> NHi , is an inequality of the form v Zi ≤ Cv where v Zi

of unfinished instances, of Pjk located in the buffer of (Pjk); and that the right hand side Cv is the capacity of rv. v NHi is said to be *capacitated* if v Zi = Cv and *over-capacitated* if v Zi Cv.

<sup>1</sup> NHCi = { <sup>v</sup> Zi <sup>≤</sup> Cv : v NHi NH<sup>i</sup> }.

C1, 2 Z2 <sup>2</sup> jk <sup>2</sup>

C7, 9 Z8 <sup>9</sup> jk <sup>8</sup>

jk jk

C9}.

(x y )

jk jk

C2};

jk jk

C8,

(x y )

(x y )

P NH

P NH

Define the set of all possible neighbourhood constraints with respect to riRU as:

jk jk

jk jk

<sup>9</sup> jk <sup>9</sup>

Constraints of 1 NHCi assure that no neighbourhood of NHi becomes over-capacitated. As Lawley & Sulistyono (2002) discuss, 1 NHCi may induce deadlock among failuredependent resources of i RFD , since if all neighbourhoods are capacitated, parts cannot move from one neighbourhood to another without over-capacitating a neighbourhood. In the example, a state may satisfy both 1= <sup>2</sup> Z1 C1=1 and 1= <sup>2</sup> Z2 C2=1. But, a part moves from one of these associated neighbourhoods to another must over-capacitate the other neighbourhood. To resolve this dilemma, we develop an additional set of constraints,

It is first necessary to compute the set of strongly connected neighbourhoods for 2 NHCi . To do this, for each riRU, we construct a directed graph ( NH<sup>i</sup> ,Ai ) where Ai = {( <sup>i</sup> NHg , <sup>i</sup> NHh ) : Pjk <sup>i</sup> NHg with Pj,k+1 <sup>i</sup> NHh }. Thus, in operation, there will be part flow from <sup>i</sup> NHg to <sup>i</sup> NHh . We then compute the set of strong components of ( NH<sup>i</sup> , Ai ) using standard polynomial graph algorithms (Cormen et at., 2002). For example, we see that <sup>2</sup> NH1 and 2 NH2 are strongly connected, since {P14,P26} <sup>2</sup> NH1 and {P13,P27} <sup>2</sup> NH2 . Therefore, in operation, there is flow from <sup>2</sup> NH1 to 2 NH2 and from 2 NH2 to <sup>2</sup> NH1 . Let <sup>i</sup> SC be the set of strongly connected components of ( NH<sup>i</sup> , Ai ). Then, <sup>2</sup> SC = { <sup>2</sup> SC1 = { <sup>2</sup> NH1 , <sup>2</sup> NH2 }}, and <sup>9</sup> SC = { <sup>9</sup> SC1 ={ <sup>9</sup> NH7 }, <sup>9</sup> SC2 ={ <sup>9</sup> NH8 }, <sup>9</sup> SC3 ={ <sup>9</sup> NH9 }}. Then, 2 NHCi is stated as follows:

<sup>2</sup> NHCi = { Zgi + h Zi < Cg + Ch : { NHgi , <sup>h</sup> NHi } <sup>i</sup> SCm <sup>i</sup> SC , m=1… <sup>i</sup> | | SC }.

Hence, for every strongly connected component of ( NH<sup>i</sup> ,Ai ), 2 NHCi guarantees that at most one neighbourhood can be capacitated at a time. In the example, we have the

P NH

(x y )

(x y )

<sup>2</sup> jk <sup>1</sup>

<sup>9</sup> jk <sup>7</sup>

<sup>9</sup> Z9

P NH

P NH

Recall that xjk is the number of finished instances, and yjk is the number

<sup>=</sup> <sup>i</sup> jk <sup>v</sup>

P NH

<sup>2</sup> NHCi .

following:

jk jk

(x . y )

In the example, we have

<sup>2</sup> NHC1 = { <sup>2</sup> Z1

<sup>9</sup> NHC1 = { <sup>9</sup> Z7

Recall that RU is the set of unreliable resources in the system S, and that i={Pjk:(Pjk)=riR} is the set of part type stages supported by resource ri. If riRU, then rvR is said to be *failuredependent* on ri if Pjkv, Pj,k+ci with c0. In other words, rv is failure-dependent on ri if every part that enters the buffer of rv requires future processing on unreliable resource ri (note that ri is failure-dependent on itself). For riRU, let i RFD = {rv : rvR and Pjkv, Pj,k+ci with c0} be the set of failure-dependent resources on ri, and let RFD <sup>=</sup> <sup>u</sup> <sup>i</sup> FDi <sup>R</sup> and RNFD = R \ RFD.

r R For each failure-dependent resource of i RFD , we construct a neighbourhood. The neighbourhood of rv <sup>i</sup> RFD , v NHi , is defined as the set of part type stages that require rv now or later in their processing and have no intervening failure-dependent resources of <sup>i</sup> RFD . Formally, v NHi <sup>=</sup><sup>v</sup> {Pjk: c0 with Pj,k+cv and d[0,c), (Pj,k+d) <sup>i</sup> RFD }. Thus, if (Pj,k+c) = rv <sup>i</sup> RFD , (Pj,k1) = rw <sup>i</sup> RFD , with rvrw, and {(Pjk), (Pj,k+1)…(Pj,k+c1)} <sup>i</sup> RFD <sup>=</sup>, then {Pjk,Pj,k+1 … Pj,k+c1,Pj,k+c} <sup>v</sup> NHi , and Pj,k<sup>1</sup> <sup>v</sup> NHi . Let NH<sup>i</sup> = { <sup>v</sup> NHi : rv <sup>i</sup> RFD } be the neighbourhood set for riRU, and let NH={ NH<sup>i</sup> : riRU}.

For example, the system of Figure 1 has two unreliable resources, RU={r2,r9}. Note that anytime r1 appears in a route, r2 appears later in the route, and thus, RFD2 = {r1,r2}. Also, anytime r7 or r8 appear in a route, r9 appears later in the route, so, RFD9 = {r7,r8,r9}. Thus, NH2 = { <sup>2</sup> NH1 , <sup>2</sup> NH2 } and NH9 ={ <sup>9</sup> NH7 , <sup>9</sup> NH8 , <sup>9</sup> NH9 }, where the neighbourhoods are as follows: <sup>2</sup> NH1 ={P14,P22,P23,P24,P25,P26}, <sup>2</sup> NH2 = {P11,P12,P13,P15,P21,P27}, <sup>9</sup> NH7 = {P32}, <sup>9</sup> NH8 = {P31}, 9 NH9 = {P33,P34,P35}.

To understand this construction, consider <sup>2</sup> NH1 and 2 NH2 . Note that 1= {P14,P26} and 2={P13,P15,P21,P27}. Since <sup>v</sup> <sup>v</sup> NHi , {P14,P26} <sup>2</sup> NH1 , and {P13,P15,P21,P27} <sup>2</sup> NH2 . Now consider T1 = {(P11),(P12),(P13),(P14),(P15)} = {r6,r3,r2,r1,r2}. Since {r6,r3} RFD2 = , {P11,P12} <sup>2</sup> NH2 . Similarly, T2 = {(P21),(P22),(P23),(P24),(P25),(P26),(P27),(P28)} = {r2,r3,r4,r6,r5,r1,r2,r3}. Since {r3,r4,r6,r5} RFD2 = , {P22,P23,P24,P25} <sup>2</sup> NH1 . Thus, we get <sup>2</sup> NH1 = {P14,P22,P23,P24,P25,P26} and <sup>2</sup> NH2 = {P11,P12,P13,P15,P21, P27}.

Although all parts supported by r6 later need an unreliable resource, r6 is shared by r2 and r9, and thus it is not failure-dependent on either. This implies that failure-dependent sets are disjoint, i.e., RFD2 RFD9 = . Furthermore, we observe that no part stage is in more than one neighbourhood, i.e., <sup>2</sup> NH1 <sup>2</sup> NH2 <sup>9</sup> NH7 <sup>9</sup> NH8 <sup>9</sup> NH9 = . These and other important neighbourhood properties are established in Chew and Lawley (2006).

We restrict the number of parts allowed in a neighbourhood. Our intention is to guarantee that every part in the neighbourhood of a failure-dependent resource has capacity reserved at that resource. That is, we want to be able to advance every part requiring an unreliable resource into its associated failure-dependent resource in the event of a resource failure so that it will not block production of parts not requiring the failed resource. In the example, for a permissible state, we want, for example, every part in 9 NH9 = {P33,P34,P35} to have a reserved unit of buffer at r9. As a consequence, we will reject a state if this constraint is violated. For instance, a state is not admissible if, at this state, the sum of parts in <sup>9</sup> NH9 1; recall that r9 has a single unit of capacity. To see this, at this inadmissible state, if r9 fails, at least one part of <sup>9</sup> NH9 must reside at r5 or r6. Although P1, P2, and P4 do not require failed r9 in their processing, this distribution of parts may in turn block production of some of these part types. Our objective is to develop supervisory controllers capable of rejecting these undesirable states.

We now construct neighbourhood constraints to enforce the above intention. The constraint for a neighbourhood, say <sup>v</sup> NHi , is an inequality of the form v Zi ≤ Cv where v Zi <sup>=</sup> <sup>i</sup> jk <sup>v</sup> jk jk P NH (x . y ) Recall that xjk is the number of finished instances, and yjk is the number

of unfinished instances, of Pjk located in the buffer of (Pjk); and that the right hand side Cv is the capacity of rv. v NHi is said to be *capacitated* if v Zi = Cv and *over-capacitated* if v Zi Cv. Define the set of all possible neighbourhood constraints with respect to riRU as:

$$\mathsf{NHC}^{i}\_{\mathsf{I}} = \{ \mathsf{Z}\dot{\mathsf{k}} \leq \mathsf{C}\_{\mathsf{v}} \; ; \; \mathsf{NH}\dot{\mathsf{k}} \in \mathsf{NH}^{i} \}. $$

In the example, we have

398 Challenges and Paradigms in Applied Robust Control

Recall that RU is the set of unreliable resources in the system S, and that i={Pjk:(Pjk)=riR} is the set of part type stages supported by resource ri. If riRU, then rvR is said to be *failuredependent* on ri if Pjkv, Pj,k+ci with c0. In other words, rv is failure-dependent on ri if every part that enters the buffer of rv requires future processing on unreliable resource ri (note that ri is failure-dependent on itself). For riRU, let i RFD = {rv : rvR and Pjkv, Pj,k+ci with c0} be the set of failure-dependent resources on ri, and let RFD

For each failure-dependent resource of i RFD , we construct a neighbourhood. The neighbourhood of rv <sup>i</sup> RFD , v NHi , is defined as the set of part type stages that require rv now or later in their processing and have no intervening failure-dependent resources of <sup>i</sup> RFD . Formally, v NHi <sup>=</sup><sup>v</sup> {Pjk: c0 with Pj,k+cv and d[0,c), (Pj,k+d) <sup>i</sup> RFD }. Thus, if (Pj,k+c) = rv <sup>i</sup> RFD , (Pj,k1) = rw <sup>i</sup> RFD , with rvrw, and {(Pjk), (Pj,k+1)…(Pj,k+c1)} <sup>i</sup> RFD <sup>=</sup>, then {Pjk,Pj,k+1 … Pj,k+c1,Pj,k+c} <sup>v</sup> NHi , and Pj,k<sup>1</sup> <sup>v</sup> NHi . Let NH<sup>i</sup> = { <sup>v</sup> NHi :

For example, the system of Figure 1 has two unreliable resources, RU={r2,r9}. Note that anytime r1 appears in a route, r2 appears later in the route, and thus, RFD2 = {r1,r2}. Also, anytime r7 or r8 appear in a route, r9 appears later in the route, so, RFD9 = {r7,r8,r9}. Thus, NH2 = { <sup>2</sup> NH1 , <sup>2</sup> NH2 } and NH9 ={ <sup>9</sup> NH7 , <sup>9</sup> NH8 , <sup>9</sup> NH9 }, where the neighbourhoods are as

To understand this construction, consider <sup>2</sup> NH1 and 2 NH2 . Note that 1= {P14,P26} and 2={P13,P15,P21,P27}. Since <sup>v</sup> <sup>v</sup> NHi , {P14,P26} <sup>2</sup> NH1 , and {P13,P15,P21,P27} <sup>2</sup> NH2 . Now consider T1 = {(P11),(P12),(P13),(P14),(P15)} = {r6,r3,r2,r1,r2}. Since {r6,r3} RFD2 = , {P11,P12} <sup>2</sup> NH2 . Similarly, T2 = {(P21),(P22),(P23),(P24),(P25),(P26),(P27),(P28)} = {r2,r3,r4,r6,r5,r1,r2,r3}. Since {r3,r4,r6,r5} RFD2 = , {P22,P23,P24,P25} <sup>2</sup> NH1 . Thus, we get <sup>2</sup> NH1

Although all parts supported by r6 later need an unreliable resource, r6 is shared by r2 and r9, and thus it is not failure-dependent on either. This implies that failure-dependent sets are disjoint, i.e., RFD2 RFD9 = . Furthermore, we observe that no part stage is in more than one neighbourhood, i.e., <sup>2</sup> NH1 <sup>2</sup> NH2 <sup>9</sup> NH7 <sup>9</sup> NH8 <sup>9</sup> NH9 = . These and other

We restrict the number of parts allowed in a neighbourhood. Our intention is to guarantee that every part in the neighbourhood of a failure-dependent resource has capacity reserved at that resource. That is, we want to be able to advance every part requiring an unreliable resource into its associated failure-dependent resource in the event of a resource failure so that it will not block production of parts not requiring the failed resource. In the example, for a permissible state, we want, for example, every part in 9 NH9 = {P33,P34,P35} to have a reserved unit of buffer at r9. As a consequence, we will reject a state if this constraint is violated. For instance, a state is not admissible if, at this state, the sum of parts in <sup>9</sup> NH9 1; recall that r9 has a single unit of capacity. To see this, at this inadmissible state, if r9 fails, at least one part of <sup>9</sup> NH9 must reside at r5 or r6. Although P1, P2, and P4 do not require failed r9 in their processing, this distribution of parts may in turn block production of some of these part types. Our objective is to develop supervisory controllers capable of rejecting these

important neighbourhood properties are established in Chew and Lawley (2006).

follows: <sup>2</sup> NH1 ={P14,P22,P23,P24,P25,P26}, <sup>2</sup> NH2 = {P11,P12,P13,P15,P21,P27}, <sup>9</sup> NH7 = {P32},

rv <sup>i</sup> RFD } be the neighbourhood set for riRU, and let NH={ NH<sup>i</sup> : riRU}.

= {P14,P22,P23,P24,P25,P26} and <sup>2</sup> NH2 = {P11,P12,P13,P15,P21, P27}.

<sup>=</sup> <sup>u</sup> <sup>i</sup>

FD i r R <sup>R</sup>

and RNFD = R \ RFD.

<sup>9</sup> NH8 = {P31}, 9 NH9 = {P33,P34,P35}.

undesirable states.

$$\begin{aligned} \text{NHC}\mathfrak{F} &= \{ \mathbf{Z} \} = \sum\_{\mathbf{P}\_{\mathbb{K}} \in \text{NHC}\_{\mathbb{K}}^{2}} \langle \mathbf{x}\_{\text{jk}} + \mathbf{y}\_{\text{jk}} \rangle \leq \mathbf{C}\_{\mathbb{I}} , \ \mathbf{Z} \mathfrak{Z} &= \sum\_{\mathbf{P}\_{\mathbb{K}} \in \text{NHC}\_{\mathbb{K}}^{2}} \langle \mathbf{x}\_{\text{jk}} + \mathbf{y}\_{\text{jk}} \rangle \leq \mathbf{C}\_{\mathbb{Z}} \} , \\\\ \text{NHC}\mathfrak{C}\mathfrak{F} &= \{ \mathbf{Z} \mathfrak{Y} = \sum\_{\mathbf{P}\_{\mathbb{K}} \in \text{NHC}\_{\mathbb{Z}}^{2}} \langle \mathbf{x}\_{\text{jk}} + \mathbf{y}\_{\text{jk}} \rangle \leq \mathbf{C}\_{\mathbb{Z}} , \ \mathbf{Z} &= \sum\_{\mathbf{P}\_{\mathbb{K}} \in \text{NHC}\_{\mathbb{K}}^{2}} \langle \mathbf{x}\_{\text{jk}} + \mathbf{y}\_{\text{jk}} \rangle \leq \mathbf{C}\_{\mathbb{K}} , \ \mathbf{y}\_{\text{jk}} \in \mathbb{K} \} \\\\ \mathbf{Z} &= \sum\_{\mathbf{P}\_{\mathbb{K}} \in \text{NHC}\_{\mathbb{K}}^{2}} \langle \mathbf{x}\_{\text{jk}} + \mathbf{y}\_{\text{jk}} \rangle \leq \mathbf{C}\_{\mathbb{K}} \} . \end{aligned}$$

Constraints of 1 NHCi assure that no neighbourhood of NHi becomes over-capacitated. As Lawley & Sulistyono (2002) discuss, 1 NHCi may induce deadlock among failuredependent resources of i RFD , since if all neighbourhoods are capacitated, parts cannot move from one neighbourhood to another without over-capacitating a neighbourhood. In the example, a state may satisfy both 1= <sup>2</sup> Z1 C1=1 and 1= <sup>2</sup> Z2 C2=1. But, a part moves from one of these associated neighbourhoods to another must over-capacitate the other neighbourhood. To resolve this dilemma, we develop an additional set of constraints, <sup>2</sup> NHCi .

It is first necessary to compute the set of strongly connected neighbourhoods for 2 NHCi . To do this, for each riRU, we construct a directed graph ( NH<sup>i</sup> ,Ai ) where Ai = {( <sup>i</sup> NHg , <sup>i</sup> NHh ) : Pjk <sup>i</sup> NHg with Pj,k+1 <sup>i</sup> NHh }. Thus, in operation, there will be part flow from <sup>i</sup> NHg to <sup>i</sup> NHh . We then compute the set of strong components of ( NH<sup>i</sup> , Ai ) using standard polynomial graph algorithms (Cormen et at., 2002). For example, we see that <sup>2</sup> NH1 and 2 NH2 are strongly connected, since {P14,P26} <sup>2</sup> NH1 and {P13,P27} <sup>2</sup> NH2 . Therefore, in operation, there is flow from <sup>2</sup> NH1 to 2 NH2 and from 2 NH2 to <sup>2</sup> NH1 . Let <sup>i</sup> SC be the set of strongly connected components of ( NH<sup>i</sup> , Ai ). Then, <sup>2</sup> SC = { <sup>2</sup> SC1 = { <sup>2</sup> NH1 , <sup>2</sup> NH2 }}, and <sup>9</sup> SC = { <sup>9</sup> SC1 ={ <sup>9</sup> NH7 }, <sup>9</sup> SC2 ={ <sup>9</sup> NH8 }, <sup>9</sup> SC3 ={ <sup>9</sup> NH9 }}. Then, 2 NHCi is stated as follows:

$$\mathsf{NHIC}^{i}\_{\mathsf{Z}} = \{ \mathsf{Z}\mathsf{k} \, + \, \mathsf{Z}\mathsf{k} \, < \mathsf{C}\_{\mathsf{Z}} + \mathsf{C}\_{\mathsf{h}} : \{ \mathsf{NHI}^{i}, \mathsf{NHI}^{i} \} \sqsubseteq \mathsf{SC}\mathsf{h} \in \mathsf{SC}^{i}, \mathsf{m} = 1... \mid \mathsf{SC}^{i} \rangle \}. $$

Hence, for every strongly connected component of ( NH<sup>i</sup> ,Ai ), 2 NHCi guarantees that at most one neighbourhood can be capacitated at a time. In the example, we have the following:

Robust Control for Single Unit Resource Allocation Systems 401

Step 1 of the algorithm configures the data structures required. For every part type stage represented in the system, these capture the current resource holding and the future processing need. These structures also capture the resource availability of resources in the state being tested. Three additional comments regarding the algorithm are in order. First, the need for every failure-dependent resource is explicitly set to zero, so this version looks only at the availability of non-failure-dependent resources. Second, for non-failuredependent part type stages (those not requiring unreliable resources), the need for every resource in the residual route (except the one held) is set to one. Finally, for failuredependent part type stages (those requiring unreliable resources), the need for every resource in the residual route (except the one held) up to the one immediately preceding the first encountered failure-dependent resource is set to one, all others are set to zero. Note that these are the resources such a part will need to advance into the failure-dependent resource

Algorithm A1 is not correct by itself, since it does not handle allocation of failure-dependent resources (for detailed examples the reader is referred to the work by Lawley & Sulistyono

of its current neighbourhood. Step 2 then executes the usual Banker's logic.

$$\text{NHIC}\textsuperscript{\text{\textsuperscript{\text{\tiny}}}} = \{\text{Z\textsuperscript{\text{\tiny}}} + \text{Z\textsuperscript{\text{\tiny}}} \less \text{C}\_{1} + \text{C}\_{2}\} \text{ \phantom{\text{\tiny}} \text{NHIC}\textsuperscript{\text{\tiny}} = \bigotimes \dots \text{}$$

2 NHC2 guarantees that <sup>2</sup> NH1 and 2 NH2 are not simultaneously capacitated.

To summarize, NHCi <sup>=</sup> <sup>1</sup> NHCi <sup>2</sup> NHCi guarantees that no neighbourhood is over capacitated, and that neighbourhoods with mutual flow dependencies are not simultaneously capacitated. The complete set of neighbourhood constraints is defined as:

$$\mathsf{NHC} = \{ \mathsf{NHC}^{\mathsf{i}} : \mathsf{r}\_{\mathsf{i}} \in \mathbb{R}^{\mathsf{U}} \}.$$

Note that in the worst case, we generate one constraint for each pair of resources and thus the size of NHC is of O(|R|<sup>2</sup> ).

Chew and Lawley (2006) establish several important properties of NHC. These properties are required to establish robustness of the two supervisors that we develop later. We next modify two deadlock avoidance policies that we use in conjunction with NHC to develop robust supervisors.

## **3.1.2 Banker's algorithm**

In this subsection, we configure Banker's Algorithm (BA) (Habermann, 1969) to work with NHC. BA is perhaps the most widely known deadlock avoidance policy (DAP), and its underlying concepts have influenced the thinking of numerous researchers. BA is a suboptimal DAP in the sense that it achieves computational tractability by sacrificing some safe states. BA avoids deadlock by allowing an allocation only if the requesting processes can be ordered so that the terminal resource needs for the ith process, Pi, in the ordering can be met by pooling available resources and those released by completed processes P1, P2 Pi1. The ordering is essentially a sequence in which all processes in the system can complete successfully. BA is of O(mnlog n) where m is the number of resource types and n is the number of requests. Other manufacturing related work also uses BA (Ezpeleta et al., 2002; Lawley et al., 1998; Reveliotis, 2000).

Our modifications are straightforward and are a generalization of those undertaken by Lawley & Sulistyono (2002). Our objective is to search for an ordering of parts that advances failure-dependent parts (those requiring unreliable resources) into the resource of their current neighbourhood, and non-failure-dependent parts (those not requiring unreliable resources) out of the system. Again, the ordering is such that the resources required by the first part are all available, those required by the second part are all available after the first part has finished and released the resources held by the part, and so forth. If the system can be cleared in this way (all failure-dependent parts are advanced into failure-dependent resources and all non-failure-dependent parts are advanced out of the system), then we can guarantee that if any unreliable resource fails, the system can continue producing parts that do not require this failed resource.

In the following, let =NFDFD be the set of part type stages instantiated in q *whose parts hold non-failure-dependent resources*, where NFD is the set of non-failure-dependent part type stages (those that do not require failure-dependent resources in the residual route) and FD is the set of failure-dependent part type stages (those that do require failure-dependent resources in the residual route). We now present our modified version of BA as Algorithm A1 as follows.

400 Challenges and Paradigms in Applied Robust Control

NHC2 = { <sup>2</sup> Z1 + <sup>2</sup> Z2 < C1+C2}, 9 NHC2 = .

To summarize, NHCi <sup>=</sup> <sup>1</sup> NHCi <sup>2</sup> NHCi guarantees that no neighbourhood is over capacitated, and that neighbourhoods with mutual flow dependencies are not simultaneously capacitated. The complete set of neighbourhood constraints is defined as:

NHC = { NHCi : riRU}. Note that in the worst case, we generate one constraint for each pair of resources and thus

Chew and Lawley (2006) establish several important properties of NHC. These properties are required to establish robustness of the two supervisors that we develop later. We next modify two deadlock avoidance policies that we use in conjunction with NHC to develop

In this subsection, we configure Banker's Algorithm (BA) (Habermann, 1969) to work with NHC. BA is perhaps the most widely known deadlock avoidance policy (DAP), and its underlying concepts have influenced the thinking of numerous researchers. BA is a suboptimal DAP in the sense that it achieves computational tractability by sacrificing some safe states. BA avoids deadlock by allowing an allocation only if the requesting processes can be ordered so that the terminal resource needs for the ith process, Pi, in the ordering can be met by pooling available resources and those released by completed processes P1, P2 Pi1. The ordering is essentially a sequence in which all processes in the system can complete successfully. BA is of O(mnlog n) where m is the number of resource types and n is the number of requests. Other manufacturing related work also uses BA (Ezpeleta et al., 2002;

Our modifications are straightforward and are a generalization of those undertaken by Lawley & Sulistyono (2002). Our objective is to search for an ordering of parts that advances failure-dependent parts (those requiring unreliable resources) into the resource of their current neighbourhood, and non-failure-dependent parts (those not requiring unreliable resources) out of the system. Again, the ordering is such that the resources required by the first part are all available, those required by the second part are all available after the first part has finished and released the resources held by the part, and so forth. If the system can be cleared in this way (all failure-dependent parts are advanced into failure-dependent resources and all non-failure-dependent parts are advanced out of the system), then we can guarantee that if any unreliable resource fails, the system can continue producing parts that

In the following, let =NFDFD be the set of part type stages instantiated in q *whose parts hold non-failure-dependent resources*, where NFD is the set of non-failure-dependent part type stages (those that do not require failure-dependent resources in the residual route) and FD is the set of failure-dependent part type stages (those that do require failure-dependent resources in the residual route). We now present our modified version of BA as Algorithm

2

).

NHC2 guarantees that <sup>2</sup> NH1 and 2 NH2 are not simultaneously capacitated.

2

the size of NHC is of O(|R|<sup>2</sup>

**3.1.2 Banker's algorithm** 

Lawley et al., 1998; Reveliotis, 2000).

do not require this failed resource.

A1 as follows.

robust supervisors.

Step 1 of the algorithm configures the data structures required. For every part type stage represented in the system, these capture the current resource holding and the future processing need. These structures also capture the resource availability of resources in the state being tested. Three additional comments regarding the algorithm are in order. First, the need for every failure-dependent resource is explicitly set to zero, so this version looks only at the availability of non-failure-dependent resources. Second, for non-failuredependent part type stages (those not requiring unreliable resources), the need for every resource in the residual route (except the one held) is set to one. Finally, for failuredependent part type stages (those requiring unreliable resources), the need for every resource in the residual route (except the one held) up to the one immediately preceding the first encountered failure-dependent resource is set to one, all others are set to zero. Note that these are the resources such a part will need to advance into the failure-dependent resource of its current neighbourhood. Step 2 then executes the usual Banker's logic.

Algorithm A1 is not correct by itself, since it does not handle allocation of failure-dependent resources (for detailed examples the reader is referred to the work by Lawley & Sulistyono

Robust Control for Single Unit Resource Allocation Systems 403

It is well known that certain system structures, such as a central buffer, input/output bins, and non-unit buffer capacities, eliminate the possibility of deadlock-free unsafe states (Lawley & Reveliotis, 2001). In these systems, every state is either deadlock or safe, and therefore, a single-step look-ahead policy (SSL) is a correct and optimal deadlock avoidance policy. Further, it is of polynomial complexity, and thus ideal for runtime applications in real systems. In the following, we will modify the SSL presented by Lawley (1999) so that it

A resource allocation graph (RAG) is a digraph that encodes the resource requests and allocations of parts (Lawley, 1999). For our purposes, let RAG=(R\RFD,E) where R\RFD is the set of system non-failure-dependent resource types and E={(ru,rv): ru,rv R\RFD and ru is holding a part pjk with (pj,k+1)=rv}. A subdigraph of RAG, say (R,E), is *induced* when R R\RFD and E ={(ru,rv):(ru,rv)E and ru,rv R}. A subdigraph, (R,E), forms a *knot* in RAG if ruR, (ru)= R, where (ru) is the set of all nodes reachable from ru in RAG. In other words, a set of nodes, R, forms a knot in RAG when, for every node in R, the set of nodes reachable along arcs in RAG is exactly R. Further, we define a *capacitated knot* to be a knot in which every resource in the knot is filled to capacity with parts requesting other resources in the knot. It is commonly known that a capacitated knot in RAG is a necessary and sufficient condition for deadlock in these types of sequential resource allocation systems. We now provide an algorithm, Algorithm A2, below to detect a capacitated knot in RAG = (R\RFD,E). This algorithm has the same polynomial complexity as that given by

Step 1: Compute the set of strongly connected components of RAG: C={C1…Cq}

Step 3: For every strongly connected component CiC such that (Ci,Cj)Ec j=1…q

Step 2: Construct digraph (C,Ec) such that C={C1…Cq} and Ec={(Ci,Cj):(ru,rv)E with ruCi

We note that, for our present work, this version of deadlock detection algorithm operates only on non-failure-dependent resources and parts held by these resources. In A2, Step 1 computes the set of strongly connected components in RAG. As mentioned earlier, this is a standard digraph operation. Step 2 constructs a digraph that defines the reachability relationship between these components. Step 3 looks for a component with no outgoing arc. If such a component is filled to capacity with parts requesting other resources in the component, then it is a capacitated knot, and deadlock exists. If no such capacitated knot

Note that A2 is not correct by itself since it considers only the non-failure-dependent resources. Failure-dependent resources can easily deadlock themselves. However, when A2 is taken in conjunction with NHC, it guarantees Property 2.2 and thus assures that the

system will continue to operate even when multiple unreliable resources are down.

**3.1.3 A Single step look ahead policy** 

Lawley (1999). **Algorithm A2:** 

Input: RAG=(R\RFD,E)

Output: DEADLOCK, NO DEADLOCK

If Ci is a capacitated knot Return DEADLOCK

and rvCj for ij}

End If

Step 4: Return NO DEADLOCK

exists then the RAG is deadlock-free.

End For

works with systems with multiple unreliable resources.

(2002)). However, A1 and NHC together form a robust controller, that is, if we allow the system to visit only those states acceptable to both A1 and NHC, then the system operation will satisfy the requirements of Property 2.2. The detailed proofs for this are given in Chew and Lawley (2006). The supervisor is defined as follows:

## **Definition 3.1.1**: Supervisor 1 = A1 NHC.

Supervisor 1 permits a system state that satisfies both A1 and NHC in runtime. Consider Figure 5, which illustrates a state, say q, in which r4 holds p23 and p14; and r5 holds p33. It is clear, at q, that there exists an admissible sequence by A1; that is, p33 can advance into failure-dependent resource r9; p23 can advance into failure-dependent resource r1; and finally, p41 can be advanced out of the system. In addition, q satisfies NHC since

<sup>2</sup> NHC1 = { <sup>2</sup> Z1 1 1, (there is one P23) 2 Z2 0 1};

<sup>9</sup> NHC1 = { <sup>9</sup> Z7 0 1, 9 Z8 0 1, 9 Z9 1 1 (there is one P33)};

2 NHC2 = {1 + 0 < 1 + 1 = 2}; 9 NHC2 = .

Therefore, q is an admissible state by 1. Supervisor 1 will prohibit, at q, advancing p23 into r6 (where it becomes p24) because p24 and p33 will block, causing the resulting state to violate A1 although not NHC. Loading a p11 into r6 at q is also precluded by 1 since the resulting state violates <sup>2</sup> NHC2 , although not A1. However, advancing p33 one step into r6 or loading a p31 into r8 will result in an admissible state.

Fig. 5. An admissible system state by supervisor <sup>1</sup>

Supervisor 1 is of polynomial complexity since both A1 and NHC require polynomial time for runtime implementation. Chew and Lawley (2006) formally establish that 1 yields a robust supervisor for systems where every part type requires in its route at most one unreliable resource.

## **3.1.3 A Single step look ahead policy**

402 Challenges and Paradigms in Applied Robust Control

(2002)). However, A1 and NHC together form a robust controller, that is, if we allow the system to visit only those states acceptable to both A1 and NHC, then the system operation will satisfy the requirements of Property 2.2. The detailed proofs for this are given in Chew

Supervisor 1 permits a system state that satisfies both A1 and NHC in runtime. Consider Figure 5, which illustrates a state, say q, in which r4 holds p23 and p14; and r5 holds p33. It is clear, at q, that there exists an admissible sequence by A1; that is, p33 can advance into failure-dependent resource r9; p23 can advance into failure-dependent resource r1; and

<sup>2</sup> NHC1 = { <sup>2</sup> Z1 1 1, (there is one P23) 2 Z2 0 1};

<sup>9</sup> NHC1 = { <sup>9</sup> Z7 0 1, 9 Z8 0 1, 9 Z9 1 1 (there is one P33)};

NHC2 = {1 + 0 < 1 + 1 = 2}; 9 NHC2 = .

p23

NHC2 , although not A1. However, advancing p33 one step into r6 or loading a

r6

r9

Therefore, q is an admissible state by 1. Supervisor 1 will prohibit, at q, advancing p23 into r6 (where it becomes p24) because p24 and p33 will block, causing the resulting state to violate A1 although not NHC. Loading a p11 into r6 at q is also precluded by 1 since the resulting

r1 r2 r3

r4

p41

P3

r8

P2

P4

Supervisor 1 is of polynomial complexity since both A1 and NHC require polynomial time for runtime implementation. Chew and Lawley (2006) formally establish that 1 yields a robust supervisor for systems where every part type requires in its route at most one

P1

finally, p41 can be advanced out of the system. In addition, q satisfies NHC since

2

and Lawley (2006). The supervisor is defined as follows:

**Definition 3.1.1**: Supervisor 1 = A1 NHC.

p31 into r8 will result in an admissible state.

Fig. 5. An admissible system state by supervisor <sup>1</sup>

r5

p33

state violates <sup>2</sup>

unreliable resource.

r7

It is well known that certain system structures, such as a central buffer, input/output bins, and non-unit buffer capacities, eliminate the possibility of deadlock-free unsafe states (Lawley & Reveliotis, 2001). In these systems, every state is either deadlock or safe, and therefore, a single-step look-ahead policy (SSL) is a correct and optimal deadlock avoidance policy. Further, it is of polynomial complexity, and thus ideal for runtime applications in real systems. In the following, we will modify the SSL presented by Lawley (1999) so that it works with systems with multiple unreliable resources.

A resource allocation graph (RAG) is a digraph that encodes the resource requests and allocations of parts (Lawley, 1999). For our purposes, let RAG=(R\RFD,E) where R\RFD is the set of system non-failure-dependent resource types and E={(ru,rv): ru,rv R\RFD and ru is holding a part pjk with (pj,k+1)=rv}. A subdigraph of RAG, say (R,E), is *induced* when R R\RFD and E ={(ru,rv):(ru,rv)E and ru,rv R}. A subdigraph, (R,E), forms a *knot* in RAG if ruR, (ru)= R, where (ru) is the set of all nodes reachable from ru in RAG. In other words, a set of nodes, R, forms a knot in RAG when, for every node in R, the set of nodes reachable along arcs in RAG is exactly R. Further, we define a *capacitated knot* to be a knot in which every resource in the knot is filled to capacity with parts requesting other resources in the knot. It is commonly known that a capacitated knot in RAG is a necessary and sufficient condition for deadlock in these types of sequential resource allocation systems. We now provide an algorithm, Algorithm A2, below to detect a capacitated knot in RAG = (R\RFD,E). This algorithm has the same polynomial complexity as that given by Lawley (1999).

## **Algorithm A2:**

Input: RAG=(R\RFD,E)

Output: DEADLOCK, NO DEADLOCK


If Ci is a capacitated knot

Return DEADLOCK

End If

End For

Step 4: Return NO DEADLOCK

We note that, for our present work, this version of deadlock detection algorithm operates only on non-failure-dependent resources and parts held by these resources. In A2, Step 1 computes the set of strongly connected components in RAG. As mentioned earlier, this is a standard digraph operation. Step 2 constructs a digraph that defines the reachability relationship between these components. Step 3 looks for a component with no outgoing arc. If such a component is filled to capacity with parts requesting other resources in the component, then it is a capacitated knot, and deadlock exists. If no such capacitated knot exists then the RAG is deadlock-free.

Note that A2 is not correct by itself since it considers only the non-failure-dependent resources. Failure-dependent resources can easily deadlock themselves. However, when A2 is taken in conjunction with NHC, it guarantees Property 2.2 and thus assures that the system will continue to operate even when multiple unreliable resources are down.

Robust Control for Single Unit Resource Allocation Systems 405

then Pjk is classified as 'undirected.' If Pjk is terminal, it is ignored. For rmRCO, let RU <sup>m</sup> represent the set of right and undirected part type stages associated with rm; and LU <sup>m</sup> , the set of left and undirected part type stages associated with rm. In the example, consider that r1)=1, r2)=2, r3)=3 r4)=4, r5)=5, r6)=6, r7)=7, r8)=8 and r9)=9. We now have that the inclusive remaining route of P33, r5,r6 supporting P33,P34 <sup>1</sup> P3 , is strictly increasing for thus P33 is classified as 'right' and hence RU 5 = {P33}. In the meantime, since P25 <sup>1</sup> P2 is the terminal part type stage for <sup>1</sup> P2 , P25 is ignored. Clearly, RU 5 = . On the other hand, the inclusive remaining route of P11, r6,r3 supporting P11,P12 <sup>1</sup> P1 , is strictly decreasing for hence P11 is classified as 'left.' The inclusive remaining route of P24 is r6,r5 supporting P24,P25 <sup>1</sup> P2 , which is strictly decreasing for hence P24 is classified as 'left.' Therefore, LU 6 = {P11,P24}. Meanwhile, since P34 <sup>1</sup> P3 is the terminal part type stage for <sup>1</sup> P3 , P34 is ignored. It is obvious that LU 6 = . After all the part type stages are classified in this way, a constraint is generated for each pair of non-failure-dependent resources, yielding RO constraints. We now define RO constraints formally as follows.

> jk RU m

P

where Cm and Cn are the respective buffer capacities of rm and rn.

jk jk

jk LU <sup>n</sup>

P

jk jk

< Cm + Cn

R={r1,r2,r3,r4,r5,r6,r7,r8 } P={P1,P2,P3,P4} RU={r4,r6,r8} P1={P11,P12,P13} T1=r1,r2,r3 P2={P21,P22, P23,P24} T2=r2,r5,r4,r3 P3={P31,P32,P33} T3=r5,r7,r8 P4={P41,P42,P43} T4={r5,r6,r3}

C=C1,C2,...,C8=2,2,2,2,2,2,2,2

( ) x y

+

( ) x y

In the example, for r5, r6 RCO, we have (x33+y33) + (x11+y11+x24+y24) < 2, recalling that C5=C6=1. This constraint assures that for every resource allocation state that the system is allowed to visit, the number of 'right' and 'undirected' parts occupying buffer space at r5 plus the number of 'left' and 'undirected' parts occupying buffer space at r6 will be less than the combined capacity of the two resources. Similar constraints are generated for the

We are now in the position to establish that the conjunction of RORCO and NHC, call it supervisor 3, satisfies Property 2.2. Supervisor 3 is a control policy such that it disables

Chew et al. (2011) establish that 3 is a robust controller for systems where every part type

r7 r8

jk(q) if (q,jk) violates either RORCO or NHC. Formally, it is stated as follows.

**Definition 3.2.1:** RORCO is the set of constraints:

resource pairs {r3,r4}, {r3,r5}, {r3,r6}, {r4,r5} and {r4,r6}.

**Definition 3.2.2:** Supervisor 3 = RORCO NHC.

P2

r1 r2 r3

r6

Fig. 6. An example production system with three unreliable resources

requires at most one unreliable resource.

P3 P4

r4 r5

P1

rm, rn RCO such that rm) < rn),

## **Definition 3.1.2**: Supervisor 2 = A2 NHC.

Supervisor 2 accepts a system state that contains no deadlock and satisfies NHC. For example, in Figure 1, suppose that every non-failure-dependent resource has non-unit capacity; that is, Ci1, riR\RFD= {r3,r4,r5,r6}. Then, A2 permits any state in which no subset of parts residing on {r3,r4,r5,r6} is deadlocked on {r3,r4,r5,r6}. If the state also satisfies NHC, then Property 2.2 is guaranteed.

Note that 2 = A2 NHC is suited for real-time implementation since both A2 and NHC are of polynomial complexity. Chew and Lawley (2006) formally establishes that 2 yields a robust supervisor for systems where every part type requires in its route at most one unreliable resource.

#### **3.2 Robust control using a resource order policy**

This subsection configures a deadlock avoidance policy, resource order policy (RO). We will employ this configured resource order policy in conjunction with the neighbourhood constraints of Subsection 3.1 to develop a robust controller. Consider, for configuration purposes, Figure 1. Define RCO = R\ FD R <sup>i</sup> as the set of non-failure-dependent resources. Since FD R2 = {r1,r2} and FD R9 = {r7,r8,r9}, thus RCO = {r3,r4,r5,r6}. Let : RCO (the set of natural numbers) be a one to one mapping of non-failure-dependent resources orders the non-failure-dependent resources so that RO can be applied); PFD={Pj: (Pjk)RU for some k} (PFD is the set of part types requiring unreliable resources; thus, in Figure 1, PFD = {P1,P2,P3}); and PNFD=P\PFD (PNFD is the set of part types not requiring any unreliable resources; hence, in Figure 1, PNFD = {P4}). For each PjPFD, determine all maximal subsequences in the route of Pj that do not contain failure-dependent resources. For instance, in Figure 1, P3PFD where P3=P31, P32, P33, P34, P35, P36 with route r8,r7,r5,r6,r9,r4, the maximal subsequences in r8,r7,r5,r6,r9,r4 that do not contain failure-dependent resources are r5,r6and r4.

To express this formally, for each PjPFD, break the route of Pj into subroutes as follows: for Pj=Pj1 … Pj,k11,Pjk1,Pj,k1+1 … Pj,k21,Pjk2 ,Pj,k2+1 … Pj,khj 1,Pjkhj ,Pj,khj +1 …, {Pjk1 ,Pjk2 … Pjkhj } being precisely the set of part type stages of Pj that is processed on failure-dependent resources (that is, Pjk) : k=k1,k2 … khj } RFD and Pjk) : kk1,k2 … khj } RFD = let <sup>1</sup> Pj =Pj1 … Pj,k1<sup>1</sup>, <sup>2</sup> Pj <sup>=</sup>Pj,k1+1 … Pj,k2<sup>1</sup>, <sup>3</sup> Pj <sup>=</sup>Pj,k2+1 … Pj,k3<sup>1</sup>,…, j <sup>h</sup> <sup>P</sup> <sup>j</sup> <sup>=</sup>Pj,k(hj 1)+1 … Pj,(khj 1) and j <sup>h</sup> <sup>1</sup> <sup>P</sup> <sup>j</sup> <sup>=</sup>Pj,(khj +1) … Pj|Pj For each PjPNFD, rename Pj <sup>0</sup> Pj . Finally, let P' = { <sup>0</sup> Pj : PjPNFD <sup>k</sup> Pj k =1…hj and Pj PFD}. Note that in P', a part type PjPFD is replaced by a set of part types { <sup>1</sup> Pj , <sup>2</sup> Pj … j <sup>h</sup> P <sup>j</sup> } each having a route that is a maximal segment of the route of Pj not containing a failure-dependent resource.

In Figure 1, for example, P3 is replaced by <sup>1</sup> P3 = P33,P34 with route r5,r6 and <sup>2</sup> P3 =P36 with route r4, and P4 is renamed <sup>0</sup> P4 . Thus, the revised set of part types is P'={ <sup>0</sup> P4 }{ <sup>1</sup> P1 , <sup>1</sup> P2 , <sup>2</sup> P2 , <sup>1</sup> P3 , <sup>2</sup> P3 }. Note that none of the routes of part types in P' contains any failure-dependent resources.

We now use P' and RCO to construct a set of RO constraints as follows. For each <sup>i</sup> Pj = Pj,k(i1)+1 … Pj,(ki<sup>1</sup> P' and for each Pjk <sup>i</sup> Pj , consider the inclusive remaining route, (Pjk) … (Pj,(ki1)), and its mapping, Pjk)) ... Pj,(ki1))). (Recall that to implement RO, the resources must be ordered. represents the ordering function.) If the mapping of the inclusive remaining route is strictly increasing (decreasing), then Pjk is classified as 'right' ('left'); if the mapping of the inclusive remaining route switches direction at some point, 404 Challenges and Paradigms in Applied Robust Control

Supervisor 2 accepts a system state that contains no deadlock and satisfies NHC. For example, in Figure 1, suppose that every non-failure-dependent resource has non-unit capacity; that is, Ci1, riR\RFD= {r3,r4,r5,r6}. Then, A2 permits any state in which no subset of parts residing on {r3,r4,r5,r6} is deadlocked on {r3,r4,r5,r6}. If the state also satisfies NHC,

Note that 2 = A2 NHC is suited for real-time implementation since both A2 and NHC are of polynomial complexity. Chew and Lawley (2006) formally establishes that 2 yields a robust supervisor for systems where every part type requires in its route at most one

This subsection configures a deadlock avoidance policy, resource order policy (RO). We will employ this configured resource order policy in conjunction with the neighbourhood constraints of Subsection 3.1 to develop a robust controller. Consider, for configuration purposes, Figure 1. Define RCO = R\ FD R <sup>i</sup> as the set of non-failure-dependent resources. Since FD R2 = {r1,r2} and FD R9 = {r7,r8,r9}, thus RCO = {r3,r4,r5,r6}. Let : RCO (the set of natural numbers) be a one to one mapping of non-failure-dependent resources orders the non-failure-dependent resources so that RO can be applied); PFD={Pj: (Pjk)RU for some k} (PFD is the set of part types requiring unreliable resources; thus, in Figure 1, PFD = {P1,P2,P3}); and PNFD=P\PFD (PNFD is the set of part types not requiring any unreliable resources; hence, in Figure 1, PNFD = {P4}). For each PjPFD, determine all maximal subsequences in the route of Pj that do not contain failure-dependent resources. For instance, in Figure 1, P3PFD where P3=P31, P32, P33, P34, P35, P36 with route r8,r7,r5,r6,r9,r4, the maximal subsequences in

r8,r7,r5,r6,r9,r4 that do not contain failure-dependent resources are r5,r6and r4.

Pj,k1<sup>1</sup>, <sup>2</sup> Pj <sup>=</sup>Pj,k1+1 … Pj,k2<sup>1</sup>, <sup>3</sup> Pj <sup>=</sup>Pj,k2+1 … Pj,k3<sup>1</sup>,…, j <sup>h</sup> <sup>P</sup> <sup>j</sup> <sup>=</sup>Pj,k(hj

To express this formally, for each PjPFD, break the route of Pj into subroutes as follows: for

precisely the set of part type stages of Pj that is processed on failure-dependent resources

} RFD and Pjk) : kk1,k2 … khj

1) and j <sup>h</sup> <sup>1</sup> <sup>P</sup> <sup>j</sup>

PjPNFD <sup>k</sup> Pj k =1…hj and Pj PFD}. Note that in P', a part type PjPFD is replaced by a set of part types { <sup>1</sup> Pj , <sup>2</sup> Pj … j <sup>h</sup> P <sup>j</sup> } each having a route that is a maximal segment of the

In Figure 1, for example, P3 is replaced by <sup>1</sup> P3 = P33,P34 with route r5,r6 and <sup>2</sup> P3 =P36 with route r4, and P4 is renamed <sup>0</sup> P4 . Thus, the revised set of part types is P'={ <sup>0</sup> P4 }{ <sup>1</sup> P1 , <sup>1</sup> P2 , <sup>2</sup> P2 , <sup>1</sup> P3 , <sup>2</sup> P3 }. Note that none of the routes of part types in P' contains any

We now use P' and RCO to construct a set of RO constraints as follows. For each <sup>i</sup> Pj = Pj,k(i1)+1 … Pj,(ki<sup>1</sup> P' and for each Pjk <sup>i</sup> Pj , consider the inclusive remaining route, (Pjk) … (Pj,(ki1)), and its mapping, Pjk)) ... Pj,(ki1))). (Recall that to implement RO, the resources must be ordered. represents the ordering function.) If the mapping of the inclusive remaining route is strictly increasing (decreasing), then Pjk is classified as 'right' ('left'); if the mapping of the inclusive remaining route switches direction at some point,

For each PjPNFD, rename Pj

1,Pjkhj

,Pj,khj

+1 …, {Pjk1

,Pjk2 … Pjkhj

} RFD = let <sup>1</sup> Pj =Pj1 …

1)+1 … Pj,(khj

<sup>0</sup> Pj . Finally, let P' = { <sup>0</sup> Pj :

} being

,Pj,k2+1 … Pj,khj

**Definition 3.1.2**: Supervisor 2 = A2 NHC.

**3.2 Robust control using a resource order policy** 

then Property 2.2 is guaranteed.

Pj=Pj1 … Pj,k11,Pjk1,Pj,k1+1 … Pj,k21,Pjk2

+1) … Pj|Pj

route of Pj not containing a failure-dependent resource.

(that is, Pjk) : k=k1,k2 … khj

failure-dependent resources.

<sup>=</sup>Pj,(khj

unreliable resource.

then Pjk is classified as 'undirected.' If Pjk is terminal, it is ignored. For rmRCO, let RU <sup>m</sup> represent the set of right and undirected part type stages associated with rm; and LU <sup>m</sup> , the set of left and undirected part type stages associated with rm. In the example, consider that r1)=1, r2)=2, r3)=3 r4)=4, r5)=5, r6)=6, r7)=7, r8)=8 and r9)=9. We now have that the inclusive remaining route of P33, r5,r6 supporting P33,P34 <sup>1</sup> P3 , is strictly increasing for thus P33 is classified as 'right' and hence RU 5 = {P33}. In the meantime, since P25 <sup>1</sup> P2 is the terminal part type stage for <sup>1</sup> P2 , P25 is ignored. Clearly, RU 5 = . On the other hand, the inclusive remaining route of P11, r6,r3 supporting P11,P12 <sup>1</sup> P1 , is strictly decreasing for hence P11 is classified as 'left.' The inclusive remaining route of P24 is r6,r5 supporting P24,P25 <sup>1</sup> P2 , which is strictly decreasing for hence P24 is classified as 'left.' Therefore, LU 6 = {P11,P24}. Meanwhile, since P34 <sup>1</sup> P3 is the terminal part type stage for <sup>1</sup> P3 , P34 is ignored. It is obvious that LU 6 = . After all the part type stages are classified in this way, a constraint is generated for each pair of non-failure-dependent resources, yielding RO constraints. We now define RO constraints formally as follows. **Definition 3.2.1:** RORCO is the set of constraints:

$$\forall \mathbf{r}\_{\mathbf{m}\nu} \mathbf{r}\_{\mathbf{n}} \in \text{RCO such that } \alpha(\mathbf{r}\_{\mathbf{m}}) \le \alpha(\mathbf{r}\_{\mathbf{n}}),\\\sum\_{\mathbf{P}\_{\mathbf{k}} \in \Pi\_{\mathbf{m}}^{\text{RU}}} \sum\_{\mathbf{R}^{\text{U}}} \left(\mathsf{X}\_{\mathbf{k}} + \mathsf{Y}\_{\mathbf{k}}\right)\\ \quad + \sum\_{\mathbf{P}\_{\mathbf{k}} \in \Pi\_{\mathbf{L}}^{\text{LU}}} \mathsf{X}\_{\mathbf{k}} + \mathsf{Y}\_{\mathbf{k}} \lambda \quad < \mathsf{C}\_{\mathbf{m}} + \mathsf{C}\_{\mathbf{m}}$$

where Cm and Cn are the respective buffer capacities of rm and rn.

In the example, for r5, r6 RCO, we have (x33+y33) + (x11+y11+x24+y24) < 2, recalling that C5=C6=1. This constraint assures that for every resource allocation state that the system is allowed to visit, the number of 'right' and 'undirected' parts occupying buffer space at r5 plus the number of 'left' and 'undirected' parts occupying buffer space at r6 will be less than the combined capacity of the two resources. Similar constraints are generated for the resource pairs {r3,r4}, {r3,r5}, {r3,r6}, {r4,r5} and {r4,r6}.

We are now in the position to establish that the conjunction of RORCO and NHC, call it supervisor 3, satisfies Property 2.2. Supervisor 3 is a control policy such that it disables jk(q) if (q,jk) violates either RORCO or NHC. Formally, it is stated as follows.

**Definition 3.2.2:** Supervisor 3 = RORCO NHC.

Chew et al. (2011) establish that 3 is a robust controller for systems where every part type requires at most one unreliable resource.

Fig. 6. An example production system with three unreliable resources

Robust Control for Single Unit Resource Allocation Systems 407

ROROD admits states for which at most one resource of ROD=RFD\RU is capacitated with FD parts, although it places no constraint on the number of unreliable resources that are

 r2r4r5: z21+z23+z22+z31+z41<6 r4r5r6: z23+z22+z31+z41+z42<6 r2r4r6: z21+z23+z42<6 r4r5r7: z23+z22+z31+z41+z32<6 r2r4r7: z21+z23+z32<6 r4r5r8: z23+z22+z31+z41+z33<6

We are now in the position to establish that RO4 policy (the conjunction of RORCO, RORFD,

policy such that it admits the enabled controllable event α if and only if *δ*(*q,α*) satisfies RORCO

The intuition behind this control policy is that it ensures that if a shared resource (i.e., a PFD resource) is filled with FD parts, at least one can be advanced out of the shared resources and, thus, out of RCO, which can then operate under RORCO. Furthermore, clearing RCO of this part will not create problems in the FD resources. To summarize, RORFD allows states with at most one FD resource filled with parts that are FD on the same unreliable resource. RORFD2 allows states for which at most two FD resources are capacitated with FD parts. ROROD admits states for which at most one resource of ROD is capacitated with FD parts. Wang et al. (2008) establish that 4 is a robust controller for systems where every part type

**4. Robust control for product routings with multiple unreliable resources** 

In Section 3, we develop robust controllers for the single unit resource allocation systems with multiple unreliable resources. These guarantee that if any subset of resources fails, parts in the system requiring failed resources do not block production of parts not requiring failed resources. To establish supervisor correctness, we assume that each part type requires at most one unreliable resource in its route. We now relax this assumption using a central buffer and present robust controllers that guarantee robust operation without assumptions

, and ROROD), call it supervisor 4, satisfies Property 2.2. Supervisor 4 is a control

As in the example system in Figure 6, the set of constrains are as follows.

RORFD r2r4: z21+z23<4 r5r7: z31+z32<4 r2r5: z21+z22<4 r5r8: z31+z33<4 r4r5: z23+z22<4 r7r8: z32+z33<4

 r2r4r8: z21+z23+z33<6 r4r6r7: z23+z42+z32<6 r2r5r6: z21+z22+z31+z41+z42<6 r4r6r8: z23+z42+z33<6 r2r5r7: z21+z22+z31+z41+z32<6 r4r7r8: z23+z32+z33<6

 r2r7r8: z21+z32+z33<6 r6r7r8: z42+z32+z33<6 ROROD r2r5: z21+z22+z31+z41<4 r5r7: z22+z31+z41+z32<4

RORFD RORFD2 ROROD. Formally, it is stated as follows.

**Definition 3.3.5:** Supervisor 4 = RORCO RORFD RORFD2 ROROD.

 r2r5r8: z21+z22+z31+z41+z33<6 r5r6r7: z22+z31+z41+z42+z32<6 r2r6r7: z21+z42+z32<6 r5r6r8: z22+z31+z41+z42+z33<6 r2r6r8: z21+z42+z33<6 r5r7r8: z22+z31+z41+z32+z33<6

r1r3: z11+z13+z24+z43<4

r5r6: z41+z42<4

r2r7: z21+z32<4

requires at most one unreliable resource.

RORCO r1r2: z11+z12+z21<4 r2r3: z12+z21+z13+z24+z43<4

capacitated.

RORFD2

RORFD2

## **3.3 Robust control using shared resource capacity**

The robust supervisory control policies presented in sections 3.1-3.2 assume that that parts requiring failed resources can be advanced into FD buffer. We refer this type of control policies as "absorbing" policies. This subsection relaxes this assumption because, in some systems, providing FD buffer space might be too expensive or it might be desirable to load the system more heavily with FD parts. A "distributing" type of control policy is developed and presented in this subsection. This policy distributes parts requiring failed resources throughout the buffer space of shared resources so that these distributed parts do not block the production of part types that are not requiring failed resources.

Now, the development of the "distributing" control policy, namely, RO4 policy is discussed in details. First, based on the definitions of resource sets in the previous sections, we further define three resource regions: (1) the region of continuous operation, RCO=RPFDRNFD, (2) the region of failure dependency, RFD=RFD, and (3) the region of distribution, ROD=RFD\RU = RFD\RU =RR\RNFD. In the example system in Figure 6, we have RCO= {r1,r2,r3}; RFD= {r2,r4,r5,r6,r7,r8}; ROD= {r2,r5,r7}. RO4 policy is the conjunction of four modified RO policies applied to different resource regions. We now define the RO constraints as follows.

**Definition 3.3.1**: RORCO is the set of constraints:

$$\sum\_{P\_{jk}\in\Omega\_{\mathcal{S}}} z\_{jk} + \sum\_{P\_{uv}\in\Omega\_{h}} z\_{uv} < \mathcal{C}\_{\mathcal{g}} + \mathcal{C}\_{h}$$
 
$$where \quad z\_{st} = \mathcal{x}\_{st} + y\_{st'} \quad r\_{\mathcal{g}'}r\_{h} \in RCO \text{ and } \mathcal{g} \neq h.$$

RORCO admits states that exhibit at most one capacitated resource in RCO. **Definition 3.3.2**: RORFD is the set of constraints

$$\sum\_{P\_{jk}\text{ or}\Omega\_{\mathcal{S}}\cap P\_{i}^{FD}} z\_{jk} + \sum\_{P\_{uv}\text{ or}\Omega\_{\mathcal{h}}\cap P\_{i}^{FD}} z\_{uv} < \mathcal{C}\_{\mathcal{g}} + \mathcal{C}\_{h} \quad \text{for} \quad r\_{i} \in \mathbb{R}^{LV}$$

$$where \quad z\_{st} = \mathbf{x}\_{st} + y\_{st'} \quad r\_{\mathcal{S}'}r\_{h} \in \mathbb{R}FD, \quad and \quad \mathbf{g} \neq h.$$

RORFD admits states for which at most one resource of RFD is capacitated with Pi FD parts for each riRU. Note that it does not place any constraint on the total number of RFD resources capacitated.

**Definition 3.3.3**: RORFD2 is the set of constraints

$$\sum\_{P\_{jk}\in\Omega\_{\mathcal{S}}\cap P^{FD}} z\_{jk} + \sum\_{P\_{\text{inv}}\in\Omega\_{h}\cap P^{FD}} z\_{\text{nnn}} + \sum\_{P\_{\text{uv}}\in\Omega\_{j}\cap P^{FD}} z\_{\text{uv}} < \mathcal{C}\_{\mathcal{S}} + \mathcal{C}\_{h} + \mathcal{C}\_{j}$$
 
$$\text{where} \quad z\_{st} = x\_{st} + y\_{st'} \quad r\_{\mathcal{S}'}r\_{h'}r\_{j} \in \text{RFD and } \mathcal{g} \neq h \neq j.$$

RORFD2 admits states for which at most two resources of RFD are capacitated with FD parts, but does not place any constraint on the total number of RFD resources capacitated. **Definition 3.3.4**: ROROD is the set of constraints

$$\sum\_{P\_{jk} \in \Omega\_{\mathcal{S}} \cap P^{FD}} z\_{jk} + \sum\_{P\_{uv} \in \Omega\_{h} \cap P^{FD}} z\_{uv} < \mathcal{C}\_{\mathcal{g}} + \mathcal{C}\_{h}$$
 
$$where \quad z\_{st} = x\_{st} + y\_{st'} \quad r\_{\mathcal{g}'} r\_{h} \in ROD \text{ and } \mathcal{g} \neq h.$$

406 Challenges and Paradigms in Applied Robust Control

The robust supervisory control policies presented in sections 3.1-3.2 assume that that parts requiring failed resources can be advanced into FD buffer. We refer this type of control policies as "absorbing" policies. This subsection relaxes this assumption because, in some systems, providing FD buffer space might be too expensive or it might be desirable to load the system more heavily with FD parts. A "distributing" type of control policy is developed and presented in this subsection. This policy distributes parts requiring failed resources throughout the buffer space of shared resources so that these distributed parts do not block

Now, the development of the "distributing" control policy, namely, RO4 policy is discussed in details. First, based on the definitions of resource sets in the previous sections, we further define three resource regions: (1) the region of continuous operation, RCO=RPFDRNFD, (2) the region of failure dependency, RFD=RFD, and (3) the region of distribution, ROD=RFD\RU = RFD\RU =RR\RNFD. In the example system in Figure 6, we have RCO= {r1,r2,r3}; RFD= {r2,r4,r5,r6,r7,r8}; ROD= {r2,r5,r7}. RO4 policy is the conjunction of four modified RO policies applied to different resource regions. We now define the RO constraints as

, , .

*U*

FD parts for

,, , .

**3.3 Robust control using shared resource capacity** 

**Definition 3.3.1**: RORCO is the set of constraints:

**Definition 3.3.2**: RORFD is the set of constraints

**Definition 3.3.4**: ROROD is the set of constraints

follows.

capacitated.

RORFD2

**Definition 3.3.3**: RORFD2

the production of part types that are not requiring failed resources.

*jk g uv h*

*PP PP*

RORCO admits states that exhibit at most one capacitated resource in RCO.

*FD FD jk g i uv h i*

is the set of constraints

*jk g mn h uv j*

*jk g uv h*

*PP PP*

*PP P P P P st st st g h j*

*st st st g h*

RORFD admits states for which at most one resource of RFD is capacitated with Pi

*FD FD FD*

*where z x y r r r RFD and g h j*

but does not place any constraint on the total number of RFD resources capacitated.

*FD FD*

*st st st g h*

*P P*

*jk uv g h*

*z z CC*

*st st st g h*

*where z x y r r RCO and g h*

*where z x y r r RFD and g h*

each riRU. Note that it does not place any constraint on the total number of RFD resources

*jk uv g h i*

*z z C C for r R*

, ,, .

, , .

*jk mn uv g h j*

*z z z CCC*

admits states for which at most two resources of RFD are capacitated with FD parts,

*jk uv g h*

*z z CC*

*where z x y r r ROD and g h*

ROROD admits states for which at most one resource of ROD=RFD\RU is capacitated with FD parts, although it places no constraint on the number of unreliable resources that are capacitated.


As in the example system in Figure 6, the set of constrains are as follows.

We are now in the position to establish that RO4 policy (the conjunction of RORCO, RORFD, RORFD2 , and ROROD), call it supervisor 4, satisfies Property 2.2. Supervisor 4 is a control policy such that it admits the enabled controllable event α if and only if *δ*(*q,α*) satisfies RORCO RORFD RORFD2 ROROD. Formally, it is stated as follows.

**Definition 3.3.5:** Supervisor 4 = RORCO RORFD RORFD2 ROROD.

The intuition behind this control policy is that it ensures that if a shared resource (i.e., a PFD resource) is filled with FD parts, at least one can be advanced out of the shared resources and, thus, out of RCO, which can then operate under RORCO. Furthermore, clearing RCO of this part will not create problems in the FD resources. To summarize, RORFD allows states with at most one FD resource filled with parts that are FD on the same unreliable resource. RORFD2 allows states for which at most two FD resources are capacitated with FD parts. ROROD admits states for which at most one resource of ROD is capacitated with FD parts. Wang et al. (2008) establish that 4 is a robust controller for systems where every part type requires at most one unreliable resource.

## **4. Robust control for product routings with multiple unreliable resources**

In Section 3, we develop robust controllers for the single unit resource allocation systems with multiple unreliable resources. These guarantee that if any subset of resources fails, parts in the system requiring failed resources do not block production of parts not requiring failed resources. To establish supervisor correctness, we assume that each part type requires at most one unreliable resource in its route. We now relax this assumption using a central buffer and present robust controllers that guarantee robust operation without assumptions

Robust Control for Single Unit Resource Allocation Systems 409

The maximum number of iterations of the RPA while loop is bounded by the number of part type stages, and thus RPA is no worse than O(CRL=PjP|Pj|), which is polynomial in

The central buffer (CB) will be used to clear workstation buffer space of failure-dependent parts that have finished a subroute. If such parts have completely finished their original routes, they exit the system. Otherwise, they must have available space in the CB. This will

For example, suppose the system of Figure 7 is in a state as follows: r7 is failed with p17 waiting for processing; r5 is holding a completed p15; and r4 is holding a completed p14. Because of the blocking effect of p14 and p15, it is not possible to produce all other part types. However, if we relocate p14 and p15 to the CB, the system can continue producing P2, P3, and P4. CB constraints are necessary to achieve this. For P1, we state the linear inequality: (x11+y11)+(x12+x12+y12)+(x13+y13)+(x14+x14+y14)+(x15+x15+y15) B1, where xjk and yjk are the number of finished and unfinished pjk's at (Pjk), xjk is the number of finished pjk's relocated

With this constraint, finished parts p12, p14, and p15, for subpart types SP14, SP13, and SP12, respectively, can be moved to the CB. Thus, in the example, we can transfer the finished p14 and p15 to the CB, allowing P2, P3, and P4 to continue production. In the meantime, we decrement x14 and x15 by 1, and increment x14 and x15 by 1. As an aside, we decrement x<sup>14</sup>

R={r1,r2,r3,r4,r5,r6,r7} P={P1,P2,P3,P4}

P4={P41,P42,P43,P44} T4={r1,r6,r3,r1} C=C1,C2,...,C7=1,1,1,1,1,1,1

P1={P11,P12,...,P18} T1=r1,r2,r3,r4,r5,r6,r7,r1 P2={P21,P22,...P25} T2=r1,r3,r4,r6,r1

P3={P31,P32,...,P3,11} T3=r1,r5,r3,r2,r3,r5,r3,r5,r6,r7,r1

RU={r2,r4,r5,r7}

We now state the CB constraint, CBC. Let P\*={Pj:PjP |TjRU| 1} be the set of part types that require multiple unreliable resources, and B the total capacity of the CB. For a part type

> jk j j1 jk j j jk jk jk P P \SP P LP Z () x x y

by 1 and increment y15 by 1 when p14 advances from the CB into the buffer of r5.

ensure that they do not block the production of other part types.

r2

r3

r6

r7

r5 r4

cumulative route length (CRL).

**4.2 Central buffer constraints** 

to the CB, and Bj the CB space reserved for Pj.

Fig. 7. Example with four unreliable resources

Central Buffer

PjP\*, let

P1

r1

P3

P4

P2

on route structure. To this end, we will construct new robust controllers in conjunction with the robust controllers, 1 and 2, developed in Subsection 3.1. The following three subsections will demonstrate the way we use a central buffer to extend 1 and 2 for systems where parts may require multiple unreliable resources.

## **4.1 Route partitioning algorithm**

We now show how to use a central buffer to extend 1 and 2 for systems where parts may require multiple unreliable resources. We partition routes with multiple unreliable resources into subroutes, each of which contains one unreliable resource. A part in the last stage of a subroute can move to the first resource of the succeeding subroute or into the central buffer. With this partition, the system resembles one with at most one unreliable resource per route, allowing us to apply 1 and 2.

The route partitioning algorithm (RPA) performs this operation. It starts with the last stage and builds the subroute backwards. A subroute is extended until two unique unreliable resources are detected. Then, a new subroute is begun. We demonstrate below on P1 of Figure 7.

## **Route Partitioning Algorithm (RPA)**

Algorithm Notation: j, q, u are indices and counters; is the empty list; is a temporary set. for j=1…|P|

 let u=|Pj|, q=1, SPj1=, = while u0 (a) if (Pju)RU\, ={(Pju)} (b) if ||2, SPjq=push(Pju,SPjq), u=u1 *(Note: The function 'push' takes two parameters, an object and an ordered list of objects, and inserts the object into the head of the list.)* (c) else =, q=q+1, SPjq=

end while

NSj = q *(Number of Segments for Pj)*

For j=1, u=|P1|=8, q=1, SP11=, =. Then, (P18)=r1RU\ ={r2,r4,r5,r7}, execute (b): SP11=P18, u = 7.

Next, (P17)=r7RU\={r2,r4,r5,r7}, execute first if: ={r7}={r7}. Since ||<2, execute (b) SP11= P17,P18, u=6.

Next, (P16)=r6RU\={r2,r4,r5}, execute (b): SP11= P16,P17,P18 and u=5.

Next, (P15)=r5RU\={r2,r4,r5}, execute (a): ={r5}={r5,r7}. Since ||=2, execute (c): =, q=2, SP12=. This completes the first subroute SP11=P16,P17,P18.

Next, u=5, (P15)=r5RU\={r2,r4,r5,r7}, execute (a): ={r5}={r5}. Since ||<2, execute (b): SP12=P15, u = 4.

Next, (P14)=r4RU\={r2,r4,r7}, execute (a): ={r4}={r4,r5}. Since ||=2, execute (c): =, q=3, SP13=. This completes the second subroute SP12=P15.

Continuing as shown, RPA partitions P1 into four subpart types (the remaining two are SP13=P13,P14 and SP14= P11, P12) with subroutes TS11=r6,r7,r8, TS12=r5, TS13=r3,r4, and TS14=r1,r2. Note that each subroute requires at most one unreliable resource, although the frequency of that resource is not limited. RPA does not affect part types whose routes require at most one unreliable resource.

The maximum number of iterations of the RPA while loop is bounded by the number of part type stages, and thus RPA is no worse than O(CRL=PjP|Pj|), which is polynomial in cumulative route length (CRL).

## **4.2 Central buffer constraints**

408 Challenges and Paradigms in Applied Robust Control

on route structure. To this end, we will construct new robust controllers in conjunction with the robust controllers, 1 and 2, developed in Subsection 3.1. The following three subsections will demonstrate the way we use a central buffer to extend 1 and 2 for systems

We now show how to use a central buffer to extend 1 and 2 for systems where parts may require multiple unreliable resources. We partition routes with multiple unreliable resources into subroutes, each of which contains one unreliable resource. A part in the last stage of a subroute can move to the first resource of the succeeding subroute or into the central buffer. With this partition, the system resembles one with at most one unreliable resource per route,

The route partitioning algorithm (RPA) performs this operation. It starts with the last stage and builds the subroute backwards. A subroute is extended until two unique unreliable resources are detected. Then, a new subroute is begun. We demonstrate below on P1 of

Algorithm Notation: j, q, u are indices and counters; is the empty list; is a temporary set.

For j=1, u=|P1|=8, q=1, SP11=, =. Then, (P18)=r1RU\ ={r2,r4,r5,r7}, execute (b):

Next, (P17)=r7RU\={r2,r4,r5,r7}, execute first if: ={r7}={r7}. Since ||<2, execute (b)

Next, (P15)=r5RU\={r2,r4,r5}, execute (a): ={r5}={r5,r7}. Since ||=2, execute (c):

Next, u=5, (P15)=r5RU\={r2,r4,r5,r7}, execute (a): ={r5}={r5}. Since ||<2, execute (b):

Next, (P14)=r4RU\={r2,r4,r7}, execute (a): ={r4}={r4,r5}. Since ||=2, execute (c):

Continuing as shown, RPA partitions P1 into four subpart types (the remaining two are SP13=P13,P14 and SP14= P11, P12) with subroutes TS11=r6,r7,r8, TS12=r5, TS13=r3,r4, and TS14=r1,r2. Note that each subroute requires at most one unreliable resource, although the frequency of that resource is not limited. RPA does not affect part types whose routes

*(Note: The function 'push' takes two parameters, an object and an ordered list of objects, and* 

where parts may require multiple unreliable resources.

**4.1 Route partitioning algorithm** 

allowing us to apply 1 and 2.

**Route Partitioning Algorithm (RPA)** 

(a) if (Pju)RU\, ={(Pju)}

(c) else =, q=q+1, SPjq=

NSj = q *(Number of Segments for Pj)*

require at most one unreliable resource.

(b) if ||2, SPjq=push(Pju,SPjq), u=u1

*inserts the object into the head of the list.)*

Next, (P16)=r6RU\={r2,r4,r5}, execute (b): SP11= P16,P17,P18 and u=5.

=, q=2, SP12=. This completes the first subroute SP11=P16,P17,P18.

=, q=3, SP13=. This completes the second subroute SP12=P15.

let u=|Pj|, q=1, SPj1=, =

Figure 7.

for j=1…|P|

while u0

end while

SP11=P18, u = 7.

SP12=P15, u = 4.

SP11= P17,P18, u=6.

The central buffer (CB) will be used to clear workstation buffer space of failure-dependent parts that have finished a subroute. If such parts have completely finished their original routes, they exit the system. Otherwise, they must have available space in the CB. This will ensure that they do not block the production of other part types.

For example, suppose the system of Figure 7 is in a state as follows: r7 is failed with p17 waiting for processing; r5 is holding a completed p15; and r4 is holding a completed p14. Because of the blocking effect of p14 and p15, it is not possible to produce all other part types. However, if we relocate p14 and p15 to the CB, the system can continue producing P2, P3, and P4. CB constraints are necessary to achieve this. For P1, we state the linear inequality: (x11+y11)+(x12+x12+y12)+(x13+y13)+(x14+x14+y14)+(x15+x15+y15) B1, where xjk and yjk are the number of finished and unfinished pjk's at (Pjk), xjk is the number of finished pjk's relocated to the CB, and Bj the CB space reserved for Pj.

Fig. 7. Example with four unreliable resources

With this constraint, finished parts p12, p14, and p15, for subpart types SP14, SP13, and SP12, respectively, can be moved to the CB. Thus, in the example, we can transfer the finished p14 and p15 to the CB, allowing P2, P3, and P4 to continue production. In the meantime, we decrement x14 and x15 by 1, and increment x14 and x15 by 1. As an aside, we decrement x<sup>14</sup> by 1 and increment y15 by 1 when p14 advances from the CB into the buffer of r5.

We now state the CB constraint, CBC. Let P\*={Pj:PjP |TjRU| 1} be the set of part types that require multiple unreliable resources, and B the total capacity of the CB. For a part type PjP\*, let

$$Z\_{\mathfrak{j}} = \sum\_{\mathbf{P}\_{\mathfrak{jk}} \in \mathbf{P}\_{\mathfrak{j}} \backslash \mathbf{SP}\_{\mathfrak{j}^1}} (\mathbf{x}\_{\mathfrak{j}\mathbf{k}} + \mathbf{y}\_{\mathfrak{jk}}) + \sum\_{\mathbf{P}\_{\mathfrak{jk}} \in \mathbf{LP}\_{\mathfrak{j}}} \mathbf{x}'\_{\mathfrak{j}\mathbf{k}}$$

Robust Control for Single Unit Resource Allocation Systems 411

**Proof:** The structure of the proof is as follows. We assume the system to be in an admissible state with parts requiring multiple unreliable resources, with some failed. We show that these parts can advance into the CB or into the buffer space of failure-dependent resources, where they do not block production of parts not requiring failed resources. Let PjP\*. The

current state, q, unreliable resources in the subroutes of Pj have failed and that q satisfies 5. In the following, we want to show that under 5 parts of type Pj do not block other part types from producing. We ignore parts of type Pj in the final subroute since it is covered by 1. That is, 1 guarantees that parts in the final subroute can be advanced into the buffer space of the last resource and completed and removed from the system if the resource is operational or stored there, out of the way of part types not requiring failed resources, if it is

Let qj={pjk | Pjk SPjq, q = NSj, (NSj1),…,2} be the set of parts of Pj in the state q. Let qj={pjk | Pjk LPj} be the set of parts of Pj in the final stage of a subroute. By the definition of LPj, qj qj. Now, 1 guarantees that all parts in qj\qj can be advanced, perhaps through several processing steps, into the buffer spaces of resources required by stages of LPj. That is, 1 guarantees a sequence of part movements such that the system reaches a new

The left hand side of CBC does not change in moving from state q to state t. To see this, note that CBC is only affected by parts in P\*. Since we allow no new parts to be admitted and no part of P\* is required to move from one subroute to another (only to the end of the current subroute), the left-hand-side of CBC does not change magnitude. Thus, the part advancement under 1 does not violate CBC.Now, CBC guarantees that every part of tj has capacity reserved on the CB, and any finished part of this set can be moved to the CB. Further, any unfinished part of tj can be finished and moved to the CB if its resource is operational. If the associated resource is not operational, the part can be stored at its failed resource where it will not block the production of part types not requiring failed resources. Thus, all operational resources can be cleared of parts of type Pj. Under 1, the resulting

**Proof:** The proof follows the same construction as Theorem 4.3.1. The main difference is in how BA and SSLA operate. Thus, 5 and 6 guarantee robust operation for systems where parts can require multiple unreliable resources. Note that if every resource is unreliable,

Supervisory control for manufacturing systems resource allocation has been an active area of research. Significant amount of theories and algorithms have been developed to allocate resources effectively and efficiently, and to guarantee important system properties, such as system liveness, traceability, deadlock-free operations. However, a major assumption these research works are based on is that resources never fail. While resource failures in automated

state, say t, where tj=tj. In state t, all instances of Pj are at the end of a subroute.

state is a feasible initial state if resource repairs or additional failures occur.

**Theorem 4.3.2:** 6 is robust to failure of RU.

**5. Conclusion and future research** 

both theorems continue to hold.

,SPj,(NSj-1),…,SPj1}. Assume that in the

The following theorems establish that these supervisors ensure robust operation.

**Definition 4.3.1:** Supervisor 5 = <sup>1</sup> CBC. **Definition 4.3.2:** Supervisor 6 = <sup>2</sup> CBC.

**Theorem 4.3.1:** 5 is robust to failure of RU.

not.

subpart types of Pj constructed by RPA are {SPj,NSj

where LPj is the set of "last" part type stages in the subparts of Pj (except SPj1, the final stage of Pj). For example, LP1={P12,P14,P15} and LP3 = {P32,P34,P36,P38}. In general,

$$\mathbf{L}\mathbf{P}\_{\mathbf{j}} = \{ \mathbf{P}\_{\mathbf{j}} \mid\_{\textsf{SP}\_{\mathbf{j}\mathcal{NS}\_{\mathbf{j}}}} \textsf{/} \mathbf{P}\_{\mathbf{j}} \mid\_{\textsf{SP}\_{\mathbf{j}\mathcal{NS}\_{\mathbf{j}}}} \textsf{+} \textsf{[}\textsf{sp}\_{\mathbf{j},\textsf{NS}\_{\mathbf{j}}-\mathbf{1}]} \textsf{+} \dots \textsf{<} \textsf{P}\_{\mathbf{j}} \mid\_{\textsf{SP}\_{\mathbf{j},\textup{NS}\_{\mathbf{j}}}} \textsf{+} \dots \textsf{+} \textsf{[}\textsf{sp}\_{\mathbf{j}2} \textsf{)} \}\dots \textsf{+} \textsf{s}$$

Zj keeps track of the total number of instances of part type stages of PjP\* that are in the system. CBC is defined as:

$$\text{2(i)}\quad Z\_{\mathfrak{j}} \le \mathcal{B}\_{\mathfrak{j}\prime} \quad \mathcal{P}\_{\mathfrak{j}} \in \mathcal{P}^\* \qquad \text{(ii)}\quad \sum\_{\mathcal{P}\_{\mathfrak{j}} \in \mathcal{P}^\*} \mathcal{B}\_{\mathfrak{j}} \le \mathcal{B}\_{\mathfrak{j}}$$

CBC ensures that every part in the system requiring multiple unreliable resources has capacity reserved on the CB. CBC has no more than CRL\*|P| constraints and thus checking CBC computation is no worse than O(CRL\*|P|), which is polynomial in stable measures of system size.

The level of Bj for PjP\* can be fixed, in which case Bj does not change; or state-based, where we periodically reallocate CB across all PjP\*. Although we cannot preempt CB space from parts that have it reserved, we can reallocate CB space that is not reserved. One simple approach is to let Bj=Zj as long as (ii) holds. This represents a first-come-first-serve rule. Alternatively, we can solve the following assignment problem:

$$\min \quad \sum\_{i=1}^{B} \sum\_{j=1}^{|P^\*|} \mathbf{C}\_{i\bar{\jmath}} \mathbf{X}\_{i\bar{\jmath}} \tag{1}$$

$$\text{Test.} \qquad \mathcal{B}\_{\bar{\jmath}} = \sum\_{i=1}^{\mathcal{B}} \chi\_{\bar{\imath}\jmath'} \qquad \mathcal{j} = 1 \dots |\mathcal{P}^\* \mid \tag{2}$$

$$Z\_{\bar{\jmath}} \le \sum\_{i=1}^{B} \chi\_{\bar{\imath}\gamma'} \quad \text{j} = 1 \dots \lfloor \mathbf{P}^\* \rfloor \tag{3}$$

$$\sum\_{i=1}^{\text{B}} \sum\_{j=1}^{|\text{P}^\*|} \mathbf{X}\_{ij} \le \mathbf{B} \tag{4}$$

$$\mathbf{^\{X\_{ij} \in \{0, 1\} \ }}\text{ , i = 1...B, j = 1... | P\* \ |}\tag{5}$$

Here, Xij is 1 if the ith unit of CB is assigned to PjP\*, 0 otherwise. The objective (1) minimizes assignment cost; (2) counts the assignment to each PjP\*; (3) assures no preemption from parts in the system; and (4) assures the CB is not over allocated. Cij is the cost of assigning CB space to PjP\*. This cost could reflect production priorities or failure probabilities. This problem can be solved in polynomial time using the Hungarian Algorithm (Papadimitriou, 1982). The solution frequency is a topic for future research.

#### **4.3 Robust controllers with CBC**

We now define two supervisory controllers. The first is the conjunction of 1 and CBC; and the second is the conjunction of 2 and CBC. Recall that 1 and 2 are the controllers of Subsection 3.1. Formally, the extended supervisors are stated as follows.

**Definition 4.3.1:** Supervisor 5 = <sup>1</sup> CBC.

410 Challenges and Paradigms in Applied Robust Control

where LPj is the set of "last" part type stages in the subparts of Pj (except SPj1, the final stage

\_ <sup>j</sup> NSj jj <sup>j</sup> NS NS <sup>j</sup> <sup>j</sup> NSj j2 j j,| | SP SP SP , ,, , j,| | | |1 j,| | ... | | SP SP LP P P { , ,...,P } .

j j j j

(i) B , ii Z P P\* ( ) B B

CBC ensures that every part in the system requiring multiple unreliable resources has capacity reserved on the CB. CBC has no more than CRL\*|P| constraints and thus checking CBC computation is no worse than O(CRL\*|P|), which is polynomial in stable measures of

The level of Bj for PjP\* can be fixed, in which case Bj does not change; or state-based, where we periodically reallocate CB across all PjP\*. Although we cannot preempt CB space from parts that have it reserved, we can reallocate CB space that is not reserved. One simple approach is to let Bj=Zj as long as (ii) holds. This represents a first-come-first-serve rule.

B |P\*|

i 1 j 1

B ij i 1 X 

B |P\*|

i 1 j 1

ij

Here, Xij is 1 if the ith unit of CB is assigned to PjP\*, 0 otherwise. The objective (1) minimizes assignment cost; (2) counts the assignment to each PjP\*; (3) assures no preemption from parts in the system; and (4) assures the CB is not over allocated. Cij is the cost of assigning CB space to PjP\*. This cost could reflect production priorities or failure probabilities. This problem can be solved in polynomial time using the Hungarian Algorithm (Papadimitriou, 1982). The solution frequency is a topic for future research.

We now define two supervisory controllers. The first is the conjunction of 1 and CBC; and the second is the conjunction of 2 and CBC. Recall that 1 and 2 are the controllers of

Subsection 3.1. Formally, the extended supervisors are stated as follows.

X B

ij ij

(1)

, j=1...|P\*| (2)

, j=1...|P\*| (3)

(4)

X {0,1} ij , i=1...B, j=1...|P\*| (5)

C X

j

\* PP

Zj keeps track of the total number of instances of part type stages of PjP\* that are in the

of Pj). For example, LP1={P12,P14,P15} and LP3 = {P32,P34,P36,P38}. In general,

Alternatively, we can solve the following assignment problem:

Zj

**4.3 Robust controllers with CBC** 

min

B ij 1 X *i*

st. Bj =

system. CBC is defined as:

system size.

**Definition 4.3.2:** Supervisor 6 = <sup>2</sup> CBC.

The following theorems establish that these supervisors ensure robust operation.

**Theorem 4.3.1:** 5 is robust to failure of RU.

**Proof:** The structure of the proof is as follows. We assume the system to be in an admissible state with parts requiring multiple unreliable resources, with some failed. We show that these parts can advance into the CB or into the buffer space of failure-dependent resources, where they do not block production of parts not requiring failed resources. Let PjP\*. The subpart types of Pj constructed by RPA are {SPj,NSj ,SPj,(NSj-1),…,SPj1}. Assume that in the current state, q, unreliable resources in the subroutes of Pj have failed and that q satisfies 5. In the following, we want to show that under 5 parts of type Pj do not block other part types from producing. We ignore parts of type Pj in the final subroute since it is covered by 1. That is, 1 guarantees that parts in the final subroute can be advanced into the buffer space of the last resource and completed and removed from the system if the resource is operational or stored there, out of the way of part types not requiring failed resources, if it is not.

Let qj={pjk | Pjk SPjq, q = NSj, (NSj1),…,2} be the set of parts of Pj in the state q. Let qj={pjk | Pjk LPj} be the set of parts of Pj in the final stage of a subroute. By the definition of LPj, qj qj. Now, 1 guarantees that all parts in qj\qj can be advanced, perhaps through several processing steps, into the buffer spaces of resources required by stages of LPj. That is, 1 guarantees a sequence of part movements such that the system reaches a new state, say t, where tj=tj. In state t, all instances of Pj are at the end of a subroute.

The left hand side of CBC does not change in moving from state q to state t. To see this, note that CBC is only affected by parts in P\*. Since we allow no new parts to be admitted and no part of P\* is required to move from one subroute to another (only to the end of the current subroute), the left-hand-side of CBC does not change magnitude. Thus, the part advancement under 1 does not violate CBC.Now, CBC guarantees that every part of tj has capacity reserved on the CB, and any finished part of this set can be moved to the CB. Further, any unfinished part of tj can be finished and moved to the CB if its resource is operational. If the associated resource is not operational, the part can be stored at its failed resource where it will not block the production of part types not requiring failed resources. Thus, all operational resources can be cleared of parts of type Pj. Under 1, the resulting state is a feasible initial state if resource repairs or additional failures occur.

**Theorem 4.3.2:** 6 is robust to failure of RU.

**Proof:** The proof follows the same construction as Theorem 4.3.1. The main difference is in how BA and SSLA operate. Thus, 5 and 6 guarantee robust operation for systems where parts can require multiple unreliable resources. Note that if every resource is unreliable, both theorems continue to hold.

## **5. Conclusion and future research**

Supervisory control for manufacturing systems resource allocation has been an active area of research. Significant amount of theories and algorithms have been developed to allocate resources effectively and efficiently, and to guarantee important system properties, such as system liveness, traceability, deadlock-free operations. However, a major assumption these research works are based on is that resources never fail. While resource failures in automated

Robust Control for Single Unit Resource Allocation Systems 413

resources, controller 4 distribute' parts requiring failed resources among the buffer space of shared resources, and controllers 5-<sup>6</sup> utilize central buffer to achieve robust operations. These robust controllers assure different levels of robust system operation and impose very different operating dynamics on the system, thus affecting system performance in different ways. An extensive simulation study has been conducted and a set of implementation guidelines for choosing the best robust controller based on manufacturing system

A taxonomy is developed and presented in Table 1 to help guide future research in the area of robust supervisory control. By combining the different system structures, the presence/absence of central buffer, flexible routing capability, system robust level requirements, and unreliable resource failure characteristics, a significant amount of future research and development need to be done to address a variety of system control and performance requirements. And, although automated manufacturing systems are the context in which we develop the robust supervisory control research. We expect to expand our research to other application areas due to the similarity in resource allocation requirement and complexity in workflow management. The robust controllers we developed so far only address a small subset of the research taxonomy. For example, controller 1 falls in the category in the taxonomy of (S1, C1, FR1, RB2, RC1, AA1). Especially, it would be interesting and challenging to develop supervisory control policies for systems with flexible routing and for systems where the failure characteristics of resources are dynamically evolving and can be estimated through sensor monitoring and

Chew, S. & Lawley, M. (2006). Robust Supervisory Control for Production Systems with

Chew, S.; Wang, S. & Lawley, M. (2008). Robust Supervisory Control for Product Routings

Cormen, T.; Leiserson, C. & Rivest, R. (2002). *Introduction to Algorithms* (Second Edition),

Ezpeleta, J.; Tricas, F.; Garcia-Valles, F. & Colom, J. (2002). A Banker's Solution for Deadlock

Habermann, A. (1969). Prevention of System Deadlocks. *Communications of the ACM*, Vol.12,

Hsieh, F. (2004). Fault-tolerant Deadlock Avoidance Algorithm for Assembly Processes.

*Engineering,* Vol.6, No.1, (January 2009), pp. 195-200, ISSN 1545-5955 Chew, S.; Wang, S. & Lawley, M. (2011). Resource Failure and Blockage Control for

Vol.3, No.3, (July 2006), pp. 309-323, ISSN 1545-5955

Vol.24, No.3, (March 2011), pp. 229-241, ISSN 0951-192X

McGraw-Hill, ISBN 0072970545, New York, USA

No.7, (July 1969), pp. 373–377, ISSN 0001-0782

2004), pp. 65-79, ISSN 1083-4427

Multiple Resource Failures. *IEEE Transactions on Automation Science and Engineering,* 

with Multiple Unreliable Resources. *IEEE Transactions on Automation Science and* 

Production Systems. *International Journal of Computer Integrated Manufacturing,*

Avoidance in FMS with Flexible Routing and Multiresource States. *IEEE Transactions on Robotics and Automation*, Vol.18, No.4, (August 2002), pp. 621–625,

*IEEE Transactions on Systems, Man and Cybernetics, Part A*, Vol.34, No.1, (January

characteristics and performance objectives are developed in Wang et al. (2009).

degradation modelling.

ISSN 1042-296X

**6. References** 

manufacturing systems are inevitable, we investigate such system behaviours and control dynamics. First, we developed the notion of robust supervisory control for automated manufacturing systems with unreliable resources. Our objective is to allocate system buffer space so that when an unreliable resource fails the system can continue to produce all part types not requiring the failed resource. We established properties that such a controller must satisfy, namely, that it ensure safety for the system given no resource failure; that it constrain the system to feasible initial states in case of resource failure; that it ensure safety for the system while the unreliable resource is failed; and that during resource repair it constrain the system to states that will be feasible initial states when the repair is completed.


We then developed a variety of control policies that satisfy these robust properties.

Table 1. Taxonomy for future research directions

Specifically, supervisory controllers 1-<sup>4</sup> are for systems with multiple unreliable resources where each part type requires at most one unreliable resource. Supervisory controllers 5-<sup>6</sup> control systems for which part types may require multiple unreliable resources. Another classification of the controllers is based on the underlying control mechanism: controllers 1- 3 'absorb' all parts requiring failed resources into the buffer space of failure-dependent resources, controller 4 distribute' parts requiring failed resources among the buffer space of shared resources, and controllers 5-<sup>6</sup> utilize central buffer to achieve robust operations. These robust controllers assure different levels of robust system operation and impose very different operating dynamics on the system, thus affecting system performance in different ways. An extensive simulation study has been conducted and a set of implementation guidelines for choosing the best robust controller based on manufacturing system characteristics and performance objectives are developed in Wang et al. (2009).

A taxonomy is developed and presented in Table 1 to help guide future research in the area of robust supervisory control. By combining the different system structures, the presence/absence of central buffer, flexible routing capability, system robust level requirements, and unreliable resource failure characteristics, a significant amount of future research and development need to be done to address a variety of system control and performance requirements. And, although automated manufacturing systems are the context in which we develop the robust supervisory control research. We expect to expand our research to other application areas due to the similarity in resource allocation requirement and complexity in workflow management. The robust controllers we developed so far only address a small subset of the research taxonomy. For example, controller 1 falls in the category in the taxonomy of (S1, C1, FR1, RB2, RC1, AA1). Especially, it would be interesting and challenging to develop supervisory control policies for systems with flexible routing and for systems where the failure characteristics of resources are dynamically evolving and can be estimated through sensor monitoring and degradation modelling.

## **6. References**

412 Challenges and Paradigms in Applied Robust Control

manufacturing systems are inevitable, we investigate such system behaviours and control dynamics. First, we developed the notion of robust supervisory control for automated manufacturing systems with unreliable resources. Our objective is to allocate system buffer space so that when an unreliable resource fails the system can continue to produce all part types not requiring the failed resource. We established properties that such a controller must satisfy, namely, that it ensure safety for the system given no resource failure; that it constrain the system to feasible initial states in case of resource failure; that it ensure safety for the system while the unreliable resource is failed; and that during resource repair it constrain the

**Taxonomy for Future Research Directions** 

S2 random number of unreliable resources for each part type

Flexible Routing FR1 every part type stage can be performed by exactly one

FR2 every part type stage can be performed by exactly two

FRj every part type stage can be performed by exactly j

RC2 unreliable resource failure characteristics can be estimated

Specifically, supervisory controllers 1-<sup>4</sup> are for systems with multiple unreliable resources where each part type requires at most one unreliable resource. Supervisory controllers 5-<sup>6</sup> control systems for which part types may require multiple unreliable resources. Another classification of the controllers is based on the underlying control mechanism: controllers 1- 3 'absorb' all parts requiring failed resources into the buffer space of failure-dependent

AA2 Business Processes and Workflow Management

system to states that will be feasible initial states when the repair is completed. We then developed a variety of control policies that satisfy these robust properties.

resource

resources

resources

 RB2 at most one resource failure at any time RB3 at most two resource failures at any time

RBi at most i resource failures at any time

Condition RC1 unreliable resources fail at any time

Central Buffer Capacity C1 without central buffer C2 with central buffer

Robustness Level RB1 no resource failures

Application Areas AA1 Manufacturing Systems

 AA4 Supply Chain Management AA5 Internet Resource Mangement

 AA6 Transporation Systems AA7 Healthcare Systems Table 1. Taxonomy for future research directions

AA3 E-Commerce

…

…

Unreliable Resource

System Structure S1 at most one unreliable resource for each part type


**1. Introduction**

them poorly at best.

1Electrical Engineering, Ira A. Fulton School of Engineering 2Electrical Engineering, Ira A. Fulton School of Engineering

3School of Human Evolution and Social Change, School of Sustainability 4ASELSAN, Inc. Microelectronics, Guidance and Electro-Optics Division, Turkey

**Introduction.** A critical challenge faced by sustainability science is to develop robust strategies to cope with highly uncertain social and ecological dynamics. The increasing intensity with which human societies utilize (limited) natural resources is fueling the global debate and urging the development of resource management methodologies/policies to effectively deal with very demanding socio-bio-economical issues. Unfortunately, despite concerted efforts by governments, many natural resources continue to be poorly managed. The collapse of many fisheries worldwide is the most notable example (Clark, 2006; Clark et al., 2006; Holland, Gudmundsson; Myers, Worm 2003; Sethi et al., 2005) but other examples include forests (Moran, Ostrom), groundwater basins (Shah, 2000), and soils (ISRIC, 1990). The suggested causes are varied but (Clark, 2006) highlights two: (1) lack of consideration of economic incentives actually faced by economic agents and (2) uncertainty associated with the dynamics of biological populations. In the case of fisheries, Clark notes that "complexity and uncertainty will always limit the extent to which the effects of fishing can be understood or predicted" (Clark, 2006, p. 98). This suggests that we need policies capable of effectively managing natural resource systems despite the fact that we understand

**Design of Robust Policies for Uncertain Natural** 

**Resource Systems: Application to the Classic** 

**Gordon-Schaefer Fishery Model** 

Armando A. Rodriguez1, Jeffrey J. Dickeson2, John M. Anderies3 and Oguzhan Cifdaloz4

*Arizona State University* 

*USA* 

**19**

**Real-World Management Issues.** Real-world resource management must address three components: goal setting, practical (robust) implementation, and learning. Clark and others (Clark, 2007; 2006; Clark et al., 2006) have recently noted that practical implementation issues are frequently at the root of fishery management failures. For most fisheries, the necessary institutional contexts exist (Wilen, Homans) and we know what to do, yet management efforts fail. This suggests a need to focus on the actual *process* of resource management. For example, how can managers make decisions with incomplete information concerning how the resource and the resource users will respond to management actions?

