2. Timed Petri nets

In Petri nets, concurrent activities are represented by tokens that can move within a (static) graph-like structure of the net. More formally, a marked place/transition Petri net M is defined as a pair M ¼ ð Þ N ; m<sup>0</sup> , where the structure N is a bipartite directed graph, N ¼ ð Þ P; T; A , with two types of vertices, a set of places P and a set of transitions T, and a set of directed arcs A connecting places with transitions and transitions with places, A ⊆ P�T ∪ T�P. The initial marking function m<sup>0</sup> assigns nonnegative numbers of tokens to places of the net, m<sup>0</sup> : P ! f g 0; 1;… . Marked nets can be equivalently defined as M ¼ ð Þ P; T; A; m<sup>0</sup> .

cycles are spent waiting for the completion of memory accesses [4]. A model of a pipelined processor at the instruction execution level is used in this chapter to study the mismatch of

This model of a processor is then used for performance analysis of shared-memory bus-based multiprocessors. The main objective of this analysis is to study the degradation of the processor's performance when the utilization of the (shared) bus approaches 100%. This perfor-

Modeling and analysis of shared-memory bus-based systems requires a flexible formalism that can easily handle concurrent activities as well as synchronization of different events and

As formal models, Petri nets are bipartite directed graphs in which the two types of vertices represent, in a very general sense, conditions and events. An event can occur only when all conditions associated with it (represented by arcs directed to the event) are satisfied. An occurrence of an event usually satisfies some other conditions, indicated by arcs directed from the event. Hence, an occurrence of one event causes some other event (or events) to occur, and

In order to study the performance aspects of systems modeled by Petri nets, the durations of modeled activities must also be taken into account. This can be done in different ways, resulting in different types of temporal nets [7]. In timed Petri nets [8], occurrence times are associated with events, and the events occur in real time (as opposed to instantaneous occurrences in other models). For timed nets with constant or exponentially distributed occurrence times, the state graph of a net is a Markov chain (or an embedded Markov chain). If the state space of a timed net is finite and reasonably small, the stationary probabilities of states can be determined by standard methods [9]. Then these stationary probabilities are used for the derivation of many performance characteristics of the model [10]. In other cases, discrete event

In this chapter, timed Petri nets are used to model shared-memory bus-based multiprocessor systems. Section 2 recalls basic concepts of Petri nets and timed Petri nets. Section 3 discusses a model of a pipelined processor and its performance as a function of modeling parameters. Shared-memory bus-based systems are described and analyzed in Section 4. Section 5 concludes

In Petri nets, concurrent activities are represented by tokens that can move within a (static) graph-like structure of the net. More formally, a marked place/transition Petri net M is defined as a pair M ¼ ð Þ N ; m<sup>0</sup> , where the structure N is a bipartite directed graph, N ¼ ð Þ P; T; A , with two types of vertices, a set of places P and a set of transitions T, and a set of directed arcs A connecting places with transitions and transitions with places, A ⊆ P�T ∪ T�P. The initial

mance degradation limits the number of processors in bus-based systems.

processes that occur in such systems. Petri nets [5, 6] are such formal models.

simulation [11] is used to find the performance characteristics of a timed net.

processor and memory performances.

76 Petri Nets in Science and Engineering

so on.

the chapter.

2. Timed Petri nets

A place is shared if it is connected to more than one transition. A shared place p is free-choice if the sets of places connected by directed arcs to all transitions sharing p are identical. A shared place p is (dynamically) conflict-free if for each marking reachable from the initial marking, at most one transition sharing p is enabled. If a shared place p is not free-choice and not conflictfree, the transitions sharing p are conflicting.

In timed nets [8], occurrence times are associated with transitions, and transition occurrences are real-time events, i.e., tokens are removed from input places at the beginning of the occurrence period and are deposited to the output places at the end of this period. All occurrences of enabled transitions are initiated in the same instants of time in which the transitions become enabled (although some enabled transitions may not initiate their occurrences). If, during the occurrence period of a transition, the transition becomes enabled again, a new, independent occurrence can be initiated, which will overlap with the other occurrence(s). There is no limit on the number of simultaneous occurrences of the same transition (sometimes this is called infinite occurrence semantics). Similarly, if a transition is enabled "several times" (i.e., it remains enabled after initiating an occurrence), it may start several independent occurrences in the same time instant.

More formally, a timed Petri net is a triple, T ¼ ð Þ M; c; f , where M is a marked net, c is a choice function which assigns probabilities to transitions in free-choice classes, or relative frequencies of occurrences to conflicting transitions, c ! ½ � 0; 1 , and f is a timing function, which assigns an (average) occurrence time to each transition of the net, f : T ! Rþ, where R<sup>þ</sup> is the set of nonnegative real numbers.

The occurrence times of transitions can be either deterministic or stochastic (i.e., described by some probability distribution function). In the first case, the corresponding timed nets are referred to as D-timed nets [12]; in the second, for the (negative) exponential distribution of occurrence times, the nets are called M-timed nets (Markovian nets) [13]. In both cases, the concepts of state and state transitions have been formally defined and used in the derivation of different performance characteristics of the model. In simulation applications, other distributions can also be used, for example, the uniform distribution (U-timed nets) is sometimes a convenient option. In timed Petri nets, different distributions can be associated with different transitions in the same model providing flexibility that is used in simulation examples that follow.

In timed nets, the occurrence times of some transitions may be equal to zero, which means that the occurrences are instantaneous; all such transitions are called immediate (while the others are called timed). Since immediate transitions have no tangible effects on the (timed) behavior of the model, it is convenient to "split" the set of transitions into two parts, the set of immediate and the set of timed transitions, and to first perform all occurrences of the (enabled) immediate transitions, and then (still in the same time instant), when no more immediate transitions are enabled, to start the occurrences of (enabled) timed transitions. It should be noted that such a convention effectively introduces the priority of immediate transitions over the timed ones, so the conflicts of immediate and timed transitions are not allowed in timed nets. Detailed characterization of the behavior of timed nets with immediate and timed transitions is given in [8].

runlength, nℓ, i.e., the average number of instructions between two consecutive cache misses; if this choice probability is equal to 0.1, the runlength is equal to 10; if it is equal to 0.2, the

Performance Analysis of Shared-Memory Bus-Based Multiprocessors Using Timed Petri Nets

http://dx.doi.org/10.5772/intechopen.75589

79

Psel is another free-choice place; it models the hits and misses of the second-level cache. The probability associated with transition Tloc represents the hit ratio of the second-level cache (the occurrence time of Tloc is the average access time to the second-level cache, tc) while the miss ratio is associated with transition Tmem which represents accesses to the main memory (with

Typical values of modeling parameters used in this chapter are shown in Table 1.

All temporal data in Table 1 (i.e., cache and memory access times) are in processor cycles.

Processor utilization as a function of h1, the hit rate of the first-level cache, is shown in Figure 2 for two values of the second-level cache access time, tc ¼ 5, and tc ¼ 10. It should not be

Symbol Parameter Value h<sup>1</sup> First-level cache hit rate 0.9 h<sup>2</sup> Second-level cache hit rate 0.8 tp First-level cache access time 1 tc Second-level cache access time 5 tm Main memory access time 25 ps<sup>1</sup> Prob. of one-cycle pipeline stall 0.1 ps<sup>2</sup> Prob. of two-cycle pipeline stall 0.05

runlength is 5; and so on.

the occurrence time tm).

Table 1. Modeling parameters and their typical values.

Figure 2. Processor utilization as a function of first-level cache hit rate for h<sup>2</sup> ¼ 0:8, ps ¼ 0:2.

### 3. Pipelined processors

A timed Petri net model of a pipelined processor [14] at the level of instruction execution is shown in Figure 1 (as usual, timed transitions are represented by solid bars, and immediate transitions by thin bars). It is assumed that the first level cache does not delay the processor, while cache misses (at the first level cache) introduce a delay of tc processor cycles. For simplicity, only two levels of cache memory are represented in the model; it appears that such a simplification does not affect the results in a significant way [15].

Place Pnxt is marked when the processor is ready to execute the next instruction. Pnxt is a freechoice place with three possible outcomes that model issuing an instruction without any further delay (Ts0 with the choice probability ps0), a single-cycle pipeline stall (modeled by Td1 with the choice probability ps<sup>1</sup> associated with Ts1), and a two-cycle pipeline stall (modeled by Td2 and then Td1 with the choice probability ps<sup>2</sup> assigned to Ts2). Other pipeline stalls could be represented in a similar way, if needed.

Marked place Cont indicates that an instruction is ready to be issued to the execution pipeline. It is assumed that once the instruction enters the pipeline, it will progress through all the stages and, eventually, leave the pipeline. Since the details of pipeline implementation are not important for performance analysis of the processor, they are not represented here. Only the first stage of the execution pipeline is shown as timed transition Trun.

Done is another free-choice place which determines if the executing instruction results in a cache miss or not. Transition Tnxt occurs (with the corresponding probability) if cache miss does not occur and the processor can continue fetching and issuing instructions. Cache miss is represented by Tsel. The choice probability associated with Tsel determines the instruction

Figure 1. Instruction-level Petri net model of a pipelined processor.

runlength, nℓ, i.e., the average number of instructions between two consecutive cache misses; if this choice probability is equal to 0.1, the runlength is equal to 10; if it is equal to 0.2, the runlength is 5; and so on.

Psel is another free-choice place; it models the hits and misses of the second-level cache. The probability associated with transition Tloc represents the hit ratio of the second-level cache (the occurrence time of Tloc is the average access time to the second-level cache, tc) while the miss ratio is associated with transition Tmem which represents accesses to the main memory (with the occurrence time tm).

Typical values of modeling parameters used in this chapter are shown in Table 1.

All temporal data in Table 1 (i.e., cache and memory access times) are in processor cycles.

Processor utilization as a function of h1, the hit rate of the first-level cache, is shown in Figure 2 for two values of the second-level cache access time, tc ¼ 5, and tc ¼ 10. It should not be


Table 1. Modeling parameters and their typical values.

the timed ones, so the conflicts of immediate and timed transitions are not allowed in timed nets. Detailed characterization of the behavior of timed nets with immediate and timed transi-

A timed Petri net model of a pipelined processor [14] at the level of instruction execution is shown in Figure 1 (as usual, timed transitions are represented by solid bars, and immediate transitions by thin bars). It is assumed that the first level cache does not delay the processor, while cache misses (at the first level cache) introduce a delay of tc processor cycles. For simplicity, only two levels of cache memory are represented in the model; it appears that such

Place Pnxt is marked when the processor is ready to execute the next instruction. Pnxt is a freechoice place with three possible outcomes that model issuing an instruction without any further delay (Ts0 with the choice probability ps0), a single-cycle pipeline stall (modeled by Td1 with the choice probability ps<sup>1</sup> associated with Ts1), and a two-cycle pipeline stall (modeled by Td2 and then Td1 with the choice probability ps<sup>2</sup> assigned to Ts2). Other pipeline

Marked place Cont indicates that an instruction is ready to be issued to the execution pipeline. It is assumed that once the instruction enters the pipeline, it will progress through all the stages and, eventually, leave the pipeline. Since the details of pipeline implementation are not important for performance analysis of the processor, they are not represented here. Only the first

Done is another free-choice place which determines if the executing instruction results in a cache miss or not. Transition Tnxt occurs (with the corresponding probability) if cache miss does not occur and the processor can continue fetching and issuing instructions. Cache miss is represented by Tsel. The choice probability associated with Tsel determines the instruction

a simplification does not affect the results in a significant way [15].

stage of the execution pipeline is shown as timed transition Trun.

stalls could be represented in a similar way, if needed.

Figure 1. Instruction-level Petri net model of a pipelined processor.

tions is given in [8].

78 Petri Nets in Science and Engineering

3. Pipelined processors

Figure 2. Processor utilization as a function of first-level cache hit rate for h<sup>2</sup> ¼ 0:8, ps ¼ 0:2.

surprising that processor utilization is quite sensitive to the values of h1, but is much less sensitive to the values of tc.

For pipelined processors shown in Figure 1, processor utilization can be estimated using the

1 þ ps<sup>1</sup> þ 2∗ps<sup>2</sup> þ ð Þ 1 � h<sup>1</sup> ∗ð Þ tc þ ð Þ 1 � h<sup>2</sup> ∗tm

An outline of a shared-memory bus-based multiprocessor is shown in Figure 5. The system is composed of n identical processors, which access the shared memory using a system bus. To reduce the average access time to the shared memory, the processors use (multilevel) cache memories. It is assumed that memory consistency is provided by a cache coherence mechanism [16], which usually increases the miss ratio of accessing caches (and is otherwise not

A timed Petri net model of a shared-memory bus-based multiprocessor is shown in Figure 6. It contains models of n processors (only two are shown in Figure 6), which are copies of the model shown in Figure 1 except for the main memory (transition Tmem) which becomes shared memory in Figure 6. The remaining part of Figure 6 is modeling the bus that coordi-

<sup>1</sup> <sup>þ</sup> <sup>0</sup>:<sup>1</sup> <sup>þ</sup> <sup>0</sup>:<sup>1</sup> <sup>þ</sup> <sup>0</sup>:1∗ð Þ <sup>5</sup> <sup>þ</sup> <sup>0</sup>:2∗<sup>25</sup> <sup>≈</sup> <sup>0</sup>:45: (2)

Performance Analysis of Shared-Memory Bus-Based Multiprocessors Using Timed Petri Nets

: (1)

81

http://dx.doi.org/10.5772/intechopen.75589

up <sup>¼</sup> <sup>1</sup>

up <sup>¼</sup> <sup>1</sup>

4. Shared-memory bus–based systems

nates accesses of processors to the shared memory.

Figure 5. A shared-memory bus-based multiprocessor.

represented in the model).

The estimated values agree quite well with the values shown in Figures 2–4.

For the values of modeling parameters shown in Table 1, processor utilization is:

following formula:

Processor utilization as a function of h2, the hit rate of the second-level cache, is shown in Figure 3 for two values of the main memory access time, tm ¼ 25 and tm ¼ 50. Processor utilization is rather insensitive to values of h2, and does not change much with tm.

Processor utilization as a function of the probability of pipeline stalls ps ¼ ps<sup>1</sup> þ 2ps<sup>2</sup> is shown in Figure 4 for three combinations of values of tc and tm.

Again, processor utilization is rather insensitive to the probability of pipeline stalls as well as the values of tc and tm.

Figure 3. Processor utilization as a function of second-level cache hit rate for h<sup>1</sup> ¼ 0:9, ps ¼ 0:2.

Figure 4. Processor utilization as a function of probability of pipeline stalls for h<sup>1</sup> ¼ 0:9, h<sup>2</sup> ¼ 0:8.

For pipelined processors shown in Figure 1, processor utilization can be estimated using the following formula:

$$u\_p = \frac{1}{1 + p\_{s1} + 2 \ast p\_{s2} + (1 - h\_1) \ast (t\_c + (1 - h\_2) \ast t\_m)}.\tag{1}$$

For the values of modeling parameters shown in Table 1, processor utilization is:

$$
\mu\_p = \frac{1}{1 + 0.1 + 0.1 + 0.1 \ast (5 + 0.2 \ast 25)} \approx 0.45.\tag{2}
$$

The estimated values agree quite well with the values shown in Figures 2–4.
