**7. References**


#### 298 Petri Nets – Manufacturing and Computer Science

Van der Aalst, W. (1998). The application of petri nets to workflow management, the journal of Circuits, Systems and Computers 8, (1), pp. 21-66, 1998.

**Chapter 0**

**Chapter 13**

**Timed Petri Nets in Performance Exploration**

In modern computer systems, the performance of the whole system is increasingly often limited by the performance of its memory subsystem [1]. Due to continuous progress in manufacturing technologies, the performance of processors has been doubling every 18 months (the so–called Moore's law [2]), but the performance of memory chips has been improving only by 10% per year [1], creating a "performance gap" in matching processor's performance with the required memory bandwidth [3]. More detailed studies have shown that the number of processor cycles required to access main memory doubles approximately every six years [4]. In effect, it is becoming more and more often the case that the performance of applications depends on the performance of the system's memory hierarchy and it is not unusual that as much as 60% of time processors spend waiting for the completion of memory

Memory hierarchies, and in particular multi–level cache memories, have been introduced to reduce the effective latency of memory accesses [5]. Cache memories provide efficient access to information when the information is available at lower levels of memory hierarchy; occasionally, however, long–latency memory operations are needed to transfer the information from the higher levels of memory hierarchy to the lower ones. Extensive research

Techniques which tolerate long–latency memory accesses include out–of–order execution of instructions and instruction–level multithreading. The idea of out–of–order execution [1] is to execute, instead of waiting for the completion of a long–latency operation, instructions which (logically) follow the long–latency one, but which do not depend upon the result of this long–latency operation. Since out–of–order execution exploits instruction–level concurrency in the executed sequential instruction stream, it conveniently maintains code–base compatibility [6]. In effect, the instruction stream is dynamically decomposed into micro-threads, which are scheduled and synchronized at no cost in terms of executing additional instructions. Although this is desirable, speedups using out–of–order

> ©2012 Zuberek, licensee InTech. This is an open access chapter distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0),which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly

©2012 Zuberek, licensee InTech. This is a paper distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

has focused on reducing and tolerating these large memory access latencies.

cited.

**of Simultaneous Multithreading**

Additional information is available at the end of the chapter

Wlodek M. Zuberek

**1. Introduction**

operations [4].

http://dx.doi.org/10.5772/48601
